5 February 2007

Biol 6312

Protein Folding

Pathways of Protein Folding

Proteins of a given primary structure seem to fold into a given 3-D structure.

1. Can this outcome be predicted?

2. What is the pathway?

The free energy diagram for protein folding is usually drawn something like this 2-state model. N is the folded state and I3 is the unfolded state. ∆G of folding is typically negative 5-10 kcal (30-60 kJ) per mole. The activation free energy differs depending upon whether the reaction is in the folding direction or the unfolding direction. It also depends upon the transition state (the point on the path of the highest free energy.)

This diagram implies a single pathway through a transition state, with no side pathways or intermediates (i.e. it is rather simplistic.) It implies that folding of such proteins can be studied in either direction: refolding or unfolding.

1) Can a protein fold spontaneously to its native state?

Yes, Christian Anfinsen showed this for Ribonuclease (Profiles in Science)

They first denatured ribonuclease with urea, and reduced its disulfide bonds to -SH. If the urea was removed slowly, in the presence of oxygen which allowed disulfide formation, the enzyme returned to its native state.

Anfinsen, Christian B., Edgar Haber, Michael Sela, and F. H. White Jr. "The Kinetics of Formation of Native Ribonuclease During Oxidation of the Reduced Polypeptide Chain." Proceedings of the National Academy of Sciences 47, 9 (15 September 1961): 1309-1314. (PDF)

He shared the Nobel Prize for Chemistry in 1972 How does a protein do this?

2) Does a protein search all possible outcomes to find the final conformation, the one of lowest free energy?

No, as was first explained by Cyrus Levinthal, there is not enough time.
See his 1968 paper.

Each amino acid would have numerous possible conformations (most of them rather unfavorable), and multiplied together a typical protein would have an astronomical number of possible conformations. Even assigning a very small time interval for each step would lead to a time greater than the age of the universe.

So there must be kinetic control, as well as thermodynamic control in protein folding. In other words, proteins not only seek the lowest free energy state, but they follow particular paths, not random ones.

Or, the answer to Levinthal' s paradox is that proteins are a lot smarter than they look. Not only does the amino acid sequence seem to carry the information necessary to specify the final 3-D structure of the folded state, but also:

built into the amino acid sequence is information that guides the folding process--taking it along particular pathways, and avoiding others.

Are there detectable intermediates on the folding pathway? Fig U1-3.1

Some proteins show evidence of a "molten globule" --the result of hydrophobic collapse. Such forms show some secondary structure, but not a correct tertiary structure. It is driven by the rapid burial of hydrophobic side chains. This pathway is considered to consist of many equally probable alternative paths---a multi-dimensional folding landscape with many transition states and short-lived intermediates.

The view of Walter Englander and colleagues, (see below) studying the folding of cytochrome c by Hydrogen exchange, is that protein folding occurs by the sequential assembly of discrete units of secondary structure. They believe that intermediates occur, but they are largely undetected because they follow the rate-limiting step. The rate limiting step is thought to occur early, a kind of hydrophobic collapse in which the correct topology of the polypeptide chain is found.

Rumbley J, Hoang L, Mayne L, Englander SW
An amino acid code for protein folding.
Proc Natl Acad Sci U S A 2001 Jan 2;98(1):105-12

The folding pathway is Blue, Green, Yellow, Red and White.

Rumbley J, Hoang L, Mayne L, Englander SW
An amino acid code for protein folding.
Proc Natl Acad Sci U S A 2001 Jan 2;98(1):105-12

Krishna MM, Maity H, Rumbley JN, Lin Y, Englander SW.
Order of steps in the cytochrome C folding pathway: evidence for a sequential stabilization mechanism.
J Mol Biol. (2006) 359:1410-9.

The folding pathways for other proteins have been studied by the group of Alan Fersht .

Daggett V, Fersht AR.
Is there a unifying mechanism for protein folding?
Trends Biochem Sci. 2003 Jan;28(1):18-25. Review.

Abstract: Proteins appear to fold by diverse pathways, but variations of a simple mechanism - nucleation-condensation - describe the overall features of folding of most domains. In general, secondary structure is inherently unstable and its stability is enhanced by tertiary interactions. Consequently, an extensive interplay of secondary and tertiary interactions determines the transition-state for folding, which is structurally similar to the native state, being formed in a general collapse (condensation) around a diffuse nucleus. As the propensity for stable secondary structure increases, folding becomes more hierarchical and eventually follows a framework mechanism where the transition state is assembled from pre-formed secondary structural elements.

Unfolding (a) and Folding (b) of the engrailed homeodomain protein. Aromatic side chains are shown. These structures are found from Molecular Dynamics simulations.

Mayor U, Guydosh NR, Johnson CM, Grossmann JG, Sato S, Jas GS, Freund SM, Alonso DO, Daggett V, Fersht AR.
The complete folding pathway of a protein from nanoseconds to microseconds.
Nature. 2003 Feb 20;421(6925):863-7.

Religa TL, Markson JS, Mayor U, Freund SM, Fersht AR.
Solution structure of a protein denatured state and folding intermediate.
Nature. 2005 Oct 13;437(7061):1053-6.

Historical perspective:

Barry Honig, colleague of Levinthal

Honig B
Protein folding: from the levinthal paradox to structure prediction
J Mol Biol 1999 Oct 22;293(2):283-93

Hydrophobic collapse must be associated with formation of secondary structure. As a protein moves from an extended, linear structure to a compact one, an inside is formed by burying the hydrophobic side chains. This will tend to take the backbone NH and CO groups away from water, and therefore new H-bonds need to be made. This is accomplished by the formation of regular secondary structure. (Fig. 1-23)

Knots in Proteins

Some proteins appear to contain "knots", resulting in more difficult folding pathways.

The entire protein is shown in the top panel.

The knot, from the upper domain, is shown in the lower panel.

Jmol view of the knotted protein

Taylor WR.
A deeply knotted protein structure and how it might fold.
Nature. 2000 Aug 24;406(6798):916-9.

Other proteins have been discovered with knots. For example, a methyltransferase, a homodimeric protein YibK from Haemophilus influenzae folds via several intermediates.

Mallam AL, Jackson SE.
Probing nature's knots: the folding pathway of a knotted homodimeric protein.
J Mol Biol. 2006 Jun 23;359(5):1420-36. Epub 2006 May 2.

Protein folding Energy landscapes :

The protein folding funnel concept was introduced by the group of Wolynes:

Bryngelson JD, Onuchic JN, Socci ND, Wolynes PG.
Funnels, pathways, and the energy landscape of protein folding: a synthesis.
Proteins. 1995 Mar;21(3):167-95.

Rapid protein folding can be explained by the energy landscape view: Each point represents a conformation fo the folding protein. The native state is marked by "N". There are some energetic peaks and valley along the way, but essentially, all roads lead down.

Courtesy of Ken Dill. http://www.dillgroup.ucsf.edu. (Dill, K.A. and Chan, S.: From Levinthal to pathways to funnels.Nat. Struct. Biol. 1997, 4:10–19.)

see also Jose Onuchic

Plotkin SS, Onuchic JN.
Understanding protein folding with energy landscape theory. Part I: Basic concepts.
Q Rev Biophys. 2002 May;35(2):111-67.

Understanding protein folding with energy landscape theory. Part II: Quantitative aspects.
Q Rev Biophys. 2002 Aug;35(3):205-86.

Intra-cellular Protein Folding

Formation of Disulfide bonds is an important aspect of protein folding. First , this reaction is catalyzed by enzymes. That means there must be a supply of oxidized substrate for the net conversion of SH groups to S-S disulfides. This can come from oxygen through the electron transport chain (part A below).

This is the system in E. coli, the best characterized one. In panel A, the oxidation of DsbA generates electrons which are taken up by DsbB and passed on to quinones, which enter the electon transport chain. Oxidized DsbA is the substrate for the formation of disulfide bonds in many proteins in the bacterrial periplasm.

In panel B, NADPH via thioredoxin feeps DsbC reduced so that it can shuffle disulfides in proteins as they are folding. See below.


Nakamoto H, Bardwell JC.
Catalysis of disulfide bond formation and isomerization in the Escherichia coli periplasm.
Biochim Biophys Acta. 2004 Nov 11;1694(1-3):111-9.
Proline isomerization is an important step in protein folding, since the peptide bond on the N-side of Proline residues is sometimes cis. Some of the enzymes that carry out this isomerization are called cyclophilins.

Wedemeyer WJ, Welker E, Scheraga HA.
Proline cis-trans isomerization and protein folding.
Biochemistry. 2002 Dec 17;41(50):14637-44.

Walter S, Buchner J
Molecular chaperones--cellular machines for protein folding.
Angew Chem Int Ed Engl. 2002 Apr 2;41(7):1098-113.

These are proteins that assist other proteins in folding correctly. They bind to hydrophobic sequences, and using the energy of ATP hydrolysis, unfold them, allowing them to try to refold properly. They prevent aggregation, unwanted interactions, and so have become known as "chaperones".


Hsp70 binding a polypeptide substrate in red

14 copies of GroEL and 7 copies of GroES form a folding machine Bukau B, Horwich AL.
The Hsp70 and Hsp60 chaperone machines.
Cell. 1998 Feb 6;92(3):351-66. Review.

Tertiary Structure

The packing of elements of secondary sructure into a compact form generates the tertiary structure. The various tertiary structures are called "folds". They are especially diverse, and so are not easily described. (Fig. 1-24).

A system of topological diagrams has been developed. (Pubmed)

Beta-strands are triangles, alpha-helices are circles.

TOPS Web site

There are many arrrangements of alpha-helices and beta-strands, and loops of different lengths. Loops connecting helices and strand may look open in certain models of structure, but they seldom are. A spacefilling view will usually show that the packing is tight. (Fig. 1-25) (Jmol)

Proteins also contain bound water, which is usually found in the crystal structure.

(Fig. 1-26) (Jmol)

Packing of the side chains is an important aspect of tertiary structure. This ensures the compact and stable nature of folded proteins. Commonly these interactions are between nonpolar groups, but a few polar interactions will also likely be found in any folded protein.

Close packing (Fig. 1-27)

Examples of packing motifs: alpha-helix bundles (Fig. 1-28) (Jmol). Sometimes small cavities exist that can contain water molecules as shown above.

Thermodynamics of Protein Folding

Most folded proteins have marginal stability. Although there are numerous enthalpy and entropy terms associated with the folding of a protein, usually they almost cancel, leaving a small negative ∆G of folding. Enthalpy terms include ion pairs and hydrogen bonding, but since protein groups can make very favorable interactions with water, the difference in free energy between the folded and the unfolded structures is small. Much entropy is lost as the unfolded protein assumes a compact form, but if hydrophobic side chains are buried, the entropy of the water increases. When all of the terms are summed, the net favorable free energy is often small, 30-50 kJ/mole.

What does an unfolded conformation look like? It usually has little secondary structure and no tertiary structure, but that also depends upon the unfolding conditions. For example, temperature can usually denature a protein. This can be monitored by Circular Dichroism spectroscopy, which is sensitive to secondary structure. (Fig. 1-35)

Temperature-sensitive mutants are those that fold properly at lower temperatures, but not at higher temperatures. These can be used as conditional mutants for studying essential proteins.

Chemical denaturants, such as urea, guanidinium chloride, and SDS, can also denature proteins.

Some proteins are stable at very high temperatures. They usually come from thermophilic bacteria, which live at 60-100˚ C.

Post-Translational Modifications

There are a few common covalent modifications to proteins, some of which impact stability.

A database of post-translational modifications, formerly at the Protein Information Resource, is at the European Bioinformatics Institutue

Disulfide bonds, an example we have seen before. (Fig. 1-37) (Jmol)

Metal binding, also seen before. (Fig. 1-38) (Jmol)

Other cofactors are common, such as hemes, FAD, pyridoxal phosphate, PQQ (Fig. 1-39) New cofactors (PubMed) (JPG)

See PubChem

FAD (Jmol)

PQQ (Jmol)

Other types of modification (Fig. 1-40)

Computer lab


Comments/questions: email me

Copyright 2007, Steven B. Vik, Southern Methodist University

Last modified 2/5/07