9 April 2007
Biol 6312
Motor protein switches
Motor proteins are used for intracellular transport of macromolecule, organelles, vesicles, and chromosomes. They are used for chemotaxis and muscle movement. These proteins use ATP hydrolysis to generae conformational states. In general, there are motor proteins, such as mysoin, kinesin, and dynein, which use ATP hydrolysis to move along filaments. They have multiple conformational states involving ATP binding, ADP binding and covalently linked phosphate. (Fig. 3-17) They have switch regions, much like the G-proteins. (Fig. 3-18)
Recent progress in kinesin is discussed in the review below. The likely mode of kinesin movement is depicted in the movie below.
Yildiz A, Selvin PR.
Kinesin: walking, crawling or sliding along?
Trends Cell Biol. 2005 Feb;15(2):112-20.
Mechanisms of Regulation and Control
In metabolism there are two, sometimes distinct, situations that are often described by the terms regulation and control. Regulation is generally considered to refer to the process of responding to various environmental changes in order to maintain a constant level of a metabolite (homeostasis). In contrast, control is generally considered to refer to the process in which the level of a metabolite is increased or decreased in response to some signal.
The use of the terms regulation and control with respect to proteins may not be consistent with usage described above.
Regulation of proteins by degradation.
Proteins are degraded in cells by a variety of proteases, but primarily through the proteasome. This is a large structure that superficially resembles the HSP60 chaperonin. (Fig. 3-19) .(Jmol)
In general, the level of any protein can be maintained by a balance between its synthesis and its degradation. The rate of degradation might be related to the intrinsic stability of the protein.
In addition, proteins might be targeted for degradation, in response to their own damage, or some outside signal. The primary signal that sends a protein to the proteasome for degradation is the covalent tagging by many molecules of ubiquitin. Ubiquitin is covalently linked to a lysine residue near the N-terminus of a protein. This is catalyzed by a ubiquitin ligase, a part of a larger pathway. Proteolysis by the proteasome is driven by ATP hydrolysis. (Fig. 3-20) . Ubiquitin is not degraded, but is recycled.
Phosphorylation of specific residues can also be a signal to send a protein for degradation. For example, NFκB, a transcription factor, is retained in the cytoplasm by binding to another protein IκB. The phosphorylation of 2 serine residues in IκB signals it to be ubiquinated, and eventually degraded. This releases NFκB, allowing it to travel to the nucleus and be active.
Hypoxia-inducible factor (HIF) is another transcription factor. Under normal levels of oxygen, its prolines are hydroxylated, leading to ubiquination and degradation. When oxygen levels fall sufficiently, it is no longer hydroxylated, and so the factor survives to function.
Control of protein function by post-translational modification
More than half of all human proteins are thought to be post-translationally modified. More than 40 different types of modifications have been observed. Common ones are phosphorylation, glycosylation, lipidation, limited proteolysis, and less common ones include methylation, N-acetylation, nitrosylation and attachment of SUMO.
Common effects of covalent modification include changing the location of a protein, its activity, or its interactions with other proteins. Limited proteolysis can trigger a cascade of activity by activating enzymes, such as in the blood clotting system.
Phosphorylation is the most common post-translational modification. Unlike limited proteolysis, it is reversible, making it suitable for rapidly turning enzyme activity on and off.
Phosphorylation is a switch mechanism
Phosphorylation of proteins occurs in all types of living organsims. Target proteins are phosphorylated by protein kinases and are dephosphorylated by protein phosphatases. This allows for independent regulation of the reaction in both directions. Nucleoside triphosphates are the source of the phosphoryl group, usually ATP.
In the human genome serine, threonine and tyrosine kinases constitute about 2% of the genome (575 instances)-the third most common domain. Bacteria tend to have histidine and aspartate kinases, and in E. coli they make up about 1.5% of the genome.
The covalent attachment of a phosphate group can have large effects on the conformation of the protein. It introduces negative charge and H-bonding capacity.
Phosphorylation can affect the activity of the target protein. Example glycogen phosphorylase undergoes a conformational change after phosphorylation of a single Serine by phosphorylase kinase. It then becomes more active in the production of glucose-6-phosphate from glycogen. (Fig. 3-22)
Isocitrate dehydrogenase of E. coli is inactivated by phosphorylation of a serine residue at the active site. This does not involve a significant conformational change. (Fig. 3-23)
Phosphorylation can also create a new binding surface for another protein. For example SH2 domains bind to phosphorylated tyrosines. (Also see Fig. 3-21 in which phosphorylation of the receptor allows Grb to bind, triggering the MAP kinase pathway.)
Protein kinases are generally controlled by phosphorylation. Many kinases resemble the structure shown in Fig. 3-24. There are two domains, or lobes, with a hinge-like connection between them. A catalytic cleft exists between the 2 domains. A flexible activation loop controls the activity state of the kinase. (Fig. 3-25)
Src-family kinases
Src kinases are activated by phosphorylation of a Tyrosine residue near the activation loop. The phosphotyrosine stabilizes the active conformation. (Fig. 3-26) Src kinases can autophosphorylate, leading to rapid conversion to the active sate.
Src kinases contain 2 domains that help to keep the kinase inactive. The SH2 domain binds to a phosphotyrosine in the kinase, and the SH3 domain binds to a polyproline helix segment.
Cyclin-dependent kinases
These are enzymes that control the timing of the cell cycle. Cyclin-dependent kinase 2 (Cdk2) is activated by the binding of cyclin A and the phosphorylation of Threonine 160. Neither alone is sufficient. (Fig. 3-27) The isolated Cdk2 is inhibited by the location of the red helix, PSTAIRE, in which the catalytic Glutamate is not in position for hydrolysis of ATP. The binding of cyclin A helps form an active catalytic site by moving the red helix, and causing the short green helix to change into a β-strand. The phosphorylation of Thr 160 helps the kinase to better interact with its substrates.
Signaling systems in bacteria
Bacteria use a two-component signaling system that is somewhat different from the kinase cascade system in eukaryotes. The first component is a membrane-bound ATP-dependent histidine kinase receptor protein (HK). The second component is a cytoplasmic response regulator protein (RR). Signals from outside the cell can activate the kinase domain of the receptor. First autophosphorylation of a histidne residue in the HK domain. Then the phosphoryl group is transfered to an aspartate residue in an RR protein. (Fig. 3-29)
RR regulatory domains have 3 distinct functions:
The effects of phosphorylation of the RR protein tend to be spread over a surface. Numerous residues undergo small conformational changes (magenta-phosphorylated) (Fig. 3-30)
Although many stimuli lead to changes in rates of transcription, a well-characterized two-component system signals bacterial chemotaxis in response to nutrients.
Control by proteolysis
Limited proteolysis is a mechanism used to activate some proteins. It is distinct from proteolysis for degradation.
After proteolysis, the fragments can remain together if they are linked by disulfide bonds. This is generally the case during the activation of proteolytic enzymes such as chymotrypsin. The key proteolytic step is the hydrolysis of the bond between Arg15 and Ile 16. When this is cleaved, the enzyme can rearrange its conformation to make an active enzyme. (Fig. 3-31) Illustrated for plasmin (blue-active)/plasminogen(red-inactive) (Fig. 3-32)
The processing of hormones leads to a variety of different products with distinct functions. The precursor to pituitary hormones is called Prepro-opiomelanocortin. Some steps are tissue-dependent. (Fig. 3-33)
The blood coagulation cascade is an example of a proteolytic cascade, in which a single activated protein can rapidly lead to millions of activated molecules downstream, such as would be needed to clot the flow of blood from a wound. (Fig. 3-34)
Protein splicing
There are about 100 examples of protein splicing known today. Typically, a single polypeptide chain is processed into two new chains, similar to the way in which introns are removed from mRNA. Each of the new chains can be a functional protein. (Fig. 3-35). Many inteins are nucleases. (Fig. 3-36) The mechanism of protein splicing requires no other factors. There are 3 key residues or regions, designated A at the N-terminal end of the intein, G at the C-terminal end of the intein, and B internal to the intein. Residue A usually contains an Oxygen (Ser, Thr) or sulfur (Cys) for nucleophilic attack. B is usually TXXH, and is important for intein structure and function. G is usually Asn, which is important in the mechanism, shown in Fig. 3-38.
In a related process, the eukaryotic Hedgehog protein autocatalyzes the cleavage of its C-terminal domain, and links a cholesterol to the new N-terminus. This binds the protein to a membrane.
![]() |
Hall TM, Porter JA, Young KE, Koonin EV, Beachy PA, Leahy DJ. |
Prediction of Protein Structure
Overview
Key References over the past few years:
Ginalski K, Grishin NV, Godzik A, Rychlewski L.
Practical lessons from protein structure prediction.
Nucleic Acids Res. 2005 Apr 1;33(6):1874-1891Schonbrun J, Wedemeyer WJ, Baker D.
Protein structure prediction in 2002.
Curr Opin Struct Biol. 2002 Jun;12(3):348-54.Protein structure prediction in the postgenomic era
David T Jones
Current Opinion in Structural Biology 2000, 10:371-379.
There continues to be a large gap between the number of proteins of known amino acid sequence and the number of proteins of known 3-D structure. This gap may never be eliminated.
But protein structure is essential for understanding the function of the protein:
Can the prediction of protein structure from sequence be improved enough to eliminate the need to crystallize each protein?
Probably not, in the near future, but predictions can generate useful and generally realiable information
There are 3 levels of analysis in the overall prediction scheme;
- Motif recognition in the primary sequence
- Secondary structure prediction
- Teriary structure/fold prediction
A starting point is often to search for proteins with sequences that are similar to the protein under study. This usually involves a BLAST search:
BLAST (Basic Local Alignment Search Tool)
Altschul SF, Madden TL, Schaffer AA, Zhang JH, Zhang Z, Miller W, Lipman DJ:
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res 1997, 25: 33893402.Altschul SF, Koonin EV.
Iterated profile searches with PSI-BLAST--a tool for discovery in protein databases.
Trends Biochem Sci. 1998 Nov;23(11):444-7. Review.
From this analysis one might learn about the function of similar proteins, and whether a 3-D structure exists for any of them. Depending on the degree of amino acid identity of the closest "neighbors", one might surmise the function of the protein.
Example BLAST input page
Protein Motif Recognition
Prosite is a database of sequence motifs for functions such as:
post-translational modifications of N- or C-termini, signal sequences for localization, sites of lipid attachment, sites of phosphorylation, or markers of particular types of enzymes
Example: phosphorylase kinase KRKQISVR Reference: Kemp & Pearson (1990) TiBS 15:342-346
PROSITE Website
To scan a protein sequence against the Protsite database go here
There are databases of sequence motifs. These can generate aligned sequences from their websites.
PRINTS a Protein Fingerprint database
Prediction of Secondary Structure
In some cases this is preliminary to prediction of tertiary structure.
a) Statistical methods, first developed by Peter Chou and Gerald Fasman
Based on the tendencies of particular amino acid to be found in the different types of secondary structure. They are considered to be about 60% accurate.
Chou-Fasman Method
the tendencies of each type of amino acid to be found in each of the 3 types of secondary structure (helix, sheet, loop) are calculated from a database of high-resolution structures (2 Å):
Original database, 1974 contained 15 proteins (2473 amino acids)
Revised, 1978, containing 29 proteins (4741 amino acids)
Simply increasing the database did not increase the accuracy of the predictions.
is the propensity for Ala to be found in an α-helix
Where the i's correspond to the amino acid type (20) and j's are the secondary structures (3)
e.g.:
= the number of Ala residues found in α-helix (in the database)
= the number of Ala residues (in the database)
= the number of amino acids found in α-helix (in the database)
= the number of amino acid residues in the database
So, the top of the propensity expression represents the fraction of all Ala residues that are α-helical,
while the bottom represents the fraction of ALL residues that are α-helical. Therefore, this ratio represents the tendency of each amino amino relative to the average amino acid.
A propensity of >1 indicates more likely than chance, while <1 indicates less likely than chance. A value of 1.0 indicates no prediction.
Since secondary structure is usually formed by several consecutive residues, it is more meaningful to take running averages of 5 or more amino acids at a time. This is called the window. A prediction will tend to be most accurate when the window matches the size of the actual segment of secondary structure.
The entire length of the protein is analyzed for each set of secondary structure propensities (helix, sheet, turn). The final prediction is made by comparing the 3 sets of values.
These calculations can be done by a spreadsheet using the "LOOKUP" function (called in Excel). This can also be used for predicting transmembrane spans using hydrophobicity values.
Special applications:
1) Regions that are highly likely to be both α helical or β-sheet are candidates for conformational changes.
2) The effects of mutations can be predicted by changing the sequence.
3) Turns can be predicted (with fair accuracy) on a residue basis, because they are not extended structures
Limitations: Accuracy of this method seems to be limited because of the limited range of propensities. Most types of amino acids can be found often in any secondary structure. Few are really excluded from any of the secondary structures.
Extension of this approach. Position specific propensities, e.g. first position in a helix or turn. This works well with turns or with the termini of helices/sheets.
Protein Sci 1994 Dec;3(12):2207-16
A revised set of potentials for β-turn formation in proteins.
Hutchinson EG, Thornton JM
Server for Chou-Fasman Prediction
Original references are too early for on-line abstracts etc.
Chou PY, Fasman GD.
Empirical predictions of protein conformation.
Annu Rev Biochem. 1978;47:251-76. Review.Chou PY, Fasman GD.
Prediction of protein conformation.
Biochemistry. 1974 Jan 15;13(2):222-45.Chou PY, Fasman GD.
Conformational parameters for amino acids in helical, β-sheet, and random coil regions calculated from proteins.
Biochemistry. 1974 Jan 15;13(2):211-22.
Garnier (GOR1) Predicter of secondary structure
Garnier J, Osguthorpe DJ, Robson B.
Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins.
J Mol Biol. 1978 Mar 25;120(1):97-120.Garnier J, Gibrat JF, Robson B.
GOR method for predicting protein secondary structure from amino acid sequence.
Methods Enzymol. 1996;266:540-53.Combination approach:
Kloczkowski A, Ting KL, Jernigan RL, Garnier J.
Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence.
Proteins. 2002 Nov 1;49(2):154-66.
b) Neural net predictions of Secondary Structure:
They use a training set of known proteins, and usually utilize multiple sequence alignments during the prediction. They are considered to be 70-80% accurate.
PredictProtein (Now at Columbia U.)
Rost B, Schneider R, Sander C.
Progress in protein structure prediction?
Trends Biochem Sci. 1993 Apr;18(4):120-3.
PsiPred David T. Jones
Jones DT:
Protein secondary structure prediction based on position-specific scoring matrices.
J Mol Biol 1999, 292: 195202. Full text MEDLINE
Membrane-spanning segments of integral membrane proteins can be predicted in similar ways. We will discuss this later.
Comments/questions: svik@mail.smu.edu
Copyright 2007, Steven B. Vik, Southern Methodist University