16 April 2007

Biol 6312

Mechanisms of Regulation and Control (continued)

Glycosylation

A wide variety of modifications exist. Carbohydrates are covalently-attached to proteins through the oxygens of the side chains of Serine or Threonine (O-linked), or the nitrogens of the side chains of Asparagine (N-linked).

Generally, glycosylation is used in extracellular or surface-attached proteins for (1) recognition, or (2) protection (Fig. 3-40).

Simple glycosylation occurs in yeast. It is much more prevalent and complex in mammals.

Glycosylation typically proceeds through addition of a core oligosaccharide, followed by trimming and further glycosylation. N-linked and O-linked glycosylation start with different cores. (Fig. 3-41) (Animation)

Lipidation

Lipidation provides a way to target proteins to membrane surfaces, sometimes in a reversible manner, and sometimes to a specific membrane. There are 4 types of lipid modification:

  1. Myristoylation (a 14 carbon saturated fatty acid). It is attached via an amide bond to an N-terminal glycine residue.
  2. Palmitoylation (a 16 carbon saturated fatty acid, some variation). It is attached via a thioester to a cysteine residue. (S-acylation)
  3. Prenylation ( a farnesyl or geranylgeranyl group---isoprene units). It is attached via a thioether to a cysteine residue. Usually the cysteine is 4 residues from the C-terminus, but becomes C-terminal after thioether linkage and further processing.
  4. Glycosylphosphatidylinositol anchor (Fig. 3-45)

Figure for the first three (Fig. 3-44).

Methylation

Methylation of lysine and arginine residues on proteins found in the nuclei of eukaryotes is common. S-adenosylmethionine is the methyl donor in the reaction. Arginines can be mono- or di-methylated. Lysines can be mono- di- or tri-methylated. (Fig. 3-47) These modifications do not change the charge, but they add hydrophobic bulk, and eliminate H-bond donors. They typically affect protein-protein interactions among proteins involved in regulating transcription.

The first lysine demethylase was discovered only recently.

Shi Y, Lan F, Matson C, Mulligan P, Whetstine JR, Cole PA, Casero RA, Shi Y.
Histone demethylation mediated by the nuclear amine oxidase homolog LSD1.
Cell. 2004 Dec 29;119(7):941-53.

N-acetylation

The N-termini of many proteins are acetylated. This is usually permanent, and provides protection from proteases. The epsilon-amino groups of lysine residues can also be acetylated, and these are usually deacetylated by other enzymes. For example, in the case of histones, histone acetyltransferases and histone deacetylases add and remove acetyl-groups (Fig. 3-48).

Sumoylation

SUMO is a small ubiquitin-like Modifier protein. It is covalently attached to other proteins via the ε-amino group of a lysine residue, similar to ubiquination. (Fig. 3-49). There is a consensus sequence in which the lysine is usually found: ψ-K-X-E, where the first residue is a large hydrophobic and X is any amino acid. (Jmol)

Nitrosylation

The use of NO, nitric oxide, as a protein modifier is widespread throughout living organisms. NO is a gaseous molecule that is produced by an enzyme nitric oxide synthase. It reacts with cysteine to form nitrosylated cysteine. (Fig. 4-50). NO also reacts with metal ions, replacing ligands. How these reactions are reversed is not understood.

Prions

Prion diseases such as bovine spongiform encephalopathy (BSE) and Creutzfeldt-Jakob disease (CJD) are neurodegenerative diseases in which the prion protein is the primary, if not the only, infectious agent.

CJD can be (1) sporadic, (2) genetic, or (3) infectious. The prion protein (PrP-c) occurs in normal cells, on the outside surface of neurons, but a conformationally-altered form called PrP-Sc (Scrapie form) is thought to be the infectious agent. Scrapie is the name for this disease originally discovered in sheep and goats. The altered form of the PrP has been shown to be more resistant to protease digestion, and to contain increased amount of β-sheet. The structure of PrP-c from several mammals has been determined by NMR. PrP-Sc forms insoluble fibers and therefore is difficult to study by NMR. A monoclonal antibody has been found that is specific for PrP-Sc. Crystal structures of truncated, globular PrP from the C-terminus have been obtained. The N-terminus contains copper ion binding sites.

PrP: about 250 amino acids

  1. (Jmol) The original NMR structure of a segment of the mouse PrP
  2. (Jmol) A domain-swapped dimeric human PrP indicates a possible mechanism of oligomerization.
  3. (Jmol) Possible differences between PrP and its infectious form PrP-Sc

Short review:

Weissmann C
Molecular genetics of transmissible spongiform encephalopathies.
J Biol Chem 1999 Jan 1;274(1):3-6

Korth C, Stierli B, Streit P, Moser M, Schaller O, Fischer R, Schulz-Schaeffer W, Kretzschmar H, Raeber A, Braun U, Ehrensperger F, Hornemann S, Glockshuber R, Riek R, Billeter M, Wuthrich K, Oesch B
Prion (PrPSc)-specific epitope defined by a monoclonal antibody.
Nature 1997 Nov 6;390(6655):74-7

Alonso DO, DeArmond SJ, Cohen FE, Daggett V
Mapping the early steps in the pH-induced conformational conversion of the prion protein.
Proc Natl Acad Sci U S A 2001 Mar 13;98(6):2985-9

Knaus KJ, Morillas M, Swietnicki W, Malone M, Surewicz WK, Yee VC.
Crystal structure of the human prion protein reveals a mechanism for oligomerization.
Nat Struct Biol. 2001 Sep;8(9):770-4.

Copper binding

Viles JH, Cohen FE, Prusiner SB, Goodin DB, Wright PE, Dyson HJ
Copper binding to the prion protein: structural implications of four identical cooperative binding sites.
Proc Natl Acad Sci U S A 1999 Mar 2;96(5):2042-7 Full text

Whittal RM, Ball HL, Cohen FE, Burlingame AL, Prusiner SB, Baldwin MA
Copper binding to octarepeat peptides of the prion protein monitored by mass spectrometry
Protein Sci 2000 Feb;9(2):332-43

Yeast prions are an example of amyloid proteins (Fig. 4-53)

True HL, Lindquist SL
A yeast prion provides a mechanism for genetic variation and phenotypic diversity.
Nature 2000 Sep 28;407(6803):477-83

Balbirnie M, Grothe R, Eisenberg DS
An amyloid-forming peptide from the yeast prion Sup35 reveals a dehydrated beta-sheet structure for amyloid
Proc Natl Acad Sci U S A 2001 Feb 27;98(5):2375-2380

Serpins

Serpins are another class of proteins that can be considered metastable. Serpins are proteins that irreversibly inhibit serine proteases. For example, anti-thrombin is a serpin that inhibits thrombin, part of the coagulation pathway.

The action of a serpin was discovered in the case of trypsin. A loop in the alpha1;-antitrypsin covalently reacts with the catalytic Serine in trypsin. This aduct is essentially the acyl enzyme intermediate. Trypsin is unable to carry the reaction forward, because this first step triggers a conformational change in the antitrypsin. The loop is connverted into a beta;-strand and is inserted into a beta;-sheet of the antitrypsin. Trypsin, covalently attached, is propelled from one end of the antitrypsin to the other. Trypsin is partly unfolded, and becomes completely inactivated. It is sensitive to proteases.

This has been called "Inhibition by Deformation". (Fig. 4-54) (Jmol)

Prediction of Tertiary Structure

In general there are three approaches

a) Try to model an amino acid sequence by homology (homology modeling)

b) Try to find compatibility to known structures (threading)

c) Try to fold an amino acid sequence based on physical principles (ab initio or de novo)

 Identification of topological fold is often the goal.

Sometimes these approaches are combined

A) Modeling Approach

Look for sequences that have >30% sequence identity with a protein of known structure

(Sequences of 15-30% identity can be attempted)

Basic principles:

1) Buried amino acid residues are hydrophobic

2) Surface amino acid residues are polar

3) Within a family of homologous proteins, buried and active site residues are conserved.

4) Within a family of homologous proteins, surface residues are variable.

5) Elements of secondary structure will be more highly conserved than amino acid sequence.

3 steps in the procedure

1) Sequence alignment

2) Build sequence into secondary structure

3) Energy minimize to improve tertiary structure

Swiss-Model: The EXPASY server will build a model of a protein that is at least 30% identical to a protein in the protein data bank. Click the "First Approach Mode" at the left

B) Threading

If no homologous protein can be identified by sequence comparisons, the compatibility of the a.a. sequence of the target can be determined for representations of all known folds (templates).

This is called threading. (Fig. 4-25)

Example:

J Mol Biol 1997 Apr 11;267(4):1026-38 
A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence.
Rice DW, Eisenberg D

They have derived a scoring matrix from a database of 119 pairs of proteins of known structure with the same fold, but with <30% sequence identity.

882 elements = 7 x 3 x 2 x 7 x 3

7 classes of amino acids (Cys; Trp; Arg,Lys; Tyr,Phe; Ile,Leu,Val,Met; Ala,Gly,Ser,Thr,Pro; Asp,Glu,Asn,Gln,His)

3 types of secondary structure (helix, sheet, turn)

2 locations (buried, exposed)

and in the target sequence 7 classes of a.a. and 3 types of secondary structure (from PredictProtein)

First: obtain secondary structure prediction from PredictProtein.

Second, Calculate score for each of the 119 folds:

Example:

The highest score is for a Trp, predicted to be in helix, that matches a buried Trp in a helix____Score=4.5

A basic residue predicted to be in a sheet that matches an exposed basic residue in a sheet___Score=2.3

The same basic residue that matches an exposed basic residue in a helix would score -9 (Lowest score)

Threading by PsiPred using Genthreader

Jones DT:
GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences.
J Mol Biol 1999, 287: 797–815.

Prospector server by Jeffrey Skolnick

C. Ab initio approach (physical principles)

This approach can work even in the absence of homology to known structures, but overall the reliability is low.

LINUS is one example: Local Independently Nucleated Units of Structure

50 amino acids are folded at a time, in an overlapping fashion: 1-50, 26-75, ...

It is based on the idea that actual proteins fold by forming local secondary structure first.

Side chains are simplified. Only 3 interactions are used:

1 repulsive: steric

2 attractive: H-bonds and hydrophobic

Then the calculation of all possibilities for the search of the lowest free energy

Proteins 1995 Jun;22(2):81-99
LINUS: a hierarchic procedure to predict the fold of a protein.
Srinivasan R, Rose GD

Proc. Natl. Acad. Sci. USA Vol. 96, Issue 25, 1425814263, December 7, 1999 (Full text)
A physical basis for protein secondary structure
Rajgopal Srinivasan and George D. Rose

Srinivasan R, Rose GD.
Ab initio prediction of protein structure using LINUS.
Proteins. 2002 Jun 1;47(4):489-95.

Example prediction by LINUS

Touchstone II : no server is available
Zhang Y, Kolinski A, Skolnick J.
TOUCHSTONE II: a new approach to ab initio protein structure prediction.
Biophys J. 2003 Aug;85(2):1145-64.

De novo prediction: Rosetta by David Baker

Bonneau R, Strauss CE, Rohl CA, Chivian D, Bradley P, Malmstrom L, Robertson T, Baker D.
De novo prediction of three-dimensional structures for major protein families.
J Mol Biol. 2002 Sep 6;322(1):65-78.

Fully automated 3-d structure prediction: Examples Fig. 4-26, Fig. 4-27

Bystroff C, Shao Y.
Fully automated ab initio protein structure prediction using I-SITES, HMMSTR and ROSETTA.
Bioinformatics. 2002 Jul;18 Suppl 1:S54-61
Server prediction:

Robetta here

Kim DE, Chivian D, Baker D.
Protein structure prediction and analysis using the Robetta server.
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W526-31.

Robetta = Robot Rosetta

These sites offer www access to predictions of tertiary structure:

3D-PSSM Imperial Cancer Research Fund, London (Now called Phyre)

Fugue University of Cambridge (other server)

SAM-T02 UC Santa Cruz

CAFASP2: Critical Assessment of Fully Automated Structure Prediction (CAFASP3) (CAFASP4)

Meta Prediction Server

CASP4 analysis:
Samudrala R, Levitt M.
A comprehensive analysis of 40 blind protein structure predictions.
BMC Struct Biol. 2002 Aug 1;2(1):3.

Fischer D, Barret C, Bryson K, Elofsson A, Godzik A, Jones D, Karplus KJ, Kelley LA, MacCallum RM, Pawowski K et al.: CAFASP-1: critical assessment of fully automated structure prediction methods.
Proteins 1999, S3: 209–217


Comments/questions: svik@mail.smu.edu

Copyright 2007, Steven B. Vik, Southern Methodist University

Last modified 4/16/07