5 March 2007
Biol 6312
Protein Purification/X-Ray Crystallography/NMR
Protein Purification
Various references exist:
To determine the 3-dimensional structure of a protein, it is essential to purify it to homogeneity. Many other types of analysis that are commonly applied to proteins also benefit from, or require, highly purified preparations.
Requirements for success in Protein Purification
There is often a trade-off between yield and purity.
Most proteins are not found in high abundance in any tissue, so expression of cloned genes is now commonly used for purification.
This also allows the incorporation of tags for purification or identification, the expression of domains of larger proteins, and the modification of proteins through mutagenesis. These steps often are important or even essential to cary out crystallization or many other studies.
Classical methods of protein purification rely upon differences among proteins in several properties:
Modern methods rely upon affinity tags (His, strep, glutathione-S-transferase, maltose binding protein) or ligand binding (ATP), usually with chromatography
X-Ray Crystallography
Usually considered the best method for atomic resolution structures of proteins.
Basic Procedure (Fig. 5-3)
What is a crystal?
What is a unit cell?
The smallest volume element that can build the entire crystal by simple translation. It may contain one or more protein molecules. (see Escher below)
| The unit cell is marked by the green lines. It contains exactly 2 white creatures and 2 black creatures (proteins). In a protein crystal, the black creature would represent the water between the proteins. | ![]() |
Unit cells can be classified by size, shape and internal symmetry.
| A unit cell is defined by 3 lengths, a, b, and c,
and by 3 angles α, β, and γ. |
![]() |
A triclinic unit cell has no symmetry, a≠b≠c, α≠β≠γ.
A cubic until cell has the greatest symmetry , a=b=c, α=β=γ.
Additional symmetry may occur if 2 or more proteins are found in the unit cell.
The combination of unit cell geometries and internal symmetries are called Space groups.
Protein crystals have 71 possible space groups.
Theory of X-ray Diffraction
How do crystals and x-rays lead to atomic resolution structures?
X-rays are electromagnetic radiation, like visible light, but much higher energy, and shorter wavelength.
X-rays are scattered by molecules, due to their atomic dimensions and energies. In general, x-rays that are scattered by non-crystalline material will be moving in random directions. A crystal has a periodic arrangement of atoms, so it will diffract x-rays. Some scattered x-rays are reinforced, and others average out to zero. This theory was developed by Sir WL Bragg.
In a crystal, atoms in each molecule will form parallel planes, due to the periodic arrangement. According to Bragg's Law
nλ = 2 d sin θ
where d is the distance between the planes
λ is the wavelength of the x-ray (n is an integer)
and θ is the angle between the incident and scattered rays, and the planes.
(See the link above)
The result of diffraction due to a crystal, is a pattern of scattered x-rays. These were formerly collected by film, but now electronic detectors are used. The locations and intensities of all spots are measured.
The pattern of spots is indicative of the crystal's space group.
The resolution of the resulting model is related to the size of the diffraction pattern (the number of spots). This can be determined before the data are analyzed.
The phase problem:
Information is lost, similar to deciphering an object by examining its shadow.
The data collected from the diffraction pattern is used to derive structure factors, F. the electron density is related to the structure factors by a Fourier Transformation. But the structure factors are not completely determined by the diffraction pattern. They are vectors, whose magnitudes can be determined, but not their phases (angles). This is refered to as "the phase problem", a fundamental difficulty in protein crystallography. In small molecule crystallography, it is possible to guess the result from the diffraction pattern, and Linus Pauling, for example, was brilliant at that.
Proteins have too many atoms, and they have too many possible conformations.
The phase problem was first solved by Max Perutz working in Cambridge. He developed the MIR, multiple isomorphous replacement method. He was able to grow crystals of hemoglobin with Hg salt, that were isomorphous with those lacking Hg. Since Hg is so electron dense, it has a major impact on the diffraction pattern. Comparing diffraction patterns from several such crystal forms, he was able to solve the problem. Other heavy metal ions also work. This was the standard method for many years. (Inteviews with Max Perutz) (1914-2002)
Other solutions are now available:
1. Use the phases from another solved structure that is sufficiently similar (~50% identical)
2. MAD (multiple wavelength anomalous dispersion)
This requires an x-ray source with a tunable wavelength, such as from a synchroton, and the protein must have some larger atoms. This is usually accomplished by expressing the protein in E. coli, grown with Se-Met, methionine with sulfur replaced by selenium.Read RJ.
As MAD as can be.
Structure. 1996 Jan 15;4(1):11-4. Review.
Build the Structure
Fit the polypeptide chain to the electron density. There may be discontinuities because of disordered loops, etc. Unless the crystals are very highly ordered, it is likely that Carbon, Nitrogen and Oxygen atoms will all look the same, and Hydrogens will be unresolved.
Refinement
Most structures are refined to improve the quality. So, take the derived structural model and generate the expected diffraction pattern. Now, adjust the model to minimize the difference between the observed and calculated diffraction patterns. This is refinement.
Evaluation of a Structure
1) Resolution---refers to the minimum distance between 2 atoms at which they are still seen to be distinct. It is usually measured in Å. Lower number is higher resolution.
2) R-factor----A test of the model's accuracy
the F's are the structure factors. For a perfect crystal R=0. The model generates exactly the observed diffraction pattern. A random structure will have an R=.59 or 59%.
An R of 15 or 20% is considered good for a protein crystal. 25% might be OK
NMR-Nuclear Magnetic Resonance
This is a kind of spectroscopy related to a property of certain atomic nuclei, called spin. It requires nuclei with nonzero spin.
In the presence of a magnetic field the 2 spin states take on different energies (they are split).
The transition from one state to the other is a "resonance".
Which atoms have nonzero spin?
| Element |
Protons
|
Neutrons
|
Natural Abundance
|
Spin
|
Sensitivity (Relative)
|
| 1H |
1
|
0
|
99.98%
|
1/2
|
1.0
|
| 2H (D) |
1
|
1
|
.016%
|
1
|
.0096
|
| 12C |
6
|
6
|
98.8%
|
0
|
-
|
| 13C |
6
|
7
|
1.11%
|
1/2
|
.016
|
| 14N |
7
|
7
|
99+%
|
0
|
-
|
| 15N |
7
|
8
|
.37%
|
1/2
|
-
|
| 16O |
8
|
8
|
99.76%
|
0
|
-
|
| 19F |
9
|
10
|
100%
|
1/2
|
.834
|
| 31P |
15
|
16
|
100%
|
1/2
|
.066
|
1H occurs naturally in proteins, and 13C, to a small extent (1%).
13C and 15N can be introduced during protein expression (e.g. in bacteria).
Not so important for protein structure determination, but still of interest:
19F can be added to proteins as a part of a ligand, as a reporter group
31P is found in many biological molecules, ATP, phospholipids, etc.
In principle, macromolecular structure can be determined from an NMR spectrum because the NMR signal is highly sensitive to the environment of the atom. So, the value of the resonance can be shifted up or down in energy. The value of the resonance is called the "chemical shift".
You might be familiar with the proton resonances in the ethyl group: -CH2-CH3
In the -CH2- group there are 3 spin states: ↑↑ ↑↓ ↓↓
So the -CH3 protons see 3 different environments and have their resonances split into 3. There are 2 ways to make the middle state, so there is twice as much of that one.
Likewise,the -CH3 group has 4 spin states: ↑↑↑ ↑↑↓ ↑↓↓ ↓↓↓
Spectrum of ethyl group
These are very short range interactions, and are the strongest influence on the chemical shifts. These are called "through bond" interactions. Using more sophisticated NMR techniques one can measure "through space" interactions. An example would be in an alpha-helix, residues 1 and 5 are close in space (but not through bonds), and can influence each other's chemical shifts.
The difficulty with a protein, as compared to a small organic molecule, is that there are so many Hydrogen atoms that the signals are on top of each other.
NMR experiment (Fig. 5-4)
The first thing to do is to try to identify the resonance associated with each amino acid. There are trends in chemical shift values:
| Groups | Chemical shifts |
| -CH3 (L,I,V,A,T) | 0.9-1.5 ppm |
| -CH2- | 1.5-3.5 |
| Cα-H | 3.5-5.5 |
| Ar-H (W,F,Y) | 6.4-7.4 |
| -N-H (amide) | 7.0-9.0 |
To associate observed resonances with amino acids, it is essential to know the amino acid sequence.
The key to determining the structure of the protein is to measure as many proton-proton distances as possible. This is done using 2-D NMR techniques. This allows one to determine which pairs of resonances are close enough to interact. Larger 2-D spectrum
There are 2 common methods:
COSY (Correlation Spectroscopy) These are couplings through bonds.
NOESY (Nuclear Overhauser effect spectroscopy) These are couplings through space.
One must collect a large number of distancess, which are then used as constraints to build a structural model. This is refered to as distance geometry. Often, 10 or 20 different models will be published, that are compatible with the constraints. They should all be similar, or confidence in any of the models will be low. (Fig. 5-2)
An NMR structural model is evaluated by the deviations of the group of models determined, RMSD, the root mean square deviation. A good structure has a value of 0.7Å. Up to 1.0 is considered OK.
What are the advantages of NMR analysis?
No crystals required (but high concentration is required, there can be solubility problems, or instability problems for long collection times)
The protein is in the soluble state (might be different in a crystal)
Disadvantages?
There is a size limitation (which keeps increasing, now about 30-50 kD, but new techniques will extend this). Larger proteins require C and N isotopes, and special techniques.
The resulting structural models are generally not considered high resolution.
The models often show regions of the protein that are not well-ordered, or lack interactions with other regions.
Comments/questions: email me
Copyright 2007, Steven B. Vik, Southern Methodist University