You are on page 1of 30

Lecture 11 – Structural Transitions of

Polypeptides and Proteins


Coil-Helix Transitions
 The transition between a random coil and a helix
structure:
– also called the ‘coil-helix’ transition…
– is an important component in protein folding pathways.
 The term, ‘random coil’:
– refers to a set of equivalent coil-like structures:
• each is unfolded, relative to a typical helical structure…
– α-helix, 310 helix, β-strand, etc.
 Focus: thermodynamic properties of these transitions.
– For simplicity, we begin with a homopolymer:
• a polynucleotide of identical amino acids…
• and focus on the transition: random coil to α-helix.
– We investigate this coil-helix transition:
• using a statistical thermodynamic treatment: The Zipper Model.
The Nucleation of the α-helix
 The nucleation step in α-helix formation:
– involves formation of an H-bond between:
• Keto Oxygen of residue j.
• Amide Hydrogen of residue j+4.
– this requires the torsion angles to
assume mean values:
• φ = -57o, ψ = -47o.
• entropically unfavorable.
– This H-bond helps to stabilize
the helical structure.
 Energetic favorability, however:
– relies on isolation of the H-bond
from competition with water…
• as is the case in the folded protein’s interior;
• or in a non-polar solvent.
Model Polypeptide System
 As a result, many shorter oligo-polypeptides:
– form an α-helix only in organic solvent (e.g., octanol).
 Certain long polypeptides, however:
– can be induced to form an α-helix in Aq. solution…
 Classic Example (Zimm and Bragg, 1959):
– poly-[γ-benzyl-L-glutamate]
• in 80% dichloroacetic acid / 20% ethylene dichloride.
– undergoes a coil-to-helix transition when heated:
• the opposite of protein denaturation in Aq. solution…
• due to dichloroacetic acid’s ability to form strong H-bonds:
– with the amide-Nitrogens in the coil.
– Here, α-helix formation thus endothermic.
• since ∆Ho > 0, this process is entropy driven (∆So > 0)…
• consistent with release of solvent from the helix.
Estimating s
 In order to apply the Zipper model to a transition:
 s must be related to experimentally measured quantities:
• The midpoint of the melting transition:
– The ‘melting temperature’, Tm
• ∆Ho for adding 1 helical unit onto a pre-existing helix:
– The enthalpy of helix growth, ∆Hog.
 Method: Experimental determination of s
• s modeled as a micro-equilibrium constant of helix formation…

 In practice, ∆Hog determined from by comparing Tm’s of polymers of


2 different lengths.
 Allows modeling within the zipper model.
– While σ adjusted to yield the best fit.
Comparison with Experiment
 Points are experimental data   (2 sets, Doty, et
al.):
– determined by optical rotation.
– for several lengths:
• N = 26, 1500 residues.
 Curves: predicted values:
– Fractional number of helical
residues, Θh.
• estimated by the Zipper model.
– s values computed using:
∆Hog = 0.89 kcal/mol > 0.
• helix formation endothermic.
– thus, s increases with T.
– 2 fitted σ values shown:
• σ = 1 x 10-4 (dashed).
• σ = 2 x 10-4 (solid).

General Transition Characteristics
 Transition shows a large N-dependence.
– even though s and σ are length-independent.
 Cooperativity of the transition increases with N.
– as measured by the narrowness of the transition…
 Relative contributions of parameters also strongly
N-dependent
– In short polymers, nucleation (σ)
dominant
• initial formation unfavorable.
• Propagation strongly inhibited.
– At large N, propagation (s)
dominates…
• nucleation penalty distributed
over more residues…
• s quickly dominates as T increases.
Validating the Size of σ
 Fitted Application of a Zipper model:
– predicts the coil to α-helix transition to be highly cooperative.
• fitted value, σ = 10-4.
 This prediction can be separately validated as follows:
– statistical weight of nucleation = σs.
• This accounts for formation of 1 H-bond.
– s accounts for the balance between ∆Ho and ∆So…
• for only 1 residue.
– σ accounts for the cooperativity of nucleation:
• nucleation restricts the angles of 4 residues:
– to values typical of an α-helix: (φ = -57o,ψ = -47o),
• the ∆So for only 1 of these 4 is included in s…
– thus, σ = exp[3∆Sores/R]
– Net entropy change/residue:
∆Sores = R ln Whelix – R ln Wcoil = -R ln 9 = -18 J/mol K.
• Substitution yields the estimate, σ = 1.5 x 10-3.
The Coil to 310-Helix Transition
 s and σ always depend on the nature of the transition.
 other transitions exhibit different s and σ values.
 Example: Sequences of type (AAAAK)nA.
 A, K = alanine, lysine, respectively.
 convert from 310 helices to α-helices…
• when n is increased from 3 to 4.
– i.e., with increasing total polymer length.
 This suggests that the 310 helix:
• is easier to initiate (from a coil) than an α-helix:
σ(310) > σ(α-helix)
• but, once initiated, the α-helix is more easily propagated:
s(α-helix) > s(310)
 The difference in σ is due to conformational entropies:
• nucleation of a 310 helix fixes the torsion angles of 1 less residue:
– α-helix: H-bond between residues j and j+4.
– 310-helix: H-bond b/w residues j and j+3.
Sequence Dependence of σ, s
 For regular α-helix formation (or melting), both σ
and s are sequence dependent.
– each type of residue characterized by a different s value.
– cooperativity, σ of α–helix formation will also vary:
• but with the mean residue content.
– Ex: In Lysine-containing polymers, σ = 10-3.
• nucleation 10x more favorable
– compared to poly-[γ-benzyl-L-glutamate].
 The impact of residue differences on α-helix
stability:
– studied using host-guest peptides.
• energetic variations due to a single , internal switched
residue are measured;
– essentially no variation in σ.
• emphasis: determination of variation in s.
Host-Guest Parameters
 Begin with a host α-helix (…yyyyyyy…):
– y = some residue stable as a α-helix.
• transition free energy/residue: ∆Go(y).
– Replace 1 y with a guest residue (‘X’).
• yields sequence: …yyy-X-yyy…
– Measure δ∆Go (kJ/mol) for all X values:
• then, ∆Go(x) = ∆Go(y) + δ∆Go(x)
 Values yield α-helix ‘propensities’.
– Here, shown normalized…
• relative to the δ∆Go of Gly;
• since Gly lacks a Cβ (R = H).
– All but Pro more favorable than Gly.
• Pro is a strong α-helix breaker.
 Conversion from δ∆Go(x) to s:
– assume σ ~ independent of ‘X’.
• then, s = exp[-∆Go(X)/RT].
• Here, s = 1.0 means neutral favorability.
Modeling Melting Initiation
 Consider an N-residue polypeptide:
– in the fully helical conformation: hhh…hhh
• weight: ωN =σsN.
 Melting of the Helix:
– can occur by 2 fundamentally different processes…
– melting of an end residue:
• 2 conformations: chh…hhh and hhh…hhc.
• total weight: ωN-1 = 2σsN-1.
– melting a ‘middle’ residue…
• N-2 conformations, of the form : …hhhchhh…
– each has 2 ‘helix-islands’…
• total weight: ωN-1 = (N-2)σ2sN-1.
 More generally, a conformation with j helix-islands:
– will contain j factors of σ.
– this motivates our Zipper model.
End vs. Middle Melting
 The relative probabilities of initiating melting:
– at the end vs. the middle…
• estimated by a ratio of statistical weights:
Pe/Pm = 2σsN-1/(N-2)σ2sN-1 = 2/Nσ.
– we have 2 opposing factors:
• N = number of central melting points.
• σ = penalty for initiating melting at a given middle point.
 Assuming the typical experimental value (σ = 2x10-4)
– Pe/Pm = 1 when N = 104.
– For short helices (N < 104 residues),
• σ dominates…
• transition initiates at the ends.
– For long helices (N > 104 residues),
• N overcomes σ ...
• denaturation may then proceed from the middle.
– These trends observed experimentally, in globular proteins.
Predicting Protein Structure
 Given sets of (s,σ) values for all 20 amino acids…
– for formation of all types of 2o structures:
• coil to α-helices, β-strand, or 310 helix, etc..
 We should be able to apply a Zipper model:
– to predict the probability of adopting each type of structure…
• during folding and melting of globular proteins.
 Added complication:
– s = exp(-∆Go/RT) also depends on external factors.
• e.g., ∆Go varies with residue environment.
– Globular proteins offer 2 distinct environments:
• hydrophobic: buried residues in the protein interior.
• hydrophilic: solvent accessible residues at the protein surface.
– meaningful assignment of an s value to each residue j:
• demands knowledge of whether j is buried, in each context.
Chou and Fasman
 Statistical Thermodynamics not routinely used to
model protein folding.
– however, many statistical methods for predicting 2o
structure have been developed.
• these incorporate many of the essential features of the
Zipper model.
 The Method of Chou and Fasman (1974):
– begins with a empirical set of residue parameters.
– defined not by measured transition energies (∆Go),
• but by the statistical tendency of each residue to form each
type of structure…
• as determined from the mole fractions present in actual
protein crystals.
– first parameters used data from 64 different proteins.
Chou and Fasman Parameters
 For each type of amino acid…
– three types of parameters are computed:
• <Pα> = propensity to form an α-helix.
• <Pβ> = propensity to form a β-sheet.
• <PT> = propensity to turn (adopt a coil).
 Example: Determination of <Pα> values.
– First, a mole fraction is computed for each type, i:
• χα(i) = occurrence of i in an α-helix / occurrence of i in
the data set.
– Secondly, an average alpha-helical amino acid is defined:
• with an average value of χα(i) :
< χα> = Σi χα(i) / 20
– Parameter i then defined as a relative tendency:
<Pα(i)> = χα(i) / < χα>
• really just a weighted average.
 Repeated for each type of 2o structure.
The Favorability of Propagation
 Parameters <Pα> and <Pβ>:
– correspond to the mean propagation terms, <s>…
• in their respective Zipper models;
– averaged over solvent conditions.
– Ex.: <Pα> corresponds conceptually to <s>…
• in the Zipper model of the coil to α-helix transition.
 Qualitative propensity also assigned to each residue,
• relative to each type of structure.
– e.g., relative to α-helix formation, residues categorized as:
• Strong Helix Formers (H)
• Average Helix Formers (h)
• Weak Helix Formers (I)
• Indifferent (i)
• Weak Helix Breakers (b)
• Strong Helix Breakers (B)
 Again, this is repeated for each type of 2o structure.
Chou-Fasman Parameter Set
 Comparison w/ Host-
Guest Parameters:
– relative favorabilities:
• general agreement.
• differences in the
ordering.
– values play the role of
propagation terms, s.

 Glycine:  Proline:
– low <Pα>, but high <PT> – low <Pα> and <Pβ>.
– great conformational – due to restricted
freedom. torsion angles.
– 3rd residue of a Type II turn. – α-helix, β-sheet breaker.
The Cooperativity of Nucleation
 The cooperativity of 2o structure formation:
– i.e., the statistical unlikelihood of nucleation.
– is also included in the Chou-Fasman model:
• but, implicitly, in the rules of region assignment:
– e.g., whether a sub-sequence is helix, sheet, or coil.
 Regions of 2o structure assigned by inspection:
– where any 2o structure requires a string of residues of similar
propensity.
– Example: For an α-helix:
• initiation of a helix requires a contiguous set of helix formers:
– H, h, or I… with I given ½-weight.
– clearly modeling the cooperativity of helix nucleation.
• nucleated helices propagate through residues, H, h, I, and i.
• and terminate when two or more helix breakers are encountered.
– again, modeling the cooperativity of the process.
Example: Chou-Fasman Method
 Applied to the first 24 residues of Adenylate kinase.
– method predicts 2 structures:
• N-terminal string with α-helix forming tendency.
– mean weight: <Pα> = 1.39.
• 2nd string with both α-helix and β-sheet forming tendency.
– mean β-tendency higher: <Pβ> = 1.56.
– Experimentally, strings correspond to α-helix, β-sheet.
• A β-turn (specific coil) is also observed.
– predicted by a hydropathy-based modification by Rose (1978).
Example (cont.)
 Applied to the remainder of Adenylate kinase:
– And also compared with a 2nd method (Nagano).
– best results provided by a joint method:
• here, obtains ~ 70% accuracy.
Evaluating Accuracy
 The most widely used method:
– the overall, per-residue, 3-state accuracy (Q3):
Q3 = [(PH+PE+PC)/N] x 100%
• N = total number of residues.
• PX = number of correctly predicted residues in state X.
– X = α-Helix, β-shEet, or Coil.
– Although other methods exist,
• Q3 is the most conceptually simple.
 Pioneering method by Chou-Fasman:
– overall accuracy of only about Q3 = 50%.
• as assessed by a database of 267 known structures.
– initially very popular, due to conceptual simplicity.
Improvements on Chou-Fasman
 Many improvements have appeared.
– differ based on parameter definition and application.
– an in-depth consideration beyond the scope of this course.
• however, success correlated with the addition of relevant
statistical information…
 [1] Information regarding residue context.
– i.e., The propensity of a residue to adopt a given state:
• determined by its n neighboring residues…
– as compared with observations in a database.
– We examine: the GOR method (Garnier, 1987):
 [2] Information regarding homologous proteins.
– protein first subjected to multiple alignment.
• to identify homologous proteins.
– prediction then based on consensus propensities.
– We examine: the PHD method (Rost and Sander, 1993).
The GOR Method
 Propensity of a residue to adopt state S:
– defined not only by its own identity:
• as in Chou-Fasman,
– but also by the identities of neighboring residues.
– GOR uses a 17-residue window:
• a central, predicted residue + 8 flanking residues
on each side.
• e.g., residues 4-20 used to predict the state of the 12th
residue (F) of adenylate kinase:
The GOR Method (cont.)
 Using sequences in the database, 3 Scoring Matrices, MS were
first constructed:
– One for each of the 3 basic helical states, S = {H, E, C}.
– Each is a 20x17 matrix, with elements mxy:
• row, x = amino acid type (e.g., Ala).
• column, y = residue position within the ‘window’,
• mxy = the probability that residue y is of type x…
– given that the central residue is in state S
– So, the sum of the mxy values in each column is 1.
– Again, each matrix constructed in advance,
• from observed frequencies in the data base
– (e.g., from all known protein structures).

 Any candidate sequence is evaluated at each position, k:


– By applying each scoring matrix, MS:
• Residue k is taken as the central residue of the window…
• And elements (mxy) are summed for all 1 <= y <= 17.
– such that the residue at each window position, y is of type x.
– Highest of the 3 sums (H, E, and C) :
• yields the prediction, S, for that residue’s state.
Example: The GOR Method
 An application of GOR IV shown below:
– run at the Network Protein Sequence @nalysis site:
• http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl.
– to the 1st 24 residues of Adenylate kinase:

– this method correctly predicts the β-sheet and β-turn.


– the α-helix at residues (1-8) is predicted coil:
• although its structural propensity to form a
helix is noted (blue line).
 Overall, Gor IV has an accuracy of Q3 = 64.4%.
Use of Multiple Alignments
 The method of ‘Multiple alignments’ was first used to
aid in protein 2o structural prediction:
– by Zvelebil (1987), in combination with the GOR I method.
• accuracy improved by 9%.
 Basic idea:
– Given a sequence to be evaluated:
• identify a set of homologous (i.e., similar) sequences…
– each with > 25% sequence identity.
• 2o structure prediction then based on consensus propensities.
– one popular multiple alignment based-method:
 The PHD Method:
– Profile network from HeiDelberg:
• combines sequence homology info. with a neural network.
Profile network from HeiDelberg
 The PHD Method (Rost and Sander, 1993):
– combines sequence homology information,
• with the optimization strength of a 2-layer neural network.
 1st Layer: Raw Predictions
– Input:
• fractions of the 20 types of residue at
each multiple-alignment position…
• in a 13-residue window around
evaluated residue, k.
– total of 20x13 = 260 input nodes.
– Output: probability for each state (PH, PE, PC).
 2nd Layer: Elimination of Infeasible Structures.
– input: output of the first layer.
• After application to each residue in the chain.
– output: refined probabilities.
• refines the raw predictions of the 1st layer…
– e.g., HHHEEHH becomes HHHHHHH
Example
 An application of PHD shown below:
– run at the PredictProtein server (Columbia):
• http://dodo.cpmc.columbia.edu/
predictprotein/submit_def.html.
• initial homology search: Psi-Blast.
– to 1st 24 residues of Adenylate kinase.
 This method correctly:
– predicts the β-sheet (r10-r14).
• ‘E’ region (blue).
– predicts the reverse-turn (r16-r22).
• ‘L’ region (green).
– however, α-helix predicted to be coil.
• as in the GOR method…
• but, a 30% α-helical probability is
assigned to the region (red).
 Overall PHD accuracy: Q3 = 70.8%.
Conclusion
 In this Lecture, the helix-coil transition was used to
discuss:
– The coil to α-helix transition of a model polypeptide system:
• Poly-[γ-benzyl-L-glutamate];
• and the sequence-dependence of s and σ.
– The tendency of short polypeptides to melt at helix ends.
– The lower cooperativity of 310 helix formation.
 Limitations of the Zipper model were then discussed:
– The dependence of s on the (unknown) residue
environment.
– So that purely statistical methods of prediction are more
usual.
 The conceptual relationship b/w the Zipper model:
– and statistical methods of predicting protein 2o structure…

You might also like