Ahn A Mass Spec Methods Theory

Mass Spectrometry:
Methods & Theory
Proteomics Tools
Molecular Biology Tools
Separation & Display Tools
Protein Identification Tools
Protein Structure Tools
Mass Spectrometry Needs

Ionization-how the protein is injected in to the
MS machine
Separation-Mass and Charge is determined
Activation-protein are broken into smaller
fragments (peptides/AAs)
Mass Determination-m/z ratios are
determined for the ionized protein
fragments/peptides
Protein Identification
2D-GE + MALDI-MS
Peptide Mass Fingerprinting (PMF)
2D-GE + MS-MS
MS Peptide Sequencing/Fragment Ion Searching
Multidimensional LC + MS-MS
ICAT Methods (isotope labelling)

MudPIT (Multidimensional Protein Ident. Tech.)
1D-GE + LC + MS-MS
De Novo Peptide Sequencing
Mass Spectrometry (MS)

Introduce sample to the instrument
Generate ions in the gas phase
Separate ions on the basis of differences in
m/z with a mass analyzer
Detect ions
How does a mass spectrometer work?
Create ions
Ionization
method
MALDI
Electrospray
(Proteins must be
charged and dry)
Separate ions
Mass analyzer
MALDI-TOF
MW
Triple Quadrapole
AA seq
MALDI-QqTOF
AA seq and MW
QqTOF
AA seq and protein modif.
Detect ions
Mass
spectrum
Database
analysis
Generalized Protein Identification by MS
Library
Spot removed
from gel
Artificial
spectra built
Fragmented
using trypsin
Spectrum of
fragments
generated
MATCH
Artificially
trypsinated
Database of
sequences
(i.e. SwissProt)
Methods for
protein
identification
MS Principles
Different elements can be uniquely
identified by their mass
MS Principles
Different compounds can be uniquely
identified by their mass
Butorphanol
N -CH2OH
L-dopa
Ethanol
COOH
HO
-CH2CH-NH2
CH3CH2OH
HO
HO
MW = 327.1
MW = 197.2
MW = 46.1
Mass Spectrometry
Analytical method to measure the
molecular or atomic weight of samples
Weighing proteins
A mass spectrometer creates charged particles (ions) from molecules.
Common way is to add or take away an ions:
NaCl + e NaCl
-
NaCl NaCl+ + eIt then analyzes those ions to provide information about the molecular
weight of the compound and its chemical structure.
Mass Spectrometry
For small organic molecules the MW can be
determined to within 5 ppm or 0.0005% which
is sufficiently accurate to confirm the
molecular formula from mass alone
For large biomolecules the MW can be
determined within an accuracy of 0.01% (i.e.
within 5 Da for a 50 kD protein)
Recall 1 dalton = 1 atomic mass unit (1 amu)
MS History
JJ Thomson built MS prototype to measure
m/z of electron, awarded Nobel Prize in 1906
MS concept first put into practice by Francis
Aston, a physicist working in Cambridge
England in 1919
Designed to measure mass of elements
Aston Awarded Nobel Prize in 1922
MS History
1948-52 - Time of Flight (TOF) mass
analyzers introduced
1955 - Quadrupole ion filters introduced by
W. Paul, also invents the ion trap in 1983
(wins 1989 Nobel Prize)
1968 - Tandem mass spectrometer appears
Mass spectrometers are now one of the
MOST POWERFUL ANALYTIC TOOLS IN
CHEMISTRY
MS Principles
Find a way to charge an atom or
molecule (ionization)
Place charged atom or molecule in a
magnetic field or subject it to an electric
field and measure its speed or radius of
curvature relative to its mass-to-charge
ratio (mass analyzer)
Detect ions using microchannel plate or
photomultiplier tube
Mass Spec Principles

Sample
+
_
Ionizer
Mass Analyzer
Detector
How does a mass spectrometer work?
Create ions
Separate ions
Ionization
method
Mass analyzer
MALDI
Electrospray
(Proteins must be
charged and dry)
Detect ions
Mass
MALDI-TOF
spectrum
MW
Database
Triple Quadrapole
AA seq
analysis
MALDI-QqTOF
AA seq and MW
QqTOF
AA seq and
protein modif.
Mass spectrometers
L in e a r T im e O f F lig h t t u b e
io n s o u r c e
Time of flight (TOF) (MALDI)

Measures the time required for ions to fly down the length
of a chamber.
Often combined with MALDI (MALDI-TOF) Detections
R e f l e c t o r T i m e O from
f F lig h t tu b e
multiple laser bursts are averaged. Multiple laser
d e te c to r
t im e o f f l ig h t
io n s o u r c e
Tandem MS- MS/MS

-separation and identification of compounds in complex
mixtures
- induce fragmentation and mass analyze the fragment ions.
- Uses two or more mass analyzers/filters separated by a
collision cell filled with Argon or Xenon
d e te c to r
Different MS-MS configurations
Quadrupole-quadrupole (low energy)

Magnetic sector-quadrupole (high)
Quadrupole-time-of-flight (low energy)
Time-of-flight-time-of-flight (low energy)
r e f le c t o r
tim e o f flig h t
Typical Mass Spectrometer
LC/LC-MS/MS-Tandem LC, Tandem MS
Typical Mass Spectrum

Characterized by sharp, narrow peaks
X-axis position indicates the m/z ratio of a
given ion (for singly charged ions this
corresponds to the mass of the ion)
Height of peak indicates the relative
abundance of a given ion (not reliable for
quantitation)
Peak intensity indicates the ions ability to
desorb or fly (some fly better than others)
All proteins are sorted based on a

mass to charge ratio (m/z)
m/z ratio:
Molecular weight divided by the
charge on this protein
Typical Mass Spectrum

Relative
Abundance
aspirin
120 m/z-for singly charged ion this is the mass
Resolution & Resolving Power

Width of peak indicates the resolution of the
MS instrument
The better the resolution or resolving power,

the better the instrument and the better the
mass accuracy
Resolving power is defined as:
M
M
M is the mass number of the observed mass

(M) is the difference between two masses
that can be separated
Resolution in MS
Resolution in MS
783.455
QTOF
784.465
785.475
783.6
Mass Spectrometer Schematic

Turbo pumps
Diffusion pumps
Rough pumps
Rotary pumps
High Vacuum System
Inlet
Sample Plate
Target
HPLC
GC
Solids probe
Ion
Source
Mass
Filter
MALDI
ESI
IonSpray
FAB
LSIMS
EI/CI
TOF
Quadrupole
Ion Trap
Mag. Sector
FTMS
Detector
Microch plate
Electron Mult.
Hybrid Detec.
Data
System
PCs
UNIX
Mac
Different Ionization Methods

Electron Impact (EI - Hard method)
small molecules, 1-1000 Daltons, structure
Fast Atom Bombardment (FAB Semi-hard)

peptides, sugars, up to 6000 Daltons
Electrospray Ionization (ESI - Soft)
peptides, proteins, up to 200,000 Daltons
Matrix Assisted Laser Desorption (MALDI-Soft)

peptides, proteins, DNA, up to 500 kD
Electron Impact Ionization

Sample introduced into instrument by
heating it until it evaporates
Gas phase sample is bombarded with
electrons coming from rhenium or
tungsten filament (energy = 70 eV)
Molecule is shattered into fragments (70
eV >> 5 eV bonds)
Fragments sent to mass analyzer
EI Fragmentation of CH3OH
CH3OH
CH3OH+
CH3OH
CH2O=H+
CH3OH
CH2O=H+
+ H
CH3 + OH
CHO=H+ + H
Why wouldnt Electron Impact be suitable

for analyzing proteins?
Why You Cant Use EI For

Analyzing Proteins
EI shatters chemical bonds
Any given protein contains 20 different

amino acids
EI would shatter the protein into not only
into amino acids but also amino acid subfragments and even peptides of 2,3,4
amino acids
Result is 10,000s of different signals from
a single protein -- too complex to analyze
Soft Ionization Methods

337 nm UV laser
Fluid (no salt)
+
_
cyano-hydroxy
cinnamic acid
Gold tip needle
MALDI
ESI
Soft Ionization
Soft ionization techniques keep the
molecule of interest fully intact
Electro-spray ionization first conceived in
1960s by Malcolm Dole but put into
practice in 1980s by John Fenn (Yale)
MALDI first introduced in 1985 by Franz
Hillenkamp and Michael Karas (Frankfurt)
Made it possible to analyze large
molecules via inexpensive mass analyzers
such as quadrupole, ion trap and TOF
Ionization methods
Electrospray mass spectrometry (ESI-MS)

Liquid containing analyte is forced through a steel capillary at high voltage
to electrostatically disperse analyte. Charge imparted from rapidly
evaporating liquid.
Electrospray Ionization
Sample dissolved in polar, volatile buffer
(no salts) and pumped through a stainless
steel capillary (70 - 150 m) at a rate of 10100 L/min
Strong voltage (3-4 kV) applied at tip along
with flow of nebulizing gas causes the
sample to nebulize or aerosolize
Aerosol is directed through regions of
higher vacuum until droplets evaporate to
near atomic size (still carrying charges)
Electrospray (Detail)
Electrospray Ionization
Can be modified to nanospray system
with flow < 1 L/min
Very sensitive technique, requires less
than a picomole of material
Strongly affected by salts & detergents
Positive ion mode measures (M + H)+ (add
formic acid to solvent)
Negative ion mode measures (M - H)- (add
ammonia to solvent)
Positive or Negative Ion Mode?

If the sample has functional groups that
readily accept H+ (such as amide and
amino groups found in peptides and
proteins) then positive ion detection is
used-PROTEINS
If a sample has functional groups that
readily lose a proton (such as carboxylic
acids and hydroxyls as found in nucleic
acids and sugars) then negative ion
detection is used-DNA
Matrix-Assisted Laser
Desorption Ionization
337 nm UV laser
cyano-hydroxy
cinnamic acid
MALDI
MALDI
Sample is ionized by bombarding sample
with laser light
Sample is mixed with a UV absorbant
matrix (sinapinic acid for proteins, 4hydroxycinnaminic acid for peptides)
Light wavelength matches that of
absorbance maximum of matrix so that
the matrix transfers some of its energy to
the analyte (leads to ion sputtering)
HT Spotting on a MALDI Plate
MALDI Ionization
Matrix
+
+ +-+
Laser
Analyte
+
+ ++ + --+
-+
+
+
+
Absorption of UV radiation
by chromophoric matrix and
ionization of matrix
Dissociation of matrix,
phase change to supercompressed gas, charge
transfer to analyte molecule
Expansion of matrix at
supersonic velocity, analyte
trapped in expanding matrix
plume (explosion/popping)
MALDI
Unlike ESI, MALDI generates spectra that
have just a singly charged ion
Positive mode generates ions of M + H
Negative mode generates ions of M - H
Generally more robust that ESI (tolerates
salts and nonvolatile components)
Easier to use and maintain, capable of
higher throughput
Principal for MALDI-TOF MASS

p e p tid e m ix t u r e
e m b e d d e d in
lig h t a b s o r b in g
c h e m ic a ls ( m a t r ix )
+
+ +
p u ls e d
U V o r I R la s e r
(3 -4 n s )
vacuum
+
s tro n g
e le c tr ic
fie ld
Vacc
c lo u d o f
p ro to n a te d
p e p tid e m o le c u le s
d e te c to r
T im e O f F lig h t tu b e
Principal for MALDI-TOF MASS

L in e a r T im e O f F lig h t tu b e
io n s o u r c e
d e te c to r
t im e o f flig h t
R e fle c to r T im e O f F lig h t tu b e
io n s o u r c e
d e te c to r
r e f le c t o r
t im e o f flig h t
MALDI = SELDI
337 nm UV laser
cyano-hydroxy
cinnaminic acid
MALDI
MALDI/SELDI Spectra
Normal
Tumor

Turbo pumps
Diffusion pumps
Rough pumps
Rotary pumps
High Vacuum System
Inlet
Sample Plate
Target
HPLC
GC
Solids probe
Ion
Source
Mass
Filter
MALDI
ESI
IonSpray
FAB
LSIMS
EI/CI
TOF
Quadrupole
Ion Trap
Mag. Sector
FTMS
Detector
Microch plate
Electron Mult.
Hybrid Detec.
Data
System
PCs
UNIX
Mac
Different Mass Analyzers

Magnetic Sector Analyzer (MSA)
High resolution, exact mass, original MA
Quadrupole Analyzer (Q)
Low (1 amu) resolution, fast, cheap
Time-of-Flight Analyzer (TOF)
No upper m/z limit, high throughput
Ion Trap Mass Analyzer (QSTAR)
Good resolution, all-in-one mass analyzer
Ion Cyclotron Resonance (FT-ICR)
Different Types of MS
ESI-QTOF
Electrospray ionization source + quadrupole

mass filter + time-of-flight mass analyzer
MALDI-QTOF
Matrix-assisted laser desorption ionization +

quadrupole + time-of-flight mass analyzer
Both separate by MW and AA seq
Different Types of MS
GC-MS - Gas Chromatography MS
separates volatile compounds in gas column and IDs

by mass
LC-MS - Liquid Chromatography MS
separates delicate compounds in HPLC column and

IDs by mass
MS-MS - Tandem Mass Spectrometry
separates compound fragments by magnetic field and

IDs by mass
LC/LC-MS/MS-Tandem LC and Tandem MS
Magnetic Sector Analyzer
Quadrupole Mass Analyzer
A quadrupole mass filter consists of four

parallel metal rods with different charges
Two opposite rods have an applied +
potential and the other two rods have a potential
The applied voltages affect the trajectory
of ions traveling down the flight path
For given dc and ac voltages, only ions of

a certain mass-to-charge ratio pass
through the quadrupole filter and all other
ions are thrown out of their original path
Quadrupole Mass Analyzer
Q-TOF Mass Analyzer

NANOSPRAY
TIP
MCP
DETECTOR
PUSHER
HEXAPOLE
QUADRUPOLE
ION
SOURCE
SKIMMER
HEXAPOLE
COLLISION
CELL
TOF
REFLECTRON
HEXAPOLE
Mass Spec Equation (TOF)

2Vt2
m
=
z
L2
m = mass of ion L = drift tube length
z = charge of ion t = time of travel
V = voltage
Ion Trap Mass Analyzer

Ion traps are ion
trapping devices that
make use of a threedimensional quadrupole
field to trap and massanalyze ions
invented by Wolfgang
Paul (Nobel Prize1989)
Offer good mass
resolving power
FT-ICR
Fourier-transform ion cyclotron resonance
Uses powerful magnet (5-10 Tesla) to

create a miniature cyclotron
Originally developed in Canada (UBC) by
A.G. Marshal in 1974
FT approach allows many ion masses to
be determined simultaneously (efficient)
Has higher mass resolution than any other
MS analyzer available
FT-Ion Cyclotron Analzyer
Current Mass Spec Technologies

Proteome profiling/separation
2D SDS PAGE - identify proteins
2-D LC/LC - high throughput analysis of lysates
(LC = Liquid Chromatography)
2-D LC/MS (MS= Mass spectrometry)
Protein identification
Peptide mass fingerprint
Tandem Mass Spectrometry (MS/MS)
Quantative proteomics
ICAT (isotope-coded affinity tag)

ITRAQ
2D - LC/LC
Study protein
complexes
without gel
electrophoresis
Complex mixture is
simplified prior to
MS/MS by 2D LC
(trypsin)
Peptides all bind

to cation
exchange column
Successive elution
with increasing salt
gradients separates
peptides by charge
Peptides are
separated by
hydrophobicity on
reverse phase
column
2D LC/MS
Peptide Mass Fingerprinting

(PMF)
Peptide Mass Fingerprinting

Used to identify protein spots on gels or
protein peaks from an HPLC run
Depends of the fact that if a peptide is cut up
or fragmented in a known way, the resulting
fragments (and resulting masses) are unique
enough to identify the protein
Requires a database of known sequences
Uses software to compare observed masses
with masses calculated from database
Principles of Fingerprinting
Sequence
>Protein 1
acedfhsakdfqea
sdfpkivtmeeewe
ndadnfekqwfe
>Protein 2
acekdfhsadfqea
sdfpkivtmeeewe
nkdadnfeqwfe
>Protein 3
acedfhsadfqeka
sdfpkivtmeeewe
ndakdnfeqwfe
Mass (M+H)
Tryptic Fragments
4842.05
acedfhsak
dfgeasdfpk
ivtmeeewendadnfek
gwfe
4842.05
acek
dfhsadfgeasdfpk
ivtmeeewenk
dadnfeqwfe
4842.05
acedfhsadfgek
asdfpk
ivtmeeewendak
dnfegwfe
Principles of Fingerprinting
Sequence
>Protein 1
acedfhsakdfqea
sdfpkivtmeeewe
ndadnfekqwfe
Mass (M+H)
4842.05
>Protein 2
acekdfhsadfqea
sdfpkivtmeeewe
nkdadnfeqwfe
4842.05
>Protein 3
acedfhsadfqeka
sdfpkivtmeeewe
ndakdnfeqwfe
4842.05
Mass Spectrum
Predicting Peptide Cleavages
http://ca.expasy.org/tools/peptidecutter/
http://ca.expasy.org/tools/peptidecutter/peptidecutter_enzymes.html#Tryps
Protease Cleavage Rules

Sometimes
inhibition occurs
Trypsin
XXX[KR]--[!P]XXX
Chymotrypsin
Lys C
Asp N endo
CNBr
XX[FYW]--[!P]XXX
XXXXXK-- XXXXX
XXXXXD-- XXXXX
XXXXXM--XXXXX
K-Lysine, R-Arginine, F-Phenylalanine, Y-Tyrosine,

W-Tryptophan,D-Aspartic Acid, M-Methionine, P-Proline
Why Trypsin?
Robust, stable enzyme

Works over a range of pH values & Temp.
Quite specific and consistent in cleavage
Cuts frequently to produce ideal MW peptides
Inexpensive, easily available/purified
Does produce autolysis peaks (which can be
used in MS calibrations)
1045.56, 1106.03, 1126.03, 1940.94, 2211.10, 2225.12,
2283.18, 2299.18
Digest with specific protease

546 aa
60 kDa; 57 461 Da
pI = 4.75
>RBME00320 Contig0311_1089618_1091255 EC-mopA 60 KDa chaperonin GroEL

MAAKDVKFGR TAREKMLRGV DILADAVKVT LGPKGRNVVI EKSFGAPRIT KDGVSVAKEV
ELEDKFENMG AQMLREVASK TNDTAGDGTT TATVLGQAIV QEGAKAVAAG MNPMDLKRGI
DLAVNEVVAE LLKKAKKINT SEEVAQVGTI SANGEAEIGK MIAEAMQKVG NEGVITVEEA
KTAETELEVV EGMQFDRGYL SPYFVTNPEK MVADLEDAYI LLHEKKLSNL QALLPVLEAV
VQTSKPLLII AEDVEGEALA TLVVNKLRGG LKIAAVKAPG FGDCRKAMLE DIAILTGGQV
ISEDLGIKLE SVTLDMLGRA KKVSISKENT TIVDGAGQKA EIDARVGQIK QQIEETTSDY
DREKLQERLA KLAGGVAVIR VGGATEVEVK EKKDRVDDAL NATRAAVEEG IVAGGGTALL
RASTKITAKG VNADQEAGIN IVRRAIQAPA RQITTNAGEE ASVIVGKILE NTSETFGYNT
ANGEYGDLIS LGIVDPVKVV RTALQNAASV AGLLITTEAM IAELPKKDAA PAGMPGGMGG
MGGMDF
Digest with specific protease

Trypsin yields 47 peptides (theoretically)
Peptide masses in Da:
501.3
674.3
861.4
1000.6
1249.6
1582.9
1790.6
2419.2
533.3
675.4
879.4
1196.6
1249.6
1583.9
1853.9
2526.4
544.3
701.4
921.5
1217.6
1344.7
1616.8
1869.9
2542.4
545.3
726.4
953.4
1228.5
1455.8
1726.7
2286.2
3329.6
614.4
822.4
974.5
1232.6
1484.6
1759.9
2302.2
4211.4
634.3
855.5
988.5
1233.7
1514.8
1775.9
2317.2
http://us.expasy.org/tools/peptide-mass.html
Digest with trypsin

In practice.......see far fewer by mass spec
- possibly incomplete digest (we allow 1 miss)
- lose peptides during each manipulation
washes during digestion
washes during cleanup step
some peptides will not ionize well
some signals (peaks) are poor
low intensity; lack resolution
What Are Missed Cleavages?

Sequence
>Protein 1
acedfhsakdfqea
sdfpkivtmeeewe
ndadnfekqwfe
Tryptic Fragments (no missed cleavage)

acedfhsak (1007.4251)
dfgeasdfpk (1183.5266)
ivtmeeewendadnfek (2098.8909)
gwfe (609.2667)
Tryptic Fragments (1 missed cleavage)

acedfhsak (1007.4251)
dfgeasdfpk (1183.5266)
ivtmeeewendadnfek 2098.8909)
gwfe (609.2667)
acedfhsakdfgeasdfpk (2171.9338)
ivtmeeewendadnfekgwfe (2689.1398)
dfgeasdfpkivtmeeewendadnfek (3263.2997)
Calculating Peptide Masses

Sum the monoisotopic residue masses
Monoisotopic Mass: the sum of the exact or accurate masses of the lightest stable isotope of the atoms
in a molecule
Add mass of H2O (18.01056)

Add mass of H+ (1.00785 to get M+H)
If Met is oxidized add 15.99491
If Cys has acrylamide adduct add 71.0371
If Cys is iodoacetylated add 58.0071
Other modifications are listed at
http://prowl.rockefeller.edu/aainfo/deltamassv2.html
H-1.007828503 amu
2
H-2.014017780 amu
1
C-12
13
C-13.00335, 14C-14.00324
12
Masses in MS
Monoisotopic
mass is the mass
determined using
the masses of the
most abundant
isotopes
Average mass is
the abundance
weighted mass of
all isotopic
components
Mass Calculation (Glycine)

NH2CH2COOH
Amino acid
R1NHCH2COR3
Residue
Monoisotopic Mass
1
H = 1.007825
12
C = 12.00000
14
N = 14.00307
16
O = 15.99491
Glycine Amino Acid Mass

5xH + 2xC + 2xO + 1xN
= 75.032015 amu
Glycine Residue Mass
3xH + 2xC + 1xO + 1xN
=57.021455 amu
Amino Acid Residue Masses

Monoisotopic Mass
Glycine 57.02147
Alanine 71.03712
Serine 87.03203
Proline 97.05277
Valine
99.06842
Threonine 101.04768
Cysteine 103.00919
Isoleucine 113.08407
Leucine 113.08407
Asparagine 114.04293
Aspartic acid 115.02695

Glutamine
128.05858
Lysine
128.09497
Glutamic acid 129.0426
Methionine
131.04049
Histidine
137.05891
Phenylalanine 147.06842
Arginine
156.10112
Tyrosine
163.06333
Tryptophan
186.07932
Amino Acid Residue Masses

Average Mass
Glycine 57.0520
Alanine 71.0788
Serine 87.0782
Proline 97.1167
Valine
99.1326
Threonine 101.1051
Cysteine 103.1448
Isoleucine 113.1595
Leucine 113.1595
Asparagine 114.1039
Aspartic acid 115.0886

Glutamine
128.1308
Lysine
128.1742
Glutamic acid 129.1155
Methionine
131.1986
Histidine
137.1412
Phenylalanine 147.1766
Arginine
156.1876
Tyrosine
163.1760
Tryptophan
186.2133
Preparing a Peptide Mass

Fingerprint Database
Take a protein sequence database (Swiss-Prot or
nr-GenBank)
Determine cleavage sites and identify resulting
peptides for each protein entry
Calculate the mass (M+H) for each peptide
Sort the masses from lowest to highest
Have a pointer for each calculated mass to each
protein accession number in databank
Building A PMF Database

Sequence DB
>P12345
acedfhsakdfqea
sdfpkivtmeeewe
ndadnfekqwfe
Calc. Tryptic Frags

acedfhsak
dfgeasdfpk
ivtmeeewendadnfek
gwfe
>P21234
acekdfhsadfqea
sdfpkivtmeeewe
nkdadnfeqwfe
acek
dfhsadfgeasdfpk
ivtmeeewenk
dadnfeqwfe
>P89212
acedfhsadfqeka
sdfpkivtmeeewe
ndakdnfeqwfe
acedfhsadfgek
asdfpk
ivtmeeewendak
dnfegwfe
Mass List
450.2017 (P21234)
609.2667 (P12345)
664.3300 (P89212)
1007.4251 (P12345)
1114.4416 (P89212)
1183.5266 (P12345)
1300.5116 (P21234)
1407.6462 (P21234)
1526.6211 (P89212)
1593.7101 (P89212)
1740.7501 (P21234)
2098.8909 (P12345)
The Fingerprint (PMF) Algorithm

Take a mass spectrum of a trypsin-cleaved
protein (from gel or HPLC peak)
Identify as many masses as possible in spectrum
(avoid autolysis peaks of trypsin)
Compare query masses with database masses
and calculate # of matches or matching score
(based on length and mass difference)
Rank hits and return top scoring entry this is
the protein of interest
Query (MALDI) Spectrum

1007
1199
2211 (trp)
609
2098
450
1940 (trp)
698
500
1000
1500
2000
2500
Query vs. Database

Query Masses
Database Mass List
450.2201
609.3667
698.3100
1007.5391
1199.4916
2098.9909
450.2017 (P21234)
609.2667 (P12345)
664.3300 (P89212)
1007.4251 (P12345)
1114.4416 (P89212)
1183.5266 (P12345)
1300.5116 (P21234)
1407.6462 (P21234)
1526.6211 (P89212)
1593.7101 (P89212)
1740.7501 (P21234)
2098.8909 (P12345)
Results
2 Unknown masses
1 hit on P21234
3 hits on P12345
Conclude the query
protein is P12345
Database search
PeptIdent (ExPasy)
Mascot (Matrix Science)
MS-Fit (Prospector; UCSF)
ProFound (Proteometrics)
MOWSE (HGMP)
Human Genome Mapping Project
Mascot
800
1200
1600
2000
2400
800
1200
1600
2000
m/z
m/z
theoretical
experimental
Protein ID
2400
What You Need To Do PMF

A list of query masses (as many as possible)
Protease(s) used or cleavage reagents
Databases to search (SWProt, Organism)
Estimated mass and pI of protein spot (opt)
Cysteine (or other) modifications

Minimum number of hits for significance
Mass tolerance (100 ppm = 1000.0 0.1 Da)
A PMF website (Prowl, ProFound, Mascot, etc.)
PMF on the Web

ProFound
http://129.85.19.192/profound_bin/WebProFound.exe
MOWSE
http://srs.hgmp.mrc.ac.uk/cgi-bin/mowse
PeptideSearch
http://www.narrador.emblheidelberg.de/GroupPages/Homepage.html
Mascot
www.matrixscience.com
PeptIdent
http://us.expasy.org/tools/peptident.html
ProFound
ProFound Results
MOWSE
PeptIdent
MASCOT
Mascot Scoring
The statistics of peptide fragment
matching in MS (or PMF) is very similar to
the statistics used in BLAST
The scoring probability follows an extreme
value distribution
High scoring segment pairs (in BLAST)
are analogous to high scoring mass
matches in Mascot
Mascot scoring is much more robust than
arbitrary match cutoffs (like % ID)
Extreme Value Distribution
it is the limit distribution of the maxima of a sequence of independent and

identically distributed random variables. Because of this, the EVD is used as an
approximation to model the maxima of long (finite) sequences of random
variables.
8000
7000
P(x) = 1 - e
6000
-x
-e
5000
4000
3000
2000
1000
0
<20
30
40
50
60
70
80
90
100
110
Scores greater than 72 are significant
>120
MASCOT
Mascot/Mowse Scoring
The Mascot Score is given as S = -10*Log(P),
where P is the probability that the observed
match is a random event
Try to aim for probabilities where P<0.05 (less
than a 5% chance the peptide mass match is
random)
Mascot scores greater than 72 are significant
(p<0.05).
Advantages of PMF
Uses a robust & inexpensive form of MS (MALDI)

Doesnt require too much sample optimization
Can be done by a moderately skilled operator (dont
need to be an MS expert)
Widely supported by web servers
Improves as DBs get larger & instrumentation gets
better
Very amenable to high throughput robotics (up to 500
samples a day)
Limitations With PMF

Requires that the protein of interest
already be in a sequence database
Spurious or missing critical mass peaks
always lead to problems
Mass resolution/accuracy is critical, best
to have <20 ppm mass resolution
Generally found to only be about 40%
effective in positively identifying gel spots
Tandem Mass Spectrometry

Purpose is to fragment ions from parent
ion to provide structural information about
a molecule
Also allows mass separation and AA
identification of compounds in complex
mixtures
Uses two or more mass analyzers/filters
separated by a collision cell filled with
Argon or Xenon
Collision cell is where selected ions are
MS-MS & Proteomics
Tandem Mass Spectrometry

Different MS-MS configurations
Quadrupole-quadrupole (low energy)
Magnetic sector-quadrupole (high)
Quadrupole-time-of-flight (low energy)
Time-of-flight-time-of-flight (low energy)
How Tandem MS
sequencing works
Use Tandem MS: two mass analyzers
in series with a collision cell in
between
Collision cell: a region where the
ions collide with a gas (He, Ne, Ar)
resulting in fragmentation of the ion
Fragmentation of the peptides occur
in a predictable fashion, mainly at
the peptide bonds
The resulting daughter ions have
masses that are consistent with
known molecular weights of
dipeptides, tripeptides,
tetrapeptides
Ser-Glu-Leu-Ile-Arg-Trp
Collision Cell
Ser-Glu-Leu-Ile-Arg
Ser-Glu-Leu-Ile
Ser-Glu-Leu
Etc
Data Analysis Limitations

-You are dependent on well annotated genome
databases
-Data is noisy. The spectra are not always
perfect. Often requires manual determination.
-Database searches only give scores. So if you
have a false positive, you will have to manually
validate them
Advantages of Tandem Mass Spec

FAST
No Gels
Determines MW and AA sequence
Can be used on complex mixtures-including low copy #
Can detect post-translational modif.-ICAT
High-thoughput capability
Disadvantages of Tandem Mass Spec

Very expensive-Campus
Hardware: $1000
Setup: $300
1 run: $1000
Requires sequence databases for analysis
MS-MS & Proteomics

Advantages
Provides precise
sequence-specific data
More informative than
PMF methods (>90%)
Can be used for denovo sequencing (not
entirely dependent on
databases)
Disadvantages
Requires more handling,
refinement and sample
manipulation
Requires more expensive
and complicated
equipment
Requires high level
expertise
Can be used to ID post- Slower, not generally

trans. modifications
high throughput
ISOTOPE-CODED AFFINITY TAG

(ICAT): a quantitative method
Label protein samples with heavy and light reagent
Reagent contains affinity tag and heavy or light isotopes
Chemically reactive group: forms a

covalent bond to the protein or peptide
Isotope-labeled linker: heavy or light,
depending on which isotope is used
Affinity tag: enables the protein or
peptide bearing an ICAT to be isolated by
affinity chromatography in a single step
Example of an ICAT Reagent

Reactive group: Thiol-reactive
group will bind to Cys
Biotin Affinity tag:

Binds tightly to
streptavidin-agarose
resin
Linker: Heavy version will

have deuteriums at *
Light version will have
hydrogens at *
NH
NH
H
N
S
*
*
*
*
H
N
I
O
The ICAT Reagent
How ICAT works?

Affinity isolation
on streptavidin
beads
Lyse &
Label
Quantification
MS
Light
100
100
MIX
Proteolysis
(ie trypsin)
Identification
MS/MS
NH2-EACDPLRCOOH
Heavy
550
570
m/z
590
200
400
m/z
600
ICAT Quantitation
ICAT
Advantages vs. Disadvantages
Estimates relative protein
levels between samples
with a reasonable level of
accuracy (within 10%)
Yield and non specificity
Can be used on complex

mixtures of proteins
Expensive
Cys-specific label reduces

sample complexity
Peptides can be
sequenced directly if
tandem MS-MS is used
Slight chromatography
differences
Tag fragmentation
Meaning of relative
quantification information
No presence of cysteine
residues or not accessible by
ICAT reagent

Turbo pumps
Diffusion pumps
Rough pumps
Rotary pumps
High Vacuum System
Inlet
Sample Plate
Target
HPLC
GC
Solids probe
Ion
Source
Mass
Filter
MALDI
ESI
IonSpray
FAB
LSIMS
EI/CI
TOF
Quadrupole
Ion Trap
Mag. Sector
FTMS
Detector
Microch plate
Electron Mult.
Hybrid Detec.
Data
System
PCs
UNIX
Mac
MS Detectors
Early detectors used photographic film

Todays detectors (ion channel and electron
multipliers) produce electronic signals via 2o
electronic emission when struck by an ion
Timing mechanisms integrate these signals
with scanning voltages to allow the
instrument to report which m/z has struck the
detector
Need constant and regular calibration
Mass Detectors
Electron Multiplier (Dynode)
Limitations of Proteomics
-solubility of indiv. protein differs
-2D gels unable to resolve all proteins at a given time
-most proteins are not abundant (ie kinases)
-proteins not in the database cannot be identified
-multiple runs can be expensive
-proteins are fragile and can be degraded easily
-proteins exist in multiple isoforms
-no protein equivalent of PCR exists for amplification
of small samples
Shotgun Proteomics:
Multidimensional Protein
Identification Technology
(MudPIT)
General Strategy for Proteomics Characterization

Fractionation &
Isolation
2-DE
Liquid
Chromatography
Peptides
Characterization
Mass Spectrometry
Identification
Post Translational modifications
Quantification
Database Search
MALDI-TOF MS
-(LC)-ESI-MS/MS
Overview of Shotgun Proteomics: MudPIT

Protein Mixture
Digestion
Tandem Mass
Spectrometer
2D Chromatography
RP
MS/MS Spectrum
PySpzS5609 #2438 RT: 66.03 AV: 1 NL: 8.37E6
T: + c d Full ms2 729.75@35.00 [ 190.00-1470.00]
545.31
100
95
90
85
80
75
658.36
70
65
900.36
Relative Abundance
60
55
1031.40
50
45
913.42
40
1240.53
782.23
896.29
35
546.19
771.24
25
1028.41
721.31
20
431.15
15
801.38
559.13
651.14
408.74
399.24
217.91
1241.39
914.34
427.27
317.17
10
5
1032.43
895.33
30
432.40
669.39
1027.22
882.07
600.24
481.13
869.23
915.53
986.50
1258.56
1033.60
1142.43
1123.49
1312.35
1356.10
1195.44
0
200
300
400
500
600
700
800
900
m/z
1000
1100
1200
1300
1400
SEQUEST
DTASelect &
Contrast
SCX
Peptide
Mixture
> 1,000 Proteins

Identified
MudPIT
IEX-HPLC
Trypsin
+ proteins
p53
RP-HPLC
Acquiring MS/MS Datasets
2D Chromatography
SCX
MudPIT Cycle
load sample
wash
salt step
wash
RP gradient
re-equilibration
RP
Tandem MS Spectrum
Peptide Sequence is Inferred from Fragment ions
x 3~18
MS/MS of Peptide Mixtures

LC
MS
(MW Profile)
MS/MS
(AA Identity)
Matching MS/MS Spectra to

Peptide Sequences
SEQUEST
Experimental MS/MS
Spectrum
Peptides Matching Precursor Ion

Mass
Theoretical MS/MS
Spectra
PySpzS5609 #2438 RT: 66.03 AV: 1 NL: 8.37E6

T: + c d Full m s2 729.75@35.00 [ 190.00-1470.00]
545.31
100
#1
CALCULATE #2
#3
#4
#5
95
90
85
80
75
658.36
70
65
900.36
Relative Abundance
60
55
1031.40
50
45
913.42
40
1240.53
782.23
896.29
35
546.19
771.24
25
1028.41
721.31
20
431.15
15
801.38
217.91
559.13
651.14
408.74
399.24
1241.39
914.34
427.27
317.17
10
5
1032.43
895.33
30
432.40
669.39
882.07
600.24
481.13
869.23
K.TVLIMELINNVAK.K
L.NAKMELLIDLVKA.Q
E.ELAILMQNNIIGE.N
A.CGPSRQNLLNAMP.S
L.FAPLQEIINGILE.G
1027.22
915.53
986.50
1258.56
1033.60
1142.43
1123.49
1312.35
1356.10
1195.44
0
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
m /z
COMPARE
SCORE
SEQUEST Output File
SEQUEST-PVM
Beowolf computing cluster
55 mixed CPU: Alpha chips and AMD
Athlon PC CPU
Filtering, Assembling &

Comparing Protein Lists
20,000s of SEQUEST Output
Files
PARSE
Protein
List
ASSEMBLE
DTASelect
FILTER
Criteria Sets
Contrast
COMPARE
Summary Table
Control
VISUALLY ASSESS SPECTRUM/PEPTIDE MATCHES
Post Analysis Software DTASelect:

Swimming or Drowning in Data
It processes tens of thousands of SEQUEST

outputs in a few minutes.
It applies criteria uniformly and therefore is

unbiased.
It is highly adaptable and re-analysis with a new

set of criteria is easy.
It saves time and effort for manual validation.
The CONTRAST feature can compare results

from different experiments.
Application of shotgun proteomics:

Comprehensive Analysis of Complex
Protein Mixtures
Purification
Cells/Tissues
Multiprotein Complex/
Organelle
Total Protein
Characterization
Yeast: A Perfect Model
Database
ORF
Unknown,
uncoding,
Known,
biochem.
MIPS
6368
hypothetical
1568
or genetics
4344
YPD
6145
1833
4270
SGD
~6000
NA
NA
Complete genome sequence information

An extensively studied organism
Optimal numbers of ORFs, easy for database search
Functional Categories of Yeast Proteins Identified

Used GO to
determine
functional
groups
Communication and Signal Transduction
Ionic Homeostasis
Cell Rescue, Defense, Death, and Ageing

Energy
Cellular Organization
Protein Destination
Transcription
Transport
Protein Synthesis
Metabolism
Unclassified
Cell Growth, Division, DNA synthesis,

and Biogenesis
Washburn et al. Nature Biotechnology 19, 242-7 (2001)
Summary of MudPIT
It is an automated and high throughput

technology.
It is a totally unbias method for protein

identification.
It identifies proteins missed by gel-based

methods (i.e. (low abundance, membrane
proteins etc.)
Post translational modification information of

proteins can be obtained, thus allowing their
functional activities to be derived or inferred.
2-DE vs MudPIT
Widely used, highly
commercialized
High resolving power
Highly automated process

Identified proteins with
extreme pI values, low
abundance and those

from membrane
Visual presentation
Limited dynamic range

Only good for highly soluble
and high abundance proteins
Large amount of sample
required
Thousands of proteins can

be identified
Not yet commercialized

Expensive
Computationally intensive
Quantitation
Peptide Masses From ESI

Each peak is given by:
m/z = (MW + nH+)
n
m/z = mass-to-charge ratio of each peak on spectrum
MW = MW of parent molecule
n = number of charges (integer)
H+ = mass of hydrogen ion (1.008 Da)
Peptide Masses From ESI

Charge (n) is unknown, Key is to determine MW
Choose any two peaks separated by 1 charge
1431.6 = (MW + nH+) 1301.4 = (MW + [n+1]H+)
[n+1]
n
2 equations with 2 unknowns - solve for n first
n = 1300.4/130.2 = 10
Substitute 10 into first equation - solve for MW
MW = 14316 - (10x1.008) = 14305.9
14,305.14
ESI Transformation
Software can be used to convert these
multiplet spectra into single (zero charge)
profiles which gives MW directly
This makes MS interpretation much easier
and it greatly increases signal to noise
Two methods are available
Transformation (requires prior peak ID)

Maximum Entropy (no peak ID required)
Maximum Entropy
ESI and Protein Structure

ESI spectra are actually quite sensitive to
the conformation of the protein
Folded, ligated or complexed proteins tend
to display non-gaussian peak
distributions, with few observable peaks
weighted toward higher m/z values
Denatured or open form proteins/peptides
which ionize easier tend to display many
peaks with a classic gaussian distribution
ESI and Protein Conformation

Native Azurin
Denatured Azurin
Different MS-MS Modes

Product or Daughter Ion Scanning
first analyzer selects ion for further fragmentation

most often used for peptide sequencing
Precursor or Parent Ion Scanning
no first filtering, used for glycosylation studies
Neutral Loss Scanning
selects for ions of one chemical type (COOH, OH)
Selected/Multiple Reaction Monitoring
selects for known, well characterized ions only
THE END

Ahn A Mass Spec Methods Theory

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ahn A Mass Spec Methods Theory

Uploaded by

Copyright:

Available Formats

Mass Spectrometry:

Methods & Theory

Mass Spectrometry Needs

Peptide Mass Fingerprinting (PMF)

MS Peptide Sequencing/Fragment Ion Searching

ICAT Methods (isotope labelling)

Mass Spectrometry (MS)

How does a mass spectrometer work?

AA seq and protein modif.

Generalized Protein Identification by MS

Mass Spec Principles

How does a mass spectrometer work?

Time of flight (TOF) (MALDI)

Tandem MS- MS/MS

Different MS-MS configurations

Quadrupole-quadrupole (low energy)

Typical Mass Spectrometer

LC/LC-MS/MS-Tandem LC, Tandem MS

Typical Mass Spectrum

All proteins are sorted based on a

Typical Mass Spectrum

120 m/z-for singly charged ion this is the mass

Resolution & Resolving Power

The better the resolution or resolving power,

M is the mass number of the observed mass

Mass Spectrometer Schematic

High Vacuum System

Different Ionization Methods

small molecules, 1-1000 Daltons, structure

Fast Atom Bombardment (FAB Semi-hard)

Electrospray Ionization (ESI - Soft)

peptides, proteins, up to 200,000 Daltons

Matrix Assisted Laser Desorption (MALDI-Soft)

Electron Impact Ionization

Why wouldnt Electron Impact be suitable

Why You Cant Use EI For

EI shatters chemical bonds

Any given protein contains 20 different

Soft Ionization Methods

Gold tip needle

Electrospray mass spectrometry (ESI-MS)

Positive or Negative Ion Mode?

HT Spotting on a MALDI Plate

Principal for MALDI-TOF MASS

Principal for MALDI-TOF MASS

Mass Spectrometer Schematic

High Vacuum System

Different Mass Analyzers

High resolution, exact mass, original MA

Quadrupole Analyzer (Q)

Low (1 amu) resolution, fast, cheap

Time-of-Flight Analyzer (TOF)

No upper m/z limit, high throughput

Ion Trap Mass Analyzer (QSTAR)

Good resolution, all-in-one mass analyzer

Ion Cyclotron Resonance (FT-ICR)

Electrospray ionization source + quadrupole

Matrix-assisted laser desorption ionization +

separates volatile compounds in gas column and IDs

LC-MS - Liquid Chromatography MS

separates delicate compounds in HPLC column and

MS-MS - Tandem Mass Spectrometry

separates compound fragments by magnetic field and