You are on page 1of 33

Bioinformatics for Proteomics –

2D Gels

By Andrew Garrow,
University of Leeds
Myself
 PhD – Bioinformatics for a proteomics
approach to understanding the
Schistosome tegument.
 Construction of a proteomics database
 2D Gels
 Mass spectrometry
 Data analysis
Lecture Layout

 Introduction to 2D gel electrophoresis


 2D gel databases
 2D gel analysis
Proteomics

 Analysis by direct measurement of


proteins in terms of their presence and
relative abundance.
Why study Proteomics?

To better understand protein expression and formation:


 Transcriptional control
 Post-transcriptional control e.g. alternate splicing, RNA
editing
 Translational and degradation control, translational
frameshifting
 >200 known PTM e.g. phosphorylation, glycolysation, lipid
attachment, peptide cleavage

No proteins > No genes


Biological question

Biological sample

Sample prep.

2D Gel separation

Imaging

Protein excision

Protein digestion

MS/Protein ID

Global bioinformatics

Proteomic discovery
2D Gels

 Used for the separation of proteins within


a sample.
 Dependant upon protein molecular
weight and pI.
 Can be used for to resolve >1,800 spots
(Choe and Lee, 2000).
2D Gels
The Problem

 2D Gels can routinely be used to separate


>1000 spots, yet cells express 1000’s-10000’s
of proteins.
 Approaches to improve protein coverage:
 Separation on the basis on differential
compartmentalisation/solubilisation
 Narrow range IPG strips for focusing on particular pI
ranges.
Zooming


Use narrow range IPG strips to focus on particular pI ranges.
2D Gel Procedure

 Three day process


 Day 1 – Rehydration phase
 Day 2 – Isoelectric focusing (IEF)
 Day 3 – Second dimension
Staining
Sensitivity Process Advantages
Time/Steps
•Bio-safe Coomassie 10ng 2.5hr/3 steps MS compatible; easily visualised;
non hazardous

•Coomassie Blue R-250 40ng 2.5hr/2 steps Oldest and least expensive method

•Silver Stain Plus 1ng 1.5hr/3 steps MS compatible; high sensitivity; low
background

•Bio-Rad silver stain 1ng 2hr/7 steps High sensitivity; detects some highly
glycosylated and other difficult to
stain proteins

•Sypro ruby 1ng 3hr/2 steps MS compatible, allows analysis in


flourescent imagers, linear over 3
orders of magnitude
Gel Analysis

 Gel images digitally captured using a charged-


couple device (CCD) camera or scanner
 Analysed by specialised software
Phoretix 2D Advanced (www.phoretix.com)
PDQuest (www.proteomeworks.bio-rad.com)
2D Elite (http://www.imsupport.com/)
Melanie (www.expasy.ch/melanie)
Gel Analysis

Software features
 Spot detection
 Spot quantification
 Noise reduction
 Gel comparison by warping
 Linkage with robots
Robot Spot Cutter
Post Gel analysis

 Robotic spot picking


 Protein/spot digestion – e.g. with trypsin
 Mass spectrometry (MS)
 MS data analysis
 Data repository
Data Repository

 Database – a collection of data records either


in a single file or in multiple files.
 Database management system (DBMS) – a
software suite including a database, the
utilities required to organize it, search, update,
maintain data security and control access.
 Databases – flat file, relational, object
orientated.
2D Gel Databases

www.expasy.ch - Swiss-2DPAGE

http://www.anl.gov/BIO/PMG/ - Mouse liver, human breast cell


lines, pyrococcus. Argonne Protein Mapping Group.

http://www.harefield.nthames.nhs.uk/nhli/protein/index.html - HSC-
2DPAGE, Heart Science Centre, Harefield Hospital

http://oto.wustl.edu/thc/peri-gels.htm - Washington Univ. Inner Ear


Protein Database

http://ca.expasy.org/ch2d/2d-index.html - World 2DPAGE, Index of


2D gel databases
Federated 2D PAGE database

 Described by Appel et al (1996)


 Aimed to tackle (then) emerging
problems with 2D Gel databases:
 non-uniformity of data-encoding conventions
 robustness
 consistency
 commitment of groups to maintain the databases
and data quality
Federated 2D PAGE database

 Rules:
 Rule 1 – Individual entries in the database must be accessible by a keyword
search. Other methods are possible but not required.
 Rule 2 – The database must be linked to other databases by active hypertext
cross-references, linking together all related databases. Database entries
must be at least linked to the main index.
 Rule 3 – A main index has to be supplied that provides a means of querying
all databases through one unique query point. Currently, the main index is the
SWISS-PROT database.
 Rule 4 – Individual protein entries must be available through clickable images.
 Rule 5 – 2DE analysis software designed for use with federated databases,
must be able to access individual entries in any federated 2DE databases.

http://ca.expasy.org/ch2d/fed-rules.html
Swiss 2DPAGE
 Established in 1993
 Maintained by the Central Clinical Chemistry
Laboratory of the Geneva University Hospital
and the Swiss Institute of Bioinformatics.
 Entries highly annotated -
 containing textual data on proteins including:
 mapping procedure
 physiological and pathological information,

 experimental data (isoelectric point, molecular weight,


amino acid composition, peptide masses)
 bibliographical references.
Swiss 2DPAGE

 Entries are linked to images showing the


experimentally determined and theoretical
protein locations.
 Cross-references are provided to other
federated 2D-PAGE database entries, Medline
and SWISS-PROT
 Search via - clickable images
- keywords
Make2DDB

 Software package provided by ExPASY


 Allows for production of a 2DPAGE
database on users server.
 Database created which is queryable via
description, accession or spot clicking.
 Provides links to Swiss-Prot.
Make2DDB databases
http://semele.anu.edu.au/2d/2d.html -
ANU 2D-PAGE, Australian National University 2D-PAGE database

http://babbage.csc.ucm.es/2d/2d.html -
COMPLUYEAST 2DPAGE, Saccharomyces cerevisae 2D-PAGE database at
Universidad complutense Madrid, Spain

http://www.gram.au.dk/ -
PHCI-2DPAGE, Parasite host cell interaction 2D-PAGE interaction database.

http://www.bio-mol.unisi.it/2d/2d.html -
Sienna 2D PAGE

A sample of 2D-PAGE databases created with make2ddb.


2D Gel Databases
 Limitations of current databases:
 Do not contain strict/detailed descriptions of protocol
(buffers, sample volume, staining techniques all important
information for gel comparisons).
 Designed as 2D (and not proteomics) databases and

therefore not readily expandable to incorporate other


proteomics data e.g. MS, MDLC.
 Designed for reference gels, not on-going projects.
Proteomics Database Schema

 What should it encompass?


 Proteomics methods (e.g. protein sample prep, electrophesis
buffers, staining techniques, digestion for MS etc).
 Results from each stage of the experiment (e.g. gel images,
MS data).
 Parameters used for MS data analysis/statistical results
 All stored in strict format.
 Note: MIAME and MAGE-ML
Database querying

 Interact via web interface using Perl/CGI


 Clickable gel images
 Text querying – for keywords, gel/spot
name, author, sequence etc.
 XML used for data exchange
Proteomics Database Schema
Introduction to databases
 Flat file –simplest database type, an ordered
collection of data entries, analogous to how files would
be stored in a filing cabinet.
 Relational –more sophisticated, storing data in inter-
related tables. Allow for flexible querying using
Structured Query Language (SQL).
 Object Orientated – database consistent with object
orientated principles, allowing for storage of complex
datatypes (i.e. multimedia) and querying beyond that
defined by a rigidly defined query language.
DBMS choice

 A flat file database would contain many redundancies in


storing complex data types.
 An object-oriented database could intrinsically store
complex data types e.g. large images, however, a relational
database could contain links to images stored elsewhere.
 SQL would provide a fast and easy way of querying and
updating the database.
 A relational database would provide a platform, easily
expandable to accommodate additional forms of data.
Future

 Standard database schema for proteomics and mark-up language


for data exchange.
 Improved spot detection, quantification and gel warping
algorithms.
 Improved sample preparation techniques.
 More automation (linkage of robots!).
 Protein array technologies.
References
 Appel RD, et al 1993 - SWISS-2DPAGE: a database of two-dimensional gel electrophoresis
images. Electrophoresis, 14, 1232-1238.

 Appel RD, Bairoch A, Sanchez JC, Vargas JR, Golaz O, Pasquali C and Hochstrasser
DF, 1996 – Federated two-dimensional electrophoresis database: a simple means of
publishing two-dimensional electrophoresis data, Electrophoresis, 17, 540-546.

 Bjellqvist B, Ek K, Righetti PG, Gianazza E, Gorg A, Westermeier R, Postel W., 1982 –


Isoelectric focusing in immobilised pH gradients: principle, methodology and applications,
J.Biochem.Biophys.Methods, 6, 317-339.

 Brazma A, et al. 2001 – Minimum information about a microarray experiment (MIAME)-


towards standards for microarray data, Nat. genetics, 29, 365-71.

 Hoogland C, Baujard, Sanchez JC Hochstrasser DF and Appel RD, 1997 – Make2ddb: a


simple package to set up a two-diensional electrophoresis database for the world wide web,
Electrophoresis, 18, 2755-2758.

 O'Farrell, 1975 - High resolution two-dimensional electrophoresis of proteins., J.Biol.Chem.,


25, 250, 4007-21.

You might also like