Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Emerging Technologies in Protein and Genomic Material Analysis
Emerging Technologies in Protein and Genomic Material Analysis
Emerging Technologies in Protein and Genomic Material Analysis
Ebook554 pages5 hours

Emerging Technologies in Protein and Genomic Material Analysis

Rating: 0 out of 5 stars

()

Read preview

About this ebook

It is widely recognized that analytical technologies and techniques are playing a pioneering role in a range of today's foremost challenging scientific endeavours, including especially biological and biomedical research. Worthy of mention, for example, are the role that high performance separation techniques played in mapping the human genome and the pioneering work done within mass spectrometry.

It is also apparent that state-of-the-art pharmaceutical and biomedical research is the major driving force of the development of new analytical techniques. Advancements in genomics research has provided the opportunity for a call for new drug targets for new technologies, which has speeded up drug discovery and helped to counteract the trend towards inflation of R&D costs.

This book has been designed to be a reference covering a wide range of protein and genomic material analysis techniques. Emerging developments are presented with applications where relevant, and biological examples are included. It was developed to meet the ever growing need for a comprehensive and balanced text on an analytical technique which has generated a tremendous amount of interest in recent years.

In addition, this book also serves as a modern textbook for advanced undergraduate and graduate courses in various disciplines including chemistry, biology and pharmacy.

Authors of the individual chapters are recognized champions of their individual research disciplines and also represent contemporary major research centres in this field.

· Contains state-of-the-art knowledge of the field and detailed descriptions of new technologies
· Provides examples of relevant applications and case studies
· Contributing authors are leading scientists in their own respective research fields
LanguageEnglish
Release dateJul 2, 2003
ISBN9780080530840
Emerging Technologies in Protein and Genomic Material Analysis

Related to Emerging Technologies in Protein and Genomic Material Analysis

Titles in the series (26)

View More

Related ebooks

Chemistry For You

View More

Related articles

Reviews for Emerging Technologies in Protein and Genomic Material Analysis

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Emerging Technologies in Protein and Genomic Material Analysis - Gyorgy Marko-Varga

    Chapter 1

    Enabling Bio-analytical Technologies for Protein and Genomic Material Analysis and their Impact on Biology

    György A. Marko-Vargaa; Peter L. Oroszlanb    a Department of Analytical Chemistry, Lund University, Lund, Sweden

    b Zeptosens AG, Witterswil, Switzerland

    1.1 BACKGROUND

    Although the human genome was made public by the end of June 2000, it was not the complete human map; there were still parts that were not readily available. Currently, there is a plan that the National Human Genome Research Institute will publish the finished draft of the human-genome sequence in a scientific journal by 2003. During the period between the first launch of the human-genome sequences (June 2000) and today, the focus has turned towards the post-genomic area. High capacity and high performance technologies that are able to identify the expression of proteins are currently a research area attracting enormous attention. This is due to the fact that the DNA coding and ‘the holy grail’ of human life is no further than the nearest computer with an Internet connection. Almost the entire genome can be searched. These gene sequences hold only a limited percentage of the entire proteome that is present in cells. The reason is that, upon transcription and translation, proteins in many cases undergo a post-translational modifying step.

    Proteomics has become the new ‘hot’ research where the protein expression area has received enormous attention scientifically as well as from investors in the stock market [1]. Several hundred million dollars have already been invested by the Pharma and Biotech companies and the business growth until 2005 is expected to be $5 billion [2].

    Academia, the pharmaceutical industry and the biotech sector have moved their efforts and positions towards trying to complete the human proteome and what it holds. It seems that this task is not only a major effort to fulfil, but also difficult to define. What exactly the proteome as such holds in terms of gene products is one aspect of addressing the entire map of the human proteome. The other aspect is that the post-translational modifications that occur within the cell reflect a fundamental and very important process whereby additionally a large number of proteins is present. The post-translationally modified proteins may occur as many variants all sharing a larger degree of protein sequences, e.g. enzymes often occur as one protein sequence holding an inactive form in a state of rest, which after modification becomes activated. These activation mechanisms include phosphorylations, carbohydrate modifications or simply cleaving a protein sequence from the pre-protein.

    Many academic scientists argue that efforts to map the entire proteome are a matter of organisation. This illustrates the need for a single body to coordinate research efforts and advance human knowledge as a whole rather than according to the corporate agenda. Unlike the genome project, where DNA sequencing technology propelled the project towards a single easily understood goal, the proteomics field is driven by an array of emerging technologies having multiple goals. Scientists would like to identify ultimately all human proteins, determine the shapes they assume inside cells and understand their function by observing how they change in different circumstances. The initiative of starting an inventory of protein content in cells was suggested in the 1980s [3].

    The Human Proteome Organization (HUPO) is currently trying to provide some help coordinating large proteome investigations. There is also agreement within HUPO that the focus of its work will be on mapping proteins within certain biological materials generated from human clinical studies, with cancer research of primary interest [4].

    The ‘Human Proteome Index’ (HPI) [5] is another initiative from the founders of SwissProt. It is a human protein database to which all human protein sequences can be submitted. Because national and international funding into proteomics research is steadily increasing, sequence information should be stored in a central database accessible to everyone. The number of inactive protein sequences forms one part of the required information. This detailed biological information addresses the actual cell compartment from where the specific protein in question originates and/or where it was found to be present. It is of mandatory importance. An identity by itself, that a protein is expressed, or found to be up- or down-regulated in a certain disease or disease in-vivo or in-vitro model, comprises only the first step in the proteomics process. Unless the expression can be linked to its biological function, resulting limitations in the verification of the biological role will remain.

    Sequencing the genome is just the start in a long march towards an understanding of how humans grow and develop and also to the link with disease and disease evolvement. Genes are powerful, but ultimately we need to understand the proteins that carry out bodily functions.

    In this context, the evolving ‘new science’ and its various disciplines are expected to have a large impact also on the life science industry.

    1.2 INDUSTRIAL IMPACT

    State-of-the-art drug discovery and development—the foremost important consumer of new analytical technologies—is driven largely by chemistry and high-throughput screening. Questions about the biological effect of a candidate drug (i.e. toxicity, efficacy) are addressed only at a rather late phase of the development process. In the contemporary pharmaceutical business, much effort is directed to reducing drug development times significantly in order to expedite marketing incentives. A rational time-saving approach could contribute significantly to the short-term (i.e. fast product launch) as well as the long-term (i.e. broad coverage of the therapeutic area by accumulating relevant mechanistic and efficacy data) business success of a given pharmaceutical enterprise.

    The advent and use of new, emerging analytical technologies result also in new disciplines, e.g. pharmacogenomics/toxicogenomics, shifting that paradigm to a much earlier phase of the drug development process. In the course of drug development, clinical trials determine whether a drug is safe and effective, at what doses it works best and what side effects it causes. Early availability of accurate pharmacokinetic and pharmacodynamic parameters helps in the individualization of dosages based on concentration/effect variability data. Moreover, the availability of (i) pharmacodynamic or surrogate efficacy markers bearing high predictive value regarding the clinical endpoint of the disease and (ii) high performance, accurate and predictive methods to evaluate the effect of medicines could significantly contribute to shorten the time it takes to determine the most effective combinations of current and experimental drugs.

    1.3 PROTEOMICS—THE NEW FRONTIER AFTER THE HUMAN GENOME

    Proteomics is a collection of scientific approaches and technologies aimed at characterizing the protein content of cells, tissues, and whole organisms, i.e. the proteome (proteins expressed as gene products from a given biological material at any given time). An important goal of proteomics studies is understanding the biological roles of specific proteins and to use this information to identify new key regulating targets.

    1.3.1 Human proteome initiatives

    A handful of biotech companies such as Oxford Glycosciences, Proteome Systems, MDS-Proteomics, CuraGen, Large Scale Biology, Celera Genomics (that recently announced they had became a pharmaceutical company) and others, are racing to develop new tools to mine the proteome and complete sets of human, as well as microorganism, proteins for clues to human diseases.

    Many biotech companies originate from the genomic field and try to make a success of applying the powerful shotgun, or high speed/capacity gene sequencing technology, to the protein sequencing field.

    DNA molecules are considered ‘simple targets’ when compared with proteins. The long spiralling ladder of the DNA molecule, the famous double helix, is composed of just four basic constituents. Proteins, on the other hand, are highly complicated biomacromolecules that fold up into intricate and often unpredictable shapes utilizing 20 amino acid building blocks. This alone makes sequencing proteins a much harder task. In addition, the set of proteins that a cell uses changes constantly. Post-translational modification (PTM) occurs consistently, which makes the proteome a moving target, so the actual expression composition in the cell is a dynamic event.

    PTMs also keep the biological system in balance. A single modification such as a phosphorylation can create the difference between a biologically active target and its inactive latent form. Some proteins are modified in less than a minute, while others can persist in the cell for hours or even days. However, ultimately, the major difference in approaching the protein expression profiling research area is that we lack the ability to produce a PCR technology for proteins whereby we are able to amplify any sequence we would like. Until we are able to make that ingenious discovery, which surely would be awarded an additional Nobel prize, we are left with two principle options: (1) develop novel technology that allows us to increase protein sequencing throughput, as well as make major improvements in analyte sensitivity detections, and (2) make use of larger amounts of biological starting materials that allows the sequence information to be determined. In many medical situations the latter possibility is ruled out since e.g. specific tissue material important for the disease is not available.

    Today, there are two main strategic approaches that studies are performed according to: (i) global expression profiling, where a high number of proteins are readily sequenced, (here, tens of thousands of proteins can be identified), and (ii) a focused approach, where a given organelle in the cell is of particular interest because of a biological function occurring in a biological event of particular importance, or targeting a specific protein that is known to have a central role to play in a specific disease. In this respect, both the target protein itself and the neighboring protein alteration in the disease progression are of key importance.

    Additionally, many critical functions of the cell are accomplished by a complex cascade of events, e.g. in signalling pathways, in which proteomics is an efficient and powerful technique for extracting specific details of signalling mechanisms. This has been explored and described in several interesting presentations by both Cellzome and MDS-Proteomics [6–8].

    1.3.2 Multidimensional liquid chromatography

    Traditional protein expression profiling using two-dimensional gel electrophoresis (2DGE) is still very useful in most proteomics laboratories all over the world. The combination of these two separation techniques is still the best method for analyzing intact proteins. 2DGE technology was originally developed early in the 1970s by O’Farell and Laemmli [9,10].

    At present, there seems to be a movement towards 2D or multidimensional liquid chromatography replacing many experimental studies where 2D-gels are used currently. Still, 2D-gels will continue to have a place even in the future because of the inherent resolving power of displaying intact proteins at an impressive sensitivity level. A very interesting comparison of the above mentioned methods was a set of yeast studies carried out using both 2D-gels and 2D-LC. Gygi et al. [11] found that the capillary LC approach was superior to the gel technology, while Fey and Larsen, in a repeat of the experiment made at the Center for Proteome Analysis (CPA), Odense, Denmark, presented 2D-gel image mapping at an impressive level with even lower codon bias values, which exceeded the capacity of the 2D-capillary LC approach [12]. It is probable that further examples will occur in the future where 2D-capillary LC expression profiles show superior performances to that of 2D-gels. It is still early days regarding running proteomics with liquid phase separation. The quantitative aspect utilizing isotope labeling reagents such as ICAT and others should not be underestimated [12–15].

    When it comes to integrated protein workstations, the most powerful separation technology interfaced to ESI-MS/MS is certainly multidimensional HPLC where research groups have been able to generate thousands of protein identities during a limited time range. An additional benefit is that all these steps are run in an automated mode that is operated with a high degree of reproducibility.

    1.3.3 Fast and sensitive protein sequencing by mass spectrometry

    For fast access to protein sequence information for verifying identities of proteins and protein complexes, the technology of choice for high sensitivity and high speed protein identifications is peptide mass sequencing by mass spectrometry with the various ionisation principles available [16]. In this volume, Grönborg and Noerregaard Jensen introduce new concepts whereby phosphoproteins can be analyzed. Post-translational modifications such as the phosphorylation stoichiometry using a combination of chromatographic enrichment strategies are outlined. Currently, there is an explosion of novel principles whereby protein sequencing by mass spectrometry is done. It is also anticipated that in the future more work will be focused on protein–protein interactions, since proteins mostly execute biological events as complexes.

    This was highlighted when a recent milestone was achieved simultaneously by two groups using high-throughput protein–protein interaction sequencing. Both of the groups used yeast two-hybrid systems to map the interactions.

    This experimental exercise can efficiently be utilized as a shortcut to human disease, where recent works by Gavin et al. [6] and Ho et al. [7], mapping protein–protein interactions in Saccharomyces cerevisae, resulted in the identification of more than 10 000 protein associations.

    Protein–protein interactions are vital biological processes within the cell. No protein operates on its own; on the contrary, most seem to function within complicated cellular pathways, interacting with other proteins either in pairs or as components of larger complexes.

    Both groups addressed the analysis of protein–protein interactions by mainly liquid phase separations, applied with liquid chromatographic approaches interfaced on-line to MS/MS. Gavin et al. [6] made an alteration using liquid chromatography in combination with 1D-gels, applying accelerated techniques that rapidly identify the protein function and characterize multidimensional protein complexes. Both groups made use of techniques that genetically pair engineered tags with sensitive automated mass spectrometry to probe multi-protein complexes in cells of yeast. Both single- and double-tagging approaches were applied to the yeast two-hybrid systems; however, neither the advantages nor the disadvantages of the two approaches are clear, even today.

    Gavin et al. [6] used a gene-specific cassette containing a TAP tag at the 3′-end of the genes. The TAP tag is used as a tandem-affinity approach in the purification of proteins. Using this approach, 1739 genes were tagged and 1167 genes were expressed in yeast.

    In the method of Ho et al. [7], recombinant-based cloning was utilized to add a flag epitope tag to 725 yeast genes. These clones were then transfected in yeast. Interactions were discovered for roughly 70% of the clones for a total of 1578 different interacting proteins, representing 25% of the yeast genome. However, it appears that the data generated in these studies also show that many examples of expected interactions are not present in the data sets, as well as showing defaults in the data from a yeast perspective [17]. The network of protein–protein interactions is made fully interactive through the database and the software allows highly complex hypothesis testing to be made [8]. It includes data from protein–protein interactions that were already known to exist in the literature as well as newly identified interactions performed in experimental studies.

    As a new development for improving the speed, accuracy and sensitivity of protein sequencing, the development of the sequencing MALDI instrument, the ABI 4700 TOF/TOF workstation, is an example of a unique instrument for MS/MS sequence generation. The precursor ion is selected and fragmented in a high-energy collision-induced dissociation cell [18]. Ions will then be re-accelerated and analysed by the second TOF analyser. This ionisation principle offers coaxial MS/MS capability from a standard MALDI source. The pulsing laser fires 200 times per second (200 Hz) allowing simultaneous data acquisition to be performed. High sample throughputs can be generated with a sample speed of 400–1000 per hour.

    At present, new Fourier transform ion cyclotron resonance mass spectrometry (FTICR) principles are being developed whereby peptide sequences can be obtained at low attomole levels [19].

    Palmblad and Bergkvist provide developments in one chapter, made with FTICR, for peptide and protein determinations where examples are given of peptide sequencing in various human biofluids.

    Today, researchers use mass spectrometry analysis to complement and enhance experimental results obtained by X-ray crystallography and NMR. Protein sequencing using mass spectrometry is a highly valuable technique in experimental protein structure studies. The analytical power of MS for studying proteins has improved significantly and is used routinely for both expression proteomics and functional proteomics studies.

    In addition, the use of computational modeling techniques to generate 3D structures insilico is also an important feature for understanding basic molecular interactions. In order to generate 3D models, these techniques rely on the existence of experimentally determined structures as well as primary amino acid sequences. It is also foreseen that computational approaches to protein structure determination will play a critical role in mining the vast library of gene sequence databases.

    1.4 PROTEIN CHIP ARRAYS

    With the capacity to generate a million or so antibodies that recognize every protein in the human proteome, new horizons can be envisioned. The next step would be to deposit and incubate a few microliters of biofluid from a patient or control onto the chip. Some of the sample’s proteins will mate with antibodies on the chip and latch onto them. Because the proteins have been labeled with a fluorescent dye, they can be seen, with the help of a laser, as little glowing squares of red, green or yellow. The protein pattern that lights up will differ in a few critical ways from that of a normal person, identified by differential display analysis. By identifying those discrepancies and finding the proteins that cause them, it is now possible to rapidly identify the proteins that are present in normal patients and not in diseased samples. This identification may be the opening for a hypothesis where the link to the cause of the disease, either directly or indirectly, may be found by depositing a few drops of the sample fluid onto a chip and scanning for the protein changes that might indicate a change of differential protein expression.

    This obviously will be a major challenge and undertaking since proteins are far more complex than strands of DNA; it is not just a simple transformation from DNA chips to protein chips.

    Although the speed in research successes is impressive, it is anticipated, however, because of that the major breakthrough will not happen overnight e.g. the difficulties encountered in producing high quality specific antibodies to fight every variant of every protein in the human body.

    Phage display technology is being used to make antibody libraries with the potential to make appropriate clone selection to select high affinity immunoreagents.

    The big question is whether protein chip technologies will totally supplant the need for DNA chips, or rather be used as a strong complementary analytical tool.

    Recently, a novel protein chip technology, which allows the high-throughput analysis of biochemical activities, has been used to analyse nearly all of the protein kinases from Saccharomyces cerevisae. Protein chips are disposable arrays of micro wells in silicone elastomer sheets placed on the top of microscope slides. The high density and small size of the wells allows for high-throughput batch processing and simultaneous analysis of many individual samples. Only small amounts of protein are required. Of a hundredfold known and predicted yeast protein kinases, somewhat more than a hundred were overexpressed and analyzed using 17 different substrates and protein chips. It was found that many novel activities and a large number of protein kinases are capable of phosphorylating tyrosine. Additionally, the tyrosine phosphorylating enzymes are found often to share common amino acid residues that lie near the catalytic region [19].

    1.5 DNA/RNA ANALYSIS

    Faced with the task of analyzing thousands of genes at once, researchers routinely turn to DNA chips to analyze the expression patterns in various biological materials of interest. The chips are one of the key innovations that have transformed genomics from a cottage industry into a large-scale automated business. Most genes serve as blueprints for the production of proteins, the workhorses of living cells. To understand diseases, it is compulsory to survey the activities and functions of these proteins.

    Large-scale screening exercises where microorganisms or human clinical material are investigated is directed towards finding the genetic disorders where altering gene sequences in given chromosomes can be localized and linked to a specific given disease. In this way, several companies have made major efforts to use extremely large sets of, e.g. human biofluid banks, which hold the information from a large population. In many cases, these bio-banks have a track record of samples over time. This means that it is possible to follow a patient from an early age, when the disease was diagnosed, to a late state disease in an adult state. This makes these screening studies unique since it is possible to follow the change of disease over time, and thereby find the mechanisms that are involved in malfunctions.

    1.5.1 Pharmacogenomics

    Pharmacogenomics—the study of how the genetic code affects an individual response to drugs—holds the promise that drugs can be adapted to each person’s own genetic makeup, as the key towards creating more powerful, personalized drugs with greater efficacy and safety. It requires the combination and understanding of traditional pharmaceutical sciences (e.g. biochemistry) with annotated knowledge of genes and proteins. Global and tailored gene expression analysis techniques such as genomic microarrays or gene chips have the potential to identify toxicology-related problems and simultaneously predict risks associated with the exposure of the drug candidate. Moreover, new technologies allowing simultaneous multiplexed analysis of biological markers (efficacy and surrogate markers), such as protein microarrays, are expected to deliver functional information on drug response and efficacy. This should allow short studies in the very early discovery phase to predict the long-term effect of the exposure and later phase clinical outcomes of a treatment. The integration and industrialization of these new techniques and disciplines represent an enormous potential leading to early selection of ‘good’ drug candidates by gaining information on the therapeutic index (i.e. the measure of the beneficial effect of a drug compared with its safety). Ideally, safety and efficacy should be studied simultaneously in animal models by global assessment of potential surrogate markers, resulting in the increase of efficiency with a scope of a significant return to the industry.

    1.6 MINIATURIZED, MICRO-SCALE SYSTEMS

    Large-scale shotgun sequencing using a high-throughput multi-capillary-based separation approach contributed enormously to the speeding up of sequencing efforts in the frame of the human genome project and continues to be the dominant technology in sequencing. As the next quantum leap in this direction, Guttman, in this book, proposes the dissemination and broad utilization of microfabricated, capillary-based integrated fluidic systems, to be used as stand-alone devices or integrated elements in hyphenated systems with relevance in genomics/proteomics. An important application practiced at the industrial scale and with potential for miniaturization is described by Irth et al.—the combination of homogeneous, continuous flow multi-protein biochemical assays with characterization of the interacting partners using mass spectrometry, thus allowing the simultaneous screening and affinity characterization of chemical compounds as well as the molecular mass determination of biologically active compounds in screening mode. Also in this volume, Palm presents an overview of ‘Capillary Isoelectric Focusing Developments in Protein

    Enjoying the preview?
    Page 1 of 1