You are on page 1of 39

The human genome project

By,
Anu S
Contents
• What is a genome?
• Brief introduction to human genome
• Why human genome project?
• Goals of human genome project
• Techniques involved in human genome
1. Clone-clone sequencing
2. Celera shot gun sequencing
• Role of bioinformatics in HGP
• Genes and their role in the body
• Ethical, Legal, and Social Implications
• Advantages and Disadvantages of
human genome project
• Conclusion
• Reference
What is a genome?

• The entire genetic makeup of


the cell nucleus of any
organism is called a genome

• Genes carry the information for


making all of the proteins
required by the body for
growth and maintenance.

• The genome also encodes


rRNA and tRNA which are
involved in protein synthesis.
The Human Genome

• Made up of ~35,000-50,000 genes which


code for functional proteins in the body
• Includes non-coding sequences located
between genes, which makes up the vast
majority of the DNA in the genome (~95%)
• The particular order of nucleotide bases
(As, Gs, Cs, and Ts) determines the amino
acid composition of proteins
• Information about DNA variations
(polymorphisms) among individuals can
lend insight into new technologies for
diagnosing, treating, and preventing
diseases that afflict humankind
History of human genome project
• Human genome project was
officially started in June 1990
• The project proposed was for
15year
• The countries that took part in
human genome project are:
France, Germany, Japan, China,
the UK and USA
• They completed the draft in 2000
• human genome project was
completed in April 2003
Why human genome project
• Most inherited diseases are rare, but taken together,
the more than 3,000 disorders known to result from
single altered genes rob millions of healthy and
productive lives.
• Today, little can be done to treat, let alone cure, most
of these diseases. But having a gene in hand allows
scientists to study its structure and characterize the
molecular alterations, or mutations, that result in
disease.
• Progress in understanding the causes of cancer
• Gene mutations probably play a role in many of
today's most common diseases, such as heart
disease, diabetes, immune system disorders, and
birth defects.
• These diseases are believed to result from complex
interactions between genes and environmental factors.
• When genes for diseases have been identified, scientists
can study how specific environmental factors, such as
food, drugs, or pollutants interact with those genes.
What Goals Were Established for the Human
Genome Project When it Began in 1990?

• Identify all of the genes in human DNA.


• Determine the sequence of the 3 billion chemical
nucleotide bases that make up human DNA.
• Store this information in data bases.
• Develop faster, more efficient sequencing technologies.
• Develop tools for data analysis.
• Address the ethical, legal, and social issues (ELSI) that
are arise form the project.
Techniques involved in
human genome
• DNA Sequencing
• The Employment of Restriction Fragment-Length Pol
• Yeast Artificial Chromosomes (YAC)
• Bacterial Artificial Chromosomes (BAC)
• The Polymerase Chain Reaction (PCR)
• Electrophoresis
• Clone-clone sequences
• Celera short gun sequence
DNA sequencing
• DNA sequencing, the process of determining the
exact order of the 3 billion chemical building blocks
(called bases and abbreviated A, T, C, and G) that
make up the DNA of the 24 different human
chromosomes, was the greatest technical
challenge in the Human Genome Project.
• Achieving this goal has helped reveal the
estimated 20,000-25,000 human genes within our
DNA as well as the regions controlling them.
• The resulting DNA sequence maps are being used
by 21st Century scientists to explore human biology
and other complex phenomena.
• This type of sequencing is done by four methods:
1. Maxium and gillbert method of seqencing
2. Sanger’s method of sequencing
3. Pyro sequencing
4. Automated sequencing
Restriction fragment length
polymorphism
• Restriction fragment length polymorphisms (RFLPs)
were the first type of molecular markers used in
linkage studies.
• RFLPs arise because mutations can create or destroy
the sites recognized by specific restriction enzymes,
leading to variations between individuals in the length
of restriction fragments produced from identical
regions of the genome differences in the sizes of
restriction fragments between individuals can be
detected by Southern blotting with a probe specific for
a region of DNA known to contain an RFLP.
• The segregation and meiotic recombination of such
DNA polymorphisms can be followed like typical
genetic markers.
• RFLP analysis of a family can detect the segregation
of an RFLP that can be used to test for statistically
significant linkage to the allele for an inherited disease
or some other human trait of interest
Yeast artificial chromosome
• This method first described in 1983 by Murray
and Szostak
• A yeast artificial chromosome (short YAC) is a
vector used to clone large DNA fragments (larger
than 100 kb and up to 3000 kb).
• It is an artificially constructed chromosome and
contains the telomeric, centromeric, and
replication origin sequences needed for
replication and preservation in yeast cells.
• Built using an initial circular plasmid, they are
linearised by using restriction enzymes, and then
DNA ligase can add a sequence or gene of
interest within the linear molecule by the use of
cohesive ends.
• Use of different regions of DNA in different YACs
allows the rapid determination of the sequence,
or order of the constituents, of the DNA.
Bacterial artificial chromosome
• bacterial artificial chromosome (BAC) is a
DNA construct, based on a functional fertility plasmid
(or F-plasmid), used for transforming and cloning in
bacteria, usually coli-plasmids play a crucial role
because they contain partition genes that promote
the even distribution of plasmids after bacterial cell
division.
• The bacterial artificial chromosome's usual insert size
is 150-350 kbp, but can be greater than 700 kbp.
• BACs are often used to sequence the genome of
organisms in genome projects, for example the
Human Genome Project. A short piece of the
organism's DNA is amplified as an insert in BACs,
and then sequenced. Finally, the sequenced parts
are rearranged in silico, resulting in the genomic
sequence of the organism
Polymerase chain reaction
• Using the polymerase chain reaction (PCR), millions of
copies of a specific DNA segment can be made in a test
tube.
• PCR is also an automated process. Many physical
mapping strategies depend on creating an array of linear
DNA overlaps.
• Multiple copies of DNA fragments are needed to
complete the mapping process.
• PCR can be applied for forensic purposes as well.
• From a very tiny amount of DNA, the polymerase chain
reaction can be used to produce more copies of the DNA
for analysis
• most mapping techniques in the Human Genome Project
(HGP) rely on PCR.
Clone-clone sequencing
• When the whole genome sequencing work on human and other organism was
initiated in late-1980s, it was decided that large segments (clones) of genomic DNA
(produced by partial digestion) may first be aligned in a linear order on the
chromosomes as overlapping segments, which can then be used as landmarks for
sequencing data.
• The sequences of individual clones can thus be conveniently coalesced to obtain the
DNA sequence covering an entire chromosome. Large DNA segments are cloned in
BAC vectors and these BACs are used for construction of physical maps.
• since the physical position of each clone on a chromosome is defined in the form of
ordered BACs, In late 1980s and early 1990s, such clone-based maps were
considered necessary and useful for complete genome sequencing and were
therefore prepared in several animal and plant genomes.
• Using these clone-based maps, whole genome sequencing was successfully
completed in several eukaryotes including yeast (S. cerevisiae), a nematode (C.
elegans) and a higher plant (Arabidopsis thaliana). Such clone-based maps also
contributed, though partly, to the whole genome sequencing of Drosophila
melanogaster, the mouse and the humans.
• Once the BACs are physically mapped, the physical maps can be utilized
for whole genome sequencing using the following steps :
• (i) BAC clones are selected from the whole genome BAC map, using
suitable algorithms (software), so that minimum number of BAC clones with
minimum overlapping is used to over the entire genome. This is often
described as selection of minimum tilling path. In case of human genome,
10,000 to 20,000 BACs were selected to generate a working draft of human
genome;
• (ii) BAC clones re used for subcloning, so that small inserts of a
manageable size for sequencing are available in cosmid or plasmid vectors
(DNA segments longer than 500-800 base pairs can not be sequenced
directly in manual or automated sequencers).
• These subclones are subjected to shotgun (random) sequencing without
ordering them within the BAC clone, so that many subclones are sequenced
to ensure sequencing of all parts of a BAC.
• This approach has been used to sequenced to ensure sequencing genomes
of yeast and a nematode, C. elegans and also partly the genomes of fruitfly,
mouse and humans.
In this approach, every part of the genome is actually sequenced roughly 4-
5 times to ensure that no part of the genome is left out
Celera shot gun sequencing
• Celera was founded in 1998 by Craig Venter ,with the
mission to sequence the human genome and provide
clients with early access to the resulting data.
• Using state-of-the art sequencing technology supplied by
Applied Biosystems and sophisticated internally-developed
informatics, Celera pioneered the application of “shotgun”
sequencing
• Whole-genome shotgun sequencing involves shearing or
cleavage (partial digestion) of genomic DNA followed by
cloning, to produce a genomic library.
• This is followed by sequencing of cloned DNA-fragments at
random, followed by assembly of the fragment sequences
into larger units on the basis of their overlaps.
• The techniques is described as shotgun assembly.
• This approach does not require any or physical maps of the
genome for whole genome sequencing.
• Craig Venter also made use of publicly available hierarchical
shotgun DNA sequence data generated by the International Human
Genome Sequencing Consortium (IHGSC).
• The sequences were initially obtained in the form of 140 sequenced
contigs, each contig, consisting of 2-20 overlapping clones and
representing different non-overlapping portions of the genome (a
contig is a set of contiguous overlapping clones, each contig having
two to more than 25 clones and a singleton is a clone not
incorporated into any contig).
• The gaps between these contigs were filled later. For this purpose,
the genomic library was searched for singletons, whose end
sequences may match those of the ends of two different contigs. If
such a clone (singleton) is available, its sequence will fill the gap
between two contigs. As many as 99 gaps were filled in this manner
Difference between the clone by clone and celera
shotgun method
Clone by clone method Celera shotgun method

It requires a physical map of whole genome (the crude map) It straight away moves to the job of sequening

Many copies of randomly cut genome fragments are taken The genome is shredded into pieces (2000bp) and for the
(150,000bp) second time they generate a 10000bp

These fragments are inserted into BAC and a library is These fragments are inserted into suitable vector and a
constructed library is constructed

The DNA is fingerprinted to give each piece a unique -


identification

Each BAC is then randomly broken into 1500bp and it is -


placed in another artificial piece of DNA called M13 and M13
library is constructed
The M13 libraries are then sequenced The 2,000bp and the 10000bp plasmid libraries are
sequenced

These sequences are fed into a computer program called Computer algorithms assemble the sequenced fragments
PHRAP that looks for common sequences into continuous stretch resembling each chromosome

The above steps are repeated for 4-5 times The above steps are repeated 8-9 times
Role of bioinformatics in HGP

• One of the key research areas was bioinformatics. Without the


annotation provided via bioinformatics, the information gleaned from
the HGP is not very useful.
• Informatics is the creation, development, and operation of databases
and other computing tools to collect, organize, and interpret data.
• Continued investment in current and new databases and analytical
tools is critical to the future usefulness of HGP data.
• Databases must adapt to the evolving needs of the scientific
community and must allow queries to be answered easily.
• Planners suggest developing a human genome database, analogous
to model organism databases, that will link to phenotypic information.
• Also needed are databases and analytical tools for studying the
expanding body of gene-expression and functional data, for
modeling complex biological networks and interactions, and for
collecting and analyzing sequence-variation data.
Genes and their role in the body

• F5:
• Position: 1q23
• Full name: coagulation factor V
• Role in the body:
1. Coagulation factor V is an essential component of the blood coagulation
cascade.
2. Blood coagulation is initiated either by trauma or by damage to blood vessels
and culminates in the conversion of a circulating protein called fibrinogen into its
derivative fibrin, the substance of blood clots.
3. Factor V co-operates with another coagulation factor, known as factor X, to
convert the inactive polypeptide prothrombin into the active enzyme thrombin.
4. This enzyme then converts fibrinogen into fibrin and allows blood clots to form.
5. Interestingly, factor V is also cleaved by thrombin so there is a positive feedback
loop between the two enzymes - blood clotting stimulates more blood clotting.
This amplifies the coagulation cascade and results in rapid clotting when
required.
.
Role in disease:
• Defects in the F5 gene generally block the coagulation cascade and
result in prolonged bleeding, either externally or into body cavities.
• one particular class of mutation (factor V Leiden mutations) has the
opposite effect - these mutations predispose the patient to frequent
clotting events, manifesting as deep vein thrombosis.
• This is because factor V also helps to inhibit blood clotting, (it acts
as an anticoagulant).
• It does this by interacting with another anticoagulant protein called
activated protein C (APC).
• Were it not for such regulation, blood clotting would run out of
control every time we suffered a minor injury.
• Leiden mutations in F5 specifically prevent interaction between
factor V and APC, and therefore affect its anticoagulant activity but
not its role in the coagulation pathway
• RHO
• Position: 3q21-q24
• Full name: rhodopsin (opsin 2, rod pigment)

• Role in the body:

– Rhodopsin is a membrane-spanning protein expressed in the light-sensitive rod


cells (photoreceptor cells) of the retina.
– The protein is functional when it is chemically attached to another molecule
called retinal, which is derived from vitamin A.
– The fully assembled protein facilitates the perception of dim light.

• Role in disease:

– Rhodopsin is required for normal photoreceptor development.


– The absence of rhodopsin (or the presence of a defective rhodopsin) results in
retinal degeneration, a condition known as retinitis pigmentosa, which is a major
cause of blindness in developed countries.
– About 15 per cent of retinal degeneration in humans is caused by mutations in
the RHO gene.
– Retinal degeneration can be slowed by supplementing the diet with vitamin A, as
the presence of excess retinal may help to stabilize the protein.
• HD
• Position: 4p16
• Full name: Huntington's disease
• Role in the body:
– The HD gene is expressed widely in the body and produces two distinct mRNAs.
– The larger of the two transcripts is expressed preferentially in the brain and encodes a protein
called huntingtin.
– The precise role of the protein is unknown but it is associated with microtubules and synaptic
vesicles.
– Microtubules are components of the cytoskeleton that give structural stability to the cell and
facilitate the transport of molecules and other components between cell compartments, while
synaptic vesicles are required for communication between neurons.
– It is therefore possible that the protein is involved in the transport of substances from the cell
body to the synapses. The protein may also play a role in apoptosis (deliberately programmed
cell death).
• Role in disease:
– The HD gene first came to notice as a candidate for Huntington's disease, a
neurodegenerative disorder in which certain neurons are progressively destroyed, leading to
dementia.
– The mutation that causes the disease is not a point mutation or a deletion as might be
expected, but an expansion of a trinucleotide repeat.
– There is a series of repeats (in this case the sequence CAG) within the coding region of the
gene that can expand or contract from generation to generation.
– This produces huntingtin proteins with variable numbers of glutamine residues, a so-called
polyglutamine tract.
– Once the number of repeats exceeds 35, it becomes unstable and can increase rapidly in
subsequent generations.
• XIST
• Full name: X(inactive)-specific transcript

• Role in the body:


– The XIST gene is unusual in that it encodes a functional RNA
molecule rather than a protein.
– Most genes are transcribed to produce mRNAs that serve as
templates for protein synthesis.
– In the case of the XIST gene, the RNA itself carries out a function
in the cell. The function of the XIST RNA is intriguing.
– It is expressed from one of the two X-chromosomes in female
cells just prior to inactivation
– It appears to coat the active regions of the chromosome from
which it is expressed and promote histone modifications that
favour the formation of heterochromatin.
– This results in the transcriptional repression of large parts of the
chromosome.
• Role in disease:
– Rare females have been identified with multiple
congenital malformations and severe mental
retardation that appear to result from the presence of
a small X-chromosome derivative that lacks the XIST
gene and therefore cannot be inactivated.
– The multiple symptoms reflect the doubling in dosage
of a large number of X-linked genes.
– Other families have been identified in which there is
skewed inactivation of either the paternal or maternal
X-chromosome.
– It has been suggested that up to 18 per cent of
spontaneous abortions may result from skewed X-
inactivation
• SRY
• Position: Yp11
• Full name: sex-determining region Y
• Role in the body:
– The product of the SRY gene is a transcription factor - a protein that controls
gene expression.
– It is also known as the testis-determining factor and is required to initiate male
development.
– Following SRY expression in the sex-neutral genital ridge of the embryo, other
transcription factors are synthesized.
– One of these is called steroidogenic factor 1 and is encoded by the NR5A1 gene
on chromosome 9. It helps to activate genes that facilitate the synthesis of male
sex hormones, such as anti-Mullerian hormone and testosterone.

• Role in disease:
– the absence of the SRY gene in XY individuals leads to complete gonadal
dysgenesis, producing adults with streaks of gonadal tissue where ovaries would
normally be found and a complete set of Mullerian ducts (fallopian tubes, uterus).
– The external appearance of such individuals is females Translocation of SRY to
the tip of the X chromosome results in male development in XX individuals.
However, other genes on the Y chromosome are required for sperm
development, so XX SRY males are generally sterile
Ethical, Legal, and Social Implications
• Fairness in the use of genetic information by insurers, employers, courts, schools,
adoption agencies, and the military, among others
• Privacy and confidentiality of genetic information.
• Psychological impact and stigmatization due to an individual's genetic differences.
• Reproductive issues including adequate informed consent for complex and potentially
controversial procedures, use of genetic information in reproductive decision making,
and reproductive rights
• Clinical issues including the education of doctors and other health service providers,
patients, and the general public in genetic capabilities, scientific limitations, and social
risks; and implementation of standards and quality-control measures in testing
procedures.
• Uncertainties associated with gene tests for susceptibilities and complex conditions
(e.g., heart disease) linked to multiple genes and gene-environment interactions
• Conceptual and philosophical implications regarding human responsibility, free will vs
genetic determinism, and concepts of health and disease.
• Health and environmental issues concerning genetically modified foods (GM) and
microbes.
• Commercialization of products including property rights (patents, copyrights, and
trade secrets) and accessibility of data and materials.
Advantages and Disadvantages of
human genome project
• Advantages:
– improving our knowledge of gene expression,
– elucidating the function of the large proportion of DNA we know
little about
– discovering possible means of diagnosis for some genetic
diseases,
– discovering possible treatments for currently untreatable genetic
diseases
– discovering new tools and techniques for genetic research,
– generating the ability to go directly from a trait to a gene,
– identifying genetically validated therapeutic targets which would
increase the cost-benefit ratio in pharmaceutical discovery,
– investigating the development of drug resistance in bacteria,
– investigating antigenic variation and host-parasite interaction at
both the host and parasite level
• Disadvantage:
• the cost – the money could be spent elsewhere,
• the anguish resulting from knowing that a person has an untreatable
genetic disease,
• the use or misuse of genetic information by such organisations as
insurance companies and employers,
• the ownership of genetic test results,
• the patenting of human genes and DNA,
• the increasing gap between rich and poor countries in the quality of
life and the level of health and disease treatment,
• the exploitation of isolated populations in the search for disease
genes,
• the ethics of accumulating genotypic profiles of people - are they
able to be used for anything that the researcher wants,
• decisions about the ownership of data by 'affected' or donor
individuals,
• the ethics of germline gene therapy,
• the ethics of somatic gene therapy,
• the costs of genetic treatment versus benefit to the community.
Conclusion
• Human Genome Project research will help solve one of
the greatest mysteries of life:
• How does one fertilized egg "know" to give rise to so
many different specialized cells, such as those making
up muscles, brain, heart, eyes, skin, blood, and so on?
For a human being or any organism to develop normally,
a specific gene or sets of genes must be switched on in
the right place in the body at exactly the right moment in
development.
• Information generated by the Human Genome Project
will shed light on how this intimate dance of gene activity
is choreographed into the wide variety of organs and
tissues that make up a human being.
References
• Biotechnology by clark
• https://www.celera.com
• http://www.genome.gov
• http://www.accessexcellence.org/RC/AB/IE/Intro_The
• http://www.ornl.gov/sci/techresources/Human_Genom
• http://genome.wellcome.ac.uk/doc_WTD022307.htm
• http://customglassanddoors.com/index.php?key=gen

You might also like