You are on page 1of 24

DNA PROFILING

Overview
DNA Analyst Bob Blackett has graciously
provided The Biology Project
with sample data from his own
work. n this activity! you will
learn the concepts and
techni"ues behind DNA profiling of the #$ core
%&D' ('hort Tandem )epeat( loci used for the
national DNA databank. *ou will then have the
opportunity to collect and interpret actual 'T)
data! and to answer one or more of the following
"uestions+
#. ,ow is 'T) data used in a DNA
Paternity Test-
.. ,ow can 'T) data from close relatives
be used to create a genetic profile of a
missing person-
$. ,ow much genetic diversity e/ists
among siblings-
0. ,ow does one calculate the probability
for a specific DNA profile-
1
Alternatively! you may wish to create your own
activities! based on some suggestions for open1
ended in"uiry that are offered below.
This activity is aimed at students with a basic
knowledge of DNA structure! 2endelian
genetics! and human pedigree analysis .
The Science of STR DNA Profile Analysis
What is a Short Tandem
Reeat Polymorhism
!STR" #
What are the $% core
&ODIS loci#
'ethods of Analysis of
STRs
Genetics of STR
Inheritance
DNA Profile Fre()ency
&alc)lations
Str)ct)red In()iry Activities for St)dents
2
'ome representative activities involving data
collection! interpretation! and analysis using
Bob Blackett3s data.
&reate a *lac+ett Family
Pedi,ree
&ollectin, STR DNA
rofile data
Paternity Testin, with STR
Data
DNA Profile of a -'issin,
Person-
DNA Profile Fre()ency
&alc)lations
Oen .nded Activities for St)dents
,ere we suggest some starting points for in
depth e/ploration of the topic of 'T) DNA
profiling .
What is a Short Tandem Reeat
Polymorhism !STR"#
3
STR Polymorhisms
2ost of our DNA is identical to DNA of others.
,owever! there are inherited regions of our
DNA that can vary from person to person.
4ariations in DNA se"uence
between individuals are termed
(polymorphisms(. As we will
discover in this activity!
se"uences with the highest degree
of polymorphism are very useful
for DNA analysis in forensics cases and
paternity testing. This activity is based on
analy5ing the inheritance of a class of DNA
polymorphisms known as ('hort Tandem
)epeats(! or simply 'T)s.
'T)s are short se"uences of DNA! normally of
length .16 base pairs! that are repeated
numerous times in a head1tail manner! i.e. the #7
bp se"uence of (gatagatagatagata( would
represent 0 head1tail copies of the tetramer
(gata(. The polymorphisms in 'T)s are due to
the different number of copies of the repeat
4
element that can occur in a population of
individuals.
D/S012
D8'.9: is one of the #$ core %&D' 'T)
genetic loci. This DNA is found on human
chromosome 8. The DNA se"uence of a
representative allele of this locus is shown
below. This se"uence comes from ;enBank! a
public DNA database. The tetrameric repeat
se"uence of D8'.9: is (gata(. Different alleles
of this locus have from 7 to #6 tandem repeats
of the (gata( se"uence. ,ow many tetrameric
repeats are present in the DNA se"uence shown
below- Notice that one of the tetrameric
se"uences is (gaca(! rather than (gata(.
1 aatttttgta ttttttttag
agacggggtt tcaccatgtt ggtcaggctg
actatggagt
61 tattttaagg ttaatatata
taaagggtat gatagaacac ttgtcatagt
ttagaacgaa
5
121 ctaacgatag atagatagat
agatagatag atagatagat agatagatag
atagacagat
181 tgatagtttt tttttatctc
actaaatagt ctatagtaaa catttaatta
ccaatatttg
241 gtgcaattct gtcaatgagg
ataaatgtgg aatcgttata attcttaaga
atatatattc
301 cctctgagtt tttgatacct
cagattttaa ggcc
What are the $% core &ODIS loci#
A National DNA Data3an+
The <ederal Bureau of nvestigation =<B> of
the ?' has been a leader in developing DNA
typing technology for use in the identification of
6
perpetrators of violent crime. n #@@8! the <B
announced the selection of #$ 'T) loci to
constitute the core of the ?nited 'tates national
database! %&D'. All %&D' 'T)s are
tetrameric repeat se"uences. All forensic
laboratories that use the %&D' system can
contribute to a national database. DNA analysts
like Bob Blackett can also attempt to match the
DNA profile of crime scene evidence to DNA
profiles already in the database.
There are many advantages to the %&D' 'T)
system+
The %&D' system has been widely adopted
by forensic DNA analysts
'T) alleles can be rapidly determined using
commercially available kits.
'T) alleles are discrete! and behave
according to known principles of population
genetics
The data are digital! and therefore ideally
suited for computer databases
7
Aaboratories worldwide are contributing to
the analysis of 'T) allele fre"uency in
different human populations
'T) profiles can be determined with very
small amounts of DNA
A DNA Profile4 The $% &ODIS STR loci
As part of his training and proficiency testing
for DNA Profile analysis of 'T) ='hort Tandem
)epeat> Polymorphisms! <orensic 'cientist and
DNA Analyst Bob Blackett created a DNA
profile on his own DNA. ,ere is Bob3s DNA
Profile for the #$ core ;enetic Aoci of the
?nited 'tates national database! %&D'
=%ombined DNA nde/ 'ystem>+
Aocus D$'#$69 vBA <;A D9'##8@ D.#'##
;enotype #6 ! #9 #7 ! #7 #@ ! .0 #. ! #$ .@ ! $#
<re"uency 9..C 0.0C #.8C @.@C ..$C
Aocus D#$'$#8 D8'9.: D#7'6$@ T,&# TP&D
;enotype ## ! ## #: ! #: ## ! ## @ ! @.$ 9 ! 9
<re"uency #..C 7.$C @.6C @.7C $.6.C
8
<or each genetic locus! Bob has determined his
(genotype(! and the e/pected fre"uency of his
genotype at each locus in a representative
population sample. <or e/ample! at the genetic
locus known as (D$'#$69(! Bob has the
genotype of (#6! #9(. This genotype is shared
by about 9..C of the population. By combining
the fre"uency information for all #$ %&D'
loci! Bob can calculate that the fre"uency of his
profile would be # in 8.8 "uadrillion %aucasians
=# in 8.8 times #: to the #6th powerE
n Bob3s forensic DNA work! he often compares
the DNA profile of biological evidence from a
crime scene with a known reference sample
from a victim or suspect. f any two samples
have matching genotypes at all #$ %&D' loci!
it is a virtual certainty that the two DNA
samples came from the same individual =or an
identical twin>.
'ethods of Analysis of STRs
Be will assume that you have a basic
understanding of the Polymerase %hain
9
)eaction =P%)>! and gel electrophoresis!
especially as applied to DNA se"uence analysis.
Be will focus here on the special features of
P%) and gel electrophoresis as they are applied
to 'T) characteri5ation. f you are unfamiliar
with these techni"ues! you should still be able to
complete this activity.
'ethods in Analysis of the $% &ODIS STR
loci
$ 5 DNA e6traction
DNA can be e/tracted from almost any human
tissue. Buccal cells from the inside of the cheek
are most commonly used for paternity tests.
'ources of DNA found at a crime scene might
include blood! semen! tissue from a deceased
victim! cells in a hair follicle! and even saliva.
DNA e/tracted from items of evidence is
compared to DNA e/tracted from reference
samples from known individuals .
0 5 P&R Amlification
DNA primers have been optimi5ed to allow
amplification of multiple 'T) loci in a single
reaction mi/ture. By carefully adjusting the
10
distance of the primers from the tetrameric
repeat se"uence! products from different loci
will not overlap during gel electrophoresis .
n the partial results shown above! the three
'T)s D$'#$69! vBA! and <;A are being
analy5ed simultaneously. The lengths of the
amplified DNAs are shown by the scale from
#:: bp to .9: bp at the top of the figure. The
middle panels with multiple peaks are reference
standards with the known alleles for each 'T)
locus. Notice that the alleles for the three
different loci do not overlap. The lower panel
shows the alleles for Bob Blackett3s mother
Norma for the D$'#$69! vBA! and <;A loci.
11
Norma3s alleles have been compared by
computer to the reference standards! and
labeled. To interret this res)lt7 Norma8s
,enotye is $97 $9 at the loc)s D%S$%917 $:7
$; at vWA7 and 0:7 09 at FGA 5
% 5 Detection of DNAs after P&R
Amlification
The P%) primers in the commercial kits used
for 'T) analysis have fluorescent molecules
covalently linked to the primer. To e/tend the
number of different loci that can be analy5ed in
a single P%) reaction! multiple sets of primers
with different (color( fluorescent labels are
used. <ollowing the P%) reaction! internal DNA
length standards are added to the reaction
mi/ture and the DNAs are separated by length
in a capillary gel electrophoresis machine. As
DNA peaks elute from the gel they are detected
with laser activation. The se"uencing machines
used for allele separation and detection are the
same type currently being used in the ,uman
;enome 'e"uencing project! with digital output
12
that can be analy5ed by special computer
software .
n the Amp<A'T)F Profiler Plus
F
P%)
Amplification Git from Applied Biosystems
used by Bob Blackett! @ 'T)s are analy5ed by
using three sets of primers. Hach set has a
different colored fluorescent label. n the figure
above! three sets of 'T)s are represented by
blue! three by green! and three by yellow
=shown as black> fluorescent peaks. The red
peaks are the DNA si5e standards. 'pecial
computer software is used to display the
different colors as separate panels of data and
determine the e/act length of the DNAs. A tenth
marker called A2HA is used to distinguish male
DNA as D! * or female DNA as D! D .
A second kit! called %ofiler Plus! is used in a
second P%) reaction to amplify 0 additional
13
'T) loci! plus repeat some of the loci from the
Profiler Git. The result from . P%) reactions is
the analysis of the entire %&D' set of #$ 'T)s!
with overlap of some loci! and a test for the se/
chromosomes. The results are obtained as
discrete! digital alleles determined from the
e/act si5e of the amplified products compared to
known standards .
Genetics of STR Inheritance
'ince there are no phenotypes associated with
the %&D' 'T) loci! understanding the genetics
of 'T) inheritance is simplified compared to
other genetic problems. Be need only consider
the genotypes of the parents and their offspring.
The alleles of different 'T) loci are inherited
like any other 2endelian genetic markers.
Diploid parents each pass on one of their two
alleles to their offspring according.
14
,ere is brief review of the genetic concepts and
terms important for understanding 'T) allele
inheritance. <or an in depth tutorial! see
2onohybrid %ross problems.
Allele. The different forms of a gene.
Different 'T) repeat lengths represent
different alleles at a genetic locus! i.e. 9 and
@ are different alleles of the T,&# locus.
Loc)s. The position on a specific
chromosome where the different alleles of a
genetic marker are located. The plural is
loci.
'onohy3rid &ross. ;enetic cross
involving parents differing in only one trait.
nheritance of each of the #$ 'T) loci can
be treated as a separate 2onohybrid %ross.
Genotye. The genetic composition of the
alleles at a locus. 'ince we are diploid! we
each have two alleles at each locus.
<omo=y,o)s. Both alleles at a locus are the
same! i.e. <red has a genotype of .@! .@ at
the D.#'## locus.
15
<etero=y,o)s. Alleles at a locus are not the
same! i.e. Normal has a genotype of .@! $#
at the D.#'## locus.
')ltile Allelic Series. 2any different
alleles at a locus! i.e. the known alleles at the
vBA locus are ##! #.! #$! #0! #6! #7! #8! #9!
#@! .:! and .#.
P)nnett S()are. A diagram used to
determine all possible genotypes that can
occur in a genetic cross. All of the diagrams
on this page are Punnett '"uares.
,ere are some e/amples of how 'T) data can
be interpreted in a family DNA study. The
numbers outside the Punnett '"uares are the
parental alleles that can be present in the egg or
sperm of the parents. The numbers inside the
s"uares are the genotypes possible for the
resulting children.
&ase $
f the genotypes of both parents are known! we
use a Punnett '"uare to predict the possible
16
phenotypes of their offspring. Hach child
inherits one allele of a given locus from each
parent. Panel =a> 1 At the D.#'## locus! the
children of Bob Blackett and wife Anne can
have four different genotypes. 'on David is .9!
$#. Daughter Gatie is .@! $:. Panel =b> 1 Bob
Blackett inherited the $# allele from his mother!
Norma. Therefore the .@ allele is paternal. f
Bob3s paternal was not .@! what would be your
conclusion-
&ase 0
n the genotypes of a mother and several
children are known! it is often possible to
unambiguously predict the genotype of the
father. n this case! Garen is the mother with a
genotype of @! @.$ at the T,&# locus. <rom the
Punnett '"uare we can determine that the
17
paternal alleles of Tiffany! 2elissa! and Amanda
are 9! @.$! and @.$! respectively. Therefore! their
father 'teve must have a genotype of 9! @.$. f
the three daughters had three different paternal
alleles! what would be your conclusion-
&ase %
'ometimes only one allele of the father can be
predicted when the genotypes of a mother and
several children are know. n this e/ample! the
genotype of Garen! the mother! is #7! #8 at the
D#9'6# locus. The genotypes of the daughters
are either #7! #9 or #8! #9. n each case!
2elissa! Tiffany! and Amanda inherited the #9
allele from their father! 'teve. Be cannot
determine if the genotype of 'teve is
homo5ygous! #9! #9 or #9! - where the - means
any other allele.
18
&ase :
s it possible to determine parental genotypes
when only the genotypes of their children are
known- %onsider the case of Bob Blackett3s 0
first coursins! 2arilyn! Buddy! Dick and Ianet.
Bob did not have DNA samples for their
parents! Bud and Aouise! who are both
deceased. n a real forensic case! Bud and
Aouise might represent (missing persons(. n
panel =a> we can arrange the $ known genotypes
of the 0 children. n panel =b> we predict the
only two paternal genotypes for the parents that
can account for the children. Note that we
cannot determine which genotype goes with
which parent.
19
&ase 9
A variation on %ase 0 is when there are only two
genotypes known for the children! and both
parental genotypes must be predicted. Panel =a>
1 2arilyn and Ianet are #6! #7 at the locus
D$'#$69. Buddy and Dick are #9! #9. Panel =b>
1 The only parental genotypes that can give this
result are #6! #9 and #7! #9. &nce again! we
cannot predict which parent as which genotype.
&ase ;
'ometimes the parental genotypes cannot be
predicted unambigously from the genotypes of
20
their children. 2arilyn is #7! #8 at the locus
vBA. Buddy! Dick! and Ianet are #7! #9. Bhat
are the parental genotypes- Panel =a> 1 &ne
interpretations is that the parents are #7! #9 and
#7! #8. Panel =b> 1 Another possibility is that one
parent is #8! #9 and the other is #7! -! where - is
any allele.
DNA Profile Fre()ency &alc)lations
Genotye Pro3a3ility at any STR Loc)s
Part of the work of forensic DNA analysis is
the creation of population databases for the
'T) loci studied.
Probability calculations are based on
knowing allele fre"uencies for each 'T)
locus for a representative human population
21
=and showning ,ardy1Beinberg e"uilibrium
for the population by statistical tests>.
Allele fre"uency is defined as the number of
copies of the allele in a population divided
by the sum of all alleles in a population.
<or a hetero5ygous individual! if the two
alleles have fre"uencies of p and " in a
population! the probability =P> of an
individual of having both alleles at a single
locus is
P J .p"
f an individual is homo5ygous for an allele
with a fre"uency of p! the probability =P> of
the genotype is
P J p
.
.
Be saw earlier that Bob Blackett has the
genotype #6! #9 at the locus D$'#$69. n a
reference database of .:: ?.'. %aucasians!
22
the fre"uency of the alleles #6 and #9 was
:..9.6 and :.#06:! respectively. The
fre"uency of the #6! #9 genotype is therefore
P J . =:..9.6> =:.#06:> J .:9#@! or 9..C .
Pro3a3ility for a DNA rofile of ')ltile
Loci
f databases of allele fre"uency for different
loci can be shown to be independently
inherited by appropriate statistical tests! the
probability for the combined genotype can
be determined by the multiplication =product
rule>.
The probability =P> for a DNA profile is the
product of the probability =P
#
! P
.
! ... P
n
> for
each individual locus! i.e.
Profile Probability J =P
#
> =P
.
> ... =P
n
>
23
The probability can be an e/tremely low
numbers when all #$ %&D' 'T) markers
are included in the DNA profile. As
mentioned earlier! Bob Blackett calculated
his own profile probability at #.$ times #:
1#7
!
or no more fre"uent than # in 8.8 "uadrillion
individuals =8.8 million billion>! which is
more than a million times the population of
the planet.
24

You might also like