Professional Documents
Culture Documents
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/268747417
CITATIONS READS
0 560
1 author:
Subhash Chandra
Agriculture Research Branch
200 PUBLICATIONS 1,310 CITATIONS
SEE PROFILE
Subhash Chandra
Biometrics Unit
Global Theme Biotechnology
International Crops Research Institute for the Semi-Arid Tropics
Patancheru 502 324, India
2004
ICRISAT
Contents
QTL Analysis: An Overview 3-6
2
ICRISAT
- Detect regions of the genome associated with the trait (with a map)
! How many and where the QTLs are? Identify flanking markers
3
ICRISAT
4
ICRISAT
These notes are divided into four chapters and discuss only
the mapping-population-based QTL analysis
- Chapter 1 is about The Things We Need To Get There
- Chapter 2 discusses issues related to Phenotyping
- Chapter 3 is discusses Building a Linkage Map
- Chapter 4 discusses the methods for QTL Analysis including
cross-validation and bootstrapping
5
ICRISAT
Suggested Reading
Kearsey MJ, Farquhar AGL (1998). QTL analysis in plants: Where are we now? Heredity
80:137-142
Asins MJ (2002). Present and future of quantitative trait locus analysis in plant
breeding. Plant Breeding 121:281-291
Paterson AH (2002). What has QTL mapping taught us about plant domestication? New
Phytologist 154:591-608
Bernardo R (2004). What proportion of declared QTL in plants are false? Theoretical
and Applied Genetics 109:419-424
Schoen CS et al. (2004). Quantitative trait locus mapping based on resampling in a vast
maize testcross experiment and its relevance to quantitative genetics for complex
traits. Genetics 167:485-498
Kraakman ATW et al. (2004). Linkage disequilibrium mapping of yield and yield
stability in modern spring barley cultivars. Genetics 168:435-446
6
ICRISAT
Chapter 1
Things we need to get there
- Selection of parents
- Number of markers
7
ICRISAT
8
ICRISAT
For a fixed sample size: Lowering one error increases the other one
9
ICRISAT
- For eternal RIL and DH, both types of markers are equally
informative
10
ICRISAT
Lee M (1995). DNA markers and plant breeding programs. Advances in Agronomy
55:265-344
A: Parent 1, B: Parent 2 (one high, one low); (A,B) can also be coded as (1,0)
11
ICRISAT
Chapter2
Phenotyping of mapping population
Reliable QTL mapping demands reliable phenotyping of traits
of interest under defined target environmental (e.g. drought)
conditions
Phenotype data contain information on segregation and the
phenotypic effects of QTLs
H2 = G2/[G2+(GE2/nE)+{e2/(nEnr)}]
12
ICRISAT
= E{m-E(m)}2 + {E(m)-}2
= {SE(m)}2 + {Bias(m)}2
= Imprecision + Inaccuracy
13
ICRISAT
Replication of entries
Randomization of entries
Local control of error arising from inter-plot variation
14
ICRISAT
Increasing nr
- Is not a device to reduce e2;
- Provides only a more stable estimate of e2.
15
ICRISAT
Augmented designs
C T T T T T C T T T T T C T T T T T C Block 1
C T T T T T C T T T T T C T T T T T C Block 2 Incomplete blocks
C T T T T T C T T T T T C T T T T T C Block 3
16
ICRISAT
C1 T T T T T C4 T T T T T C3 T T T T T C2 Block 1
C2 T T T T T C3 T T T T T C4 T T T T T C1 Block 2
C3 T T T T T C1 T T T T T C2 T T T T T C4 Block 3
For normally used rectangular plots, Lin & Poushinsky (1985, Can
J Plant Sci 65:743-749) suggest MAD-2, which can be adapted to
any standard design. An example of a 3x6 row-column design to
test 72 test lines is
Column
1 2 3 4 5 6
T T T T T T
T T T T T T
C C C C C C Row 1
T T T T T T
T T T T T T
T T T T T T
T T T T T T
C C C C C C Row 2
T T T T T T
T T T T T T
T T T T T T
T T T T T T
C C C C C C Row 3
T T T T T T
T T T T T T
17
ICRISAT
H2 = G2/[G2+(GE2/nE)+{e2/(nEnr)}]
the error variance of a genotype mean
(GE2/nE)+{e2/(nEnr)}
is reduced more by larger nE than by larger nr.
nr=2 will allow internal estimation of error variance for each
individual trial to assess the relative magnitude of errors
across environments and also to separate 2GE from e2;
For nr=2 and given nE, attempt should be made to reduce
e2 in individual trials, to compensate for less nr,
- By covariance adjustments;
- By spatial analysis.
18
ICRISAT
REML provides
to get entry BLUPs and their SE, and estimates of G2 and e2 and
their SE.
19
ICRISAT
H2 = G2/[G2+(e2/nr)]
Use REML to get BLUPs of G and GEI treating entry, block, replicate,
and environment effects as random.
to get BLUPs of G and GEI and their SE, and estimates of G2, 2GE
and e2 and their SE.
H2 = G2/[G2+(2GE/nE)+{e2/(nEnr)}]
20
ICRISAT
21
ICRISAT
Chapter 3
Building a linkage map
Linkage map: linear arrangement of genetic markers (loci) on the
genome obtained on the basis of estimates of recombination
fractions among the markers
Linkage grouping
Locus ordering
22
ICRISAT
23
ICRISAT
- Genotyping errors
- .
24
ICRISAT
Consider two markers A & B each having two alleles (A,a) and
(B,b)
25
ICRISAT
The ML estimator of r is
rAB = (n2 + n3)/n n=n1+n4+n2+n3
Var(rAB)=rAB(1-rAB)/n
k = e-Gd/{2(d-1)}
26
ICRISAT
27
ICRISAT
28
ICRISAT
rXY rYZ
_____X__________Y_________________Z___
dXY dYZ
Recombination fractions are not additive due to multiple crossing overs. For
example, for locus order (X,Y,Z)
29
ICRISAT
dij(Haldane) dij(Kosambi).
# loci : 2 3 5 10 20
#Locus-
Orders : 1 3 60 1,814,400 1.22x1018
30
ICRISAT
Given total genome length (L) and a genome map (with map
length L*), these can be easily estimated.
31
ICRISAT
_.__._____.__.____.___.______ L1
__.___._____.___._.____.___._ L2
_.___.____.__.____.___.__.___ L3
+d 0 -d L=L1+L2+L3
c = 1 e-2md/L
32
ICRISAT
P= 1 [1 (2d/L)]m
2d = L [1 (1-P)1/m]
33
ICRISAT
Chapter 4
QTL Analysis
4.1 The key idea
A genetic marker that tends to co-segregate with the trait is
likely to be close to a QTL controlling that trait
34
ICRISAT
Y = f{Pr(QkMj)}
Gene frequencies
35
ICRISAT
36
ICRISAT
37
ICRISAT
Two-Marker Analysis
[Simple Interval Mapping (SIM)]
(QTL Cartographer, MapQTL, PlabQTL )
Multiple-Marker Analysis
[Multiple regression, CIM, MQM]
(QTL Cartographer, PlabQTL, MapQTL, )
38
ICRISAT
Conditional Prob
Pr(QA)
Above, the Expected Trait Value, e.g. for marker genotype AA,
is derived as
39
ICRISAT
t-test
E[m(AA) m(Aa)] = AA - Aa
= (1-2r) (QQ - Qq) = (1-2r)
= (1-2r) (a+d)
Ho : AA - Aa = 0
Analysis of variance
40
ICRISAT
Regression approach
Model yj = 0 + Xj + ej
Therefore
0 = E[m(y)] = (QQ+Qq)/2
H0: = 0
41
ICRISAT
- Then, since
AA=(1-r) QQ + r Qq
Aa= r QQ + (1-r) Qq
L(QQ,Qq,2,ryi) = L(QQ,Qq,2,r)
i=1,,n ; j=1,2
42
ICRISAT
- LOD score
LOD=0.21715*G G=4.60511*LOD
43
ICRISAT
44
ICRISAT
With small r, double crossover (2 r1 r2) may be ignored which reduces the above
equation to
r = r 1 + r2
The QTL position can be represented by a point relative to the interval between two
markers by a position parameter
= r1/r
45
ICRISAT
46
ICRISAT
AABB 1 0 QQ=AABB
AaBb 0 1 Qq=AaBb
Regression approach
- Model: y = 0 + X* a + e
47
ICRISAT
48
ICRISAT
#----- r ----"
#r1"#-- r2-"
___.______.____.______._____________._________
Mi-1 Mi Q Mi+1 Mi+2
(A) QTL (B)
49
ICRISAT
The effect of other QTLs if any, other than the one in test-
interval (Mi,Mi+1), are removed through the regression of
markers outside the test-interval. This increases power of QTL
detection
50
ICRISAT
Rin=[SSResu-SSResu+1]/[SSResu+1/(n-u-2)]>Fin
51
ICRISAT
52
ICRISAT
p= R2a/H2
H2= 2G/2P
R2=SSreg/SStotal=SSQTL/SStotal
Sample size n
Population type
Heritability
Genetic architecture
53
ICRISAT
54
ICRISAT
p$ * = 2 p$ p$ b / B
b
55
ICRISAT
56
ICRISAT
Consider the example of a field trial laid out in an RCBD. Suppose there are t=16
groundnut varieties as treatments each decided to be tested with r=3 replications.
The experimental field is divided into b=r=3 blocks, each individual block containing
k=t=16 plots. There are N=txr=48 plot observations corresponding to the 48 plots used
in the trial.
The following linear additive model is commonly assumed to represent (model) any
individual plot observation in an RCBD
Plot Obs. = Trial Mean + Variety Effect + Block Effect + Error (1A)
Yij = + i + j + ij (1B)
The model above contains four terms [, i, j, ij]. We need to specify which of these
terms is a fixed and which is a random effect.
WHAT IS A FIXED EFFECT? A model term, representing the effect of a certain factor,
is said to be a fixed effect if the different levels of the factor, say t levels of the
factor variety included in the trial, represent t distinct populations (treatments), and
interest lies in estimation (BLUE) of the means of those, and only those, distinct
populations (treatments). For example, if the 16 groundnut varieties represent 16
distinct genetic populations, the corresponding model term i will be taken as a fixed
effect. Our interest will be only in estimating the means of, and possibly in some well-
defined differences among, these 16 genetic populations.
57
ICRISAT
(a) Is it physically possible for the used factor levels to be repeated at some
future time or in some other place?
(b) If answer to (a) is YES, would it be reasonable in the context of this research
for you or someone else to choose the same levels for repetition of this
research at some future time or in some other place?
If the answers to questions (a) and (b) are BOTH YES, declare the term
as a fixed effect.
If a fixed-effect factor involves a large number of levels (say >10) and there is no
structure among those levels, it might perhaps be best to declare the factor effect
as random and use BLUP to predict mean values.
The basis for this recommendation is: With a large number of unstructured
levels, it is likely that the extremely low or high means will partially reflect a
fortuitous combination of random error effects and should therefore be shrunken
toward the trial mean.
If a random-effect factor involves too few levels (say if the factor represents an
uninteresting nuisance factor in the trial), and if comparisons among levels of this
factor provide no information about other factors (no inter-class information), then
declare the factor effect as fixed. This will ease the computing demands and
provide identical results for the interesting factors.
NOTE 1: Whatever the type of model (Fixed, Random, or Mixed), is always taken as
fixed, and ij, due to random allocation of treatments to plots, is always taken as
random. What makes a model as Fixed, Random, or Mixed then depends on the nature
of the remaining terms in any model.
WHAT IS A FIXED-EFFECTS MODEL (Model 1)? Under the proviso of NOTE 1, if all
other model terms represent fixed effects, the model is called a fixed-effects (or a
fixed) model.
WHAT IS A RANDOM-EFFECTS MODEL (Model 2)? Under the proviso of NOTE 1, if all
other model terms represent random effects, the model is said to be a random-effects
(or a random) model.
WHAT IS A MIXED-EFECTS MODEL (Model 3)? Under the proviso of NOTE 1, if some of
the remaining model terms are fixed effects and some are random effects, the model
is defined to be a mixed-effects (or a mixed) model.
58
ICRISAT
59
ICRISAT
60
ICRISAT
Allelic effect
I. Dominant deviation:
This deviation will be detected in F2 population
Its calculated as: Heterozygous [(P1+P2)/2]
A positive effect reflect growth of the heterozygous that exceeds the midparent
A negative effect reflects growth that is less than the midparent
Marker analysis
Fill gaps
(More markers in target region)
More individuals
(More recombinants for the target region)
61
ICRISAT
62
ICRISAT
Disadvantages
Require well-equipped lab and well-trained workers
High cost
63