Professional Documents
Culture Documents
& 2008 Nature Publishing Group All rights reserved 1359-4184/08 $30.00
www.nature.com/mp
ORIGINAL ARTICLE
PDE4D
Introduction
Ever since neuroticism was first proposed as personality trait that reflects a tendency toward states of
negative affect,1 it has been included in nearly every
theory of personality. Its importance as a psychological construct is further enhanced by its well
documented correlation with common psychiatric
disorders, that is, high levels of neuroticism predict
the onset and subsequent episodes of major depression25 and are associated with symptoms of depression and anxiety in the general population.69 The
correlation between anxiety, major depression and
neuroticism is, in part, due to the presence of shared
genetic factors,1013 an observation that has spurred
attempts to map the genetic basis of neuroticism.1416
It is much easier to acquire the large samples
Correspondence: Dr J Flint, Wellcome Trust Centre for Human
Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN,
UK.
E-mail: jf@well.ox.ac.uk
Received 4 February 2007; revised 12 June 2007; accepted 26 June
2007; published online 31 July 2007
303
DNA pools
DNA samples were collected and extracted from
mouth swabs, quantified four times using the PicoGreen assay (Molecular Probes, Eugene, OR, USA)
and tested by PCR. The samples were divided into
eight groups according to N score (high and low) and
gender (men and women). Equal amounts of DNA
(150 ng) from each subject were mixed together to
create 48 small pools, which on average were made
from 43 individuals. Eight different pools were then
created once by combining DNA from the small pools
in proportion to the number of individuals in each
pool. The eight pools are as follows: (1) men with high
N score (n = 112), (2) men with low N score (n = 158),
(3) men with very high N score (n = 245), (4) men with
very low N score (n = 238), (5) women with high N
score (n = 320), (6) women with low N score (n = 205),
(7) women with very N high score (n = 340) and (8)
women with very N low score (n = 436) (very high or
low N scores are more than 1.5 s.d. from the mean
score adjusted to age and sex (on average 2 s.d.); high
and low N scores are between 1 and 1.5 s.d. from the
mean score (on average 1.3 s.d.)).
We tested the accuracy of the pools before starting
the experiment with the Affymetrix arrays by individually genotyping five SNPs and allelotyping the
pools seven times using the Sequenom platform.
Frequencies were averaged and corrected for unequal
detection of the two alleles based on the ratios in
individual heterozygotes.31 The pools were found to
be very accurate, with an s.d. from the expected
frequency of 0.016 for each of the eight large pools.
The s.d. attributed to the pools (construction error)
without the measurement error is estimated to be
0.012 for each pool and 0.0059 for frequency estimate
for the high and low N groups with four pools in each.
The observed frequencies were highly correlated with
the expected frequencies for the small pools and, as
expected, higher for the eight large pools (small pools:
R2 = 98.8%; large pools: R2 = 99.7%).
Molecular Psychiatry
304
Results
Sensitivity and specificity of DNA pooling
The design and interpretation of the DNA pooling
experiment require an assessment of the methods
likely power, that is, its sensitivity and specificity. If
we were to genotype all 2000 individuals individually, we would be able to detect an effect explaining 1,
0.5 and 0.1% (ORs of 1.72, 1.47 and 1.19) of the
additive genetic variance with power of 0.94, 0.34 and
0.001, respectively (assuming an alpha of 1.66 107
(300 000 independent tests, a complete linkage disequilibrium (LD) between the marker and the causative SNP and a trait increasing allele frequency of
0.2)). If we test only 20 markers, for example, in a
replication sample, our power is 1, 0.96 and 0.19 for
1, 0.5 and 0.1% effects, respectively.
To investigate the pooling strategy we simulated the
design of the study in which we used eight different
pools, each allelotyped five times (that is, 20
measurements for each group). Using samples of
CEPH individuals genotyped as part of the HapMap
project,33 we established that we could accurately
simulate the pooling experiment (see Supplementary
material for a full description). We first simulated a
situation with a null effect and found that the
genome-wide 5% threshold for individual genotyping
is 1.7 107 (log P = 6.77), which is equivalent to a 5%
threshold after applying a Bonferroni correction for
294 071 independent tests. Then, by simulation, we
determined the power to detect an association by
genotyping directly the simulated causal SNP, by
individual genotyping of SNPs on the Affymetrix
arrays (500 and 100 K with MAF > 0.05) and using a
DNA pooling strategy. We designed a random SNP
from the HapMap to be a causal SNP and we looked at
SNPs in the array within a 100 kb window. By
simulating an SNP from the HapMap as a causal
SNP we include in our simulation the effect of the
305
Functional
SNP (%)
5% threshold
1.2
1.3
1.5
1.7
2
3
Log P > 3
1.2
1.3
1.5
1.7
2
3
Individual
genotyping (%)
DNA
pooling (%)
0
6
51
80
93
99
0
4
41
68
82
90
0
3
31
56
72
85
19
53
88
97
99
100
19
49
80
90
92
95
4
16
47
66
76
87
Frequency
306
10000
5000
0
-3
-2
-1
0
1
2
Age and sex regressed scores
Table 2
307
SNP
SNP
identifier
rs322239
rs4841017
rs17385253
rs9329165f
rs10483573
rs10504830
rs13290746g
rs9959800
rs10797812
rs4875610
rs2123315f
rs702543
rs7666238
rs9431663
rs7594674g
rs873989
rs401897
rs201997
rs1452788
Geneb
PTN
CNBD1
CSMD1
PDE4D
TRIM67
DAB1
C20orf32
ADAM18
TCF4
DNA
pooling
ranka
Chrc
7
8
1
8
14
8
9
18
1
8
4
5
4
1
2
1
20
8
18
Position
136413862
8614503
34564102
8614052
46932227
88448321
34858377
52115138
179716254
3165318
18642838
58878531
142453368
227620956
67191611
58268619
54463165
39703325
51268302
12
53
21
160
258
115
146
48
159
430
136
184
315
211
432
148
394
186
384
Replicatione
All
All
4.8E06
6.5E06
3.3E05
3.8E05
5.6E05
1.1E04
1.5E04
1.7E04
2.4E04
4.1E04
4.3E04
8.2E04
9.0E04
1.2E03
2.6E03
3.5E03
5.6E03
6.4E03
6.5E03
1.1E04
2.3E05
7.7E03
3.0E05
3.6E03
6.8E04
3.6E02
1.1E02
2.2E02
2.6E02
2.0E01
9.8E04
2.1E03
8.2E01
4.3E01
8.2E01
6.5E04
2.7E04
6.7E01
5.7E03
1.5E02
1.5E03
4.9E02
3.2E03
2.2E02
1.6E03
6.0E03
4.1E03
6.3E03
3.8E03
9.5E02
6.9E02
1.2E04
6.5E04
1.3E04
3.4E01
5.3E01
9.3E04
8.8E01
2.3E01
2.8E01
2.1E01
6.1E01
5.7E01
3.8E01
1.4E01
5.1E01
6.5E01
9.6E02
7.7E04
6.1E01
9.7E01
3.9E01
9.3E01
5.2E01
1.2E01
5.7E01
7.4E01
3.7E01
1.3E01
9.4E01
7.8E01
1.4E01
2.8E01
8.0E01
2.6E01
4.7E01
5.1E01
2.7E01
1.8E01
8.6E02
4.0E01
4.9E01
8.3E01
6.0E02
3.9E01
6.5E01
4.1E01
8.5E01
9.5E02
3.9E01
6.8E01
7.8E01
3.8E02
7.2E02
9.9E01
1.3E01
5.9E04
7.9E02
1.6E01
7.9E02
5.2E01
4.4E01
6.4E01
9.7E01
308
Table 3
Haplotype analysis
Genea
LOC284661
DAB1
NRXN1
CTNNA2
DPP10
LRP1B
FHIT
NA
NA
NLGN1
NA
PDE4D
NA
NA
LAMA2
PARK2
LOC442275
CREB5
NA
RELN
PTN
NA
CSMD1
NRG1
NA
CSMD3
PTPRD
ASTN2
TMEM16C
BDNF
DLG2
CNTN5
GRIP1
RAB3IP
LOC284058
TCF4
NA
GRIK1
ERG
Chrb
Position
SNP1
SNP2
1
1
2
2
2
2
3
3
3
3
4
5
6
6
6
6
6
7
7
7
7
8
8
8
8
8
9
9
11
11
11
11
12
12
17
18
18
21
21
4370062
58268619
50761056
80715942
116252992
140847651
60694463
153382923
162481199
175393516
116523708
58878531
33080668
120286484
129281463
162539612
164461465
28296787
95721578
102871243
136413862
8614503
3178277
32336767
108959180
114031134
8481878
116859364
26402767
27715568
84283401
98413115
65150558
68480702
41550514
51268302
52115138
30306168
38888946
rs6679220
rs873989
rs7576283
rs216617
rs10496506
rs13021003
rs17361653
rs641025
rs7652267
rs6806890
rs1112531
rs702543
rs592625
rs10484965
rs7776116
rs6904579
rs206694
rs4722801
rs2724041
rs362698
rs322239
rs4841017
rs4395910
rs1487152
rs1389976
rs10505196
rs6477311
rs2418446
rs10834976
rs985205
rs17147761
rs10501893
rs4913307
rs17225638
rs10514901
rs1452788
rs9959800
rs2832533
rs17194186
rs349410
rs1213659
rs1037426
rs3770369
rs11123306
rs2046563
rs1882899
rs989920
rs1478565
rs6779246
rs963251
rs17782374
rs7743563
rs10499094
rs9321142
rs3019449
rs2764649
rs177582
rs2524976
rs17321820
rs322323
rs9329165
rs10503232
rs7002732
rs10505119
rs4876501
rs1359114
rs10513281
rs1381181
rs2049045
rs891773
rs10501898
rs12314237
rs11177914
rs17631303
rs12326693
rs1229587
rs2832438
rs2836502
SNP3
rs1202835
rs930295
rs7609261
rs17551974
rs1716741
rs325738
rs7623402
rs35277
rs2523666
rs9320695
rs6928495
rs17649761
rs1541329
rs322236
rs6601729
rs4875610
rs12114401
rs10505174
rs10511494
rs10983238
rs10834971
rs10501548
rs10501903
rs12581890
rs2532276
rs9957668
rs2836365
SNP min Pc
Global sim Pd
Max-stat sim Pe
4.2E02
1.3E04
1.0E01
2.9E03
2.1E01
1.5E02
1.3E03
2.6E03
8.3E02
1.6E02
5.1E02
8.2E04
6.9E02
7.3E03
1.1E03
1.6E03
4.3E02
3.7E02
6.9E03
2.4E02
4.8E06
6.5E06
4.1E04
4.6E04
1.9E02
2.2E02
1.2E02
3.4E02
4.4E03
1.7E01
4.4E02
4.1E02
1.2E02
2.0E02
9.4E03
9.3E04
1.7E04
5.4E03
2.2E02
2.1E01
1.5E02
7.2E01
3.6E02
5.4E01
8.6E02
8.6E04
2.6E03
1.3E01
7.2E02
3.6E01
4.7E04
4.2E01
1.3E01
3.4E02
4.1E02
8.1E02
9.0E02
6.5E02
3.0E01
8.0E05
1.9E04
8.5E04
4.1E04
7.9E02
2.0E01
7.5E02
2.2E01
1.4E02
4.3E01
3.7E01
4.9E02
1.1E02
6.9E03
6.7E02
2.8E01
1.5E03
1.0E02
2.4E02
1.9E01
3.9E02
7.3E01
1.2E01
8.2E01
2.9E02
1.9E02
8.7E03
1.4E01
1.1E02
6.4E01
2.2E04
1.9E01
3.7E01
5.9E03
1.9E02
5.0E02
1.3E01
1.0E01
2.6E01
1.0E05
1.7E04
6.2E04
3.0E03
3.2E02
4.4E01
3.1E02
4.1E01
2.7E02
3.6E01
4.9E01
6.2E02
2.0E02
1.5E02
2.2E02
3.2E01
1.4E03
1.0E02
1.8E02
SNPs with a P-value < 0.001 (sex-specific or sexaveraged) were genotyped on a sample that consisted
of 933 women and 601 men (all in the 10% low or
high tail of the N distribution) (Figure 1). Only one
SNP showed a statistically significant association in
the replication. The significant SNP, rs702543, had a
P-value of 0.00081 in the original sample, a P-value
of 0.00078 (Bonferroni corrected P-value with 19
tests = 0.015) in the replication sample and a combined P-value = 2 106. In both the samples the
frequency of the A allele was increased in the high
6
5
logP
4
3
2
1
0
40
40.5
41
41.5
42
42.5
43
Mb
309
Sample
Pool sample
Our replication
Australian
Dutch
Virginia
Three groups
All replications
High N
Low N
AA
AG
GG
A%a
AA
AG
GG
A%
348
289
129
75
83
287
576
428
316
163
84
132
379
695
156
123
57
31
38
126
249
60.3
61.4
60.3
62.0
58.9
60.2
60.8
932
728
349
190
253
792
1520
291
197
144
85
261
490
687
468
305
196
106
379
681
986
197
133
71
36
129
236
369
54.9
55.0
58.9
61.0
58.6
59.0
57.8
956
635
411
227
769
1407
2042
OR
P allele
P genotype
1.25
1.3
1.06
1.03
1.01
1.05
1.13
0.00082
0.00077
0.57
0.82
0.90
0.46
0.012
0.0035
0.0029
0.85
0.88
0.69
0.76
0.029
310
Discussion
We have performed the first whole genome analysis of
neuroticism, using pooled DNA. We show that, by
using eight pools with five replicates of each, we
retain between 21 and 94% (depending on the effect
size and threshold used) of the power from genotyping all 452 574 SNPs individually, equivalent to a
50-fold saving in the costs. We identified one SNP
rs702543 from an analysis of B2000 individuals
selected from the neuroticism extremes of 88 142
people, which we were able to replicate in a separate
sample from the same cohort. The SNP was genotyped in three other laboratories, on related, but not
identical, phenotypes. In each case the direction of
allelic effects was the same as in our sample, but the
test reached statistical significance in only one
sample, that is, only in one method of analysis
(family based analysis). Our study raises a number
of issues for whole genome association studies in
general and the genetic basis of neuroticism in
particular.
We deal first with the performance of the pooling
strategy. Our estimates of the power of our DNA
pooling strategy were derived by simulation using the
measurement error across all SNPs from both 500 and
100 K arrays. However, there is a 3.2-fold difference
between the variances of the repeated measurements
for the two types of arrays (mean variance for
100 K = 0.0017, and for 500 K = 0.0055). This difference in measurement accuracy is probably due to the
smaller number of features per SNP and the reduction
in feature size in the 500 K array relative to the
100 K.21 It means that the measurements in the 500 K
arrays should be at least triplicated to achieve the
genotype accuracy obtained with the 100 K arrays.
DNA pooling allows substantial savings in genotyping costs (see, for example, references4446) and
there have been successes using DNA pooling in the
analysis of a relatively small number of markers, for
example, in schizophrenia47,48 multiple sclerosis,49
and serum high-density lipoprotein cholesterol
levels.50,51 However, in the two cases where whole
genome association has been attempted (that is, in
memory52 and mild mental impairment53), the yields
have been low, that is, in both cases only one locus
was considered significant.
Our results are comparable, in that we have found
just one locus of small effect. Failure to replicate the
finding in all samples is not unexpected, given the
differences between the cohorts, in both phenotype
and recruitment. No sample used exactly the same
phenotype. We used the 23-item Eysenck N scale,
while the US and Australian sample used the short
(12-item) scale and the Dutch sample used the
Amsterdamse Biografische Vragenlijst, which is based
on the Eysenck scale but is not identical. However,
differences in the measures are likely to be outweighed by differences in the recruitment strategy.
Our sample represents the extremes of about 61 367
unrelated people, the Dutch from 3444 families, the
Molecular Psychiatry
10
11
12
13
14
15
16
17
18
19
Acknowledgments
20
21
22
References
1 Eysenck HJ. The Biological Basis of Personality. Thomas: Springfield, IL, 1967.
2 Angst J, Clayton P. Premorbid personality of depressive, bipolar
and schizophrenic patients with special reference to suicidal
issues. Compr Psychiatry 1986; 27: 511532.
3 Boyce P, Parker G, Barnett B, Cooney M, Smith F. Personality as a
vulnerability factor to depression. Br J Psychiatry 1991; 159:
106114.
4 Hirschfeld RM, Klerman GL, Lavori P, Keller MB, Griffith P,
Coryell W. Premorbid personality assessments of first onset of
major depression. Arch Gen Psychiatry 1989; 46: 345350.
5 Kendler KS, Neale MC, Kessler RC, Heath AC, Eaves LJ. A
longitudinal twin study of personality and major depression in
women. Arch Gen Psychiatry 1993; 50: 853862.
6 Bienvenu OJ, Samuels JF, Costa PT, Reti IM, Eaton WW, Nestadt G.
Anxiety and depressive disorders and the five-factor model of
personality: a higher- and lower-order personality trait investigation in a community sample. Depress Anxiety 2004; 20: 9297.
7 Cox BJ, MacPherson PS, Enns MW, McWilliams LA. Neuroticism
and self-criticism associated with posttraumatic stress disorder
23
24
25
26
27
28
311
Molecular Psychiatry
312
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Supplementary Information accompanies the paper on the Molecular Psychiatry website (http://www.nature.com/mp)
Molecular Psychiatry