You are on page 1of 4

PLINK files:

Flat files (MAP/PED):


1=; 2=; other=uknown
PED: Family ID Individual ID Paternal ID Maternal ID Sex
ex: FAM1 NA06985 0 0 1

Binary files (BED/BIM/FAM)


FAM: Family ID Individual ID Paternal ID Maternal ID Sex
ex FAM1 NA06985 0 0 1

BED: magic numbers mode genotype data


01101100 00011011 00000001 11011100 00001111 11100111
to confirm that this is actually a indicates whether
bed file the BED file is in
SNP-major or
individual-major
mode

00000001=SNP major=list all individuals for first SNP, all individuals for secon
00000000=individual-major=list all SNPs for the first individual, list all SNPs
In summary, the following define the BE
First two bytes 01101100 00011011 for PLINK v1.00 BE
Third byte is 00000001 (SNP-major) or 00000000 (indiv
Genotype data, either in SNP-major or individual-major
New "row" always starts a new byte
Each byte encodes up to 4 genotypes
10 indicates missing genotype, otherwise 0 and 1 point
BIM file, respectively
Bits in each byte read in reverse order

COMMANDS: --file load ped & map files


--rerun scans the specified log file and reexecutes the last row of commands. If an existing command
--script plink --script myscript1.txt (it runs the commands written in a text doc)
--bfile load binary files (bed, fam, bim)
--geno -SNPS that failed test-calculates missingness per marker - and removes indiv?? with a missing
--maf -SNPS that failed test-calculates allele freq and removes individuals with a maf < than a thres
--mak-bed makes binary files from flat files
more myfile.extenopens file in cmd
--adjust adjusts for multiple testing

--model association satistics 2by3 genotype tables, standard allelic test, Cochran-Armitage trend tes
0=unknown; 1=unaffected; 2=affected 2for each marker; 0=missing
Phenotype Genotype
1 AT TT GG CC AT TT GG CC
AT=the genotype of the SNP on
the 21 chr (in order)

Phenotype
1

P, all individuals for second SNP, etc


rst individual, list all SNPs for the second individual,etc
he following define the BED file format:
011011 for PLINK v1.00 BED file
major) or 00000000 (individual-major)
-major or individual-major order
w byte
notypes
e, otherwise 0 and 1 point to allele 1 or allele 2 in the
rse order

ds. If an existing command is reexecuted the latter is prioritary.

oves indiv?? with a missing rate higher than a threshold


with a maf < than a threshold

ochran-Armitage trend test


MAP: Chromosome # rs# (marker) genetic distance(cM)
ex: 21 rs11511647 0

BIM: Chromosome# rs# genetic distance


ex 21 rs11511647 0

Feature As summary statisAs inclusion criteria


Missingness per individual --missing --mind N
Missingness per marker --missing --geno N
Allele frequency --freq --maf N
Hardy-Weinberg equilibrium --hardy --hwe N
Mendel error rates --mendel --me N M

Genomic inflation factor (based on median


chi-squared) is 1.18739 --adjust
Mean chi-squared statistic is 1.14813
physical distance
26765

minor allele
physical distance Allele1 Allele2
26765 A T