Professional Documents
Culture Documents
Genomics
Tools for Navigating the
Genome of an Individual
Saul A. Kravitz
J. Craig Venter Institute
Rockville, MD
HuRef
variant: variant:
Haploid Consensus
ACCTTTGCAATTCCC
ACCTTTGTAATTCCC
ACCTTTGTAATTCCC
Reads ACCTTTGTAATTCCC
ACCTTTACAATTCCC
ACCTTTACAATTCCC
ACCTTTACAATTCCC
Computing Allelic
Contributions
• Consensus generation modified to separate alleles
•• Consensus generation
Bioinformatics. conflates
2008 Apr alleles
15;24(8):1035-40
Computing Allelic
Contributions
• Modified Consensus generation separates allele
• Compare HuRef alleles to identify SNP, MNP, Indel Variants
Haploid Consensus
ACCTTTGCAATTCCC
True Diploid Alleles
ACCTTTGTAATTCCC
ACCTTTGTAATTCCC
ACCTTTGTAATTCCC
Reads ACCTTTGTAATTCCC
ACCTTTACAATTCCC ACCTTTACAATTCCC
ACCTTTACAATTCCC
ACCTTTACAATTCCC MNP Variant AC / GT
HuRef Variations
• 4.1 Million Variations (12.3 Mbp)
• 1.2 Million Novel
• Many non-synonymous changes
• ~700 indels and ~10,000 total SNPs
• Indels and non-SNP Sequence Variation
• 22% of all variant events, 74% of all
variant bases
• 0.5-1.0% difference between haploid
genomes
• 5-10x higher than previous estimates
HuRef Browser
• Why do this?
• Research tool focused on variation
• Verify assembly and variants
• Show ALL the evidence
• High Perfomance
• Features
• Use HuRef or NCBI as reference
• Genome vs Genome Comparison
• Drill down from chromosome to reads and
alignments
• Overlay of Ensembl and NCBI Annotation
• Links from HuRef features in NCBI (e.g., dbSNP)
• Export of data for further analysis
http://huref.jcvi.org
Search by Feature ID or
coordinates
Haplotype
Blocks
NCBI-36 Variations
Assembly-Assembly
HuRef Mapping
Assembly Structure
Protein Truncated by 476 bp
Insertion
chr19:57578700-57581000
Heterozygous Homozygous
SNP SNP
Assembly Structure
Drill Down to
Multi-sequence Alignment
• Data volumes
– read data included from new
technologies
– Multiplication of genomes
• Enormous number of potential
comparisons
– Populations, individuals, variants
• Dynamic generation of views in web
time
• Use cases are evolving
Conclusion
• A high performance visualization tool
for an individual genome
– Validation of variants
– Comparison with NCBI-36
• Planned extensions for multi-genome
era
• Website: http://huref.jcvi.org
• Contact: hurefhelp@jcvi.org
Acknowledgements
HuRef Browser: Nelson Axelrod, Yuan Lin, and Jonathan Crabtree
Scientific Leadership: Sam Levy, Craig Venter, Robert Strausberg, Marvin Frazier
Sequence Data Generation and Indel Validation: Yu-Hui Rogers, John Gill, Jon
Borman, JTC Production, Tina McIntosh, Karen Beeson, Dana Busam, Alexia
Tsiamouri, Celera Genomics.
Data Analysis: Sam Levy, Granger Sutton, Pauline Ng, Aaron Halpern, Brian Walenz,
Nelson Axelrod, Yuan Lin, Jiaqi Huang, Ewen Kirkness, Gennady Denisov, Tim
Stockwell, Vikas Basal, Vineet Bafna, Karin Remington, and Josep Abril
CNV, Genotyping, FISH mapping: Steve Scherer, Lars Feuk, Andy Wing Chun
Pang, Jeff MacDonald