Professional Documents
Culture Documents
2
INTRODUCTION
3
CONTENTS
• Overview
• Public Databases
○ 1) Primary sequence databases
○ 2) Meta-databases
○ 3) Genome Databases
○ 4) Genome Browsers
○ 7) Protein-protein interactions
○ 9) Microarray databases
4
OVERVIEW
PUBLIC DATABASES
5
Primary sequence databases
The International Nucleotide Sequence Database (INSD)
consists of the following databases.
1. DDBJ (DNA Data Bank of Japan)
6
6. Bioinformatics Harvester [3] (Karlsruhe Institute of
Technology) - Integrating 26 major protein/gene
resources.
7. MetaBase [4] (KOBIC) - A user contributed database of
biological databases.
Genome Databases
7
8. National Microbial Pathogen Data Resource. A
manually curated database of annotated genome data
for the pathogens Campylobacter, Chlamydia,
Chlamydophila, Haemophilus, Listeria, Mycoplasma,
Neisseria, Staphylococcus, Streptococcus, Treponema,
Ureaplasma, and Vibrio.
9. Saccharomyces Genome Database, genome of the yeast
model organism.
10.Viral Bioinformatics Resource Center Curated
database containing annotated genome data for eleven
virus families.
11. The SEED platform for microbial genome analysis
includes all complete microbial genomes, and most
partial genomes. The platform is used to annotate
microbial genomes using subsystems.
12. Wormbase, genome of the model organism
Caenorhabditis elegans
13. Zebrafish Information Network, genome of this fish
model organism.
Genome Browsers
Genome Browsers enable researchers to visualize and
browse entire genomes (most have many complete
genomes) with annotated data including gene prediction and
structure, proteins, expression, regulation, variation,
comparative analysis, etc. Annotated data is usually from
multiple diverse sources.
1. Integrated Microbial Genomes (IMG) system by the
DOE-Joint Genome Institute
8
2. UCSC Genome Bioinformatics Genome Browser and
Tools (UCSC)
3. Ensembl The Ensembl Genome Browser (Sanger
Institute and EBI)
4. GBrowse The GMOD GBrowse Project
5. Pathway Tools Genome Browser
6. X:Map A genome browser that shows Affymetrix Exon
Microarray hit locations alongside the gene, transcript
and exon data on a Google maps api
7. Viral Genome Organizer (VGO) A genome browser
providing visualization and analysis tools for annotated
whole genomes from the eleven virus families in the
VBRC (Viral Bioinformatics Resource Center)
databases
8. Apollo Genome Annotation Curation Tool A cross-
platform, JAVA-based standalone genome viewer with
enterprise-level functionality and customizations. The
standard for many model organism databases.
9. SEED viewer for visualizing and interrogating the
SEED database of complete microbial genomes
Protein Sequence Databases
1. UniProt[5] Universal Protein Resource (UniProt
Consortium: EBI, Expasy, PIR)
2. PIR Protein Information Resource (Georgetown
University Medical Center (GUMC))
3. Swiss-Prot[6] Protein Knowledgebase (Swiss Institute
of Bioinformatics)
9
4. PEDANT Protein Extraction, Description and ANalysis
Tool (Forschungszentrum f. Umwelt & Gesundheit)
5. PROSITE Database of Protein Families and Domains
11
1. PathoOligoDB: A free QPCR oligo database for
pathogens
Specialized Databases
A biological database is a large, organized body of
persistent data, usually associated with
computerized software designed to update, query,
and retrieve components of the data stored within the
system. A simple database might be a single file
containing many records, each of which includes the
same set of information. For example, a record
associated with a nucleotide sequence database
typically contains information such as contact name;
the input sequence with a description of the type of
molecule; the scientific name of the source organism
from which it was isolated; and, often, literature
citations associated with the sequence.
16
1) www.ncbi.nlm.nih.gov
2) www.wikipedia.org/wiki/sanger_institute
3) Biotechnology,U.Satyanarayana
17