From Wikipedia,
the free encyclopedia.
As of 2004, there are around
500 public and commercial
biological databases. These
databases usually contain
genomics and
proteomics data, but databases
are also used in
taxonomy. The data are
nucleotide sequences of
genes or
amino acid sequences of
proteins. Furthermore
information about function,
structure, localisation on
chromosome, clinical effects of
mutations as well as similarities
of biological sequences can be
found.
Overview
Biological databases have
become an important tool in
assisting scientists to understand
and explain a host of biological
phenomena from the structure of
biomolecules and their
interaction, to the whole
metabolism of organisms and to
understanding the
evolution of species. This
knowledge helps facilitate the
fight against diseases, assists in
the development of medical
drugs and in discovering basic
relationships amongst species in
the
history of life.
The biological knowledge of
databases is usually (locally)
distributed amongst many different
specialized databases. This makes
it difficult to ensure the
consistency of information, which
sometimes leads to low data
quality.
By far the most important
resource for biological databases
is a special (yearly) issue of the
journal "Nucleic Acids Research" (NAR).
The Database Issue is
freely available, and categorizes
all the publically available
online databases related to
computational biology (or
bioinformatics).
The Database Issue of NAR
See also:
NCBI,
PubMed
Most important public
databases for molecular biology
(from
www.kokocinski.net)
Primary sequence databases
-
DDBJ (DNA DataBase of Japan)
-
EMBL Nucleotide DB (European
Molecular Biology Laboratory )
-
GenBank
[1] (National
Center for Biotechnology
Information)
These databanks represent the
current knowledge about the
sequences of all organisms. They
interchange the stored information
and are the source for many other
databases.
Meta-databases
-
Entrez {Nat.Center for
Biotechn.Inf.}
-
euGenes (Univ. of Indiana)
-
GeneCards (Weizmann Inst.)
-
SOURCE (Univ. of Stanford)
Strictly speaking a meta-database
can be considered a database of
databases, rather than any one
integration project or technology.
It collects information from
different other sources and
usually makes them available in
new and more convinient form.
Specialized databases
-
CGAP Cancer Genes (National
Cancer Institute)
-
Clone Registry Clone Collections
(National Center for
Biotechnology Information)
-
DBGET H.sapiens (Univ. of
Kyoto)
-
Ensembl Genome BrowserAnnotated
Genomes (EMBL-EBI and Sanger
Inst.)
-
GDB Hum. Genome Db (Human
Genome Organization)
-
I.M.A.G.E Clone Collections
(Image Consortium)
-
KEGG Functional Db (Univ. of
Kyoto)
-
MGI Mouse Genome (Jackson
Lab.)
-
NCBI-UniGene (National
Center for Biotechnology
Information)
-
OMIM Inherited Diseases
(National Center for
Biotechnology Information)
-
Off. Hum. Genome Db (HUGO
Gene Nomenclature Committee)
-
List with SNP-Databases
Protein sequence databases
-
UniProt Universal Protein
Resource (Uniprot Consortium:
EBI, Expasy, PIR)
-
PIR Protein Information
Resource (Georgetown University
Medical Center (GUMC))
-
SWISS-PROT Protein
knowledgebase (Swiss Institute
of Bioinformatics)
-
UCSC Genome Bioinformatics
Genome Browser and Tools (UCSC)
-
Ensembl Genome Browser
(Sanger Institute and EBI)
-
PEDANT Protein Extraction,
Description and ANalysis Tool (Forschungszentrum
f. Umwelt & Gesundheit)
-
PROSITE Database of Protein
Families and Domains
-
DIP Database of Interacting
Proteins (Univ. of California)
-
Pfam Protein families
database of alignments and HMMs
(Sanger Institute)
-
ProDom Comprehensive set of
Protein Domain Families (INRA/CNRS)
-
SignalP Server for signal
peptide prediction
Protein structure databases
-
PDB Protein Data Bank
(Research Collaboratory for
Structural Bioinformatics (RCSB))
-
CATH Protein Structure
Classification
-
SCOP Structural
Classification of Proteins
-
SWISS-MODEL Server and
Repository for Protein Structure
Models
-
ModBase Database of
Comparative Protein Structure
Models (Sali Lab, UCSF)
Microarray-databases
-
ArrayExpress (European
Bioinformatic Institute)
-
Gene Expression Omnibus
(National Center for
Biotechnology Information)
-
maxd (Univ. of Manchester)
-
SMD (Univ. of Stanford)
-
GPX(Scottish Centre for
Genomic Technology and
Informatics)
Protein-Protein Interactions
-
The General Repository for
Interaction Datasets (The GRID)
(Samuel Lunenfeld Research
Institute)