From Wikipedia,
the free encyclopedia.
The GenBank
sequence database is an
annotated collection of all
publicly available
nucleotide sequences and their
protein translations. This
database is produced at
National Center for Biotechnology
Information (NCBI) as part of
an international collaboration
with the
European Molecular Biology
Laboratory (EMBL) Data Library
from the
European Bioinformatics Institute
(EBI) and the
DNA Data Bank of Japan (DDBJ).
GenBank and its collaborators
receive sequences produced in
laboratories throughout the world
from more than 100,000 distinct
organisms. GenBank continues to
grow at an exponential rate,
doubling every 10 months. Release
134, produced in
February 2003, contained over
29.3 billion nucleotide bases in
more than 23.0 million sequences.
GenBank is built by direct
submissions from individual
laboratories, as well as from bulk
submissions from large-scale
sequencing centers.
Direct submissions are made to
GenBank using
BankIt, which is a Web-based
form, or the stand-alone
submission program,
Sequin. Upon receipt of a
sequence submission, the GenBank
staff assigns an
Accession number to the
sequence and performs quality
assurance checks. The submissions
are then released to the public
database, where the entries are
retrievable by
Entrez or downloadable by
FTP. Bulk submissions of
Expressed Sequence Tag (EST),
Sequence Tagged Site (STS),
Genome Survey Sequence (GSS),
and
High-Throughput Genome Sequence
(HTGS) data are most often
submitted by large-scale
sequencing centers. The GenBank
direct submissions group also
processes complete microbial
genome sequences.
Sources