From Wikipedia,
the free encyclopedia.
CpG sites are regions of
the
DNA where a
cytosine
nucleotide occurs next to a
guanine
nucleotide. "CpG" stands for
cytosine and guanine separated by
a
phosphate, which links the two
nucleotides together in DNA. The "CpG"
notation is used to distinguish a
cytosine followed by guanine from
a CG pair.
Assumming random distribution
of nucleotides the
probability of a cytosine and
guanine lying next to each other
is very high. However, there are
actually very few CpG sites in
eukaryotic
genomes. This is due to the
action of
DNA methyltransferase, which
recognizes these CpG sites and
methylates the cytosine,
turning it into
5-methylcytosine. Following
spontaenous
deamination, the
5-methylcytosine converts into
thymine. If this has no effect
(as in most cases), the error is
not recognized by the
repair machinery, thus
resulting in the loss of the CpG
site. CpG sites thus tend to be
eliminated from the
genomes of
eukaryotes.
However, there are regions of
the DNA which have a high
concentration of CpG sites. These
regions, known as
CpG islands, are found at the
promoters of eukaryotic
genes. Surprisingly, these CpG
sites are unmethylated, and
therefore any spontaneous
deaminations of cytosine to
uracil are recognized by the
repair machinery and the CpG site
is restored.
High occurrence of CpGs in many
cases marks the existence of
downstream genes and is frequently
used in genome annotation as
indicator of gene density.
Note: the mechanisms of
methylation and de-methylation are
largely unknown, as are the
various enzymes and modes of
regulation.