From Wikipedia,
the free encyclopedia.
An intein is a segment
of a
protein that is able to excise
itself and rejoin the remaining
portions (the exteins) with a
peptide bond. Inteins have
also been called "protein
introns".
Most reported inteins also
contain an
endonuclease domain that plays
a role in intein propagation. In
fact, many
genes have unrelated intein-coding
segments inserted at different
positions. For these and other
reasons, inteins (or more
properly, the gene segments coding
for inteins) are sometimes called
selfish genetic elements
but it may be more accurate to
call them
parasitic.
Intein-mediated
protein splicing occurs after
mRNA has been translated into
a protein. This precursor protein
contains three segements - an N-extein
followed by the intein followed by
a C-extein. After splicing has
taken place, the result is also
called an extein.
The first intein was discovered
in
1987. Since then, inteins have
been found in all three domains of
life (eukaryotes, bacteria, and
archaea). The mechanism for the
splicing effect is nature's
analogy to the technique for
chemically generating large
proteins called
native chemical ligation,
which was developed at the same
time as inteins were discovered.
Inteins in biotechnology
Inteins are very efficient at
protein splicing and they have
accordingly found an important
role in
biotechnology. Inteins have
been engineered for particular
applications such as protein
synthesis, and the selective
labeling of protein segments,
which is useful for
NMR studies of large proteins.
Pharmaceutical inhibition of
intein excision may be useful tool
for
drug development, the protein
that contains the intein will not
carry out its normal function if
the intein does not excise since
its structure will be disrupted.
An interesting endeavoring in
using intein in biotechnology was
done by David R. Liu with his
studies in the directed evolution
of molecular switches. Through the
process of selection, his group
obtained a versatile intein
component that is essentially
dependent on the binding of a
small molecule,
4-hydroxytamoxifen. This binding
event transduces a conformational
change restoring the activity of a
protein by which was engineered to
be disrupted by the intein.
Intein naming conventions
The first part of an intein
name is based on the
scientific name of the
organism in which it is found,
and the second part is based on
the name of the corresponding gene
or extein. For example, the intein
found in
Thermoplasma acidophilum
and associated with 'Vacuolar
ATPase subunit A' (VMA) is called
'Tac VMA'.
Normally, as in this example,
just three letters suffice to
specify the organism, but there
are variations. For example,
additional letters may be added to
indicate a strain. If more than
one intein is encoded in the
corresponding gene, the inteins
are given a numerical suffix
starting from 5' to 3' or in order
of their identification. For
example, "Msm dnaB-1".
The segment of the gene that
encodes the intein is usually
given the same name as the intein,
but to avoid confusion, the name
of the intein proper is usually
capitalized (e.g. Pfu RIR1-1),
whereas the name of the
corresponding gene segment is
italicized.
Full and mini inteins
Inteins can contain a
homing endonuclease gene
domain in addition to the splicing
domains. This domain is
responsible for the spread of the
intein by cleaving DNA at an
intein free allele on the
homologous chromosome,
triggering the
DNA double-stranded break repair
system, which then repairs the
break, thus copying the intein
into a previously intein free
site. The HEG domain is not
necessary for intein splicing, and
so it can be lost, forming a
minimal, or mini intein. Several
studies have demonstrated the
modular nature of inteins by
adding or removing HEG domains and
determining the activity of the
new construct.
Split inteins
Sometimes, the intein of the
pre-cursor protein comes from two
genes. In this case, the intein is
said to be a split intein. For
example, in
Cyanobacteria, DnaE, the
catalytic subunit alpha of DNA
polymerase III, is encoded by two
separate genes, dnaE-n and dnaE-c.
The dnaE-n product consists of an
N-extein sequence followed by a
123-aa (amino acid) intein
sequence, whereas the dnaE-c
product consists of a 36-aa intein
sequence followed by a C-extein
sequence.