From Wikipedia,
the free encyclopedia.
Computational molecular
docking, sometimes called
virtual screening or ligand
docking, is a research
technique for predicting whether a
small molecule, called a
ligand, will bind to a
protein. This is done by
modelling the interaction
between protein and ligand: if the
geometry of the pair is
complementary and involves
favorable
biochemical interactions, the
ligand will potentially bind the
protein
in vitro or
in vivo.
Applications
A binding interaction may mean
that the ligand
inhibits the protein's
function or acts as an
agonist. Docking is most
pertinent to the field of
drug design—most drugs are
small molecules, and using a
computational approach allows
researchers to quickly screen
large databases of potential drugs
(e.g., the
ZINC database of compounds for
virtual screening) against protein
targets such as
HIV
reverse transcriptase.
Traditional discovery of drug
candidates occurs by chance or
through painstaking work in the
lab. For example, virtual
screening and related
combinatorial chemistry
techniques are particularly
important in searching for new
antibiotics as strains of
resistant bacteria increasingly
appear due to overuse of
antibiotics. The use of
penicillins will soon be obsolete
as bacteria continue to evolve
beta-lactamases that confer
resistance to nearly every
penicillin derivative that has
been found to be active.
The mechanics of docking
To perform a docking screen,
the first requirement is a
structure of your protein of
interest. Usually the structure
has been determined in the lab
using a biophysical technique such
as
x-ray crystallography, or less
often,
NMR spectroscopy. This protein
structure and a database of
potential ligands serve as inputs
to a docking program. The success
of a docking program depends on
two components: the
search algorithm and the
scoring function.
The search algorithm
The
search space consists of all
possible orientations and
conformations of the protein
paired with the ligand. With
present computing resources, it is
impossible to exhaustively explore
the search space—this would
involve enumerating all possible
distortions of each molecule
(molecules are dynamic and exist
in an ensemble of conformational
states) and all possible
rotational and translational
orientations of the ligand
relative to the protein at a given
level of
granularity. Most docking
programs in use account for a
flexible ligand, and several are
attempting to model a flexible
protein receptor. Each "snapshot"
of the pair is referred to as a
pose. There are many
strategies for sampling the search
space. Here are some examples:
- Use a coarse-grained
molecular dynamics
simulation to propose
energetically reasonable poses
- Use a "linear
combination" of multiple
structures determined for the
same protein to emulate receptor
flexibility
- Use a
genetic algorithm to
"evolve" new poses that are
successively more and more
likely to represent favorable
binding interactions
The scoring function
The scoring function takes a
pose as input and returns a number
indicating the likelihood that the
pose represents a favorable
binding interaction.
Most scoring functions are
physics-based
molecular mechanics
force fields that estimate the
energy of the pose; a low
(negative) energy indicates a
stable system and thus a likely
binding interaction. An
alternative approach is to derive
a statistical potential for
interactions from a large database
of protein-ligand complexes, such
as the
Protein Data Bank, and
evaluate the fit of the pose
according to this inferred
potential.
All scoring functions used in
docking will yield a large number
of
false positive hits, i.e.,
ligands predicted to bind to the
protein that actually don't when
placed together in a test tube.
One way to reduce the number of
false positives is to recalculate
the energy of the top-hit poses
using a higher resolution (and
therefore slow) technique like
Generalized Born or Poisson-Boltzmann
methods[1]. However, typically the
researcher will screen a database
of tens to hundreds of thousands
of compounds and test the top 60
or so in vitro, and to
identify any true binders is still
considered a success.
See also
Reference
-
[1] Feig, et al. (2004)
Performance comparison of
generalized born and Poisson
methods in the calculation of
electrostatic solvation energies
for protein structures. J
Comput Chem. 25(2):265-84.