|
Chapter 3:
DNA IN THE LABORATORY
Basic Genetics: Why DNA Typing Works
There are about 100 trillion cells in the adult human
body. Most of them have a nucleus, or center, that contains thread-like
bundles of chromosomes. In these chromosomes are all of the instructions
and information needed to make a human being. Each parent contributes
one chromosome to each of the 23 pairs found in all normal people.
Within the chromosomes, are up to 100,000 paired genes, the fundamental
units of heredity. Each gene can have different versions (as many
as 100 or more in rare cases) called alleles, but most are the same
from person to person. Genes determine all inherited traits including
those that give the individual specific characteristics (blue eyes
rather than brown eyes) as well as common characteristics (two eyes,
two arms, etc.).
Genes are made of deoxyribonucleic acid (DNA). Hence,
DNA is the master molecule of life and controls the growth and development
of every living thing. It is a polymer, i.e., a long string of simple
repeating units. These repeating units are called nucleotides and
are of four types: adenine (A), cytosine (C), guanine (G), and thymine
(T). Just as the order of the letters of the alphabet determines
the information content of words, the order in which these four
bases are strung together is what gives DNA its information content.
The complete DNA molecule consists of two of these strands of the
four bases.
In the two strands, A always is across from or paired
with T and G always is paired with C. These are the base pairs that
are the unit of measurement in determining the size of a given segment
of DNA. This structure suggests a natural mechanism for the duplication
or replication of the DNA molecule, as occurs during cell division.
These pairings are what connect the two strands of DNA together
to form a tightly coiled, twisted ladder. This spiral staircase,
the famous double helix, is the natural form in which DNA is found
within the nucleus of the cells.
If uncoiled, the DNA molecules in every human cell
would measure six feet in length. That is the total length of the
3.3 billion base pairs that make up the total human genetic complement
or genome. Except for identical twins, the sequence of the base
pairs within the DNA helix is unique for every person, and forms
the individual's genetic code or blueprint.
Perhaps the basis of DNA typing can be best understood
by comparing the way in which genetic information is stored in the
DNA to the way in which printed information is stored in books.
For example, if we were to cut all the sentences in forty volumes
of the Encyclopedia Britannica into strips, and tape them together
end to end, then we would have an amount of information equivalent
to that contained in the DNA within each of the cells that make
up our bodies. Furthermore, the information would then be in the
same physical form as the DNA information, i.e., a long linear strip
sometimes likened to a computer punch tape.
The genetic information contained in the DNA is organized
and packaged into chromosomes, much as printed information is organized
into volumes. Just as a specific passage in the encyclopedia can
be identified by specifying a volume, page, and line number, a specific
genetic passage or location, known as a locus, can be identified.
A specific naming system identifies genes by numbers issued by the
Human Gene Mapping Committee. For example, if we see the designa-
tion D4S139 in a report, then we know exactly what gene has been
analyzed, that it is on chromosome four, and that it is the 139th
DNA probe to be mapped to chromosome four.
A significant difference between the way information
is stored in the cell and in the encyclopedia is that there are
two copies of the information in each of the cells, one from the
mother and one from the father. These two copies of the genetic
information which are largely identical, come together at the moment
of conception when the sperm and the egg join together. All of the
child's cells contain DNA derived from this original fertilized
cell, half from the mother and half from the father. It is this
basic principle of heredity, first discovered over 150 years ago,
that allows us to reliably perform parentage tests.
Variable DNA: The Key to DNA Typing
The DNA and hence the genetic code of humans is almost
the same for all individuals. It is the very small amount that differs
from person to person that forensic scientists analyze to identify
people. These differences are called polymorphisms (from the Greek
for "many forms") and are the key to DNA typing.
Two major kinds of polymorphisms are most useful
to forensic scientists. The first consists of variations in the
length of the DNA at specific locations (loci) known as VNTRs (variable
number of tandem repeats). These VNTR regions consist of stretches
of DNA made up of short repeating DNA sequences. The number of times
the short DNA sequence is repeated determines the physical length
of the DNA molecule at these specific loci. Each of the many versions
that may be found in the population are known as alleles. These
variations are examined by the method known as the RFLP technique.
The second type of variation in the DNA is simply a difference in
the nucleotide letters found at a specific pair of bases. These
are examined by the method known as PCR.
The goals are the same for both types of DNA tests:
to isolate a distinctive DNA sequence and record its presence in
a way that can be examined visually. The basic procedure of RFLP
analysis is known as "Southern blotting" after Edward Southern,
a Scottish bio-chemist who developed the technique in the early
1970s. The basic procedures of PCR based testing were invented in
the early 1980s by Kary Mullis while working for a California biotechnology
company.
Deciding which method to use is determined by the
amount of DNA that is available and how deteriorated or degraded
it is. RFLP analysis is more often performed because it is more
discriminating. If there is only a small amount of material or it
is highly degraded, PCR analysis is used, because this method requires
less material and can produce a result on DNA of poor quality. As
little as two billionths of a gram (2 ng), the amount of DNA contained
in about 700 sperm cells will suffice for PCR analysis. The RFLP
test requires 20 to 50 ng of DNA. PCR tests can be performed in
a matter of days as compared to weeks for RFLP tests.
DNA Evaluation
Once the evidence has been documented and screened
in the laboratory, and deemed appropriate for DNA analysis, the
initial step is to isolate or purify the DNA. First, it must be
removed from whatever object it is attached to and removed from
the cell. Unless the major non-DNA constituents of the cell such
as proteins, fats and carbohydrates are removed, the enzymes essential
in the next step will not be able to do their work. This isolation
of DNA begins with the controlled destruction of cellular integrity,
releasing the DNA from its nuclear and chromosomal packaging. The
cell walls are dissolved with a detergent and the proteins are digested
by enzymes.
The DNA then is purified by the methods of extraction
and precipitation. Once the DNA is isolated and concentrated, a
small sample is tested to determine quality and quantity. If intact
(high molecular weight) human DNA is present in sufficient amounts,
RFLP testing can proceed. If the DNA is degraded, or present in
minute amounts, PCR testing is used.
Restriction Fragment Length Polymorphism
(RFLP) Analysis
RFLP typing of purified DNA consists of the following
six steps:
- Cutting the DNA into pieces (Restriction Enzyme Digestion);
- Separating the DNA by size (Gel Electrophoresis);
- Transferring the DNA to a solid support surface (Southern Transfer);
- Targeting and visualizing the DNA of interest (Hybridization
and Autoradiography);
- Reading the DNA profile (Interpretation of Data); and
- Determining the rarity of the DNA profile (Population Genetics
and Frequency Estimates).
Restriction Enzyme Digestion
Restriction enzymes reproducibly cut DNA at specific
four or six-base-pair sequences called restriction sites. Hae III
(Haemophilus Aegyptius III), the most commonly used enzyme in forensic
science, cuts the DNA everywhere the bases are arranged in the sequence
GGCC. These sites are found throughout the human genome and are,
for the most part, the same in everyone. Hae III cuts human DNA
into approximately 12 million different restriction fragments ranging
in size from a few hundred to 10,000 or more base pairs in length.
Gel Electrophoresis
The physical length of DNA restriction fragments
in VNTRs is one fundamental biochemical characteristic that varies
from person to person. Electrophoresis is the technique by which
the different-sized fragments are separated. The DNA is loaded into
a hole, or well, at one end of a slab of semi-solid gel, a porous,
Jello-like substance. When an electrical field is applied to the
gel containing the DNA, the DNA moves toward the positive electrode
because it has a negative electrical charge.
The sieving action of the gel allows the small restriction
fragments to migrate at a faster rate than the larger ones, just
as a small rabbit would move faster and farther through a thicket
than a large hound in pursuit. The relative position of these fragments
within the gel after overnight current application is determined
by their length or molecular weight.
Gels have several lanes within them which contain
DNA, including marker lanes that measure how far fragments move
through the gel and a lane for control DNA that produces a known
pattern and can be used to verify that the test was properly conducted.
The DNA in the gel can be stained and seen under ultraviolet (black)
light.
Southern Transfer
Because the gel is fragile, it is necessary to remove
the DNA from the gel and permanently attach it to a solid support.
This is accomplished by the process of Southern blotting. The first
step is to denature the DNA in the gel which means that the double-stranded
restriction fragments are chemically separated into the single-stranded
form. The DNA then is transferred by the process of blotting to
a sheet of nylon. The nylon acts like an ink blotter and "blots"
up the separated DNA fragments. The restriction fragments, invisible
at this stage, are irreversibly attached to the positively-charged
nylon membrane called the "blot."
DNA Probes
Visual observation of an individual's DNA pattern
requires the use of DNA probes. The DNA probe, like a guided missile,
will seek out and find its target sequence. There are approximately
a dozen VNTR DNA probes in common usage in forensic RFLP testing.
These probes have been patented and are commercially available from
biotechnology suppliers. Each DNA probe can be used to develop the
DNA profile at a particular VNTR locus. Most of these are located
on different chromosomes, an important factor to be considered when
performing a statistical analysis. Most forensic tests use a combination
of at least four to six separate DNA probes in a sequential manner.
All of these DNA probes have been obtained by the process of cloning,
which simply means that they are free of all other human DNA. Therefore,
large quantities of probe DNA can be made and labeled with a radioactive
or other tracer in the laboratory.
Hybridization
To detect a VNTR locus immobilized on the Southern
blot, one uses a DNA probe that has a base pair sequence complimentary
to the DNA sequence at the VNTR locus. The double-stranded nature
of DNA and a phenomenon called hybridization provide the scientific
basis for the usefulness of DNA probes. Double-stranded DNA fragments
will separate when heated. When the DNA cools down, the two strands
will reconnect and become double-stranded again. This is not a random
process. Because of the complementarity of the two DNA strands,
the single strands will only reconnect with another strand that
has a complimentary sequence.
Just prior to its use, the DNA probe is labeled with
a radioactive or other tracer and boiled to the single stranded
form. The DNA on the blot, also single stranded, is soaked or incubated
in a solution containing the DNA probe. The probe hybridizes, or
binds, to only the DNA fragments that bear the complimentary sequences
of DNA bases. These will be the restriction fragments corresponding
to the locus from which the probe was originally cloned.
Autoradiography
The excess probe is washed off and the blot placed
in contact with a sheet of film. The film, exposed by the radioactive
tracer instead of light, is developed and becomes an autoradiograph,
commonly known as an autorad. The autorad is the final product of
the RFLP analysis. It reveals the overall quality of the testing
and can be copied, distributed and interpreted by other DNA experts.
The autorad has darkened areas known as bands corresponding
to the position of the DNA probes and hence the restriction fragments
bound to the membrane. Typically, several probes are used sequentially
in order to compile forensically significant DNA profiles. This
requires that the blot be stripped of the first radioactive probe,
hybridized to a second, washed and exposed to a new film to make
another autorad. Each round of hybridization and autoradiography
may take up to ten days. This is why RFLP testing takes longer than
conventional blood or PCR testing.
Interpretation of Data
Up to this point in DNA analysis, there is little
argument about its validity (provided that it is done correctly)
because we are dealing with physical reality. The interpretation
of the data, what is its meaning, is another story. This requires
that inferences based on the science of statistics, population genetics
and probability theory be applied to the measurements of physical
reality that the autoRADS reveal.
Forensic scientists use these mathematical concepts
to calculate and report an estimate of how frequently the genetic
profile they have observed might be found in major population groups.
If the genetic profiles found in two different samples, say one
from a piece of evidence and one from a suspect are indistinguishable,
they are said to match. A typical population frequency for conventional
blood typing might be 1 in 200, for DNA 1 in 5,000,000. This means
that only 1 in 5,000,000 people would have the same DNA profile.
All others would be excluded from being the source of the matching
evidence.
Irrespective of what calculation method is
used, it is a physical fact that the genetic profiles match and
that they would be found at some frequency in the population. In
attempting to call the frequency estimates into question, attorneys
are fond of pointing out that they are a comparison between the
observed profile and randomly chosen individuals, and that a relative
of the person with the profile is much more likely to share that
profile than any of the random individuals.
When forensic scientists provide a report and testimony
about the frequency estimates, their job is done. The judgment of
guilt or innocence reached by the court may take these estimates
into account, but they must be placed into the circumstances surrounding
the crime and the intent, motive, means and opportunity available
to the defendant.
COMMON FALLACIES
The meaning of the genetic profile frequencies
are often misconstrued by attorneys. For example, the argument might
be made that if a pair of matching genetic profiles are found in
1 in 200 individuals then there is a 1 in 200 chance that they came
from different sources. Not true. What the chance is that they came
from different sources cannot be determined by the genetic evidence
alone. It depends on all of the circumstances surrounding the case.
Another fallacy is the one heard in the Simpson case
preliminary hearing. O,J. Simpson's blood type matched blood found
on the sidewalk trailing away from the murder scene. Defense attorneys
pointed out that 80,000 people in Los Angeles share that blood type.
True, but all of those 80,000 people didn't visit 85 South Bundy
where the homicides occurred.
The prosecutor's fallacy is to argue that as the
genetic profile is found in 1 in 1000 people then there is only
a 1 in 1000 chance that the defendant is innocent. Once again this
ignores the other facts of the case and asks the scientific evidence
to determine guilt or innocence. Science cannot do that. Guilt or
innocence must be determined by the judge or jury.
EVALUATING THE ARGUMENTS
Critics of DNA testing have generated and seized
upon disagreements about the best way to interpret the data as a
reason not to admit DNA evidence. Forensic scientists themselves
have given the critics ammunition by not always following widely
used scientific conventions for rounding off and reporting the significance
of the numbers that are reported.
In evaluating the criticisms that are made, primarily
of the validity of the population data on which probability estimates
are made, it is important to remember that the critics:
- If they are geneticists at all,
study non-human organisms or came to the study of human genetics
from other fields. None are forensic scientists.
- Use the same assumptions in
their own work that they argue against in forensic science.
- Base their critiques on overdrawn
hypothetical and theoretical arguments for which no data exists.
- Ignore the safeguards and quality
programs that are routinely used by forensic DNA labs.
Much of the population genetics
controversy has been generated by an elementary misunderstanding
of the basics of RFLP testing in the early writing of Eric Lander.
Another source of misunderstanding was an article published in 1972
by Richard Lewontin and since repudiated by the author. Long after
Lewontin changed his mind, the article continued to be used to support
defense motions to quash DNA evidence.
What this reveals is that one
defense strategy is the shotgun approach; throw up a barrage of
flak and hope that the judge or jury accepts the validity of at
least one of the arguments. The critics also change their arguments
over time. As various lines of attack are countered or exposed as
spurious, new objections are raised. Frequency estimates are fertile
territory for the attackers because most people are not conversant
with the methods of statistics and the meaning of probability.
Reading the Autorad
Before an opinion can be formed
as to the forensic significance of a set of DNA profiles, a series
of data interpretation steps must be completed.
- First, the examiner inspects
the case specific data and records to determine if procedures
have been followed and interpretable results have been produced.
- Second, a visual inspection
of the autorad is made and an opinion of the testing results is
formed. Is the suspect included or excluded from the group of
individuals who could be the source of the evidence?
- Third, computerized electronic
measurements and statistical calculations are made that must confirm
the examiner's visual "match" calls.
- Finally, probability calculations
using population genetic data are used to calculate a conservative
estimate of the occurrence of the DNA profile in major population
groups.
In forensic science, statistics
are used to ensure that the data is interpreted in a conservative
fashion, that it can be guaranteed to be an overstatement of the
true occurrence of a DNA profile in the population. For example,
if we were to test everybody in the entire world then we would know
exactly how often a given genetic profile occurs. Let's say that
it is 1 in a million. If we report that profile out at 1 in 500,000,
that is acceptable because it is an understatement. The data has
been systematically skewed in favor of the defendant.
Visual Interpretation
The purpose of the visual inspection
is to form an opinion as to which of the DNA samples in the various
lanes in the autorad could have come from the same individual source
and which could not. Included in the autorad are bands from any
DNA that was tested as well as control specimens from known individuals.
In order to be valid, the opinion formed at this stage must be confirmed
by the following quantitative tests.
DNA Fragment Size Estimation
Central to the analysis are the
series of "sizing ladders" containing up to 30 closely spaced bands.
Each of the bands in the sizing ladders corresponds to a fragment
of DNA of exactly known size. These are the rulers which computerized
video devices use to measure the DNA fragments in each of the sample
lanes. The estimated size of the fragments being measured are reported
in base pairs.
At least one sample of a human
DNA, known as K562, is run on virtually every membrane used in forensic
casework. The K562 DNA is available as a standard reference material
from the National Institute of Standards and Technology. Sizing
estimates of the K562 must be within established tolerances before
sizing estimates from that membrane are acceptable.
Band Comparisons by "Match Criteria"
Comparisons are made between the
DNA profiles of the known samples (generally blood samples taken
from the suspect and/or victim) and DNA profiles of the evidence
samples (for example, bloodstains from the crime scene or dried
vaginal swabs). There are three possible results:
There is a match between
two specimens being compared. If the match is
- between a suspect and the DNA
in the evidence, then that suspect is included in the group of
individuals who could be the source of that evidence;
- There is not a match between
the samples. The suspect could not be the source of the evidence.
They are excluded from the group of individuals who could have
contributed the specimen; or
- The data is inconclusive, meaning
that it is not possible to make a determi- nation. This can be
caused by a variety of factors, and usually occurs when the DNA
is old and heavily contaminated.
Interpretation of RFLP data has
required the adoption of statistical techniques new to forensic
serology because of the numeric nature of the data. In ABO and other
typing systems, the data is a non-numeric listing of discrete types.
In the RFLP technique, the sizing estimates form a continuous distribution
because restriction fragments differing by one VNTR repeat unit
cannot be distinguished. Therefore, it becomes necessary to perform
statistical calculations to determine if two bands on an autorad
can be distinguished from one another.
For forensic purposes, two DNA
profiles are said to match if they are statistically indistinguishable
and are therefore consistent with having been produced by DNA from
the same individual. First a quantitative match range for each appropriate
VNTR band is calculated. Next the match ranges of corresponding
bands are compared lane to lane. If these ranges overlap then the
bands are indistinguishable from one another and are said to match.
The process continues until all bands at a locus, and all bands
at all the loci examined have been compared. In order for two composite
DNA profiles to be declared a match and therefore included in the
group of DNA profiles that could have come from an individual source,
all bands must match.
Most laboratories use a match
range, or "window", on the order of + 2.5% of the measured size
or molecular weight of the band. This match window has been determined
by comparing the DNA profiles obtained from vaginal swabs and fresh
blood from rape victims. When working with fresh blood samples,
such as in a parentage laboratory, the measurement error is less
than 1% of the measured size of the bands.
Bandshifting
The relatively large size of the
forensic match window compensates for the phenomenon known as "bandshifting".
When DNA fragments are separated by electrophoresis, each sample
is loaded into its own lane on the gel. Sometime
|