Huelsenbeck - 1997 - Maximum Likelihood
Huelsenbeck - 1997 - Maximum Likelihood
Huelsenbeck - 1997 - Maximum Likelihood
Keith A. Crandall
Department of Zoology and M. L. Bean Museum, Brigham Young University, Provo,
UT 84602
KEY WORDS: maximum likelihood, phylogeny, hypothesis test, evolutionary model, molecular
evolution
ABSTRACT
One of the strengths of the maximum likelihood method of phylogenetic estima-
tion is the ease with which hypotheses can be formulated and tested. Maximum
likelihood analysis of DNA and amino acid sequence data has been made practi-
cal with recent advances in models of DNA substitution, computer programs, and
computational speed. Here, we describe the maximum likelihood method and the
recent improvements in models of substitution. We also describe how likelihood
ratio tests of a variety of biological hypotheses can be formulated and tested using
computer simulation to generate the null distribution of the likelihood ratio test
statistic.
INTRODUCTION
Only recently has phylogenetics been recognized as a field that has basic rele-
vance to many questions in biology. Phylogenies have proven to be important
tools of research in fields such as human epidemiology (42, 86), ecology (7),
and evolutionary biology (43). In fact, for any question in which history may
be a confounding factor, phylogenies have a central role (25). To the outsider
interested in using a phylogeny, one of the most frustrating aspects of the field
437
0066-4162/97/1120-0437$08.00
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Figure 1 The likelihood surface for one possible realization of the coin tossing experiment. Here,
six heads and four tails are observed. The likelihood appears to be maximized when p = 0.6.
are sites in the sequence, with the elements of each vector denoting the nu-
cleotide state for each taxon for site i.
Note that two of the sites exhibit the same site pattern (x1 and x4). There
are a total of r = 4s site patterns possible for s species. The number of sites
exhibiting different site patterns can also be considered as the data in a phylo-
genetic analysis. For example, the above data matrix can also be described as
Number 2 0 0 0 0 0 0 0 0 . . . 1 . . . 1 . . . 1. . . 1. . . 0 0 0
where the matrix is now a 4 × 256 matrix of all r = 44 = 256 site patterns possi-
ble for four species. The site patterns are labeled 1, 2, . . . , r. Most of the possible
site patterns for our sample data set are not observed. However, 5 site patterns
are observed (now labeled y1 = {A, A, A, A}0 , y65 = {C, A, A, A}0 , y86 =
{C, C, C, C}0 , y94 = {C, C, T, C}0 , and y176 = {G, G, T, T}0 ). The numbers of
sites exhibiting each site pattern are contained in a vector n (n1 = 2, n65 = 1,
n86 = 1, n94 = 1, n176 = 1, with all other ni = 0 for the example data matrix).
Maximum likelihood assumes an explicit model for the data. Just as with the
coin tossing experiment, the data are considered as random variables. However,
instead of two possible outcomes, there are r = 4s possible outcomes for DNA
sequences. Hence, the data can be described using a multinomial distribution.
The multinomial distribution is a generalization of the binomial distribution
and has the following form:
à ! r
n Y n
Pr[n 1 , n 2 , . . . , n r | p1 , p2 , . . . , pr ] = p i, 5.
n 1 , n 2 , . . . , n r i=1 i
¡ ¢
where n 1 ,n 2n,...,nr is the number of ways that n objects can be grouped into
r classes, ni is the number of observations of the ith site pattern, and pi is
the probability that site pattern i occurs. A maximum likelihood estimate of
pi is p̂i = ni/n (that is, the probability of the ith class is the proportion of
the time it was observed). The likelihood, then, can be calculated assuming a
multinomial distribution by setting the likelihood equal to Equation 5. However,
by using a multinomial distribution, one cannot estimate topology or other
biologically interesting parameters. Hence, models that incorporate phylogeny
are assumed. The difference in the log likelihood of the data under multinomial
and phylogenetic models, however, represents the cost associated with assuming
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Figure 2 The likelihood method assumes that observed sequences (here, the nucleotides for site
pattern 176 from the example in the text; y176 = {G, G, T, T}0 ) are related by a phylogenetic tree
(τ ) with branch lengths (v 1, . . . , v 6) specified in terms of expected number of substitutions per site.
The probability of observing the data (y176) is a sum over the possible assignments of nucleotides
to the internal nodes, i, j, and k.
a phylogenetic tree and substitution model (32, 80, 93) and has also been used
to show that maximum likelihood in phylogenetics is consistent (117).
Just like the multinomial probability model, phylogenetic models specify the
probability of observing different site patterns. At a minimum, a phylogenetic
model for molecular data includes a tree (τ ) relating the sequences with branch
lengths of the tree (v) specified in terms of expected number of changes per
site and a model of sequence change. Consider just one of the nucleotide
site patterns, above (y176), for the tree of Figure 2. Because the identities of
the nucleotides at the internal nodes i, j, and k are unknown, the probability
of observing site pattern 176 is a sum of 64 terms (the 43 = 64 possible
assignments of nucleotides to nodes i, j, and k),
Pr[y176 = {G, G, T, T} | τ, v1 , . . . , v6 , 2]
X
4 X
4 X
4
= πk pGi (v1 , 2) pGi (v2 , 2) pT j (v3 , 2) 6.
i=1 j=1 k=1
much more quickly by taking advantage of the tree structure when performing
the summations over nucleotides at interior nodes. If instead of DNA sequences,
amino acid sequences are used, the above equation involves a summation over
203 = 8000 possible assignments of amino acids to the internal nodes, and
px y (v i, 2) represents the probability of observing amino acids x and y at the
tips of a branch given the branch length and other parameters of the substitution
model.
Assuming independence among sites, the likelihood of a tree (τ ) is
à ! r
n Y
L(τ, v, 2 | y1 , . . . , yr ) = Pr[yi | τ, v, 2]ni , 7.
n 1 , n 2 , . . . , n r i=1
The method of Felsenstein (24) is to choose the tree that maximizes the likeli-
hood as the best estimate of phylogeny. This application of likelihood is unusual
because the likelihood function changes depending on the tree (81, 124). In prin-
(2s−5)!
ciple, to find the maximum likelihood tree, one must visit each of the 2s−3 (s−3)!
possible unrooted bifurcating trees in turn (23). For each tree, one finds the
combination of branch lengths and other parameters that maximizes the likeli-
hood of the tree (that maximizes the likelihood function, above). The maximum
likelihood estimate of phylogeny is the tree with the greatest likelihood.
This procedure of visiting all possible trees and calculating the likelihood
for each is computationally expensive. Fortunately, there are many short cuts
that can substantially speed up the procedure. As mentioned above, Felsenstein
described an efficient method to calculate the likelihood by taking advantage of
the tree topology when summing over all possible assignments of nucleotides to
internal nodes (24). There are also efficient ways of optimizing branch lengths
that involve taking the first and second derivatives of the likelihood function
with respect to the branch of interest (see 67). Finally, rather than visiting each
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
possible tree, search algorithms concentrate on only those trees that have a good
chance of maximizing the likelihood function (see 105).
Figure 3 The likelihood method assumes that substitutions follow a Poisson process, with the rate
of nucleotide substitution specified by a rate matrix, Q. The rates of change from one nucleotide
to another are specified by the parameters r1, . . . , r12 (A). The rate matrix, Q, corresponding
to the diagram (A) is shown in B. Different models of substitution are just special cases of the
general matrix shown here. For example, to obtain the Jukes-Cantor (JC69; 58) model, the rates
of change among all nucleotides are equal (r1 = r2 = · · · = r12) and base frequencies are equal
(πA = πC = πG = π T).
analyses and is assumed when calculating the likelihood of a tree (the joint prob-
ability of observing all site patterns is the product of the individual site patterns).
However, biologists know that the substitution processes at different sites in a
sequence often are not independent. For example, hydrogen-bonded sites in
the stem regions of ribosomal genes are not independent because a substitution
in one nucleotide changes the probability that a compensatory substitution will
occur in its partner.
Several authors have examined the effect of non-independent substitution
among pair-bonded stem nucleotides. For example, Dixon & Hillis (18) ex-
amined the appropriate weights that stem sites should receive to correctly ac-
commodate non-independence in a parsimony analysis. Others have devised
time-homogeneous Poisson process models to describe substitutions in stem
regions. Instead of a 4 × 4 matrix, these models assume a 16 × 16 rate matrix
(Q) of all possible nucleotide doublets possible in a stem (76, 95, 98, 99). The
instantaneous rate of change is set to zero if more than one substitution is re-
quired to change from doublet i to doublet j (e.g. qi j = 0 for AC → CG). The
other rate parameters of the Q matrix are specified in different ways depending
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Table 1 Parameter settings for a variety of evolutionary models employed in maximum likelihood
analyses. Parameters of the substitution matrix, Q, are shown in Figure 3.
JC69 πA = πC = πG = πT r1 = r2 = r3 = r4 = r5 = r6 58
= r7 = r8 = r9 = r10 = r11 = r12
K80 πA = πC = πG = πT r3 = r4 = r9 = r10; r1 = r2 = r5 60
= r6 = r7 = r8 = r11 = r12
K3ST πA = πC = πG = πT r3 = r4 = r9 = r10; r5 = r6 = r7 61
= r8; r1 = r2 = r11 = r12
F81 πA; πC; πG; πT r1 = r2 = r3 = r4 = r5 = r6 24
= r7 = r8 = r9 = r10 = r11 = r12
HKY85 πA; πC; πG; πT r3 = r4 = r9 = r10; r1 = r2 = r5 45
= r6 = r7 = r8 = r11 = r12
TrN πA; πC; πG; πT r3 = r4; r9 = r10; r1 = r2 = r5 107
= r6 = r7 = r8 = r11 = r12
SYM πA = πC = πG = πT r1 = r2; r3 = r4; r5 = r6; r7 = r8; r9 126
= r10; r11 = r12
GTR πA; πC; πG; πT r1 = r2; r3 = r4; r5 = r6; r7 = r8; r9 64
= r10 ; r11 = r12
on the assumptions the biologist is willing to make. For example, Schöniger &
von Haeseler (98) considered the simplest case by setting qi j = π j if doublets
i and j differ at one nucleotide (e.g., qAC,AG = πAG).
Other models consider the substitution process at the level of the codon. Fun-
damental to interpreting changes in substitution rates is an accurate assessment
of how these changes influence the resulting protein; that is, does a substitution
produce a change in the amino acid (a nonsynonymous change), or does the sub-
stitution not alter the protein (a synonymous change) (100). An increase in the
relative amount of nonsynonymous change can be strong evidence for adaptive
evolution (72). Many methods exist for estimating synonymous and nonsynony-
mous substitution rates (68–70, 75, 82, 87, 89). Recently, maximum likelihood
has been used to estimate synonymous and nonsynonymous rates of change
(33, 77, 78). These authors modeled the substitution process at the level of the
codon. They used a 61 × 61 matrix to describe the instantaneous rate of change
from one codon to another (the three stop codons are not considered). For the
model of Muse & Gaut (78) the instantaneous rate of change from codon i to j
is zero (qi j = 0) if the change requires more than one substitution, qi j = βπni j
if the substitution causes a change in the amino acid, and qi j = απni j if the
substitution is synonymous (where nij is the equilibrium frequency of the sub-
stituted nucleotide). The model of Goldman & Yang (33) is similar except that
they allow a transition/transversion rate bias and consider the physicochemical
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
properties of the 20 amino acids by using Grantham’s distances (34). The pa-
rameters of both models can be estimated using maximum likelihood. Unlike
many of the standard methods, these approaches do not rely on the assumption
that the number and location of silent/replacement sites do not change over
time. The codon-based approach has been used successfully to explain the rate
heterogeneity found in the chloroplast genome as being due primarily to differ-
ences in nonsynonymous substitution rates (78). As an alternative to modeling
the substitution process at the level of the codon, the substitution process has
also been modeled at the amino acid level (1, 9, 110).
The assumption of equal rates at different sites can also be relaxed. Several
models of among-site rate heterogeneity have been developed (e.g., assume
that the sites are log normally distributed—39, 85; assume an invariant rate
class—12, 45, 91; estimate the rates in different data partitions separately, see
105; or use a combination of different rate distributions—38, 114). Probably
the most widely used model is one that assumes that rates are gamma distributed
(57, 116). The gamma distribution is a continuous probability density function
that has wide application in probability and statistics (the well-known χ 2 and
exponential distributions are special cases of the gamma distribution). System-
atists have co-opted this distribution for their own use because the shape of the
gamma distribution changes dramatically depending on the value of the shape
parameter, α, and the scale parameter, λ. Systematists set α = 1/λ = a so
that the mean of the gamma distribution is 1.0 and the variance is 1/a. Rates at
different sites, then, are thought of as random variables drawn from a gamma
distribution with shape parameter a. When a is equal to infinity, the gamma
model of among-site rate heterogeneity collapses to the equal rates case. How-
ever, most empirical estimates of the shape parameter a fall in the range of 0.1
to 0.5 (121), indicating substantial rate variation among sites. Yang (116, 119)
provides details on how to calculate the likelihood under a gamma model of
rate heterogeneity.
An advantage of the likelihood approach is that the models can be made com-
plicated to incorporate other biologically important processes. For example,
the models of substitution can be modified to account for insertion and deletion
events (5, 111, 112), secondary structure of proteins (76a, 110), and correlated
rates among sites (11, 27, 120). In the course of estimating phylogeny, the max-
imum likelihood method provides estimates of model parameters that may be
of interest to the biologist. If the biologist is only interested in phylogeny, then
these additional parameters are considered nuisance parameters (i.e., parame-
ters not of direct interest to the biologist but which must be accommodated in the
analysis by either integrated likelihood or maximum relative likelihood meth-
ods; see 31). However, maximum likelihood estimates of parameters such as
the variance in the rate of substitution among sites or the bias in the substitution
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
(19, 59, 71, 101). Here, the maximum likelihood calculated under the null hy-
pothesis (H0) is in the numerator, and the maximum likelihood calculated under
the alternative hypothesis (H1) is in the denominator. When 3 is less than one,
H0 is discredited and when 3 is greater than one, H1 is discredited. 3 greater
than one is only possible for non-nested models. When nested models are con-
sidered (i.e., the null hypothesis is a subset or special case of the alternative
hypothesis), 3 < 1 and −2 log 3 is asymptotically χ 2 distributed under the null
hypothesis with q degrees of freedom, where q is the difference in the number
of free parameters between the general and restricted hypotheses.
An alternative means of generating the null distribution of −2 log 3 is through
Monte Carlo simulation (parametric bootstrapping; 13, 14). Felsenstein (26)
first suggested the use of the parametric bootstrap procedure in phylogenetics.
Goldman (32) was among the first to apply the method in phylogenetics and
to demonstrate that the usual χ 2 approximation of the null distribution is not
appropriate for some tests involving phylogeny. In parametric bootstrapping,
replicate data sets are generated using simulation under the assumption that the
null hypothesis is correct. The maximum likelihood estimates of the model
parameters under the null hypothesis are used to parameterize the simulations.
For the phylogeny problem, these parameters would include the tree topology,
branch lengths, and substitution parameters (e.g., transition:transversion rate
ratio or the shape parameter of the gamma distribution). For each simulated
data set, −2 log 3 is calculated anew by maximizing the likelihood under the
null and alternative hypotheses. The proportion of the time that the observed
value of −2 log 3 exceeds the values observed in the simulations represents
the significance level of the test. Typically, the rejection level is set to 5%; if
the observed value for the likelihood ratio test statistic is exceeded in less than
5% of the simulations, then the null hypothesis is rejected. Although there are
good statistical reasons for test statistics based on the probability density of the
data, the parametric bootstrap procedure may also be used to determine the null
distribution of other test statistics (47).
Maximum likelihood allows the easy formulation and testing of phyloge-
netic hypotheses through the use of likelihood ratio tests (though also see 96).
Furthermore, likelihood ratio tests are known to have desirable statistical prop-
erties. For example, they are known to be uniformly most powerful when simple
hypotheses are considered and often outperform other hypothesis tests for com-
posite hypotheses (92). Over the past two decades, numerous likelihood ratio
tests have been suggested. These include tests of the null hypotheses that (a) a
model of DNA substitution adequately explains the data (32, 80, 93), (b) rates
of nucleotide substitution are biased (32, 80, 93), (c) rates of substitution are
constant among lineages (24, 65, 79, 115), (d ) rates are equal among sites (123),
(e) rates of substitution are the same in different data partitions (30, 66, 122),
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
( f ) substitution parameters are the same among data partitions (122), (g) the
same topology underlies different data partitions (51), (h) a prespecified group
is monophyletic (53), (i) hosts and associated parasites have corresponding phy-
logenies (56), ( j) hosts and parasites have identical speciation times (56), and
(k) rates of synonymous and nonsynonymous substitution are the same (77).
Here, we describe a few of these tests. Our goal is not to provide an exhaustive
list and description of all the likelihood ratio tests that have been proposed in
phylogenetics but rather to illustrate how biological questions can be addressed
in a simple way using likelihood.
Figure 4 The hierarchy of hypotheses examined for the albumin data from five vertebrates. The
parameters of the models are explained in Table 1. At each level, the null hypothesis is either
accepted, “A,” or rejected, “R.”
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
(b) transitions and transversions occur at the same rate, (c) the rate of substi-
tution is equal among sites, and (d ) rates among lineages are constant through
time (i.e., the molecular clock holds).
Figure 5 shows the trees and log likelihoods of the albumin data under several
different models of DNA substitution. For the first null hypothesis considered—
that base frequencies are equal—the likelihood under the JC69 model is com-
pared to the likelihood under the alternative hypothesis calculated assuming the
F81 model. These two models are identical except that the F81 model allows
for unequal base frequencies and the JC69 model assumes equal base frequen-
cies. Furthermore, the models are nested because the JC69 model is simply
a special case of the F81 model. Therefore, the likelihood ratio test statistic,
−2 log 3, can be compared to a χ 2 distribution with 3 degrees of freedom. For
the test of equal base frequencies, −2 log 3 = 17.56, a value much greater
than 7.82, which represents the 95% critical value from a χ 2 distribution with
3 degrees of freedom. Therefore, the null hypothesis that base frequencies are
equal is rejected, and the F81 model is preferred to the JC69 model of DNA
substitution. The other three null hypotheses considered can also be tested
using likelihood ratio tests. Table 2 shows the results of these tests. For the
albumin data set, the most parameter-rich model considered (HKY85+0) is the
best fitting model (i.e., provides a statistically significant increase in likelihood
over the other models considered). Although the HKY85+0 model was found
to best fit the data, a general test of model adequacy indicates that the model is
inadequate to explain the data (H0: HKY85+0, H1: unconstrained model, H0
rejected at P < 0.01; 32). As expected, our models of DNA substitution do not
fully describe the process of evolution leading to the observed sequences.
Although this conclusion sounds dire, it should be taken with a grain of salt
because we know a priori that our models are inadequate to explain all the fea-
tures of the evolutionary process. However, although the model is in some sense
false, this does not detract from the utility of the model for estimating parame-
ters such as topology and branch lengths, especially given the observation that
phylogenetic methods in general, and maximum likelihood in particular, can be
robust to violation of assumptions (50). Note that although a hierarchy of hy-
potheses was considered in this example, an alternative means of specifying the
tests would be to treat the most general model as the alternative against which
the other models are compared. Also note that the same tree was estimated for
each of these models and that this tree is the one generally acknowledged as the
best based on other sources of evidence (e.g., fossil and morphological data; 4).
This is true even though the assumptions of all of the models are violated to
some degree and some of the models considered (e.g., the JC69 model) poorly
describe the data. Hence, the contention that methods using wrong models are
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Figure 5 The phylogenies estimated for the albumin data under five different models of DNA
substitution. The same phylogeny is estimated in each case, and this phylogeny is consistent with
the traditional phylogeny of vertebrates [(Fish,(Frog,(Bird,(Rat,Human))))]. Note that the estimate
of the transition:transversion rate ratio (κ) changes depending on whether or not among-site rate
variation is accounted for; κ is underestimated when rate variation is not accounted for. Models
that assume gamma distributed rates are denoted with a “+0.”
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Table 2 The results of likelihood ratio tests performed on the albumin DNA data from five
vertebrates
Tests of Topology
IS A PRESPECIFIED GROUP MONOPHYLETIC? A taxonomic group is monophy-
letic if all its members share a most recent common ancestor. Many phylogenetic
studies are aimed at determining the monophyly, or lack thereof, of some group
of organisms. The most controversial of these studies have questioned the mono-
phyly of groups, such as the rodents, bats, and toothed whales (35, 36, 73, 90),
long defined as having a common evolutionary history based on morphologi-
cal similarities. Often, the monophyly of a group has important evolutionary
implications, particularly with respect to selection and adaptation. Pettigrew,
for example, argued on the basis of neurological characters that megachiropter-
ans (flying foxes) are more closely related to primates than they are to mi-
crochiropterans (90). This hypothesis of relationship implies that bats are not
a monophyletic group and that either flight evolved twice (independently) in
mammals or that flight evolved once in mammals but was subsequently lost in
the primates.
How can the Pettigrew hypothesis that bats do not form a monophyletic group
be tested? An analysis of the interphotoreceptor retinoid binding protein (IRBP)
gene, that has been sequenced in primates, bats, and other mammals, provides
an example of a likelihood ratio test that can be generalized to other questions
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
of relationship (53, 103, 113). The maximum likelihood tree for the IRBP gene
is shown in Figure 6A (113). The tree is consistent with the monophyly of bats.
The best tree based on the assumption that bats are not monophyletic has a
log likelihood 40.16 less than the best tree (Figure 6B). There are two ways to
explain this result: (a) the Pettigrew hypothesis that bats are not monophyletic is
correct, but estimation error has resulted in a tree consistent with bat monophyly;
or (b) bats are a monophyletic group.
The ratio of the likelihoods calculated under the null model (bats constrained
to be nonmonophyletic) and under the alternative hypothesis (no constraints
placed on relationships) provides a measure of the relative support of the two
hypotheses. In this case, the ratio of the likelihoods is −log 3 = 40.16 (113).
How damaging is this likelihood ratio to the Pettigrew hypothesis? Figure 6C
shows the distribution of −log 3 that would be expected if the null hypothesis
that bats are not monophyletic is true. The observed value of −log 3 is much
greater than would be expected under the null hypothesis. Hence, the Pettigrew
hypothesis can be rejected for the IRBP gene with a significance level of less
than 1% (113).
What are the statistical properties of the likelihood ratio test of monophyly?
Simulation study suggests that the method can be powerful (i.e., frequently
rejects the null hypothesis when, in fact, the null hypothesis is false). Figure 7
shows the results of a study in which one of two trees was simulated (53).
Simulating data for Tree 1 generates the distribution under the null hypoth-
esis, whereas simulating data for Tree 2 generates data under the alternative
hypothesis. The graph shows that the likelihood ratio test of monophyly can be
powerful; the power of the test increases, as expected, when the number of sites
in the analysis is increased. Although promising, the simulations presented in
Figure 7 represent an ideal situation for the likelihood ratio test of monophyly.
Additional simulations suggest that the test also performs well when the overall
rate of substitution is low and an incorrect model is implemented in the likeli-
hood analysis. However, when an incorrect model is used and the overall rate
of substitution is high, the test rejects the null hypothesis too often. Hence, the
likelihood ratio test of monophyly should be implemented with as biologically
realistic a model as possible to prevent false rejection of the null hypothesis.
partitions of DNA characters include first- and second- versus third-codon po-
sitions, different genes, different coding regions within genes, or different ge-
nomic regions (e.g., nuclear versus mitochondrial, or different viral segments).
The incongruence of the trees estimated from different genes in bacteria has
been used to demonstrate horizontal gene transfer and recombination in bac-
teria (40). Similarly, the incongruence of trees estimated from different viral
DNA segments has been used to show that reassortment of the segments has
frequently occurred in the hanta virus (46). Such an approach can also be used
to indicate method failure; if the partitioned data have evolved on the same
underlying phylogenetic tree, but different trees have been estimated from each
data partition, then either sampling error or systematic bias by the phylogenetic
method are at fault (10). Finally, the congruence of trees estimated from host
and associated parasites can be used to infer cospeciation (6, 41, 102).
How can incongruence of phylogenetic trees estimated from different data
partitions be tested? A simple likelihood ratio test can be used to test whether
the same phylogeny underlies all data partitions (51). The likelihood under
the null hypothesis (L0) is calculated by assuming that the same phylogenetic
tree underlies all of the data partitions. However, branch lengths and other
parameters of the substitution model are estimated independently in each. The
likelihood under the alternative hypothesis (L1) is calculated by relaxing the con-
straint that the same tree underlies each data partition. The alternative hypothesis
allows the possibility that the histories of all data partitions are different. The
likelihood ratio test statistic [−2 log 3 = −2 (log L0 −log L1)] is compared to
a null distribution generated using parametric bootstrapping.
Rejection of the null hypothesis of homogeneity (i.e., the same phylogeny
for all data partitions) can indicate one of several different processes. One
possibility is that a different history underlies different data partitions; incon-
gruence of this sort could be caused by recombination, horizontal gene transfer,
or ancestral polymorphism. Another possibility is that the phylogenetic meth-
ods have failed for one or more data partitions. All phylogenetic methods can
←−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−
Figure 6 The phylogenetic relationship of bats and other mammals. Trees were estimated using
maximum likelihood under a model that allows an unequal transition:transversion rate, unequal base
frequencies, and among-site rate heterogeneity (the HKY85+0 model). Maximum likelihood trees
were obtained using a tester version of the program PAUP∗ 4.0 (104). Bats form a monophyletic
group in the maximum likelihood tree (log L = −5936.52) (A). The best tree under the Pettigrew
hypothesis (that megachiroptera are constrained to be a sister group with the primates) (log L =
−5976.69) (B). The distribution of the likelihood ratio test statistic under the assumption that the
null hypothesis (the Pettigrew hypothesis) of relationship is correct (C). The observed likelihood
ratio test statistic (−log 3 = 40.16) is significant at P < 0.01. Hence, the Pettigrew hypothesis is
rejected.
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
Figure 7 The power of the likelihood ratio test of monophyly. The graph represents the results
of a computer simulation of two unrooted four taxon trees. The null hypothesis is true when the
tree ((A, B), C, D) is simulated and false when ((A, C), B, D) is simulated. The external branches
were constrained to be equal in length. R represents the ratio of the length of the internal branch
to the external branches; when R = 1.0, all branches of the tree are equal in length. The null
hypothesis is that the taxa A and B are a group (that the taxon bipartition {A, B} {C, D} exists).
The null hypothesis is rarely rejected when it is true and frequently rejected when false.
CONCLUSIONS
Phylogenies are being applied to a wider variety of biological questions than
ever before. One of the challenges for systematists is to develop appropriate
tests to address the questions posed by evolutionary biologists. Statistical tests
of phylogenetic questions can be formulated in many different ways. However,
for many of the tests posed to date, the underlying assumptions are not clear.
In fact, for some tests the null hypothesis is unclear (106). The testing of
phylogenetic hypotheses in a likelihood framework should prove useful in the
future. Likelihood provides a unified framework for the evaluation of alternative
hypotheses. With the use of parametric bootstrapping, likelihood ratio tests can
be applied to questions for which the null distribution is difficult to determine
analytically.
The application of likelihood ratio tests in phylogenetics is a recent phe-
nomenon, with most of the research activity occurring in the past five years.
However, in that time the approach has proven powerful. Likelihood ratio tests
have already provided information on the pattern of DNA substitution. Fur-
thermore, the approach has been applied to questions involving topology and
has even allowed the examination of whether hosts and parasites cospeciated
(56). Future research can investigate the statistical properties of likelihood ratio
tests with the objective of determining the power and robustness of the tests.
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
ACKNOWLEDGMENTS
We thank Rasmus Nielsen, Jack Sites, and Spencer Muse for comments on
earlier versions of this manuscript. Spencer Muse, especially, made many
helpful comments pointing out inadequacies of an earlier manuscript. This
work was supported by a Miller postdoctoral fellowship awarded to JPH and
an Alfred P. Sloan Young Investigator Award to KAC.
Literature Cited
1. Adachi J, Hasegawa M. 1992. Amino acid 11. Churchill GA. 1989. Stochastic models
substitution of proteins coded for in mito- for heterogeous DNA sequences. Bull.
chondrial DNA during mammalian evo- Math. Biol. 51:79–94
lution. Jpn. J. Genet. 67:187–97 12. Churchill GA, von Haeseler A, Navidi
2. Allard MW, Miyamoto MM. 1992. WC. 1992. Sample size for a phylogenetic
Perspective: testing phylogenetic ap- inference. Mol. Biol. Evol. 9:753–69
proaches with empirical data, as illus- 13. Cox DR. 1961. Tests of separate families
trated with the parsimony method. Mol. of hypotheses. Proc. 4th Berkeley Symp.
Biol. Evol. 9:778–86 Math. Stat. Prob. 1:105–23
3. Barry D, Hartigan JA. 1987. Statistical 14. Cox DR. 1962. Further results on tests of
analysis of hominoid molecular evolu- separate families of hypotheses. J. R. Stat.
tion. Stat. Sci. 2:191–210 Soc. B 24:406–24
4. Benton MJ. 1990. Phylogeny of the major 15. Cox DR, Miller HD. 1977. The Theory of
tetrapod groups: morphological data and Stochastic Processes. London: Chapman
divergence dates. J. Mol. Biol. 30:409–24 & Hall
5. Bishop MJ, Thompson EA. 1986. Max- 16. Crandall KA. 1994. Intraspecific clado-
imum likelihood alignment of DNA se- gram estimation: accuracy at higher lev-
quences. J. Mol. Biol. 190:159–65 els of divergence. Syst. Biol. 43:222–35
6. Brooks DR. 1981. Hennig’s parasitolog- 17. Cummings MP, Otto SP, Wakeley J. 1995.
ical method: a proposed solution. Syst. Sampling properties of DNA sequence
Zool. 30:229–49 data in phylogenetic analysis. Mol. Biol.
7. Brooks DR, McLennan DA. 1991. Phy- Evol. 12:814–22
logeny, Ecology, and Behavior. Chicago, 18. Dixon MT, Hillis DM. 1993. Ribosomal
IL: Univ. Chicago Press. 434 pp. RNA secondary structure: compensatory
8. Brower AVZ, DeSalle R, Vogler A. 1996. mutations and implications for phyloge-
Gene trees, species trees, and systematics: netic analysis. Mol. Biol. Evol. 10:256–67
a cladistic perspective. Annu. Rev. Ecol. 19. Edwards AWF. 1972. Likelihood. Cam-
Syst. 27:423–50 bridge: Cambridge Univ. Press
9. Bruno WJ. 1996. Modeling residue us- 20. Edwards AWF, Cavalli-Sforza LL. 1964.
age in aligned protein sequences via Reconstruction of evolutionary trees. In
maximum likelihood. Mol. Biol. Evol. Phenetic and Phylogenetic Classification,
13:1368–74 ed. J McNeill, pp. 67–76. London: Syst.
10. Bull JJ, Huelsenbeck JP, Cunningham Assoc.
CW, Swofford DL, Waddell PJ. 1993. Par- 21. Farris JS. 1983. The logical basis of phy-
titioning and combining data in phyloge- logenetic analysis. In Advances in Cladis-
netic analysis. Syst. Biol. 42:384–97 tics, ed. NI Platnick, VA Funk, 2:7–36.
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
New York: Columbia Univ. Press analysis based on rRNA sequences sup-
22. Felsenstein J. 1978. Cases in which par- ports the archaebacterial rather than the
simony or compatibility methods will be eocyte tree. Nature 339:145–47
positively misleading. Syst. Zool. 27:401– 40. Guttman DS. 1997. Recombination and
10 clonality in natural populations of Es-
23. Felsenstein J. 1978. The number of evo- cherichia coli. Trends Ecol. Evol. 12:16–
lutionary trees. Syst. Zool. 27:27–33 22
24. Felsenstein J. 1981. Evolutionary trees 41. Hafner MS, Nadler SA. 1988. Phyloge-
from DNA sequences: a maximum like- netic trees support the coevolution of par-
lihood approach. J. Mol. Evol. 17:368–76 asites and their hosts. Nature 332:258–59
25. Felsenstein J. 1985. Phylogenies and the 42. Harvey PH, Nee S. 1994. Phylogenetic
comparative method. Am. Nat. 125:1–15 epidemiology lives. Trends Ecol. Evol.
26. Felsenstein J. 1988. Phylogenies from 9:361–63
molecular sequences. Annu. Rev. Genet. 43. Harvey PH, Pagel MD. 1991. The Com-
22:521–65 parative Method in Evolutionary Biology.
27. Felsenstein J, Churchill GA. 1996. A hid- Oxford: Oxford Univ. Press. 239 pp.
den Markov model approach to variation 44. Hasegawa M, Kishino H, Saitou N. 1991.
among sites in rate of evolution. Mol. Biol. On the maximum likelihood method in
Evol. 13:93–104 molecular phylogenetics. J. Mol. Evol.
28. Fisher RA. 1922. On the mathematical 32:443–45
foundations of theoretical statistics. Phi- 45. Hasegawa M, Kishino K, Yano T. 1985.
los. Trans. R. Soc. London Ser. A 222:309– Dating the human-ape splitting by a
68 molecular clock of mitochondrial DNA.
29. Gaut BS, Lewis PO. 1995. Success of J. Mol. Evol. 22:160–74
maximum likelihood phylogeny infer- 46. Henderson WW, Monroe MC, St. Jeor
ence in the four-taxon case. Mol. Biol. SC, Thayer WP, Rowe JE, et al. 1995. Nat-
Evol. 12:152–62 urally occurring sin nombre virus genetic
30. Gaut BS, Weir BS. 1994. Detecting reassortants. Virology 214:602–10
substitution-rate heterogeneity among re- 47. Hillis DM. 1997. Biology recapitulates
gions of a nucleotide sequence. Mol. Biol. phylogeny. Science 276:218–19
Evol. 11:620–29 48. Hillis DM, Bull JJ, White ME, Badgett
31. Goldman N. 1990. Maximum likelihood MR, Molineux IJ. 1992. Experimental
inference of phylogenetic trees with spe- phylogenetics: generation of a known
cial reference to a Poisson process model phylogeny. Science 255:589–91
of DNA substitution and to parsimony 49. Huelsenbeck JP. 1995. Performance of
analyses. Syst. Zool. 39:345–61 phylogenetic methods in simulation. Syst.
32. Goldman N. 1993. Statistical tests of Biol. 44:17–48
models of DNA substitution. J. Mol. Evol. 50. Huelsenbeck JP. 1995. The robustness of
36:182–98 two phylogenetic methods: four-taxon
33. Goldman N, Yang Z. 1994. A codon- simulations reveal a slight superiority of
based model of nucleotide substitution maximum likelihood over neighbor join-
for protein-coding DNA sequences. Mol. ing. Mol. Biol. Evol. 12:843–49
Biol. Evol. 11:725–36 51. Huelsenbeck JP, Bull JJ. 1996. A likeli-
34. Grantham R. 1974. Amino acid difference hood ratio test to detect conflicting phy-
formula to help explain protein evolution. logenetic signal. Syst. Biol. 45:92–98
Science 185:862–64 52. Huelsenbeck JP, Cunningham CW, Gray-
35. Grauer D, Hide WA, Li W-H. 1991. Is the beal A. 1997. The performance of phy-
guinea-pig a rodent? Nature 351:649–52 logenetic methods for a well supported
36. Grauer D, Hide WA, Zharkikh A, Li phylogeny. Syst. Biol. In press
W-H. 1992. The biochemical phylogeny 53. Huelsenbeck JP, Hillis DM, Nielsen R.
of guinea-pigs and gundis, and the pa- 1996. A likelihood-ratio test of mono-
raphyly of the order rodentia. Comp. phyly. Syst. Biol. 45:546–58
Biochem. Physiol. B 101:495–98 54. Huelsenbeck JP, Hillis DM. 1993. Suc-
37. Griffiths RC, Tavaré S. 1994. Simulating cess of phylogenetic methods in the four-
probability distributions. Theor. Popul. taxon case. Syst. Biol. 42:247–64
Biol. 46:131–59 55. Huelsenbeck JP, Nielsen R. 1997. The ef-
38. Gu X, Fu Y-X, Li W-H. 1995. Maximum fect of non-independent substitution on
likelihood estimation of the heterogene- phylogenetic accuracy. Syst. Biol. Sub-
ity of substitution rate among nucleotide mitted
sites. Mol. Biol. Evol. 12:546–57 56. Huelsenbeck JP, Rannala B, Yang Z.
39. Guoy M, Li W-H. 1989. Phylogenetic 1997. Statistical tests of host parasite
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
techniques. Cold Spring Harbor Symp. 101. Silvey SD. 1975. Statistical Inference.
Quant. Biol. 52:825–37 London: Chapman & Hall
86. Ou C-Y, Ciesielski CA, Myers G, Ban- 102. Simberloff D. 1987. Calculating proba-
dea CI, Luo C-C, et al. 1992. Molecular bilities that cladograms match: a method
epidemiology of HIV transmission in a of biogeographic inference. Syst. Zool.
dental practice. Science 256:1165–71 36:175–95
87. Pamilo P, Bianchi NO. 1993. Evolution of 103. Stanhope MJ, Czelusniak J, Si J-S, Nick-
the Zfx and Zfy genes: rates and interde- erson J, Goodman M. 1992. A molecu-
pendence between the genes. Mol. Biol. lar perspective on mammalian evolution
Evol. 10:271–81 from the gene encoding Interphotorecep-
88. Penny D, Hendy MD, Steel MA. 1992. tor Retinoid Binding Protein, with con-
Progress with methods for constructing vincing evidence for bat monophyly. Mol.
evolutionary trees. Trends Ecol. Evol. Phylogeny Evol. 1:148–60
7:73–79 104. Swofford DL. 1996. PAUP∗ : Phylo-
89. Perler R, Efstratiadis A, Lomedico P, genetic Analysis Using Parsimony (and
Gilbert W, Klodner R, Dodgson J. 1980. Other Methods), Version 4.0. Sunderland,
The evolution of genes: the chicken pre- MA: Sinauer Assoc.
proinsulin gene. Cell 20:555–66 105. Swofford DL, Olsen GJ, Waddell PJ,
90. Pettigrew JD. 1986. Flying primates? Hillis DM. 1996. Phylogenetic inference.
Megabats have the advanced pathway In Molecular Systematics, ed. DM Hillis,
from eye to midbrain. Science 231:1304– C Moritz, BK Mable, pp. 407–514. Sun-
6 derland, MA: Sinauer. 2nd ed.
91. Reeves JH. 1992. Heterogeneity in the 106. Swofford DL, Thorne JS, Felsenstein
substitution process of amino acid sites J, Wiegmann BM. 1996. The topology-
of proteins coded for by mitochondrial dependent permutation test for mono-
DNA. J. Mol. Evol. 35:17–31 phyly does not test for monophyly. Syst.
92. Rice JA. 1995. Mathematical Statis- Biol. 45:575–79
tics and Data Analysis. Belmont, CA: 107. Tamura K, Nei M. 1993. Estimation of
Duxbury the number of nucleotide substitutions in
93. Ritland K, Clegg MT. 1987. Evolution- the control region of mitochondrial DNA
ary analysis of plant DNA sequences. Am. in humans and chimpanzees. Mol. Biol.
Nat. 130:S74–100 Evol. 10:512–26
94. Rodriguez F, Oliver JF, Marin A, Medina 108. Tateno Y, Takezaki N, Nei M. 1994.
JR. 1990. The general stochastic model Relative efficiencies of the maximum-
of nucleotide substitutions. J. Theor. Biol. likelihood, neighbor-joining, and maxi-
142:485–501 mum-parsimony methods when substitu-
95. Rzhetsky A. 1995. Estimating substitu- tion rate varies with site. Mol. Biol. Evol.
tion rates in ribosomal RNA genes. Ge- 11:261–77
netics 141:771–83 109. Tavaré S. 1986. Some probabilistic and
96. Rzhetsky A, Nei M. 1995. Tests of ap- statistical aspects of the primary structure
plicability of several substitution models of nucleotide sequences. In Lectures on
for DNA sequence data. Mol. Biol. Evol. Mathematics in the Life Sciences, ed. RM
12:131–51 Miura, pp. 57–86. Providence, RI: Am.
97. Saitou N, Imanishi T. 1989. Rela- Math. Soc.
tive efficiencies of the Fitch-Margoliash, 110. Thorne JL, Goldman N, Jones DT. 1996.
maximum-parsimony, maximum-likeli- Combining protein evolution and sec-
hood, minimum-evolution, and neighbor- ondary structure. Mol. Biol. Evol. 13:666–
joining methods of phylogenetic tree con- 73
struction in obtaining the correct tree. 111. Thorne JL, Kishino H, Felsenstein J.
Mol. Biol. Evol. 6:514–25 1991. An evolutionary model for max-
98. Schöniger M, von Haeseler A. 1994. A imum likelihood alignment of DNA se-
stochastic model for the evolution of au- quences. J. Mol. Evol. 33:114–24
tocorrelated DNA sequences. Mol. Phy- 112. Thorne JL, Kishino H, Felsenstein J.
logeny Evol. 3:240–47 1992. Inching toward reality: an im-
99. Schöniger M, von Haeseler A. 1995. Per- proved likelihood model of sequence evo-
formance of the maximum likelihood, lution. J. Mol. Evol. 34:3–16
neighbor joining, and maximum parsi- 113. Van Den Bussche RA, Baker RJ, Huelsen-
mony methods when sequence sites are beck JP, Hillis DM. 1997. Base composi-
not independent. Syst. Biol. 44:533–47 tional bias and phylogenetic analysis: a
100. Sharp PM. 1997. In search of molecular test of the “flying DNA” hypothesis. Mol.
darwinism. Nature 385:111–12 Phyl. Evol. Submitted
P1: ARS
September 10, 1997 17:12 Annual Reviews AR042-17
114. Waddell PJ, Penny D. 1996. Extend- model for the evolution of DNA se-
ing hadamard conjugations to model quences. Genetics 139:993–1005
sequence evolution with variable rates 121. Yang Z. 1996. Among-site rate variation
across sites. Available by anonymous ftp and its impact on phylogenetic analyses.
from onyx.si.edu. Trends Ecol. Evol. 11:367–72
115. Weir BS. 1990. Genetic Data Analysis. 122. Yang Z. 1996. Maximum likelihood mod-
Sunderland, MA: Sinauer els for combined analyses of multiple se-
116. Yang Z. 1993. Maximum likelihood es- quence data. J. Mol. Evol. 42:587–96
timation of phylogeny from DNA se- 123. Yang Z, Goldman N, Friday AE. 1994.
quences when substitution rates differ Comparison of models for nucleotide
over sites. Mol. Biol. Evol. 10:1396–401 substitution used in maximum-likelihood
117. Yang Z. 1994. Statistical properties of the phylogenetic estimation. Mol. Biol. Evol.
maximum likelihood method of phyloge- 11:316–24
netic estimation and comparison with dis- 124. Yang Z, Goldman N, Friday AE. 1995.
tance methods. Syst. Biol. 43:329–42 Maximum likelihood trees from DNA se-
118. Yang Z. 1994. Estimating the pattern quences: a peculiar statistical estimation
of nucleotide substitution. J. Mol. Evol. problem. Syst. Biol. 44:384–99
39:105–11 125. Yang Z, Rannala B. 1997. Bayesian
119. Yang Z. 1994. Maximum likelihood phylogenetic inference using DNA se-
phylogenetic estimation from DNA se- quences: a Markov chain Monte Carlo
quences with variable rates over sites: ap- method. Mol. Biol. Evol. 14:In press
proximate methods. J. Mol. Evol. 39:306– 126. Zharkikh A. 1994. Estimation of evolu-
14 tionary distances between nucleotide se-
120. Yang Z. 1995. A space-time process quences. J. Mol. Evol. 39:315–29