Phylogenetics

Phylogenetics definition and example

Phylogenetics n.,[ˌfaɪləd͡ʒəˈnɛtɪks] Definition: the scientific study of taxonomic relatedness of organisms based on evolutionary features and history

Table of Contents

Phylogenetics Definition

Phylogenetics is the scientific study of phylogeny . It studies evolutionary relationships among various groups of organisms based on evolutionary history, similarities, and differences. It makes use of molecular sequencing data (such as homologous sequences, protein sequences, nucleotide sequences, etc.) and morphological data matrices to understand and analyze the protein and gene evolutions of genetically-related groups of organisms.

Etymology: Greek from the terms phyle/phylon, meaning “tribe”, “race,” and genetikos, meaning “ relative to birth ” from genesis (“birth”).

Key Concepts

Common concepts that are highly relevant and important in understanding phylogenetics:

Phylogenetic tree

A phylogenetic tree is a “tree” diagram that shows the hypothetical evolutionary relatedness and history of groups of organisms based on the phylogenies of different biological species. The phylogenetic tree has been used to understand biodiversity , genetics, evolutions, and the ecology of organisms.

Evolutionary trees are essential phylogenetic tools for learning about common ancestors based on evolutionary closeness and branch lengths. Below is an example of a phylogenetic tree depicting the evolution of biological entities both the extinct and the living. The “root” (bottom) depicts the last universal common ancestor of life on Earth. The “branches” representing the genealogy of a group of organisms and the “node” is where the lineages come together at (or bifurcate from, depending on the point of view)  this common point. Each node represents the origin (ancestor) of a group. The “tips” of the branches represent the living organisms at present and the most recent lineage.

Tree of life

Watch this vid about phylogenetics and interpreting phylogenetic trees:

Monophyly, polyphyly, and paraphyly

Monophyly is defined as the condition of being a clade. A monophyletic group is a group of organisms that descended from a common ancestor. Hence, members of this group are more closely related to each other than any other group of organisms. The unique or exclusive characteristic (s) present in the most recent common ancestor is called synapomorphy . Synapomorphies , when identified, are what define the monophyly of a group.

For example, chordates (of phylum Chordata) are a monophyletic group that consists of descendants of a single common ancestor, which is a chordate. They share a single evolutionary history and if represented by a phylogenetic tree, they would come form the same single node.

Polyphyly is when the members of a group are an assemblage of organisms that descended from more than one common ancestor. Thus, a polyphyletic group refers to a group of organisms descending from multiple ancestral lineages. They have a mixed evolutionary origin.

Paraphyly refers to a group of organisms that descended from a single common ancestor but excludes other descendants. A paraphyletic group, therefore, is one in which it includes only some of the descendants of the common ancestor as opposed to monophyly, which includes all.

Clade and taxon

A clade refers to a monophyletic group of organisms. A taxon refers to any group or rank in biological classification. Taxa (plural form) are essentially used in classifying organisms based on the characteristics depicting relatedness (e.g., morphological, behavioral, and/or genetic features common among members of the taxon). Examples of taxonomic ranks or groups are phylum, family, genus, etc.

A taxon is a wider concept than a clade. A taxon may be monophyletic , polyphyletic, or paraphyletic . Remember our definition of a clade? A monophyletic taxon is a clade. All members of a monophyletic taxon descended from the most recent common ancestor. Apart from chordates, other examples of monophyletic taxa are mammals, insects, and angiosperms. Take, for instance, the mammals. All of them have mammary glands, which is a trait believed to have originated from a common ancestry. This trait that is passed down to its descendants is described as homologous (contrary to the analogous trait that is not).

A polyphyletic taxon is a taxon where members of a taxon descended from two or more recent common ancestors. An example of a polyphyletic group is trees. Trees share typical morphology, such as having a sturdy upright single stem that supports the branches, but they have different most recent common ancestors — the most recent ancestor of angiosperm and the most recent ancestor of gymnosperm, for instance.

A paraphyletic taxon is a taxon consisting of members that descended from a single most recent ancestor however not all of its descendants are members of the taxon. Gymnosperms, for example, exclude some of the descendants of their most recent ancestor.

To better understand this, take, for instance, cellular organisms. All cellular organisms belong to a monophyletic taxon for sharing a common characteristic — being made up of cell(s) — and descended from a single common ancestor. However, not all cells have a nucleus. Cellular organisms that do not have a nucleus in their cell belong to the paraphyletic taxon, Prokaryota (the prokaryotes ), which excludes the clade of cellular organisms that have a nucleus, Eukaryota (the eukaryotes ).

Frequently Asked Questions on Phylogenetics

What is the purpose of phylogenetics.

Phylogenetics is important in understanding the evolutionary processes and for the phylogenetic classification of organisms. It also helps formulate an evolutionary theory through constructions of molecular evolutionary trees. Phylogenetics, thus, helps us understand phylogenetic diversity and phylogenetic history of various groups of organisms.

Diversity of Cladosporium

Cladosporium sp. is a species of fungi that belongs to a monophyletic group. A study on this species made use of molecular phylogenetics (particularly, multilocus DNA sequences and typing) apart from morphological observations and microbial cultures to understand more about the indoor type of species. Accordingly, there were 46 species that eventually got identified and characterized. Read more about this study in this article .

How is phylogenetics done?

One way of studying phylogeny is by framing theories through phylogenetic trees. These diagrams proved useful in phylogenetic tree construction of the evolution and the distribution of a character trait, which can be used for phylogenetic inferences as basis for cladistic analysis. Trait hierarchy via phylogenetic trees can be used to identify which characteristics appeared in order.

In essence, phylogeneticists aim to answer the general questions, such as, “how do sequences evolve”, “how the organisms of interest (at individual or genetic level)  are evolutionary or phylogenetically related with other organisms” , or “how to come up with a sound evolutionary model” …

Initially, these diagrams are heavily based on morphological analyses and identification of phylogenetic branching patterns. Phylogenetics revolutionized when molecular biology techniques and methods came about. Such approach eventually found their way to phylogenetics.

Some of these molecular revolutionary approaches are the generation and analyses of biomolecular sequences by way of DNA sequences, nucleotide sequences, homologous sequences, and protein sequences.

Molecular sequencing (e.g., DNA sequencing) became an indispensable tool for phylogenetic methods/phylogenetic analyses of observable heritable traits. It helps verify whether a trait is homologous or analogous (thus, avoiding ‘false positives’ of common ancestry).

Alignment-based or alighment-free methods are used for sequence comparisons. Sequence alignment helps identify homologous sequences. It can be done pair-wise (where two sequences are compared) or by multiple sequence alignment (where multiple sequences are simultaneously compared).

These molecular methods provide a rather mathematical and molecular data to back up phylogenetic inferences.

Statistical data analyses, e.g., by the use of maximum likelihood methods , became essential to phylogeneticists, too, as they can now estimate the parameters of a probability distribution, such as in evolutionary models. This approach is used in phylogenetic inference, evaluating competing hypotheses about evolutionary histories by seeking the ‘ best ‘ hypothetical tree.

Bayesian phylogenetic inference (or simply, Bayesian inference) is one of the most popular methods use in molecular phylogenetics. It has been the standard statistical approach. Bayesian methods have been used by phylogeneticists to infer phylogenies and phenotypic trait evolution, evaluate phylogenetic uncertainty, analyze molecular dating and dynamics of species diversification and extinction, among others. Some of the common phylogenetic softwares used for Bayesian analysis are MrBayes, Bayesian Evolutionary Analysis Sampling Trees (BEAST), PhyloBayes / PhyloBayes MPI (Bayesian Monte Carlo Markov Chain (or Markov Chain Monte Carlo) method/ sampler for phylogenetic reconstruction).

These phylogenetic techniques and approaches helped phylogeneticists in developing better evolutionary models by the availability of rapid computer programs that help manage a large set of statistical and sequence data. And as such, they are able to infer more reliably phylogenetic relationships and sequence evolution in a phylogenetic tree.

In 1977, Carl Woese and George Fox managed to sequence rRNA genes of prokaryotes and found that there were prokaryotes that are genetically distinct from bacteria. These prokaryotes lack the rRNA genes typical of bacteria and thus were shown to be distantly related to bacteria. They, then, called these prokaryotes “ Archaebacteria “.

Is phylogeny the same as phylogenetics?

Not quite but they are related. Phylogeny is defined as the evolutionary history of a group of organisms. Phylogenetics is the science that studies phylogeny. The phylogenetic approach of studying evolutionary relatedness and histories of organisms makes use of a phylogenetic tree. Data sequencing is also used in inferring phylogenies. As already discussed above, the phylogenetic tree depicts how a group of organisms can be related to another group and molecular sequencing is applied to provide a genetic basis for their relatedness.

phylogeny definition and examples

Related Sciences

Here, you will find some of the scientific disciplines that are closely related to phylogenetics. Thus, topics could overlap between or among them.

Systematics and Evolutionary biology

Phylogeny looks into the evolutionary history of a taxonomic group of organisms. Thus, phylogenetics is mainly concerned with the phylogenetic relationships and molecular evolution of organisms according to evolutionary similarities and differences. Phylogenetics, therefore, is a part of biological systematics, which has a wider scope as it involves not only the phylogenetics of organisms but also the identification and classification of organisms. For that matter, systematics encompasses two fields of study: taxonomy and phylogenetics . In particular, phylogenetic systematics is a subfield of systematics where a phylogenetic approach is applied in studying the divesities and relationships of organisms through time.

The subfields of systematics include molecular systematics , which explores via molecular approach, numerical systematics or numerical taxonomy , which is based on bio-statistical analysis and data, and experimental systematics or evolutionary systematics , which attempts to understand systematics through the factors affecting the process of evolution.

Phylogenetics is also related to taxonomy , which is a branch of science concerned also in finding, describing, classifying, and naming organisms, including the studying of the relationships between taxa and the principles underlying such a classification. Phylogenetics provides information to taxonomy when it comes to the classification and identification of organisms.

The study of phylogenies is heavily based on the core principles and practice of genetics . It helps us understand how species evolve at the level of genes and genomes, particularly genetic changes through time.

Take the Quiz!

Choose the best answer. 

Send Your Results (Optional)

clock.png

Further Reading

  • systematics
  • evolutionary biology
  • phylogenetic tree
  • Systematics: Meaning, Branches and Its Application . (2016, May 27). Biology Discussion. https://www.biologydiscussion.com/animals-2/systematics-meaning-branches-and-its-application/32374
  • Graphical explanation of phylogenetic terms . (2022). Berkeley.edu. https://ucmp.berkeley.edu/glossary/gloss1/phyly.html‌‌
  • Haber, M., & Velasco, J. (2021).  Phylogenetic Inference (Stanford Encyclopedia of Philosophy) . Stanford.edu. https://plato.stanford.edu/entries/phylogenetic-inference/#:~:text=At%20its%20core%2C%20phylogenetic%20inference,underdetermination%20of%20theory%20by%20evidence%20.‌

©BiologyOnline.com. Content provided and moderated by Biology Online Editors.

Last updated on May 29th, 2023

You will also like...

define phylogenetic hypotheses

Circulation

The circulatory system is key to the transport of vital biomolecules and nutrients throughout the body. Learn about the ..

temperature - abiotic factor

Abiotic and Biotic Factors

This tutorial deals with the abiotic factors of the freshwater environment that determine what sort of life would be sui..

DNA carries genes coding for proteins

Genetic Information and Protein Synthesis

Genes are expressed through the process of protein synthesis. This elaborate tutorial provides an in-depth review of the..

"Opabinia regalis"

The Evolutionary Development of Multicellular Organisms

Multicellular organisms evolved. The first ones were likely in the form of sponges. Multicellularity led to the evolutio..

Biosecurity and Biocontrol

Biosecurity and Biocontrol

This lesson explores the impact of biosecurity threats, and why they need to be identified and managed. Examples to incl..

The process of photosynthesis

Photosynthesis – Photolysis and Carbon Fixation

Photosynthesis is the process that plants undertake to create organic materials from carbon dioxide and water, with the ..

Related Articles...

Anatolian leopard

Anatolian leopard

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains *.kastatic.org and *.kasandbox.org are unblocked.

To log in and use all the features of Khan Academy, please enable JavaScript in your browser.

AP®︎/College Biology

Course: ap®︎/college biology   >   unit 7.

  • Taxonomy and the tree of life
  • Discovering the tree of life
  • Understanding and building phylogenetic trees
  • Phylogenetic trees

Building a phylogenetic tree

Key points:.

  • Phylogenetic trees represent hypotheses about the evolutionary relationships among a group of organisms.
  • A phylogenetic tree may be built using morphological (body shape), biochemical, behavioral, or molecular features of species or other groups.
  • In building a tree, we organize species into nested groups based on shared derived traits (traits different from those of the group's ancestor).
  • The sequences of genes or proteins can be compared among species and used to build phylogenetic trees. Closely related species typically have few sequence differences, while less related species tend to have more.

Introduction

Overview of phylogenetic trees, the idea behind tree construction, example: building a phylogenetic tree.

  • In the context of homework or a test, the question you are solving may tell you which traits are derived vs. ancestral.
  • If you are doing your own research, you may have knowledge that allows you identify ancestral and derived traits (e.g., based on fossils).
  • You may be given information about an outgroup , a species that's more distantly related to the species of interest than they are to one another.

Parsimony and pitfalls in tree construction

  • We may not always be able to distinguish features that reflect shared ancestry ( homologous features) from features that are similar but arose independently ( analogous features arising by convergent evolution ). See an example Imagine that the tree below shows the actual evolutionary history of a group of rodents. In this tree, whiskers arise two independent times. If we didn't know the true history of the group and were trying to reconstruct it, we might interpret the whiskers as arising from a single event. The whisker data would then conflict with data for the other traits. A diagonal phylogenetic tree, labeled Convergent evolution of whiskers. The root splits at a point labeled Most recent common ancestor of A through E, to form two branches. A drawing of a rat-like creature is shown at this point. The left branch forms species A with the same drawing of a rat-like creature (no whiskers, no fuzzy tail, no big ears). The right branch passes through a point labeled Fuzzy tail (in green) and splits to form a pair of branches. From there, the left branch passes through a point labeled Whiskers (1) (in purple) and forms species B with a drawing of a rat-like creature with a fuzzy tail and whiskers. The right branch passes through a point labeled Big ears (in blue) and splits to form a pair of branches. From there, the left branch forms species C with a drawing of a rat-like creature with a fuzzy tail and big ears, no whiskers. The right branch passes through a point labeled Whiskers (2) (in purple) and splits to form species D and E. Species D and E have identical drawings of a rat-like creature with a fuzzy tail, big ears, and whiskers. The letters and drawings are labeled Present-day species.
  • Traits can be gained and lost multiple times over the evolutionary history of a species. A species may have a derived trait, but then lose that trait (revert back to the ancestral form) over the course of evolution. See an example Imagine that the tree below shows the actual evolutionary history of a group of rodents. In this tree, species E undergoes a genetic change that causes it to lose its bushy tail and gain the skinny tail present in the group's ancestor. If we didn't know the true history of the group and were trying to reconstruct it, we might assume that the species E was descended from an ancestor without a bushy tail. Under this assumption, the tail data would conflict with data for other traits. A diagonal phylogenetic tree, labeled Gain and loss of fuzzy tail. The root splits at a point labeled Most recent common ancestor of A through E, to form two branches. A drawing of a rat-like creature is shown at this point. The left branch forms species A with the same drawing of a rat-like creature (no whiskers, no fuzzy tail, no big ears). The right branch passes through a point labeled Fuzzy tail (in green) and splits to form a pair of branches. From there, the left branch forms species B with a drawing of a rat-like creature with a fuzzy tail. The right branch passes through a point labeled Big ears (in blue) and splits to form a pair of branches. From there, the left branch forms species C with a drawing of a rat-like creature with a fuzzy tail and big ears. The right branch passes through a point labeled Whiskers (in purple) and splits to form two branches. From there, the left branch forms species D with a drawing of a rat-like creature with a fuzzy tail, big ears, and whiskers. The right branch passes through a point labeled Skinny tail (reversion) and forms species E with a drawing of a rat-like creature with a skinny tail, big ears, and whiskers. The letters and drawings are labeled Present-day species.

Using molecular data to build trees

  • A larger number of differences corresponds to less related species
  • A smaller number of differences corresponds to more related species

Attribution

Works cited.

  • David Baum, "Reading a Phylogenetic Tree: The Meaning of Monophyletic Groups," Nature Education 1, no. 1 (2008): 190, http://www.nature.com/scitable/topicpage/reading-a-phylogenetic-tree-the-meaning-of-41956 .

Want to join the conversation?

  • Upvote Button navigates to signup page
  • Downvote Button navigates to signup page
  • Flag Button navigates to signup page

Good Answer

  • Introduction to genomics
  • In the cell
  • Health and disease
  • Living things
  • Methods and technology
  • Science in society
  • Genomic conversations
  • Resources for 5-12 year olds
  • Resources for 13-18 year olds
  • Resources for 18+ year olds
  • Resources for educators
  • Careers in Genomics
  • Wellcome Genome Campus

Explore Genomics > Living Things

What is phylogenetics?

define phylogenetic hypotheses

Phylogenetics is the study of the evolutionary relationships between organisms, based on their genetic material revealed through DNA and RNA sequencing. 

  • A phylogeny, or a phylogenetic tree, is a way of visually representing evolutionary relationships. They are a scientist’s best guess as to how an organism or group of organisms have evolved.
  • Evolutionary biologists use phylogenetic trees to describe how individuals or groups are related to each other through their shared ancestry.

A group of closely-related organisms that have common physical and genetic characteristics and are able to interbreed to produce fertile offspring.

The process by which organisms change and adapt by natural selection.

What is a phylogenetic tree?

  • Phylogenetic trees are made up of ‘leaves’ and ‘branches’. The leaves are at the tips of the tree and represent the organisms whose evolution you are describing.
  • Leaf labels are often letters, pictures, or species names.
  • A leaf can represent an individual organism – such as a virus or bacterium, an entire species, or even a single gene. A tree can be used to represent one species’ evolution, or the evolution of all animals.
  • Each leaf is connected by a series of branches. These branches represent how leaves relate to one another.

define phylogenetic hypotheses

  • Table of Contents
  • Random Entry
  • Chronological
  • Editorial Information
  • About the SEP
  • Editorial Board
  • How to Cite the SEP
  • Special Characters
  • Advanced Tools
  • Support the SEP
  • PDFs for SEP Friends
  • Make a Donation
  • SEPIA for Libraries
  • Entry Contents

Bibliography

Academic tools.

  • Friends PDF Preview
  • Author and Citation Info
  • Back to Top

Phylogenetic Inference

Phylogenetics is the study of the evolutionary history and relationships among individuals, groups of organisms (e.g., populations, species, or higher taxa), or other biological entities with evolutionary histories (e.g., genes, biochemicals, or developmental mechanisms). Phylogenetic inference is the task of inferring this history, and as with other problems of inference, there are interesting and difficult questions regarding how these inferences are justified.

In this entry, we examine what phylogenetic inference is and how it works. In the first section, we briefly introduce the field of phylogenetics and its history. In section 2, we explore how phylogenetic inference provides useful problems for philosophers to examine, and where philosophical approaches have contributed to the scientific examination of phylogenetics. This will help display why phylogenetic inference is not merely a biological research problem, but a philosophical one as well. Finally in section 3, we will move to a discussion of some of the contemporary debates about foundational issues in phylogenetics, and what the future of phylogenetic inference looks like.

1.1 Primer and Introduction to Phylogenetics

1.2.1 the systematics wars, 1.2.2 the molecular revolution, 1.2.3 the mathematization of phylogenetics and introduction of basic phylogenetic methods, 2.1 alignment and character coding, 2.2 computational limits and big data, 2.3 justifications of parsimony, 2.4 long-branch attraction and model based methods, 2.5 phylogenetics & philosophy of statistics, 2.6.1 phylogenetic inference as general philosophy of science, 2.6.2 phylogenetic inference as a problem of metaphysics, 3.1 reticulation & discordance, 3.2 impacts on phylogenetic inference, other internet resources, related entries, 1. phylogenetic inference in biology.

A phylogeny is a reconstruction of evolutionary history. Thus the discovery of evolution is a good starting point for the history of phylogenetics. While Darwin was not the first to propose that some species were genealogically related to others, it was On The Origin of Species (Darwin 1859) that convinced many biologists to accept common ancestry and to start building phylogenies. One of the immediate—and ongoing—impacts was the question this raised in how phylogenies (i.e., reconstructions of evolutionary history) relate to taxonomies (i.e., how we group taxa).

a phylogenetic tree: link to extended description below

Figure 1 Phylogenetic (or evolutionary) trees have become commonplace in biology research articles. This tree displays a recent hypothesis on the relative placement of lungfish and coelacanth in the evolutionary history of tetrapods (Amemiya et al. 2013: 312, fig. 1). [An extended description of figure 6 is in the supplement.]

Pre-Darwinian taxonomy focused on classifying according to the “natural system” where taxa were united into large groups due to their “natural affinities” (Winsor 1976). In an effort to clarify the concept of affinity, Richard Owen used the term “homologue” to refer to “the same organ in different animals under every variety of form and function” (Owen 1843). Put into an evolutionary framework, the natural affinities uniting these groups were regarded as the result of descent with modification from common ancestors and “homology” has since become a central term in comparative biology (Donoghue 1992; Hall 1994). To see how homology relates to phylogenetic inference, as well as to introduce some basic terminology, it will be useful to consider an example phylogeny (Figure 1).

Figure 1 provides a typical contemporary example of a phylogenetic tree —a branching diagram that displays genealogical relationships. Phylogenetic trees like these include branches (the lines) and nodes (where the lines branch or come together, depending on the direction you are reading the tree). Terminal branches are marked with the entities whose evolutionary relationships are being studied. In this case those entities are species or higher taxa, but trees may be constructed for other taxonomic levels, or any other biological entities whose evolutionary relationship can be studied (e.g., viruses, stretches of DNA, extinct taxa). Phylogenetic methods can even be employed to study ontogeny or cultural evolution, e.g., they have been used to construct cell fate maps to reveal how the developmental parts of a single organism are related (Salipante & Horwitz 2006), and used to reconstruct the expansion of language families and help estimate historical human migration patterns (Gray & Jordan 2000; Gray & Atkinson 2003; Gray, Drummond, & Greenhill 2009). More recently, phylogenetic tools have been used to aid epidemiology studies of COVID-19 (Lemieux et al. 2021).

Since so much of modern biology requires the ability to read phylogenetic trees, good guides are commonplace. This includes journal articles (e.g., Baum, Smith, & Donovan 2005; Baum 2008; Yang & Rannala 2012), online guides (e.g., Understanding Evolution , Other Internet Resources ), and more comprehensive textbooks (e.g., Felsenstein 2004; Lemey, Salemi, & Vandamme [eds] 2009; Wiley & Lieberman 2011; Baum & Smith 2013). Here, we provide only a brief introduction, but recommend any of these (among others) to readers looking for a more thorough and technical introduction.

On the phylogenetic tree in Figure 1 , time passes from left to right with extant groups labeled on the terminals of branches on the right. [ 1 ] Starting from the left there are two distinct branches, one leading to Cartilaginous fish (sharks and rays), and the other to a node. From that node the branches split, with one leading to Ray-finned fishes, and the other leading to another, more recent node (that ultimately leads to frogs, lizards, and mammals, among other taxa). Branches may be read as representing ancestral lineages, with nodes representing common ancestors that diverged (or split) into distinct lineal branches.

We can also read the tree in Figure 1 by starting with the groups on the right and moving leftward, tracing branches back to where they join at nodes. This direction takes us backwards in time, and allows us to read off which groups are more closely related to others by looking at recency of common ancestry. For example, lungfish are more closely related to frogs than they are to pufferfish since lungfish and frogs share a more recent common ancestor with each other than either does with the pufferfish. To see this, trace back along the branches from each group, starting on the right and moving left. You will converge on a node between the lungfish and frogs before reaching the node between either of those taxa and the pufferfish. Another way of saying this is that there is a monophyletic group (or clade ) that includes lungfish and frogs that does not include pufferfish.

A monophyletic group consists of an ancestor and all of its descendants. Some familiar groups turn out not to be monophyletic. For example, if we tried to unite pufferfish and sharks into a single group that excluded frogs and other tetrapods (such as the traditional Pisces) we get what is called a paraphyletic group. Individuals in paraphyletic groups all share a single common ancestor, but exclude some descendants of that ancestor. Artificial groups like “flying tetrapods” (birds plus the bats) would be called polyphyletic since they have multiple, distinct origins. Because paraphyletic and polyphyletic groups do not share a single history that is unique just to them (i.e., there is no branch on the tree that leads just to this group), they cannot feature in explanations in the same way that monophyletic groups can, i.e., they lack the utility and explanatory power of monophyletic groups (Velasco 2008a). For example, biologists might be interested in studying whether flower diversity is driven by pollinator syndrome (i.e., moth, bird, or bee) or vice-versa. Phylogenetic principles predict different patterns of monophyly on these competing hypotheses, which can be tested empirically and used by biologists as they design experiments to test for the evidence of selection in cases like these (Whittall & Hodges 2007). This is simply unavailable without the explanatory and organizing principles of monophyly, and one reason it has grown to dominate systematics.

Each node on a tree is the origin of a monophyletic group. For example, the claim that the mammals form a single, united monophyletic group just means that all and only mammals share (i.e., are descended from) a common ancestor. We have very good reasons to believe that this is true. For example, all and only the mammals have certain traits (or characters) such as mammary glands, hair, and ossicles (three bones in the middle ear). Each of these traits are homologues , i.e., their similarity is due to shared ancestry. (Homologous characters are typically contrasted with analogous characters , i.e., traits whose similarity is not explained by common ancestry, but by some other process, e.g., convergent evolution or reversal). The mammary glands in humans and in elephants are homologous because the trait has been inherited from their common ancestor. In figure 1 , the ancestor of the mammals is located at the node with branches leading to the platypus and the rest of the mammals. Once traits emerge (as evolutionary novelties) on a branch they can be passed down (unchanged or modified) to the descendants of that lineage. Since platypuses and mice have mammary glands, but no taxa stemming away from earlier nodes do, we can infer that mammary glands evolved after the node where mammals and birds split, but prior to the mammalian node. That is one way phylogenetic trees support inferential claims in biology. If mammary glands had evolved prior to the node joining mammals and birds, then we might expect to find mammary glands on birds or lizards. Platypuses and mice are also born inside an amniotic sac, as are turkeys and lizards. However, frogs do not have an amniotic sac. This tells us that the amniotic sac evolved after the frog lineage branched off from the other tetrapods. Parallel reasoning allows us to infer the history of the evolution of bones, tetrapod limb structure, feathers, etc. These homologies form a nested hierarchy of traits just as the monophyletic groups are nested, so, for example, the mammals are a part of amniotes which themselves are a part of the tetrapods. On the other hand, while the mammals and the birds are both part of the amniotes, they do not overlap at all. And indeed, no organisms have both mammary glands and feathers.

Explaining character distributions thus requires knowing which groups are clades (i.e., monophyletic groups). Phylogenetic trees depict precisely that, stating which groups are clades. This information is called the topology of the tree, which conveys the relative order of the nodes and nothing about their absolute dating in time. Knowing which groups are clades allows us to reconstruct the evolution of character traits, and it is the distribution of character traits that is the basic way that we infer which groups are clades in the first place. This style of reasoning (dubbed “reciprocal illumination” by Willi Hennig (1966) led to charges of circular reasoning from critics of phylogenetic inference (e.g., Sokal & Sneath 1963), whereas advocates have sought to explain why this worry is misplaced (see Wiley 1975; Sober 1988b, among many others). In fact, the nested hierarchies of traits is what led systematists to classify taxa into a nested hierarchy in the first place, and it was the nested hierarchy of the taxonomic system that Darwin took to be the most important evidence for common ancestry (Winsor 2009; Sober 2009, 2010). The problems of understanding what it means for traits to be homologous and of inferring homology will be discussed later ( §2.1 ). Complicating things, groups of homologies do not always form unambiguous nested hierarchies, and sorting out what phylogenetic hypotheses are best supported in these cases is a big part of what generates debates over phylogenetic inference (more on this in §2 and §3 .

Explaining character trait distributions is just the tip of the iceberg. Phylogenies are centrally important for all research in evolutionary biology. Phylogenetics lies at the heart of the linking of the fields of systematics [ 2 ] and population genetics. Knowing a phylogeny is an important first step to studying problems in evolutionary biology, functional genetics, comparative anatomy, and evolutionary developmental biology. Just as evolution is the unifying, organizing theme in biology, phylogenetics is the backbone of biological inference more generally. As Sterelny & Griffiths (1999: 379) put it, “Nothing in biology makes sense except in the context of its place in phylogeny”.

1.2 From Darwin to Today: Three Interweaving Histories

For roughly 100 years after Darwin’s Origin , phylogenetic research in biology was common and important. Yet, phylogenetics proceeded without much in the way of underlying theory or explicable methodology—rather, the systematist with extensive knowledge of some group simply relied on their judgment as to which character traits looked genuinely homologous and which character transformations seemed plausible. Some phylogenies were widely agreed upon due to the fact that when homologies are clear, inference is easy. As early as the late 1800s, there was a general consensus among systematists around the branching order of many of the groups presented in Figure 1 . However, the relative placement of the coelacanth and lungfish was disputed almost immediately upon discovery of the former in 1938 (Thomson 1991)—a dispute that has continued to draw attention from biologists (e.g., Bockmann, De Carvalho, & De Carvalho 2013; Amemiya et al. 2013; Takezaki & Nishihara 2017).

The lungfish and the coelacanth appear key to understanding the origins of the tetrapods, which has remained a major question in the reconstruction of evolutionary history since Darwin. But this is just one example among many. Bowler (1996) surveys the development of phylogenetics over this time period and examines a number of these debates in more detail such as whether the arthropods form a monophyletic group, and the question of the origin of birds, and, in particular, how they are related to the extinct dinosaurs.

While there are still important disagreements regarding phylogenies, the nature of these disputes have changed substantially as phylogeneticists no longer rely largely on individual expert judgment. The mid-twentieth century saw the introduction of competing accounts of the theoretical foundations of systematics ( §1.2.1 ), a flurry of new algorithmic methods for constructing phylogenies and taxonomic classifications, and the beginning of relatively easy access to and employment of computational power ( §1.2.3 ), all combined with new sources of evidence in the form of various kinds of molecular data ( §1.2.2 ). As biologists sought to incorporate these emerging features into phylogenetic method and theory, debates arose over how (or whether) this fundamentally changed the nature of systematics. At the same time, debates about the foundations of systematics and the proper methods for classification and taxonomy became highly prevalent and interwoven with arguments about these new methods and newly available data.

Traditionally, the emergence of phylogenetics has been presented largely in the context of debates between three major schools of twentieth century taxonomic thought, often called “The Systematics Wars” ( §1.2.1 ). As centrally important as this was, a singular focus on that history risks overshadowing the impact of the molecular revolution ( §1.2.2 ) and ever-increasing access to greater computing power ( §1.2.3 ) on the establishment of phylogenetics. Below we consider how each of these twentieth century developments impacted the emergence of phylogenetics as a distinct field of research. Yet, there are many overlapping themes cross-cutting these developments (often reinforced by allegiances between individual researchers), suggesting that, ultimately, an integrated approach is needed for a rich, nuanced history of phylogenetics.

Historically, phylogenetics emerged out of the larger field of biological systematics, the field of biology that studies the diversity of life and the relationships of living things through time. Today, systematists typically treat “relatedness” solely in terms of recency of common ancestry, but this was not always the case. Pre-Darwinian taxonomists discussed the relationships of various groups and their place in the “natural system”, and while the rise of evolutionary theory allowed that one sense of relatedness was genealogical, it did not eliminate the idea of the broader notion. Debates about the role of phylogeny in classification and taxonomy were widespread (e.g., Huxley [ed.] 1940; Winsor 1995) though they began to take on a new form beginning in the late 1950s as collaborations turned into organized research programs pushing their agendas.

In his analysis of the period, David Hull (1988: ch. 5) titled one of his chapters “Systematists at War” and thus the name “The Systematics Wars” is sometimes used to describe the debates of the period. Hull (1970) influentially compared three “contemporary systematic philosophies”, typically called pheneticism , (or numerical taxonomy , after the seminal Sokal & Sneath (1963) textbook of the same name), cladistics or phylogenetic systematics (after Hennig’s 1966 foundational text, an earlier version of which was published in German in 1950), and evolutionary systematics (following the lead of proponents like Simpson [1961] and Mayr [1969] and others).

The pheneticists disputed that we were in a position to know common ancestry with enough certainty to justify adopting it as foundational to our notion of “relatedness”, and maintained that the taxonomists’ job was discovering clusters of similarity (Sokal & Sneath 1963). Pheneticists were critical of what they viewed as the entrenched approach to systematics, arguing that the traditional approach to selecting between hypotheses largely relied on the relative prominence or status of the biologist rather than good scientific methodology. They explicitly framed their views in the context of arguments about what constituted “good science” and sought to promote independent and objective modes of assessing competing taxonomic groupings that relied on transparent and repeatable methods. This included advocating for a greater separation between biological theory and the construction of similarity groups.

Around the same time, another group (phylogenetic systematists) insisted that recency of common ancestry provided the best theoretically-motivated foundation for relatedness (Hennig 1966). Like the pheneticists, they too sought to promote good scientific methods in systematics, and were similarly critical of established approaches. The phylogenetic systematists can be characterized by their insistence that all taxonomic groups above the species level should be monophyletic. Thus, if birds are descended from dinosaurs then they are dinosaurs (Padian & Horner 2002). If the dinosaurs (or the reptiles more generally) are defined in a way that excludes birds, then the group is paraphyletic and thus cannot be a taxon in a phylogenetic taxonomic system. Ernst Mayr derisively named them “cladists” for their obsessive focus on recovering clades (i.e., monophyletic groups) (Mayr 1965a), though its practitioners were happy to embrace the name. When consistently carried out, phylogenetic taxonomic principles led to major revisions in long-standing, traditional classifications—revisions that often generated deep controversy (e.g., Halstead 1978; Gardiner et al. 1979).

Both pheneticism and cladism were developed in opposition to entrenched views in systematics. In response, biologists committed to more traditional methods sought to articulate the theoretical underpinnings of their approach, which came to be known as evolutionary systematics. With Ernst Mayr as their most prominent advocate, evolutionary systematists sought to incorporate phylogeny into classifications, but permitted that classifications might also (or even instead) represent important adaptive differences between groups (Simpson 1961; Mayr 1969). A prominent example that regularly featured in these debates is the taxonomic placement of the birds and crocodiles. Mayr argued that though birds are descended from dinosaurs and reptiles more generally, birds share a cluster of adaptations which justify classifying them as a distinctive group at a taxonomically equivalent rank as reptiles. However, even though crocodiles are more closely related to birds than to lizards and thus belong to a clade with birds that lizards are not a part of, Mayr still classifies crocodiles as part of the reptile “grade”—a group characterized by a well integrated adaptive complex (Mayr 1974). Thus unlike the cladists, evolutionary systematists advocated for the inclusion of reptiles and other paraphyletic groups in taxonomy.

One natural way to understand these debates is to recognize them as systematists sorting out “reconstructing a phylogeny” and “classifying a group of taxa” as two distinct tasks, and working out how (if at all) these tasks ought to be reciprocally informative (Mayr 1965a, 1974; Griffiths 1974; Hennig 1975; de Queiroz & Gauthier 1990, 1992). With respect to this particular issue, cladistics has largely won; today when scientists designate higher taxa they are typically always describing monophyletic groups. The principle of monophyly has proven to be such a powerful explanatory and organizing tool that the commitment to it has grown widely and now dominates systematics–in addition to informing both theory and methodology across fields of biology This is one reason why this period is sometimes referred to as “The Cladistic Revolution” (Hull 1988; Kearney 2007; Haber 2009).

Yet, another framing is available. Namely, that as biologists shifted to using phylogenies (rather than traditional taxonomies) to support their explanations and justify their inferences, disagreements over taxonomic classifications faded into the background. On this account, it is disagreements over phylogenetic hypotheses that are impactful; disputes over classifications become less consequential as they get replaced in scientific inference by phylogenies. Joseph Felsenstein (2004: 145) dubs this the “It-Doesn’t-Matter-Very-Much school”, arguing that “systematists ‘voted with their feet’ to establish this school, long before I announced its existence”. How phylogenetics ought to influence, inform, or constrain taxonomic classifications is still a live debate (see Ereshefsky 2001 as well as debates over the adoption of the PhyloCode, among other debates about phylogenetic taxonomy), though one that is orthogonal to the inference of phylogenies. Contemporary debates on taxonomic classification are largely over how to draw inferences from phylogenies for classifications; there is very little dispute over the coherency of reconstructing phylogenies. Biological classification, on this account, is not viewed as a competitor to phylogenetics, but dependent on it—and not the other way around. Phylogeneticists, like Felsenstein, can largely proceed in reconstructing phylogenies without paying attention to classificatory questions, though some of the metaphysical questions about the units of phylogenetics can come back to impact the inference of phylogeny itself (see §2.6.2 )

As an important aside, we say ‘largely’ in the previous paragraph because—as in many cases in biology—there is not unanimity. Notably, biologists that identify as pattern cladists would dispute many of the chacterizations we have offered about phylogenetics, including the very first sentence of this entry, i.e., “a phylogeny is a reconstruction of evolutionary history.” Though there are a number of views that might be identified with pattern cladism, all would typically regard that claim as unjustifiably reading a process into what is largely a claim about patterns only. On pattern cladism, phylogenetic trees are typically regarded as graphical representations of evidence from characters; in that regard, a phylogeny would just be a classification of those characters. For the pattern cladist, directly inferring process or evolutionary history from phylogenies—or, worse, building evolutionary process assumptions into phylogenetic inference—is a mistake. They defend this for numerous reasons, often on philosophical grounds, i.e., that pattern cladism embodies good scientific inference, or that evolutionary claims require additional inferential steps that need to be made explicit (Farris 1983). This has contributed to a split in the larger cladistics community. One upshot of this process/pattern split is that the term ‘cladist’ has become ambiguous, contested, and imprecise, and it is not always clear who gets regarded as what kind of cladist (Carpenter 1987, The Editors 2016, Quinn 2017, Brower 2018a, Williams and Ebach 2018). For example, while pattern cladists reject any a priori assumptions about evolution (Brower 2019), so-called process (or, sometimes, phylogenetic ) cladists regard evolutionary claims as foundational to phylogenetics, while others (e.g., statistical phylogeneticists ) have distanced themselves from using ‘cladist’ as a label altogether (see §2.3 and the subsection §2.3.1 for more on this disputed taxonomy). We recommend that philosophers writing in this area carefully specify the precise sense of ‘cladist/m’, if they use that term at all, and caution against using it interchangeably with ‘phylogenetics’.

Philosophers have largely focused on other topics in systematics than the pattern/process split. This is somewhat surprising as there are rich philosophical topics that could be fruitfully unpacked or drawn on as data by philosophers of science, especially for those interested in the nature of scientific explanation and inference, or the role of theory in biological methodology. For foundational literature on process versus pattern cladism see Beatty’s (1982) introduction of the designation of pattern cladism, with responses by Brady (1982), Patterson (1982), and Platnick (1982). Two good recent pattern cladist textbooks include Williams & Ebach (2020) and Brower & Schuh (2021). There is an expansive though well contained literature within cladistics featuring debates between pattern and process cladists, e.g., Carpenter (1987), Brower (2000, 2002, 2019), and Lee (2002) are just a very small selection of a large literature that philosophers interested in the justification of scientific inference and methology might explore.

Regardless of whether disputes about taxonomic philosophy died down because systematists came to agree or because they became less important, it is a mistake for philosophers to treat the Systematics Wars as a live dispute, as opposed to an historical one. Yet, it remains an important episode in the history of systematics, and the core conceptual debates left their mark on contemporary debates in phylogenetics (Haber 2009). At stake was not merely what notion of “relatedness” systematists ought to adopt, but what constituted good science in the context of inferring history and constructing classifications, and the relative value (and even coherence) of concepts like objectivity, testability, and repeatability in science. But systematists no longer dispute whether phylogenetic inference may be justified as a scientific activity, but, rather, over how best to carry it out. A large part of why this change took place has to do with the availability of new sources of molecular data and the introduction of new methods to take advantage of it. It is to this history that we now turn.

Like Felsenstein, Sterner and Lidgard (2018) encourage historians and philosophers of biology to “move past the systematics wars”. Their point is not that this was unimportant, but that a singular focus on it tends to overshadow the impact of other important historical advances during this period. We agree. To remedy this, Sterner and Lidgard suggest paying attention to the practice of systematists in this period, rather than more narrowly focusing on definitional disputes (embodying the so-called philosophy of science in practice approach (Ankeny, Chang, Boumans, & Boon 2011)). Doing so reveals that alongside the Systematics Wars were the dual methodological revolutions of the mathematization of systematics and the incorporation of molecular data. These trends cut across the separate schools of systematics, generating their own set of controversies. This is particularly acute in phylogenetics, where the rise of the use of mathematical and computational tools was both enthusiastically embraced and deeply resisted. We start with a short description of the impact of molecular data on phylogenetics, before moving to how mathematical and statistical tools were incorporated into phylogenetic methods in section 1.2.3 (for other histories on these topics see Dietrich 1994, 1998; Hagen 2001; Sterner & Lidgard 2018).

Let’s first consider the appeal of including molecular data in phylogenetic analyses—perhaps even to the exclusion of morphological characters (e.g., Scotland, Olmstead, & Bennett 2003 though see Wiens 2004). One of the most important aspects of the molecular revolution for phylogenetics was the sheer amount of new data available to systematists, including for groups such as bacteria, where fossil data were scarce or severely lacking. Yet it was not merely the scale and availability of molecular data that held such appeal. As it became easier to collect, code, and collate these data, biologists argued that molecular phylogenetics provided an independent source of evidence from morphological studies that bolstered evidence for common ancestry and patterns of phylogeny (Zuckerkandl & Pauling 1965). In contrast, more traditional systematists such as Theodosius Dobzhanksy, Ernst Mayr and George Gaylord Simpson were skeptical of molecular evolution studies quite generally, and, in particular, the idea that they were in some way superior or could replace morphological studies (Dietrich 1998).

Many of the earliest molecular studies typically supported established phylogenies, e.g., Margoliash (1963) discusses the evolution of cytochrome c , though simply presumes the received phylogeny of species as correct. Later, Fitch & Margoliash (1967) construct a phylogeny solely from the cytochrome c data, primarily to demonstrate molecular phylogenetic methods and how to test the reliability of results. Interestingly, one of the “anomalies” they mention is that turtles are placed closer to birds than to snakes—but this turns out to be no anomaly! The vast majority of molecular studies have since placed the turtles as a sister group to the archosaurs (a group which includes the birds and the crocodiles) including the largest genome-scale studies done to date (Crawford et al. 2015).

The case of the phylogeny of turtles as well as other recent molecular studies (e.g., of the lungfish and the coelacanth, see Amemiya et al. 2013; Takezaki & Nishihara 2017) illustrate a general point. Even in groups where we have extensive fossils and easily accessible living specimens with detailed anatomical and morphological studies, molecular evidence has often been taken to settle disputes and even to overturn previously widely held views. This is in part due to the sheer volume of molecular data, which can overwhelm other sources of evidence, but also because of the way molecular data may be viewed as evolving “neutrally”, i.e., as the result of mutation and genetic drift rather than natural selection (Duret 2008), and, thus, be less prone to displaying analogies (the result of convergent evolution), which can be mistaken for homologies (the result of descent from common ancestry). (An analogy is the similarity relation between two analogous characters, and is a kind of “homoplasy” relation, i.e., characters whose similarity is not due to descent from common ancestors.)

Analogies are false positives of evidence of common ancestry. Morphological traits, it is presumed, are more subject to selective pressures that can generate these false positives, and so molecular data from sources less subject to selective pressures (i.e., neutral sites) are preferable for reconstructing phylogenetic history. From those phylogenies, analogies can be better identified, offering evidence for appeals to natural selection (in the form of convergent evolution or other evolutionary processes) as an explanation for the pattern of biodiversity and trait distribution (Whittall & Hodges 2007 provides a good example of this reasoning in regard to pollinator mode and morphology). Put another way, molecular data helps biologists explain why what appear to be homologies fail to form nested hierarchies. Molecular data can identify which of those might be analogies instead, and this helps resolve what would otherwise appear to be conflicts in phylogenetic inference. This is one way biologists use phylogenies to offer data-driven, theoretically-grounded evolutionary explanations, in contrast to “just-so” stories.

In addition to adding more evidence in cases where systematists already had a great deal of it, molecular studies can shed light on cases where other sources of evidence are extremely weak. For example, molecular data may preserve deep phylogenetic history otherwise obscured at the morphological level by millions of years of adaptive changes. Phylogeneticists depicting the very earliest branches in plant and animal phylogenies, as well as relationships between the different groups of eukaryotes, recognized how molecular data could both test and enrich these deep phylogenetic hypotheses (Kenrick & Crane 1997; Baldauf 2003). Subsequent molecular phylogentic analyses have demonstrated the fruitfulness of using molecular data (e.g., amino acid sequences, RNA transcriptomes, and nuclear and mitochondrial genomes) for constructing deep phylogeny hypotheses (Heckman et al. 2001; Dunn, Giribet, Edgecombe, & Hejnol 2014).

Emile Zuckerkandl and Linus Pauling’s introduction and use of the hypothesis of a molecular clock provides a good example of the powerful utility of molecular data and was an important early episode in the history of molecular phylogenetics. The term “molecular clock” was coined in Zuckerkandl & Pauling (1965), but they had already used it without naming it in earlier publications (Zuckerkandl & Pauling 1962; Pauling & Zuckerkandl 1963). The basic idea is fairly simple. Given some estimated rate of molecular evolution, the relative ages of branches and nodes may be estimated given some set of molecular data mapped onto a phylogenetic topology. Thus, molecular models of evolution were incorporated into phylogenetic methods ( §1.2.3 ).

As more sequences of proteins became available, Zuckerkandl and Pauling noted that homologous proteins displayed slight yet systematically different sequences across taxa. For example, while there were 18 amino acid differences between the human and horse hemoglobin α-chains, there was thought to be only two differences between gorillas and humans. [ 3 ]

Zuckerkandl and Pauling then assumed that amino acid changes were roughly linearly proportional to time, and thus the time between the divergence of humans and horses was nine times greater than that between humans and gorillas. Using paleontological estimates for calibration, Zuckerkandl and Pauling concluded that the humans and gorillas diverged a mere 14.5 million years ago—much more recent than traditionally was thought. This line of inference is not limited to molecular phylogenetics, but has been imported for application to other data types, e.g., similar reasoning from phylogenetic techniques has been applied to language-tree divergence times to estimate Indo-European and Polynesian human migration (Gray & Jordan 2000; Gray & Atkinson 2003; Gray, Drummond, & Greenhill 2009). (For more on the early development of the molecular clock, see Morgan 1998.)

Later, it was thought that the neutral theory of evolution [ 4 ] provided some theoretical justification for the molecular clock hypothesis (Kimura 1968, 1969; Duret 2008), though Dietrich (1994) argues that the relationship goes in the other direction with Zuckerkandl and Pauling’s research providing an essential foundation for the development of the neutral theory in the first place.

The molecular clock is but one of many examples of the importance of molecular biology to phylogenetics. Importantly, incorporating molecular data helped extend phylogenetic studies beyond eukaryotes, i.e., the molecular revolution has fundamentally altered (or, arguably, created) the field of microbial phylogenetics. Beginning in 1969, Carl Woese began to sequence ribosomal RNA (rRNA) from across the prokaryotic kingdom in an effort to understand bacterial phylogeny. The work was slow and difficult, but by 1975 Woese and his lab had sequenced the rRNA of 27 different bacterial taxa (Woese et al. 1975). In 1977, Woese and George Fox dropped a bombshell. They had sequenced genes from a number of prokaryotic organisms whose rRNA genes lacked the standard signatures of all previously known bacterial rRNA genes and which appeared as distantly related to the bacteria as did the eukaryotes. Woese and Fox named them the “Archaebacteria” to emphasize their distinctness (Woese & Fox 1977). In 1990, Woese, Kandler, and Wheelis (1990) produced the first attempt at building a universal tree of life that included bacteria, archaebacteria, and the eukaryotes based on molecular data, with the project continuing to this day (Ciccarelli et al. 2006; Hinchliff et al. 2015; Hug et al. 2016). All of this is only possible because of molecular sequence data.

The introduction of molecular data was highly significant in phylogenetics. In addition to the amount of data it generated, and the way molecular data provided evidence where other data were scarce, this new type of data was also significant in how it promoted the mathematization of phylogenetics. Molecular data lends itself to mathematical and statistical analysis, in part due to (a) the ease by which molecular data may be encoded; (b) the systematic way molecular evolution may be mathematically modeled and subsequently incorporated into statistical methods; and (c) the manner by which mathematical and statistical tools help analyze the large volume of molecular data under consideration. So though the molecular revolution and mathematization of systematics need not go hand-in-hand, it is hardly surprising that they do.

A number of authors have discussed the rise of mathematical thinking and computational tools in systematics—especially in connection with numerical taxonomy and phenetic classification (e.g., Hagen 2001; Sterner & Lidgard 2014, 2018). Here we focus on the rise of statistical methods for inferring phylogenies and later ( §2 ) we discuss the philosophical debates that arose regarding the use of these methods. Good and more thorough textbook introductions for phylogenetic methods include Felsenstein (2004), Lemey et al. (2009) and Baum & Smith (2013). Yang and Rannala (2012) provide a helpful recent review in the context of molecular phylogenetics.)

Incorporating molecular data both encourages and is facilitated by the adoption of mathematical tools. We got a hint of that in the discussion of molecular clocks ( §1.2.2 ), which use estimated rates of molecular evolution to estimate divergence times. Around the same time, Anthony Edwards and Luca Cavalli-Sforza—both students of R.A. Fisher—began thinking of phylogenetic inference as a problem for statistics. After initially working independently on the problem, they collaborated to produce what are perhaps the first set of papers presenting phylogenetic inference in a statistical framework (as attributed by (Felsenstein 2001, 2004); Edwards’ own recollections of this work can be found in Edwards (1996)).

Among the earliest algorithmic methods for inferring phylogenies are those known as “distance” methods. “Distance” could represent whichever character coding scheme we might be using to construct the tree, molecular or morphological, e.g., the number of nucleotide differences between two sampled sequences of DNA. A distance method uses these pair-wise distances to infer the phylogeny.

The first distance method developed is among the best justified statistically, namely, the least-squares method of Cavalli-Sforza and Edwards (1967). Here each topology has a set of “best-fitting” branch lengths where fit is measured by summing the squared differences between the expected distance and the actual distance between each pair of taxa. The best tree topology is the one whose best fit branch lengths are a better fit than any other tree’s best fit.

An alternative class of methods are known as parsimony methods. Here we simply need a measure of how many steps it takes for one character state to transform into another. Recall that different tree hypotheses generate different explanations of the current character distributions. The basic idea behind parsimony is that the tree hypothesis that explains these character distributions with the fewest number of evolutionary changes is to be preferred.

The first parsimony algorithm was published in Camin and Sokal (1965) and a number of variants of parsimony have since been developed. For example, one might want to weight certain kinds of changes relative to others or even disallow certain kinds of changes (such as reversals) all together. A survey of such methods as well as a collection of different algorithms for carrying them out can be found in Felsenstein (2004). Parsimony was the chosen method for the Cladistic school which is part of the reason that the history sketched in “The Systematics Wars” ( §1.2.1 ) above is intertwined with the history of mathematization sketched here. Philosophical arguments for and against the superiority of parsimony as an inference method will be discussed later in section 2.3 .

If we think of phylogenetic inference as an instance of statistical inference more generally, then likelihood and Bayesian methods spring to mind as methods that exemplify more general, statistical paradigms. Both require calculating probabilities and thus will require models of evolutionary change i.e., some description of the process of evolution and assignment of probabilities to possible changes in character states (such as from one nucleotide to another). Let’s briefly walk through what that means, as how and whether to incorporate models of evolution in phylogenetic inference has been a major topic of debate.

If the data are observed nucleotide sequences from different groups of terminal taxa, then the calculation we are after is the probability that those observed sequences of nucleotides would arise on the tips of competing phylogenetic hypotheses. The simplest way to calculate that is to use the Jukes-Cantor model of evolution (Jukes & Cantor 1969). This model assumes an equal frequency of the four nucleotides, and a uniform probability that any one nucleotide might change to another. (These assumptions are typically violated, but reflecting that in a model of evolution introduces additional parameters, which increases computing time—a valuable and limited resource in 1969.) The Jukes-Cantor model provides a model of evolution on which the relevant probability that any given sequence might have evolved from another more primitive sequence may be calculated. Different tree topologies will generate different probabilities that the observed sequences evolved from a hypothetical ancestral state. This conditional probability, \(P(\text{Data} | \text{Tree})\), is called the likelihood of the tree.

Though the Jukes-Cantor model is still used, more sophisticated models of evolution are available, e.g., the K80 (Kimura 1980), HKY85 (Hasegawa, Kishino, & Yano 1985) and Tamura-Nei (Tamura & Nei 1993) models of evolution. These introduce more parameters, e.g., assumptions about equal frequencies of nucleotides and/or the uniform probability of changing nucleotide states may be relaxed and even customized; models may also incorporate molecular clock assumptions.

These parameter-rich models reflect a better understanding of molecular evolution (e.g., that transversion and transition [ 5 ] rates between nucleotides are rarely equivalent) and the combination of lowered costs and increasing accessibility to growing computational power that permits more parameter rich analyses. The large number of models available leads to an interesting statistical problem in its own right known as the problem of model selection, i.e., which model of evolution do we use in our phylogenetic tools? For discussions of this problem in the context of phylogenetics, see (Posada & Crandall 1998; Sullivan & Joyce 2005; Posada 2008).

Once we choose a model or collection of models to work with, we can now calculate probabilities conditional on these additional parameters (\(\theta\)). This likelihood may be formally represented as:

As a reminder, the likelihood of a hypothesis given a set of data is just equal to the conditional probability of those data given that hypothesis. On these model-based phylogenetic approaches, the data are conditional on both a particular phylogenetic tree hypothesis and the stipulated parameters.

Using different values for the parameters in your model will lead to different likelihoods for the tree. The maximum likelihood estimate is the tree hypothesis along with the fitted parameters which maximizes this likelihood. Felsenstein (1981) gave the first computationally feasible algorithm for calculating likelihoods, and with modern computing power combined with the huge amounts of sequence data available, likelihood methods have become the most commonly used methods in phylogenetics.

Just as with likelihood methods, Bayesian methods require a model of evolution before any inferences are possible. The fundamental difference is that likelihood inference treats parameters of the model as nuisance parameters which have a fixed but unknown value. In the frequentist method of maximum likelihood, these parameters are set to their best fitting value. Bayesian inference treats parameters as random variables with unknown distributions and probabilities are used as a measure of uncertainty. The probability of a tree is a conditional probability \(P(\text{Tree} | \text{Data})\) under the assumption of a particular model. When we incorporate the model parameters \(\theta\), by Bayes’ theorem, we have:

Controversially, this requires that we attach prior probabilities to the possible hypotheses as well as the parameters in our model. Pickett and Randle (2005) claim that it is impossible to assign priors in a consistent way. Velasco (2007) argues that this claim rests on a mistake, though the problem remains a difficult one. He later advocates assigning priors to tree topologies, but leaves open the more difficult problem of priors on branch lengths or model parameters (Velasco 2008b). Alfaro and Holder (2006) attempt to address some of these issues.

Anthony Edwards (1970) was the first to discuss Bayesian methods in phylogenetics, but it was computationally impossible to carry these inferences out at the time. In the 1990s, three groups independently developed methods for carrying out Bayesian inferences of phylogenies in practice (Rannala & Yang 1996; Mau & Newton 1997; Yang & Rannala 1997; Li, Pearl, & Doss 2000). Each group used Bayesian Markov chain Monte Carlo (MCMC) algorithms to estimate posterior probabilities, which other groups further refined (e.g., Larget & Simon 1999; Ané, Larget, Baum, Smith, & Rokas 2007; Huelsenbeck, Ané, Larget, & Ronquist 2008). (See Archibald, Mort, & Crawford 2003 for an early and non-technical primer on Bayesian phylogenetic methods.) With further statistical and computational improvements, Bayesian methods are now fairly commonplace in the literature alongside likelihood methods.

2. Phylogenetic Inference and Philosophy

Section 1 served as an introduction to the history and problem of phylogenetic inference. In this section, we will look at how phylogenetics and philosophy are intertwined—in particular, we will examine some foundational debates in phylogenetics that have a distinctly philosophical flavor, and we will point out more traditional philosophical questions that the study of phylogenetics can shed light on.

At its core, phylogenetic inference is about evaluating competing hypotheses. In an important sense, phylogeneticists are faced with what philosophers of science would identify as a problem of underdetermination of theory by evidence . Multiple competing phylogenetic trees can explain the same data, though in conflicting ways; it is the phylogeneticists task to identify which of those hypotheses best explains those data. It will be useful to separate the problem into two parts:

  • identifying what the evidence is; and
  • constructing a phylogeny to explain the evidence we have.

In the case of phylogenetic inference, we are in the philosophically interesting and puzzling situation where it seems that these two questions cannot really be separated.

The question of which method of phylogenetic inference is best justified is clearly important, but skips over an important step in phylogenetic inference—the construction of a data matrix of characters. Mishler (2005), for one, argues that it is the most important step! Following Winther (2009), we can separate character analysis which results in the construction of a data matrix from phylogenetic analysis which treats the data as input and produces a phylogeny. This first stage of character analysis has been widely discussed in biology (though surely less widely discussed than the second stage). In addition to the voluminous literature on homology, useful collections of work on characters include those edited by Wagner (2000) and Scotland & Pennington (2000).

Character analysis includes identifying characters of the organisms we are looking at, and in particular, identifying when a character in one organism is the same character as in another organism. This is the problem of homology. A different, but related problem, is assigning character states. For example, one bear might have black fur while another has white fur. Here, “fur” is the character, but it is in a different state (fur color) in the two organisms. Though it is standard to treat characters and character states separately, the extent to which these are genuinely different problems rather than than aspects of the same problem is a matter of debate (Stevens 2000; Freudenstein 2005; Sereno 2005).

Identifying and coding characters is certainly a biological problem, but it is a philosophical one as well. On what basis can we judge that one character is homologous to another? Debates on this extend back to the very beginning of contemporary biology, e.g., Richard Owen’s influential pre-Darwinian notion of homologue to Ernst Haeckel working to develop accounts of homology in the context of Darwinian evolution. In the mid-twentieth century, the pheneticists argued that character identification must be “theory-free” and advocated using raw similarity as a guide to identifying homology (Sokal & Sneath 1963; Sneath & Sokal 1973). There are well known philosophical and biological problems with trying to use similarity in this way (e.g., Goodman 1972), many of which were raised as objections to strict pheneticism during the Systematics Wars (e.g., Mayr 1965b; Neff 1986; Rieppel & Kearney 2002 though see Lewens 2012 for a recent defense of pheneticism in a contemporary context).

In a phylogenetic context, homologies are regarded as the result of descent with modification from common ancestors. Similarity and identity, in that context, is an expression of shared evolutionary history. At the same time, homologies are also treated as hypotheses from which we can infer phylogenetic hypotheses (and, subsequently, both test and explain our hypotheses of homology). Clearly, even in a contemporary phylogenetic context, questions remain on how to individuate characters and identify homologies. Richards (2002, 2003) worries that since there is no algorithm for individuating characters, we rely on illegitimate, nonscientific factors rendering character identification and phylogenetic analysis ultimately subjective. Rieppel and Kearney (2007) and Winther (2009) respond that character analysis can be considered objective since it relies on various kinds of causal criteria; Kendig (2016) adopts a slightly different strategy, rejecting the idea that we need to develop an analytic definition of “homology”, instead looking to biological practice as a guide for individuating characters.

While character analysis and phylogenetic analysis are logically separate tasks, in theory they should be intertwined. A given phylogeny can make an assessment of homology more or less plausible. One proposal to avoid this problem is the introduction of a distinction between “primary” homologies (an initial character identification based on structural features) and “secondary” homologies (similarity due to common ancestry) which are inferred through phylogenetic analysis (Pinna 1991; see Roffe 2020 for a recent philosophical treatment of this distinction). On the other hand, Rieppel and Kearney (2002) argue that character hypotheses need to be independently testable and phylogenetic analysis can clearly provide evidence for miscoding of characters—even if this is not the intended goal.

Molecular data presents a different set of challenges for individuating characters and coding them into a matrix, nicely illustrated by DNA sequence data. With only four base nucleotides, coding is easy ( §1.2.3 ), and modern laboratory techniques provide highly reliable sequences of nucleotides in a stretch of DNA. But here, the problem is alignment . How do we know that one stretch of nucleotide sequences should correspond to similar stretches in other taxa? We might consider chromosomal location, or even functional similarities, though each of these raise other questions of how we decide what ought to be treated as homologous characters (even contingently) across taxa. This becomes even more complicated as we consider how historical processes like gene duplication/loss or recombination add complexity to these questions. This is the problem of positional homology. Like morphological character coding, molecular alignment is intertwined with phylogenetic analysis and various methods of joint inference have been proposed (Redelings & Suchard 2005; Wheeler et al. 2006).

Bearing in mind what we said above about the interconnectedness of character analysis and phylogenetic analysis, we will now proceed to the second part, analysis of the data. At this stage, computational limits loom large. Even if we are concerned only with inferring the topology of a phylogeny, the number of possible trees grows exponentially relative to the number of terminal branches we include on a tree. The formula for calculating the number of possible tree topologies is relatively straight-forward (Felsenstein 2004):

where n is the number of taxa. The upshot is that even a small expansion of terminal branches results in an exponential increase in the number of possible topologies. So, for example, there are 15 possible phylogenetic trees for a group of 4 species, over 34 million for 10 species, and \(8 \times 10^{21}\) with 20 species. [ 6 ]

This means that exhaustive evaluation of all possible hypotheses very quickly becomes all but impossible—even as our capacity for this has grown with increased computing power. Finding the most parsimonious tree, the maximum likelihood tree, and even doing a multiple sequence alignment to begin with are all NP-complete problems [ 7 ] (Graham & Foulds 1982; Wang & Jiang 1994; Chor & Tuller 2005). Bayesian inference is similarly computationally intractable (Ronquist, Mark, & Huelsenbeck 2009).

In response, researchers have developed heuristic search strategies to explore possible tree space. For example, in Bayesian analysis the probability space of possible trees is so large that starting by randomly selecting a tree will be highly inefficient. It will almost surely fail to provide a good estimate of the posterior probability distribution, since the interesting regions will only occupy a small part of that distribution space (Ronquist et al. 2009). A common heuristic is to start with a parsimony or distance-methods tree. Though these methods typically perform worse than model-based approaches, they are much less computationally expensive and can quickly generate an initially plausible tree that Bayesian phylogeneticists use as a launching off point to explore the tree probability space (e.g., using hill-climbing algorithms such as Markov Chain Monte Carlo (MCMC) sampling). Innovations like these are one reason that computational phylogenetics is an important field in both statistics and computer science. Philosophers of science interested in how scientists explore and use so-called “Big Data” would profit from examining how phylogeneticists have developed, adapted, and employed heuristic search strategies.

Setting aside computational and practical problems leaves us with what appears to be a straightforward epistemological question—which methods are best justified in licensing phylogenetic inferences? We begin our examination of this question by looking at arguments in favor of a particular method, parsimony.

Parsimony methods were among the very first adopted by cladists. This is often taken to include Hennig (1966), who laid out a process of phylogenetic inference that is now referred to as “Hennigian reasoning” in his seminal text Phylogenetic Systematics (Baum & Smith 2013). Like other forms of parsimony, this method sought to identify the tree that explained the distribution of homologies by appeal to the fewest number of evolutionary events.

Parsimony techniques have grown more sophisticated, and, as larger data sets have been assessed and bigger trees constructed, justifications of parsimony approaches advanced and adapted. Early on, much of this took place in the context of the Systematics Wars ( §1.2.1 ), where a premium was placed on defending approaches as properly scientific. Thus, these debates tended to veer into philosophy of science, with systematists disputing what counted as doing good science in the context of inferring history. This helps explain the influence of Karl Popper’s work . Early proponents of parsimony were attracted to Popper’s falsificationist account of the scientific method, both in the way it provided demarcation criteria for scientific activity, and in the way it was positioned as critical of logical positivist approaches. Falsificationist justifications were employed to criticize numerical taxonomists as verificationists relying on flawed hypothetico-deductive methods.

Wiley (1975) provides a representative example of a falsificationist defense of parsimony that proved deeply influential, e.g., Eldredge & Cracraft (1980) adopt Wiley’s core justification in their influential textbook, as does Farris (1983) in his impactful defense of parsimony. On Wiley’s account, each taxonomic character is treated as an homology, and considered independently of all others. For each homology, a tree hypothesis is generated that requires the fewest number of evolutionary changes to explain the pattern of character distribution across taxa. These are hypotheses of homology and, on the falsificationist account, these hypotheses are mutually testable. That is, the tree hypotheses generated for each character constitutes a test of the trees hypothesized for every other character; falsified hypotheses of homology are treated as ad hoc assumptions of homoplasy. The maximum parsimony tree is that topology which generates the fewest falsified hypotheses of homology, and is regarded as the most corroborated tree. Adding new characters or taxa constitutes further testing of these hypotheses, and generates more potential falsifiers. In this way, parsimony is simultaneously a mode of tree construction and evaluation (Haber 2009); Wiley (1975), Eldredge and Cracraft (1980), and Farris (1983) describe this as a logical basis of reasoning about phylogenetic inference, citing the philosophical justification of falsificationism.

Whether this represents a faithful account of Popper’s falsificationism, or whether falsificationism represents the optimal (or even a desirable) mode of scientific reasoning has been a matter taken up by philosophers (Hull 1983, 1999; Sober 1988b), though that is, in some ways, beside the point. (For a philosophically-minded biologist’s survey of the role of Popper in systematics, see Gaffney 1979, Rieppel 2008, or Santis 2020.) Regardless, characterizing parsimony in falsificationist terms provided an important justification that resonates to this day. More recent invocations of Popper in phylogenetic inference have focused on whether parsimony or model-based methods such as Maximum Likelihood ( §1.2.3 ) better align with falsificationism (e.g., Siddall & Kluge 1997; de Queiroz & Poe 2001, 2003; Kluge 2001), and whether probabilistic and statistical reasoning is relevant for studying singular historical events (Siddall & Kluge 1997; Haber 2005). (See Turner 2007; Cleland 2011;Currie 2018, 2019, for general philosophy of historical science.) In section 2.4 we describe how Felsenstein’s (1978) discovery that parsimony is prone to systematic errors poses a serious challenge to this underlying falsificationist justification.

Parsimony can be defended without falsificationism, and other justifications have been offered that appeal to the method’s namesake, i.e., the principle of parsimonious reasoning or appeals to Ockham’s Razor (e.g., Kluge 2005). It amounts to (i) defending parsimony (the principle) as the criterion by which we ought to assess competing phylogenetic hypotheses, and (ii) arguing that parsimony (the method) is the best mode for achieving that goal by seeking out that tree that minimizes the number of evolutionary novelties required to explain the distribution of characters. This may also be presented as identifying that tree that minimizes the number of ad hoc assumptions of homoplasy. Though this appeal to simplicity may be wedded to falsificationist justifications (as we saw above), they are logically independent defenses. There is certainly an intuitive appeal to the principle of parsimony in the sciences, and Ockham’s razor remains influential for good reason. But justifying parsimony methods by appeal to simplicity is not so straight-forward. Evaluating which phylogenetic approaches are justified by Ockham’s Razor depends on what is being counted (Haber 2009). Sober (1988b, 2015 among others) provide useful discussions of the role of simplicity and the principle of parsimony in phylogenetics.

Other philosophical justifications of parsimony have been offered. Fitzhugh (2008 and elsewhere) has argued that the problem of phylogenetic inference is best understood as an abductive problem, and that treating it as a problem for statistical (or inductive) inference amounts to a category error. Sober (1988b) agrees that phylogenetics involves abductive reasoning, though argues that this justifies the use of statistical approaches (especially likelihood).

It is also worth recalling why pattern cladists in particular object to model-based methods of phylogeny reconstruction. Pattern cladists typically reject any method of phylogeny reconstruction that includes assumptions about the evolutionary process (§1.2.1), e.g., Brower (2019: 717) identifies a centrral commitment distinguishing pattern from process (or traditional or phylogenetic) cladism as the rejection of “ a priori evolutionary background assumptions from the inference of patterns of relationships among taxa.” As such, pattern cladists have strongly objected to likelihood and other model-based phylogenetic methods, as those explicitly incorporate evolutionary models (§1.2.3 and §2.4). This has typically gone hand-in-hand with a rejection of statistical approaches more generally, and a commitment to parsimony, e.g., “the statistical approach to phylogenetic inference was wrong from the start, for it rests on the idea that to study phylogeny at all, one must first know in great detail how evolution has proceeded. That cannot very well be the way in which scientific knowledge is obtained” (Farris 1983: 17).

In response, statistical phylogeneticists have argued that parsimony methods implicitly include evolutionary models, which model-based methods simply make explicit and, thus, subject to interrogation (de Queiroz & Poe 2001, 2003; Swofford & Sullivan 2009). This fault line is further reflected in the way pattern cladists view other proponents of parsimony as allies in this debate, as Brower (2019: 717), approvingly citing Kluge & Farris (1999), makes clear: “despite this difference of opinion regarding critical background knowledge, there is no practical distinction between pattern and process cladists’ evidence, analyses or results. From a methodological perspective, all cladists group by synapomorphy alone.” That is, it is in the interpretation of methodology and inferential justification that distinctions are drawn between pattern and process cladists; in the larger debate with statistical phylogeneticists they are allied. Note, however, that ever finer distinctions of the term ‘cladist’ are being introduced (let alone ‘statistical phylogeneticists’, who typically associate with model-based approaches). This reflects the way this term has grown contested and associated with various competing sets of commitments. It is worth taking a brief detour in section 2.3.1 to sort through this a bit.

2.3.1 Who Owns the Term “Cladist”?

In section 1.2.1 , we used the term “cladistics” to refer to a school of taxonomy which held that higher taxa must be monophyletic, and stood in contrast to the competing schools of phenetics/numerical taxonomy and evolutionary taxonomy. In this historical context, “cladistics” was sometimes used interchangeably with “phylogenetics”. We caution against that practice, especially when discussing more contemporary practices. Though it is hard to pin down a precise date when these terms began to carry different connotations, Felsenstein (1978) and Beatty (1982) are as good a marker as any for when these terms began to diverge and take on more nuanced senses. This is particularly true of the term “cladist”, which has become a highly disputed term, and one that has been claimed by sub-groups within the larger field of phylogenetics.

An apt example of this is the journal Cladistics , whose editors published a controversial editorial in 2016 stating that submitted manuscripts should reflect the philosophical commitments of the journal and its sponsoring professional society ( The Willi Hennig Society ) (Editors 2016). They write,

The epistemological paradigm of this journal is parsimony. There are strong philosophical arguments in support of parsimony versus other methods of phylogenetic inference (e.g., Farris 1983).

The editorial continues, stipulating that “Phylogenetic data sets submitted to this journal should be analysed using parsimony”. If trees generated by alternative methods show different results, these may be included if authors prefer those topologies, but only if authors are “prepared to defend it on philosophical grounds” (all quotes from Editors 2016: 1). The editors, here, are using the imprimatur of the journal Cladistics to stake out a specific sense of how “cladistics” ought to be understood, and what calling oneself a “Cladist” signals about the underlying theoretical and philosophical commitments held about phylogenetic inference, i.e., an exclusive commitment to parsimony methods—going so far as to reject model-based statistical approaches entirely.

Aleta Quinn (2017) argues that the term “cladist” remains one claimed by multiple overlapping (yet importantly distinct) camps. She tackles the ambiguity head on, providing a useful “foreign language dictionary” of ways that “cladist” gets used in systematics. Quinn argues that the term is in need of disambiguation, drawing seven different senses from the literature. This includes systematists that advocate adherence to particular phylogenetic methods (e.g., parsimony, as in the case of the Cladistics editorial) or philosophical justifications (especially to a version of Popperian falsificationism), though the various distinctions of competing senses of “cladist” go well beyond this. In response to Quinn, Williams and Ebach (2018) propose an eighth (and their preferred) sense of cladist, and Brower (2018a) offers a focused commentary on Quinn’s article, as well as other attempts to clarify the philosophical foundations and commitments of pattern cladism. For example, a distinguishing feature of pattern cladism is the rejection of a priori assumptions about evolutionary theory in phylogenetic methods. This has been interpreted as a claim of theory-neutrality or minimality by some philosophers and biologists (e.g., Pearson 2010). Brower (2019) contests this interpretation, arguing that the claim of theory neutrality is specific to evolutionary assumptions, as opposed to a more general claim.

The primary take away is that care should be taken when philosophers use the term “cladist”. Though there are a lot of interesting philosophical topics to unpack around this term, it must be acknowledged that it has become ambiguous. There are simply too many senses of this term for it to be used without clearly specifying precisely what sense is being invoked. We encourage philosophers writing about phylogenetics to follow the lead of systematists, and avoid using the terms “cladist”, “cladistics”, and other cognates as synonymous with “phylogenetics”. Instead, philosophers should reserve usage of those terms and carefully specify which sub-set of phylogeneticists (or their work) those terms are referring to.

In section 2.3 we mentioned that the falsificationist justification of parsimony was met with a serious challenge by the discovery that it was subject to making systematic errors under certain conditions. Joseph Felsenstein’s (1978) paper, “Cases in which Parsimony or Compatibility Methods Will be Positively Misleading” first describes what came to be known as the phenomenon of “long-branch attraction”. Though hardly the beginning of debates between parsimony and model-based phylogenetic methods, it represents an important inflection point in discussions about phylogenetic inference.

Long-branch attraction refers to the tendency of some methods to preferentially group together long branches (those with more evolutionary changes) regardless of their actual history. Methods designed to identify trees requiring the fewest number of evolutionary changes to explain the data will prefer explanations of shared ancestry over convergent evolution—even in cases where convergent evolution occurred. For example, imagine two branches with rapidly evolving DNA sequences. By chance alone, the two branches might have the same mutation at the same site which parsimony treats as evidence that these branches share a recent history. In a range of cases, the chances of these similar mutations occurring is so high that we expect the number of these convergent cases to overwhelm the signal from the true homologies. In these cases, the parsimony method is statistically inconsistent. A method is statistically consistent when it is guaranteed, as more data are added, to identify the correct solution. Methods that fail to do this are statistically inconsistent. In this case, not only does parsimony fail to be consistent, but the method reliably returns increased support for a specific, incorrect outcome as more data are added and hence is “positively misleading”.

Long-branch attraction is more than just an operational problem, it also challenges the falsificationist underpinnings of parsimony (Haber 2009). Recall that on Wiley’s (1975) influential account, phylogenetic trees are subject to being falsified as more data (characters) are discovered or added ( §2.3 ). If a more parsimonious tree is available to explain those data, the previously most parsimonious tree is treated as falsified (or as containing more ad hoc hypotheses of homoplasy). Felsenstein’s discovery was that in some circumstances, as more characters are included in a parsimony analysis, parsimony will be prone to rejecting hypotheses that report the actual phylogenetic relations, while corroborating more parsimonious—but incorrect—phylogenetic hypotheses. This undermines the logical justification of parsimony analysis.

Felsenstein (1978) was (and remains) an immensely impactful paper, generating numerous research programs. It is a largely theoretical and abstract paper that considers how different methods perform under different biological conditions. One natural question that might be posed concerns how prevalent these conditions are in the actual systems being studied. This is an empirical question, of course, and can be studied as such, though disputes over how to interpret empirical cases have proven controversial (Huelsenbeck 1997; Whiting 1998; Siddall & Whiting 1999; Yang & Rannala 2012).

In the 1990s experimental phylogeneticists used empirical and simulation studies to more rigorously empirically test claims of performance of competing phylogenetic methods. This included testing competing methods to reconstruct known in-vivo phylogenies (of bacteriophage) that were carefully constructed, tracked, and archived over many generations and more complex simulated phylogenies (taking advantage of the ever-growing processing power available on the lab bench) (Hillis, Bull, White, Badgett, & Molineux 1992; Hillis & Bull 1993; Hillis, Huelsenbeck, & Cunningham 1994). These largely confirmed that distance and parsimony methods were subject to long-branch attraction, where model-based statistical methods such as likelihood avoided these epistemic traps.

Felsenstein’s (1978) paper is also impactful because of his commitment to statistical consistency as the standard bearer for evaluating competing statistical approaches. For Felsenstein and many other phylogeneticists, showing that parsimony is statistically inconsistent is a fatal blow. But not everyone agrees. In fact, some authors argue that it is close to an irrelevant consideration. We now turn to this debate about the importance of consistency as an example of how philosophy plays a central role in phylogenetic theory.

Like Edwards and Cavalli-Sforza before him, Felsenstein (1978) treats phylogenetic inference as a problem for statistics. Jerzy Neyman (1971) pointed to the emerging field of molecular phylogenetics as a source of novel and interesting statistical problems. In the intervening fifty years the field has only expanded in importance to the point where phylogenetic inference is now one of the central problems in statistics. Similarly, research on algorithms for inferring phylogenies has been important in computer science for decades.

If phylogenetic inference is seen as a problem in statistical inference, it might be thought that general arguments for ideal statistical methods will just apply in this case. Maximum likelihood or Bayesian methods could be defended on these grounds for example. In this light, parsimony is a statistical method and as such, we can study its statistical properties (Felsenstein 1983, 2004). Anthony Edwards (1996) claims that his initial usage of the “minimum evolution” principle was as an approximation to the maximum likelihood solution which he believed to be justified on general, statistical grounds, i.e., parsimony methods could be assessed as a statistical approach. Since that time, a number of authors have shown exact connections (ordinally equivalent rankings) between parsimony and likelihood in a range of cases (Felsenstein 1973; Sober 1985; Tuffley & Steel 1997; Steel & Penny 2000).

In an exchange on parsimony and likelihood between Felsenstein and Elliott Sober, Felsenstein defends the idea that consistency “is a fundamental property” and in particular, that it is more fundamental than likelihood (Felsenstein & Sober 1986). He reiterates his earlier claim that

maximum likelihood methods are not desirable in themselves, but because they have desirable statistical properties such as consistency and asymptotic efficiency. (Felsenstein 1978: 408)

With this attitude, Felsenstein is merely following an old and venerable position in statistics. For example, Fisher (1956: 141) called consistency “the fundamental criterion of estimation” and said that inconsistent estimators are “outside the pale of decent usage” (Fisher 1935 [1950: 11]). Neyman (1952: 188) agrees.

Contrary to Felsenstein’s position, Sober claims that

One does not “justify” a method by showing that there is an extremely special case in which it does its work well; nor does one “refute” a method by showing that there is another special case in which it makes a hash of things. (Sober 1988b: 76)

One particular argument that he gives against consistency is that it can conflict with likelihood. Sober (1988a) presents an example in which likelihood inference can fail to be consistent. But Sober follows Anthony Edwards (1972) in positing that likelihood is a “primitive postulate” which does not need justification based on repeated sampling. If a likelihood inference can fail to be statistically consistent but is still justified, then clearly consistency cannot be a necessary criterion of justified inference.

A different kind of worry about consistency is expressed by authors who point out that it is not clear what consistent method we could use. As the editors of Cladistics put it in their editorial defending the use of parsimony,

we do not consider the hypothetical problem of statistical inconsistency to constitute a philosophical argument for the rejection of parsimony. All phylogenetic methods, including parsimony, may produce inconsistent or otherwise inaccurate results for a given data set. (Editors 2016)

Maximum likelihood estimation is known to be provably consistent under a wide variety of conditions (Wald 1949), but several authors have argued that these conditions do not apply to estimating the tree topology since tree topologies are discrete, not continuous, parameters (Yang 1996; Siddall 1998; Farris 1999). However, Swofford et al. (2001) argue that Wald’s conditions do apply and Yang (1994), Chang (1996), and Rogers (2001) prove that maximum likelihood is consistent under different assumptions about character evolution. But regardless of which assumptions suffice for likelihood to be consistent, some assumptions about the evolutionary process are definitely required. But what happens when the model we use is not the correct model? This is especially problematic when combined with the view that to be a correct model it should be true, and that we could never know that our model was true or perhaps even that there is no such thing as a true model, as some pattern cladists have claimed (e.g., Brower 2016, 2018b). As Felsenstein (2004: 272) puts it,

likelihood is usually consistent if we use the correct model in our analysis. When we use the wrong model, there are few guarantees.

2.6 Phylogenetics and Philosophy

The philosophy of statistics is but one area among many where philosophical arguments contribute to work in phylogenetics. Though phylogenetics is a subfield of biology (and, arguably, of statistics and computer science), phylogenetic inference is a topic on which both philosophers and biologists work, often in overlapping ways. Indeed, the boundary between science and philosophy here is blurry in a way reminiscent of some of the natural philosophy of the seventeenth and eighteenth centuries. An upshot of this is that phylogenetic inference is an enormously rich area for philosophers to explore. Biologists are actively debating the conceptual foundations of phylogenetic methodology, justification, goals, explanatory models, etc. It is not uncommon to see philosophers playing a central role in these debates (e.g., Hull 1988; Sober 1988b). It also means that many biologists are eager to engage with philosophers in meaningful and deeply informed ways (e.g., Sterner & Lidgard 2018; Havstad & Smith 2019).

Phylogenetic inference is, generally, best categorized as a problem of epistemology. It is, after all, a problem of inference, albeit specialized to how we justify or ground our beliefs about phylogeny. An epistemologist who wonders how we can have knowledge of the past can hardly do better than to look to phylogenetics (Sober 1988b). And as we have seen, philosophy of statistics and of inductive inference more generally looms large. But moving beyond epistemology, any philosopher interested in how science works can learn from phylogenetics. Here we will look at the cases of metaphysics and the general philosophy of science.

Understanding details of the evolutionary history of life is not simply a stand-alone problem (as interesting as that problem would be). The history of life and the history of the planet are tied together such that knowledge in one domain aids our inferences in the other. Knowing what types of fossils are embedded in rocks together with a phylogeny can help scientists date the rocks just as dating the rocks through independent means aids in dating the fossils and thus in estimating aspects of the phylogeny (Grantham 2004). And just as the physics and chemistry of the planet have shaped living things, life has shaped the planet as well. Photosynthetic cyanobacteria were partially responsible for the great oxygenation event roughly 2.4 billion years ago, where oxygen build up in the Earth’s atmosphere fundamentally altered the chemistry and physics of the earth.

This entanglement with other sciences makes phylogenetics a rich source for philosophers of science. It provides a model to better understand how science works when specialized fields inform and rely on each other. Philosophers might draw on phylogenetics to better understand the social structure of complexes of sciences (Longino 1990, 2019) (and complex sciences that draw on other fields), or to characterize what constitutes good scientific activity for what might be called “service sciences”, i.e., sciences whose products are largely used by other scientists as end-users.

One way to frame debates in the Systematics Wars is as scientists wrestling over the question of what constitutes good science. Here we take the opportunity to highlight what may be an under-appreciated aspect of this debate. Namely, does good science require theory? Though many of the traditional sciences studied by philosophers of science are clearly driven by (or, at least, tightly linked to) theory, this does not always appear to be the case for other sciences. Contrast Newtonian mechanics or population genetics with, say, the study of stem cells (Fagan 2013) or developmental biology (Love 2014), where it is unclear whether there are any underlying theoretical structures.

In the case of systematics, it is unclear how to articulate a precise theory for any of these competing views. It is probably more accurately portrayed as competing sets of commitments, some of which may resemble (or be) scientific theories (depending on the characterizations of “scientific theory”), where other commitments might be better described in terms of scientific practice or mode of reasoning (Giere 1979 [1997]; Griesemer 2000).

Whether and what the theoretical structure of phylogenetics is remains an open debate. Yet, though scientists that research, study, and develop phylogenetic methods have a lot at stake in these debates, most of the end users of these methods are rather agnostic about those competing methods (and, in many cases, the underlying theories supporting those methods). In practice, most biologists that use phylogenetic methods typically use lots of them. Philosophers of science might consider whether this amounts to an adoption of methodological pluralism or theory agnosticism, and whether trees produced under competing methodologies constitute robust results (e.g., when identifying the same monophyletic groups or homologies).

Phylogenetic inference should also be of interest to a wider range of philosophers for the way it can function as a case study for other philosophical issues. The recent history of phylogenetics has provided philosophers with test cases for how scientists shift from one set of commitments to another (Hull 1988), for the role of inductive logic and probability in both explanation and model construction (Sober 1988a,b; Haber 2005), and for what constitutes a good scientific explanation (Felsenstein & Sober 1986; Sober 2004; Haber 2009). O’Hara (1988) and Ereshefsky (2012) look to phylogenetics to see how narrative or historical explanations function in the sciences. Other examples of philosophers using phylogenetics to assess philosophical theses include Velasco (2012) and Vassend (2020) using phylogenetics to examine practices of modeling and idealization, and Haber (2005) looking at what phylogenetic inference tells us of how probabilistic reasoning may be used to assess singular historical events. Aleta Quinn proposes a taxonomy of disputes in contemporary systematics that she argues stems from disputes over application of the principle of total evidence (2019), and uses the case of phylogenetics to examine a well known objection leveled against proponents of inference to the best explanation in the sciences (2016), the “best of the bad lot argument” (van Fraassen 1989), or the “underconsideration objection” (Lipton 2004). And finally, Andreasen (1998, 2000) looks to phylogenetics to answer the question of whether biological races are real.

Though phylogenetics as we described it is largely an epistemological enterprise, biological systematics as a whole is filled with important and controversial metaphysical issues which might seem to be largely independent of problems of inference. Whether you treat taxa as traditional natural kinds (Devitt 2008), homeostatic property clusters (Boyd 1999; Wilson, Barker, & Brigandt 2007), individuals (Ghiselin 1966, 1974; Hull 1976, 1978; Sober 1980; Ereshefsky 1991; Haber 2016a) or something else (Slater 2013, 2014) will typically carry little to no impact on debates about phylogenetic inference—especially as those played out in the philosophical literature.

Yet, early phylogeneticists certainly saw metaphysical implications of their views, and this impacted how they approach the problem of reconstructing phylogeny. Simply put, if you think that taxa are historical entities, and that the aim of phylogeny is the reconstruction of their histories, the task is very different than if you take the aim to be to find ahistorical similarities or shared essential characters (Hull 1970; Griffiths 1974). Though given the ubiquitous adoption of the phylogenetic project (in a broad sense), debates over how metaphysics ought to impact inferential methods continues in some areas of phylogenetics.

Do competing metaphysical views in systematics impact phylogenetic inference? Yes, though only obliquely and indirectly, and it’s very important to take care not to conflate issues of classification with the problem of reconstructing phylogenies—even when they overlap. Let’s take a very brief look at the former, followed by some commentary on the oblique and indirect way it may overlap with phylogenetic inference.

Ereshefsky (2001) argues that a commitment to phylogenetic principles entails a reformation of the practice and norms around biological classification. Specifically, he recommends leaving behind the traditional Linnaean classification naming practices in favor of a phylogenetically informed practice. This follows a tradition stemming back to the very earliest phylogenetic literature that sought to reform biological classification around the principle of monophyly (e.g., Hennig 1966; Griffiths 1976; Wiley 1981). For many phylogeneticists this goal continues to this day—most recently around the development of the PhyloCode (see Other Internet Resources ; de Queiroz & Gauthier 1990, 1992, 1994; Cantino & de Queiroz 2020). Though structural reforms to wholly replace Linnaean taxonomy have struggled to gain wide-scale traction and adoption, taxonomists have largely adopted monophyly as a guiding principle in classification and systematics—even in the context of so-called traditional taxonomy (though microbial systematics has proven more resistant; see §3.1 ).

Of course, how biologists reconstruct a phylogeny and identify monophyletic groups is a matter of taxonomic freedom on the PhyloCode and other phylogenetic classifications. That’s just to say that phylogenetic inference will operate largely independently of these metaphysical and classificatory issues—except insofar as these classifications inform biologists about what it is that they are aiming to reconstruct a history of in the first place. Haber (2019) provides a framing for this challenge, arguing that the units of phylogeny are the units of divergence and diversification. Where Ereshefsky (2001) and other advocates of phylogenetic classification provide a specific way of cashing out those units, Haber describes this as a rich, general research project, similar to other “unit” problems in philosophy of biology. In much the same way that how we cash out, say, the units of selection may impact how we study or draw inferences about evolutionary processes, how we identify the units of divergence and diversification will impact how we draw inferences in phylogenetics. (See de Queiroz 2007 for related foundational issues.)

Padian & Horner (2002) offer a coarse-grained example of how a metaphysical commitment overlaps with issues of phylogenetic inference in their defense of treating Aves as the sole living branch of dinosaurs. In this case, they identify “transformationist” thinking against typological thinking. In the former, the identity of taxa is associated with historical continuity, and untethered from the expression of any intrinsic characters. Thus, the transformation of ancient theropod dinosaurs into the living clade of birds means that birds are dinosaurs, and that this ought to be reflected in our phylogenetic reconstructions. In contrast, they argue, a typological account will treat birds as distinct from dinosaurs, and risks a phylogenetic reconstruction that misrepresents the historical relationship of these taxa. Moreover, they argue that transformationist and typological approaches will differentially categorize features of avian and ancient theropods as evidence and data; what even counts as inferentially relevant is itself at stake. Padian and Horner, here, are identifying metaphysical commitments from phylogenetics to argue how paleontologists ought to reconstruct evolutionary history, what counts as evidence, and what kinds of evidence are important for drawing those inferences, i.e., how characters are coded is treated as a methodological dispute entailed by theoretical commitments. (See Haber 2016b in Other Internet Resources for further discussion.)

3. Looking Ahead: New Challenges and Opportunities

We began by noting that a phylogeny reconstructs evolutionary history and that different types of entities have an evolutionary history—for example, genes, organisms, populations, and species all have genealogical histories. Assume that we are trying to reconstruct the history of species or higher taxa (indeed, “phylogeny” itself is sometimes defined this way, e.g., Wiley 1981; S. Edwards 2009). Standard methods such as parsimony, likelihood, and bayesian methods all directly infer the history of characters. So what is the connection between the history of these characters and the phylogeny of the species? The unstated assumption is that the histories of these characters are the same and line up with the species history. We conclude this entry by acknowledging that this unstated assumption is often violated and we consider what effect this has on the theory and practice of phylogenetic inference. As we will see, examining this unstated assumption forces us to directly face the question of what a phylogeny actually represents and thus what “phylogenetic inference” even means.

In order to facilitate an entry point into this topic we treated phylogenetic trees and the histories they represent as simple bifurcating entities, i.e., as straight-forward branch-and-node structures. Yet, this is an idealization, and actual phylogenetic histories are typically more complex than this. Branches can fuse together, and branching events turn out to be more complex than the unidimensional instantaneous point a node might suggest.

As philosophers of science are well aware, idealization in science is both common and serves important purposes (e.g., Cartwright 1983; McMullin 1985; Wimsatt 2007; Weisberg 2007). Yet, it can be fruitful to pause and consider if some of these idealizations and simplifying assumptions need to be re-examined, are hindering studies, or generating other road blocks. Indeed, that’s precisely what has happened recently in phylogenetics. Re-examining the utility of simplifying assumptions in phylogenetics has come both from outside the field (particularly from microbiology) and internally (as phylogeneticists began to appreciate what might be gained from incorporating more complexity). This is (or, at least, can be) a good and healthy thing for a science. It puts pressure on scientists and philosophers to further articulate the central commitments of a science, and to consider how to account for new challenges. Challenges can also push a science to adapt and revise those central commitments as new evidence, theories, and tools are discovered or developed, or to revisit the utility of underlying simplifying assumptions and idealizations. This, in turn, helps philosophers of science understand how science can advance. (Haber 2019 describes this in the context of phylogenetics as productive disruptions .)

Revisiting the utility of idealizing away the complexity of phylogenetic histories has generated both hard challenges and new opportunities for phylogeneticists moving forward (Haber 2012, 2019; Velasco 2012). Here we provide a brief overview of what relaxing these assumptions amounts to, and what kinds of complexity we are acknowledging when these assumptions no longer hold. Following this, we briefly describe how this impacts the central commitments and project of phylogenetics.

In the previous sections we treated phylogenetic trees as straight-forwardly bifurcating models, regularly referring back to Figure 1 . Branches represent ancestral lineages, and nodes represent lineage splits. Yet, this view of phylogenetics is also highly idealized, reflecting at least three simplifying assumptions: (1) that lineages split but do not fuse; (2) that changes on one lineage do not affect other lineages, and (3) that splits are singular, simple events. Each of these assumptions turn out to be false.

In order to think about how lineages split or fuse, we have to first answer the question of what a lineage represents. In the case of a phylogeny where the tips are species, the natural answer is that the lineages are tracking species through time and the nodes represent speciation events. But consider the assumption that lineages split but do not fuse. For many readers, hybridization may be the most obvious counter-example. After all, hybridization is relatively familiar, especially for cases involving sexually reproducing charismatic mega-fauna and, even more frequently, plants.

In the case of speciation by hybridization, it is clear that entire lineages can fuse. Exactly how often this occurs is disputed and of course will depend on how you define species in the first place. But it is a widespread occurrence in plants and is known to happen in animals as well (Mallet 2007; Schumer, Rosenthal, & Andolfatto 2014).

While most occurrences of hybridization do not lead to new species, many often lead to introgression–that is, the introduction of genetic information from one lineage to another. If we model this as one lineage connecting to another, we violate the assumption that a phylogeny is a strictly branching tree diagram. But if we instead do not depict these lineages as directly connected, then we violate another background assumption in phylogenetics, namely, that changes along one branch do not affect any other branches. In either case, we now have a situation where the branching history of some characters is not the same as the history of others or of the history of the species as a whole. In the case of genetic history, this is known as genealogical discordance .

While hybridization among sexually reproducing species already provides good counter-examples to our assumptions about the independence of lineages, the bigger challenges have come from microbiology . More specifically, what we find in microbial genomics are high levels of horizontal (or lateral) gene transfer (HGT or LGT). For readers unfamiliar with microbial genomics, this can be thought of as the microbial analogue to hybridization, though there are important qualitative and quantitative differences.

Qualitatively, HGT is not a mode of reproduction, though it does involve the transfer of genomic parts between microbes. Unlike in reproduction, the genomic transfer is not (typically) between parent and offspring microbes, but adjacent ones. Roughly, when microbes make contact, they can exchange parts of their genome, which means that the history of these genomic parts may be distinct from the history of the microbial organisms in which we find those genetic units. HGT can be highly advantageous for microbes, and genomic parts can quickly spread through microbial populations, especially if they carry beneficial characteristics. Quantitatively, the rate of HGT in microbes is enormously high and commonplace, especially relative to hybridization in, say, mammalian species.

The upshot of widespread HGT is that microbial genes will frequently have different histories, and genetic parts will sometimes flow from one lineage to very distantly related lineages. Microbialists and philosophers of biology have taken up the question of whether reticulation in microbial histories is enough to entirely undercut the utility of phylogenetics in microbial genomics, and how this has impacted the utility of the tree of life metaphor more generally (e.g., Martin 1999; Doolittle 1999; Velasco 2013 see too the special issue of Biology & Philosophy [O’Malley, Martin, & Dupré 2010]).

The final simplifying assumption treats lineage splits as singular events that wholly separate a lineage into two distinct new branches that we can cleanly track. When we think about the process of speciation, it is clear that speciation and lineage separation more generally is often not instantaneous (de Queiroz 1998; Harrison 1998). What is less clear is how this matters to phylogenetics. Thinking carefully about lineage splits and the internal structure of lineages forces us to examine the nature and relationships of the multilevel organization of the lineages we track (Haber 2012, 2019). More inclusive lineages contain less-inclusive lineages as parts, reflecting the hierarchical organization found in biology–a feature of lineages well appreciated by early phylogeneticists, e.g., Hennig 1966; Griffiths 1974). Species trees, for example, contain gene trees within their branches (Maddison 1997).

When individual gene histories are incongruent with each other or with their containing species trees we have genealogical discordance. We have already seen how hybridization and horizontal gene transfer can cause discordance, but as systematics has become increasingly connected with population genetics, another source of discordance has loomed large—incomplete lineage sorting. Any two copies of a gene in a population will share a common ancestor in the past called their point of coalescence. If two gene lineages in the same species lineage fail to coalesce within that species branch (that is, their point of coalescence is earlier than than the most recent speciation or lineage branching) we have what is called incomplete lineage sorting (ILS). In a case of incomplete lineage sorting it is possible that one of the gene lineages first coalesces with a gene lineage from another species and thus is a case of genealogical discordance. This generates what Avise and Robinson (2008) have termed hemiplasies , i.e., homologous characters whose phylogenetic histories are topologically discordant with higher level phylogenies due to ILS. Maddison (1997) provides a very readable introduction to the relationship between gene trees and species trees and the problems of genealogical discordance while Degnan and Rosenberg (2009) is a slightly more advanced introduction to the biology of incomplete lineage sorting and its affects on phylogenetic inference.

To summarize, we first discussed phylogenetic inference as it was typically done prior to around the 1990s. First, infer the history of a number of characters (for example, genes). Each of these histories is a particular, branching tree. Now assume that these histories are the same as the species history. It is this assumption that allows methods such as parsimony or likelihood to evaluate competing hypotheses. The species history is then also a branching tree even though we know that in general, this is a kind of idealization assuming that (1) that lineages split but do not fuse, (2) that information is not transferred between lineages , and (3) that splits are singular, simple events. Each of these assumptions turn out to be false.

The reticulation and genealogical discordance generated by hybridization, HGT, ILS, and other complexities stemming from the structure and processes involving biological lineages have, historically, been obscured by idealizing assumptions in phylogenetics. In many cases treating phylogenetic trees as idealized models of genealogical history is useful and appropriate for the research question at hand. Figure 1 offers a good example of this. This reflects how the level of resolution relative to our research questions can drive the way we use phylogenetic tools (O’Hara 1993).

But what happens when these idealizing assumptions are critically interrogated? First, the fact that different entities have different, often incompatible, genealogical histories brings into question the very nature of phylogeny. One line of thought is that a phylogeny should directly track genetic history rather than the history of speciation. As Maddison (1995: 285) puts it,

one possible interpretation of a species phylogeny is that it depicts the lines by which genetic information was passed on and nothing more.

Baum and Shaw (1995), Baum (2007), and Velasco (2010, 2019) represent a series of papers devoted to developing this “concordance tree” idea. Baum (2009) incorporates this line of thought about phylogenetic history into the debate over the nature of species and other taxa.

Philosophers have paid particular attention to reticulation in the context of microbial genomics, and for good reason (O’Malley et al. 2010 is a notable special edition dedicated to this topic and is a useful resource). These are often framed as challenges to the phylogenetic approach itself, or, in other cases, how and whether the tools of phylogenetics apply to microbial genomics (Martin 1999; Doolittle 1999). The breadth of HGT in microbial systems has also led to challenges to the more conceptual and metaphysical commitments about species and other phylogenetic units (Franklin 2007).

However, the majority of work in phylogenetics today continues with the same goals as before, but with biologists using better and more powerful tools for incorporating these complexities into their phylogenetic models. Haber (2019) offers a philosophical treatment of how reticulation and genealogical discordance introduces both challenges and opportunities in the context of phylogenetic inference, asking whether the branches of phylogenetic trees are “too thin” and obscure relevant details found in the internal structure of those lineages. There are a lot of good reasons phylogeneticists might aim to include reticulate structure in their phylogenetic reconstructions. Among other things, it means treating genealogical discordance as data that can be mined, as opposed to noise that needs to be filtered out (Haber 2012, 2019). Philosophers have begun paying more attention to ways that reticulation and discordance have impacted phylogenetic inference beyond microbial genomics. Rather than coming as challenges from cognate fields, these include ways that practicing phylogeneticists have sought to account for and accommodate what we have come to understand about reticulation and discordance (see Haber & Molter 2019 for a special issue dedicated to this topic).

Population biologists were among the first to operationalize methods for extracting information about the history of lineages from the internal structure of those lineages, i.e., by examining the structure of individuals in a populations, population biologists developed methods for inferring historical coalescent points of those individuals (Hudson 1983; Kingman 1982, 2000; Tajima 1983). These methods exploit the fact that after a lineage splits it takes time for that split to be wholly reflected in the kinship relations of the parts of those lineages. Or, to put it in more familiar terms, it takes time for the parts of a lineage to all be more closely related to each other than any are to parts in other lineages These approaches can be viewed as the continued development of methods in the larger context of the close ties between the application of statistical tools and molecular biology ( §1.2.3 ).

Maddison (1997) represents a more recent launching point for a new appreciation of the role that the internal structure of lineages can play in phylogenetic inference. Briefly, the continued advancement of molecular and computational techniques provided the opportunity to produce ever-finely resolved phylogenies of the systems of interest. What quickly became apparent was a confirmation of what is only implicit in Hennig (1966): that the hierarchical structure of taxa may result in discordant phylogenies at different levels (or even within levels) of the biological hierarchy. As an example, Maddison describes how the phylogenies of individual genes or nucleotides may display different phylogenetic topographies from the organisms or populations in which we sampled those genes. This is not a result of error or homoplasy, but of genuine phylogenetic discordance between gene level lineages and the higher-level lineages in which they reside.

Maddison (1997) is also directly about how we ought to incorporate what we learn about the internal structure of lineages into our models of phylogenetic inference, and how we might build on the earlier work on coalescent models. Maddison is among the first to deeply appreciate how impactful the then-new work on genomics might be for phylogenetics, both in the ways that discordance will present serious challenges and opportunities for phylogenetic inference. This has helped spur new developments in this field of phylogenetics, (e.g., Maddison & Knowles 2006; Degnan & Rosenberg 2006, 2009; Baum 2007; Nakhleh 2013; Hahn & Nakhleh 2016; Mallet, Besansky, & Hahn 2016).

Haber (2019) offers two reasons why these new developments in phylogenetic inference should be of interest to philosophers. First, he offers a real-time example of scientists shifting from treating a feature of a system as noise to recognizing it as powerful evidence. Pease et al. (2016) is a prime example of this. They use some of the new tools developed to draw phylogenetic inference in highly discordant systems to offer a reconstruction of the tomato clade. Where earlier analyses regarded the system as too noisy to pull out a strong phylogenetic signal, these new models permitted a reconstruction that recognized specific patterns of discordance. This represents a substantial advancement in the level of resolution phylogeneticists are able to reconstruct and identify.

Second, the new attitudes toward discordance and reticulation in phylogenetics provide an example of a community of scientists shifting from one set of commitments to another. Haber (2019) offers a specific example. He describes one of the entrenched core commitments in phylogenetic methodology as being the goal of “resolving” competing phylogenetic trees (Felsenstein 1988). That is, in cases where different analyses or data are generating conflicting phylogenetic trees, the goal should be to resolve this conflict by identifying which of those trees is best supported. On a model that recognizes genealogical discordance, though, conflicting trees may accurately reflect a reticulate or discordant phylogenetic history, e.g., if different data strongly support different phylogenetic trees, one possible explanation is that we have discovered an instance of discordance. Nakhleh (2013) recommends that this entails that phylogeneticists must now also consider whether conflicting trees may be “reconciled” rather than resolved. Haber (2019) argues that this is an example of what he calls a “productive disruption”, which is one way that science might advance.

The conceptual and epistemological impacts of discordance on phylogenetic inference come together in recent debates over Sukumaran and Knowles’ (2017) work and commentary on the multispecies coalescent (Rannala & Yang 2003; Yang & Rannala 2010). Very roughly, the multispecies coalescent extends the principle of the coalescent from population genetics to phylogenetics. In population genetics, a gene can be sampled from multiple individuals in a single population; coalescent theory provides a way to model the ancestral history of that gene to a common ancestor, i.e., where that history coalesces. The multispecies coalescent does something similar, but with samples from multiple populations (or even species). These and other techniques were developed in response to the challenge posed by Maddison (1997) to develop methods that recognize speciation as an extended process. Roughly, the lineages of genes, individual organisms, populations, and species within a taxa can all “come apart”, or have different phylogenetic histories, and display “a fractal hierarchy of divergences” (Sukumaran & Knowles 2017: 1607). Multispecies coalescent techniques help reveal the fine-grained structure of complex multi-level lineages (Haber 2012), which has generated discussion on how these internal lineage structures relate to species delimitation and other phylogenetic inferences (e.g., de Queiroz 2007; Knowles & Carstens 2007; Haber 2012; Carstens, Pelletier, Reid, & Satler 2013; Hahn & Nakhleh 2016; Mallet et al. 2016; Velasco 2019).

Sukumaran and Knowles (2017) provide a good example of the discussions over what kinds of inferences are licensed by the multispecies coalescent, and the way that metaphysical concerns overlap with epistemological ones. They focus on what kinds of inferences may be drawn from multispecies coalescent methods, arguing that it is a mistake to simply treat the units identified by these methods as species, when they may turn out to be population-level structures instead (i.e., populations of a species, rather than distinct species). They seek to highlight that mistaking population structures for species (or too quickly drawing the inference of the latter from the former, without further evidence), can undermine precisely the sorts of inferences that phylogenetics seeks to license by artificially and unjustifiably inflating species counts:

Specifically, all fields that rely on species as units of analysis, from conservation biology to studies of macroevolutionary dynamics, will be impacted by inflated estimates of the number of species, especially as genomic resources provide unprecedented power for detecting increasingly finer-scaled genetic structure under the multispecies coalescent. (Sukumaran & Knowles 2017: 1607)

This may be framed as a claim about the units of divergence and diversification being identified by these multispecies coalescent methods (a metaphysical claim), and the sorts of inferences that may be drawn about those units from these methods (an epistemological claim) (Haber 2019).

To be clear, the relation between the metaphysics and epistemology is not direct, but oblique, yet important in regard to the inferences licensed by phylogenetic methods. Sukumaran and Knowles (2017), for example, conclude by issuing a plea for phylogeneticists to exercise a more careful stance towards relying on genomic data alone, and in using multispecies coalescent as a straight-forward way to identify species. They instead propose a more cautious inferential stance, identifying the units identified by the multispecies coalescent as population-level structures (which may or may not correspond to species delimitation). This is merely an example of the rich sorts of discussions around good scientific inference that is available to philosophers willing to dive into this literature.

  • Albert, Victor A., 2006, Parsimony, Phylogeny, and Genomics , Oxford: Oxford University Press. doi:10.1093/acprof:oso/9780199297306.001.0001
  • Alfaro, Michael E. and Mark T. Holder, 2006, “The Posterior and the Prior in Bayesian Phylogenetics”, Annual Review of Ecology, Evolution, and Systematics , 37: 19–42. doi:10.1146/annurev.ecolsys.37.091305.110021
  • Amemiya, Chris T., Jessica Alföldi, Alison P. Lee, Shaohua Fan, Hervé Philippe, Iain MacCallum, Ingo Braasch, Tereza Manousaki, Igor Schneider, Nicolas Rohner, Chris Organ, Domitille Chalopin, Jeramiah J. Smith, Mark Robinson, Rosemary A. Dorrington, Marco Gerdol, Bronwen Aken, Maria Assunta Biscotti, Marco Barucca, Denis Baurain, et al., 2013, “The African Coelacanth Genome Provides Insights into Tetrapod Evolution”, Nature , 496(7445): 311–316. doi:10.1038/nature12027
  • Andreasen, Robin O., 1998, “A New Perspective on the Race Debate”, The British Journal for the Philosophy of Science , 49(2): 199–225. doi:10.1093/bjps/49.2.199
  • –––, 2000, “Race: Biological Reality or Social Construct?”, Philosophy of Science , 67: S653–S666. doi:10.1086/392853
  • Ané, Cécile, Bret Larget, David A. Baum, Stacey D. Smith, and Antonis Rokas, 2007, “Bayesian Estimation of Concordance among Gene Trees”, Molecular Biology and Evolution , 24(2): 412–426. doi:10.1093/molbev/msl170
  • Ankeny, Rachel, Hasok Chang, Marcel Boumans, and Mieke Boon, 2011, “Introduction: Philosophy of Science in Practice”, European Journal for Philosophy of Science , 1(3): 303–307. doi:10.1007/s13194-011-0036-4
  • Archibald, Jenny K., Mark E. Mort, and Daniel J. Crawford, 2003, “Bayesian Inference of Phylogeny: A Non-Technical Primer”, TAXON , 52(2): 187–191. doi:10.2307/3647388
  • Avise, John C. and Terence J. Robinson, 2008, “Hemiplasy: A New Term in the Lexicon of Phylogenetics”, Systematic Biology , 57(3): 503–507. doi:10.1080/10635150802164587
  • Baldauf, S. L., 2003, “The Deep Roots of Eukaryotes”, Science , 300(5626): 1703–1706. doi:10.1126/science.1085544
  • Baum, David A., 2007, “Concordance Trees, Concordance Factors, and the Exploration of Reticulate Genealogy”, TAXON , 56(2): 417–426. doi:10.1002/tax.562013
  • –––, 2008, “Reading a Phylogenetic Tree: The Meaning of Monophyletic Groups”, Nature Education , 1(1): 190.
  • –––, 2009, “Species as Ranked Taxa”, Systematic Biology , 58(1): 74–86. doi:10.1093/sysbio/syp011
  • Baum, David A. and Kerry L. Shaw, 1995, “Genealogical Perspectives of the Species Problem”, in Hoch and Stephenson 1995: 289–303,
  • Baum, David A. and Stacey D. Smith, 2013, Tree Thinking: An Introduction to Phylogenetic Biology , Greenwood Village, CO: Roberts and co.
  • Baum, David A., Stacey DeWitt Smith, and Samuel S. S. Donovan, 2005, “The Tree-Thinking Challenge”, Science , 310(5750): 979–980. doi:10.1126/science.1117727
  • Beatty, John, 1982, “Classes and Cladists”, Systematic Biology , 31(1): 25–34. doi:10.1093/sysbio/31.1.25
  • Bockmann, Flávio A., Marcelo R. De Carvalho, and Murilo De Carvalho, 2013, “The Salmon, the Lungfish (or the Coelacanth) and the Cow: A Revival?”, Zootaxa , 3750(3): 265–276. doi:10.11646/zootaxa.3750.3.6
  • Bowler, Peter J., 1996, Life’s Splendid Drama: Evolutionary Biology and the Reconstruction of Life’s Ancestry, 1860–1940 , Chicago, IL: University of Chicago Press.
  • Boyd, Richard, 1999, “Homeostasis, Species and Higher Taxa”, in Species: New Interdisciplinary Essays , Robert A. Wilson (ed.), Cambridge, MA: MIT Press, pp. 141–185.
  • Brady, Ronald H., 1982, “Theoretical Issues and ‘Pattern Cladistics’”, Systematic Biology , 31(3): 286–291. doi:10.1093/sysbio/31.3.286
  • Brower, Andrew V.Z., 2000, “Evolution is Not a Necessary Assumption of Cladistics”, Cladistics , 16(1):143–154. doi:10.1111/j.1096-0031.2000.tb00351.x
  • –––, 2002, “Cladistics, Phylogeny, Evidence and Explanation—a Reply to Lee”, Zoologica Scripta , 31(2): 221–223. doi:10.1046/j.1463-6409.2002.00102.x
  • –––, 2016, “Are We All Cladists?”, in The Future of Phylogenetic Systematics: The Legacy of Willi Hennig , David Williams, Michael Schmitt, and Quentin Wheeler (eds.), (Systematics Association Special Volume Series 86), Cambridge: Cambridge University Press, 88–114. doi:10.1017/CBO9781316338797.007
  • –––, 2018a, “Fifty Shades of Cladism”, Biology & Philosophy , 33(1): 8. doi:10.1007/s10539-018-9622-6
  • –––, 2018b, “Statistical Consistency and Phylogenetic Inference: A Brief Review”, Cladistics , 34(5): 562–567. doi:10.1111/cla.12216
  • –––, 2019, “Background Knowledge: The Assumptions of Pattern Cladistics”, Cladistics , 35(6): 717-731. doi:10.1111/cla.12379
  • Brower, Andrew V.Z. and Randall T. Schuh, 2021, Biological Systematics: Principles and Applications , Ithaca, NY: Cornell University Press.
  • Camin, Joseph H. and Robert R. Sokal, 1965, “A Method for Deducing Branching Sequences in Phylogeny”, Evolution , 19(3): 311–326. doi:10.1111/j.1558-5646.1965.tb01722.x
  • Cantino, Philip D. and Kevin de Queiroz, 2020, “ International Code of Phylogenetic Nomenclature , Version 6”, [ Cantion & de Queiroz 2020 available online ].
  • Carstens, Bryan C., Tara A. Pelletier, Noah M. Reid, and Jordan D. Satler, 2013, “How to Fail at Species Delimitation”, Molecular Ecology , 22(17): 4369–4383. doi:10.1111/mec.12413
  • Carpenter, James M., 1987, “Cladistics of Cladists”, Cladistics , 3(4): 363–375. doi:10.1111/j.1096-0031.1987.tb00899.x
  • Cartwright, Nancy, 1983, How the Laws of Physics Lie , Oxford: Oxford University Press. doi:10.1093/0198247044.001.0001
  • Cavalli-Sforza, L. L. and A. W. Edwards, 1967, “Phylogenetic Analysis. Models and Estimation Procedures”, American Journal of Human Genetics , 19(3 Pt 1): 233–257. [ Cavalli-Sforza and Edwards 1967 available online ]
  • Chang, Joseph T., 1996, “Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency”, Mathematical Biosciences , 137(1): 51–73. doi:10.1016/S0025-5564(96)00075-2
  • Chor, Benny and Tamir Tuller, 2005, “Maximum Likelihood of Evolutionary Trees: Hardness and Approximation”, Bioinformatics , 21(suppl 1), Oxford University Press: i97–i106. doi:10.1093/bioinformatics/bti1027
  • Ciccarelli, Francesca D., Tobias Doerks, Christian von Mering, Christopher J. Creevey, Berend Snel, and Peer Bork, 2006, “Toward Automatic Reconstruction of a Highly Resolved Tree of Life”, Science , 311(5765): 1283–1287. doi:10.1126/science.1123061
  • Cleland, Carol E., 2011, “Prediction and Explanation in Historical Natural Science”, The British Journal for the Philosophy of Science , 62(3): 551–582. doi:10.1093/bjps/axq024
  • Crawford, Nicholas G., James F. Parham, Anna B. Sellas, Brant C. Faircloth, Travis C. Glenn, Theodore J. Papenfuss, James B. Henderson, Madison H. Hansen, and W. Brian Simison, 2015, “A Phylogenomic Analysis of Turtles”, Molecular Phylogenetics and Evolution , 83: 250–257. doi:10.1016/j.ympev.2014.10.021
  • Currie, Adrian, 2018, Rock, Bone, and Ruin: An Optimist’s Guide to the Historical Sciences , Boston, MA: MIT Press.
  • –––, 2019, Scientific Knowledge and the Deep Past: History Matters , Cambridge: Cambridge University Press. doi:10.1017/9781108582490
  • Darwin, Charles, 1859, On the Origin of Species , London: John Murray.
  • Degnan, James H. and Noah A. Rosenberg, 2006, “Discordance of Species Trees with Their Most Likely Gene Trees”, PLoS Genetics , 2(5): e68. doi:10.1371/journal.pgen.0020068
  • –––, 2009, “Gene Tree Discordance, Phylogenetic Inference and the Multispecies Coalescent”, Trends in Ecology & Evolution , 24(6): 332–340. doi:10.1016/j.tree.2009.01.009
  • de Queiroz, Kevin, 1998, “The General Lineage Concept of Species, Species Criteria, and the Process of Speciation: A Conceptual Unification and Terminological Recommendations”, in Howard and Berlocher 1998: 57–75.
  • –––, 2007, “Species Concepts and Species Delimitation”, Systematic Biology , 56(6): 879–886. doi:10.1080/10635150701701083
  • de Queiroz, Kevin and Jacques Gauthier, 1990, “Phylogeny as a Central Principle in Taxonomy: Phylogenetic Definitions of Taxon Names”, Systematic Zoology , 39(4): 307–322. doi:10.2307/2992353
  • –––, 1992, “Phylogenetic Taxonomy”, Annual Review of Ecology and Systematics , 23(1): 449–480. doi:10.1146/annurev.es.23.110192.002313
  • –––, 1994, “Toward a Phylogenetic System of Biological Nomenclature”, Trends in Ecology & Evolution , 9(1): 27–31. doi:10.1016/0169-5347(94)90231-3
  • de Queiroz, Kevin and Steven Poe, 2001, “Philosophy and Phylogenetic Inference: A Comparison of Likelihood and Parsimony Methods in the Context of Karl Popper’s Writings on Corroboration”, Systematic Biology , 50(3): 305–321. doi:10.1080/106351501300317941
  • –––, 2003, “Failed Refutations: Further Comments on Parsimony and Likelihood Methods and Their Relationship to Popper’s Degree of Corroboration”, Systematic Biology , 52(3): 352–367. doi:10.1080/10635150309324
  • Devitt, Michael, 2008, “Resurrecting Biological Essentialism”, Philosophy of Science , 75(3): 344–382. doi:10.1086/593566
  • Dietrich, Michael R., 1994, “The Origins of the Neutral Theory of Molecular Evolution”, Journal of the History of Biology , 27(1): 21–59. doi:10.1007/BF01058626
  • –––, 1998, “Paradox and Persuasion: Negotiating the Place of Molecular Evolution within Evolutionary Biology”, Journal of the History of Biology , 31(1): 85–111. doi:10.1023/A:1004257523100
  • Donoghue, Michael J., 1992, “Homology”, in Keywords in Evolutionary Biology , Evelyn Fox Keller and Elisabeth A. Lloyd (eds.), Cambridge, MA: Harvard University Press, 170–179.
  • Doolittle, W. Ford, 1999, “Phylogenetic Classification and the Universal Tree”, Science , 284(5423): 2124–2128. doi:10.1126/science.284.5423.2124
  • Dunn, Casey W., Gonzalo Giribet, Gregory D. Edgecombe, and Andreas Hejnol, 2014, “Animal Phylogeny and Its Evolutionary Implications”, Annual Review of Ecology, Evolution, and Systematics , 45: 371–395. doi:10.1146/annurev-ecolsys-120213-091627
  • Duret, Laurent, 2008, “Neutral Theory: The Null Hypothesis of Molecular Evolution”, Nature Education , 1(1): 218.
  • Editors, 2016, “Editorial”, Cladistics , 32(1): 1. doi:10.1111/cla.12148
  • Edwards, Anthony W. F., 1970, “Estimation of the Branch Points of a Branching Diffusion Process”, Journal of the Royal Statistical Society: Series B (Methodological) , 32(2): 155–164. doi:10.1111/j.2517-6161.1970.tb00828.x
  • –––, 1972, Likelihood: An Account of the Statistical Concept of Likelihood and Its Application to Scientific Inference , Cambridge: Cambridge University Press.
  • –––, 1996, “The Origin and Early Development of the Method of Minimum Evolution for the Reconstruction of Phylogenetic Trees”, Systematic Biology , 45(1): 79–91. doi:10.1093/sysbio/45.1.79
  • Edwards, Scott V., 2009, “Is a New and General Theory of Molecular Systematics Emerging?”, Evolution , 63(1): 1–19. doi:10.1111/j.1558-5646.2008.00549.x
  • Eldredge, Niles and Joel Cracraft, 1980, Phylogenetic Patterns and the Evolutionary Process , New York: Columbia University Press.
  • Ereshefsky, Marc, 1991, “Species, Higher Taxa, and the Units of Evolution”, Philosophy of Science , 58(1): 84–101. doi:10.1086/289600
  • –––, 2001, The Poverty of the Linnaean Hierarchy , Cambridge: Cambridge University Press. doi:10.1017/CBO9780511498459
  • –––, 2012, “Homology Thinking”, Biology and Philosophy , 27(3): 381–400. doi:10.1007/s10539-012-9313-7
  • Fagan, Melinda Bonnie, 2013, Philosophy of Stem Cell Biology: Knowledge in Flesh and Blood , New York: Palgrave Macmillan. doi:10.1057/9781137296023
  • Farris, James S., 1983, “The Logical Basis of Phylogenetic Analysis”, in Advances in Cladistics: Proceedings of the Third Meeting of the Willi Hennig Society , Vol. 2, N. I. Platnick & V. A. Funk (eds.), New York: Columbia University Press, pp. 7–36.
  • –––, 1999, “Likelihood and Inconsistency”, Cladistics , 15(2): 199–204. doi:10.1111/j.1096-0031.1999.tb00262.x
  • Felsenstein, Joseph, 1973, “Maximum Likelihood and Minimum-Steps Methods for Estimating Evolutionary Trees from Data on Discrete Characters”, Systematic Zoology , 22(3): 240–249. doi:10.1093/sysbio/22.3.240
  • –––, 1978, “Cases in Which Parsimony or Compatibility Methods Will Be Positively Misleading”, Systematic Zoology , 27(4): 401–410. doi:10.2307/2412923
  • –––, 1981, “Evolutionary Trees from DNA Sequences: A Maximum Likelihood Approach”, Journal of Molecular Evolution , 17(6): 368–376. doi:10.1007/BF01734359
  • –––, 1983, “Parsimony in Systematics: Biological and Statistical Issues”, Annual Review of Ecology and Systematics , 14(1): 313–333. doi:10.1146/annurev.es.14.110183.001525
  • –––, 1988, “Phylogenies from Molecular Sequences: Inference and Reliability”, Annual Review of Genetics , 22(1): 521–565. doi:10.1146/annurev.ge.22.120188.002513
  • –––, 2001, “The Troubled Growth of Statistical Phylogenetics”, Systematic Biology , 50(4): 465–467. doi:10.1080/10635150119297
  • –––, 2004, Inferring Phylogenies , volume 2, Sunderland, MA: Sinauer Associates.
  • Felsenstein, Joseph and Elliott Sober, 1986, “Parsimony and Likelihood: An Exchange”, Systematic Zoology , 35(4): 617–626. doi:10.2307/2413121
  • Fisher, Ronald A., 1935 [1950], Statistical Methods for Research Workers , Edinburgh: Oliver and Boyd. Eleventh edition 1950.
  • –––, 1956, Statistical Methods and Scientific Inference , first edition, Edinburgh: Oliver and Boyd.
  • Fitch, Walter M. and Emanuel Margoliash, 1967, “Construction of Phylogenetic Trees: A Method Based on Mutation Distances as Estimated from Cytochrome c Sequences Is of General Applicability”, Science , 155(3760): 279–284. doi:10.1126/science.155.3760.279
  • Fitzhugh, Kirk, 2008, “Abductive Inference: Implications for ‘Linnean’ and ‘Phylogenetic’ Approaches for Representing Biological Systematization”, Evolutionary Biology , 35(1): 52–82. doi:10.1007/s11692-008-9015-x
  • Franklin, L. R., 2007, “Bacteria, Sex, and Systematics”, Philosophy of Science , 74(1): 69–95. doi:10.1086/519476
  • Freudenstein, John V., 2005, “Characters, States and Homology”, Systematic Biology , 54(6): 965–973. doi:10.1080/10635150500354654
  • Gaffney, Eugene S., 1979, “An Introduction to the Logic of Phylogeny Reconstruction”, in Phylogenetic Analysis and Paleontology , Joel Cracraft and Niles Eldredge (eds), New York: Columbia University Press, pp. 79–111.
  • Gardiner, B. G., P. Janvier, C. Patterson, P. L. Forey, P. H. Greenwood, R. S. Miles, and R. P. S. Jefferies, 1979, “The Salmon, the Lungfish and the Cow: A Reply”, Nature , 277(5693): 175–176. doi:10.1038/277175b0
  • Ghiselin, Michael T., 1966, “On Psychologism in the Logic of Taxonomic Controversies”, Systematic Zoology , 15(3): 207–215. doi:10.2307/2411392
  • –––, 1974, “A Radical Solution to the Species Problem”, Systematic Zoology , 23(4): 536–544. doi:10.2307/2412471
  • Giere, Ronald N., 1979 [1997], Understanding Scientific Reasoning , New York: Holt, Rinehart, and Winston. Fourth edition, Fort Worth, TX: Harcourt Brace College Publishers, 1997.
  • Goodman, Nelson, 1972, “Seven Strictures on Similarity”, in his Problems and Projects , Indianapolis, IN: The Bobbs-Merrill Company, pp. 437–446.
  • Graham, R.L. and L.R. Foulds, 1982, “Unlikelihood That Minimal Phylogenies for a Realistic Biological Study Can Be Constructed in Reasonable Computational Time”, Mathematical Biosciences , 60(2): 133–142. doi:10.1016/0025-5564(82)90125-0
  • Grantham, Todd, 2004, “The Role of Fossils in Phylogeny Reconstruction: Why Is It so Difficult to Integrate Paleobiological and Neontological Evolutionary Biology?”, Biology and Philosophy , 19(5): 687–720. doi:10.1007/s10539-005-0370-z
  • Gray, Russell D. and Quentin D. Atkinson, 2003, “Language-Tree Divergence Times Support the Anatolian Theory of Indo-European Origin”, Nature , 426(6965): 435–439. doi:10.1038/nature02029
  • Gray, Russell D., A. J. Drummond, and S. J. Greenhill, 2009, “Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement”, Science , 323(5913): 479–483. doi:10.1126/science.1166858
  • Gray, Russell D. and Fiona M. Jordan, 2000, “Language Trees Support the Express-Train Sequence of Austronesian Expansion”, Nature , 405(6790): 1052–1055. doi:10.1038/35016575
  • Griesemer, James, 2000, “Development, Culture, and the Units of Inheritance”, Philosophy of Science , 67(S1): S348–S368. doi:10.1086/392831
  • Griffiths, Graham C. D., 1974, “On the Foundations of Biological Systematics”, Acta Biotheoretica , 23(3–4): 85–131. doi:10.1007/BF01556343
  • –––, 1976, “The Future of Linnaean Nomenclature”, Systematic Zoology , 25(2): 168–173. doi:10.2307/2412743
  • Haber, Matthew H., 2005, “On Probability and Systematics: Possibility, Probability, and Phylogenetic Inference”, Systematic Biology , 54(5): 831–841. doi:10.1080/106351591007444
  • –––, 2009, “Phylogenetic Inference”, in A Companion to the Philosophy of History and Historiography , Aviezer Tucker (ed.), (Blackwell Companions to Philosophy 41), Oxford, UK: Wiley-Blackwell, 231–242. doi:10.1002/9781444304916.ch20
  • –––, 2012, “Multilevel Lineages and Multidimensional Trees: The Levels of Lineage and Phylogeny Reconstruction”, Philosophy of Science , 79(5): 609–623. doi:10.1086/667849
  • –––, 2016a, “The Individuality Thesis (3 Ways)”, Biology & Philosophy , 31(6): 913–930. doi:10.1007/s10539-016-9548-9
  • –––, 2019, “Species in the Age of Discordance”, Philosophy, Theory, and Practice in Biology , 11: art. 21. doi:10.3998/ptpbio.16039257.0011.021
  • Haber, Matthew H. and Daniel J. Molter, 2019, “Species in the Age of Discordance: Meeting Report and Introduction”, Philosophy, Theory, and Practice in Biology , 11: art. 12. doi:10.3998/ptpbio.16039257.0011.012
  • Hagen, Joel B., 2001, “The Introduction of Computers into Systematic Research in the United States during the 1960s”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 32(2): 291–314. doi:10.1016/S1369-8486(01)00005-X
  • Hahn, Matthew W. and Luay Nakhleh, 2016, “Irrational Exuberance for Resolved Species Trees”, Evolution , 70(1): 7–17. doi:10.1111/evo.12832
  • Hall, Brian K., 1994, Homology: The Hierarchical Basis of Comparative Biology , San Diego, CA: Academic Press.
  • Halstead, L. B., 1978, “The Cladistic Revolution–Can It Make the Grade?”, Nature , 276(5690): 759–760. doi:10.1038/276759a0
  • Harrison, Richard G., 1998, “Linking Evolutionary Pattern and Process”, in Howard and Berlocher 1998: 19–31.
  • Hasegawa, Masami, Hirohisa Kishino, and Taka-aki Yano, 1985, “Dating of the Human-Ape Splitting by a Molecular Clock of Mitochondrial DNA”, Journal of Molecular Evolution , 22(2): 160–174. doi:10.1007/BF02101694
  • Havstad, Joyce C. and N. Adam Smith, 2019, “Fossils with Feathers and Philosophy of Science”, Systematic Biology , 68(5): 840–851. doi:10.1093/sysbio/syz010
  • Heckman, Daniel S., David M. Geiser, Brooke R. Eidell, Rebecca L. Stauffer, Natalie L. Kardos, and S. Blair Hedges, 2001, “Molecular Evidence for the Early Colonization of Land by Fungi and Plants”, Science , 293(5532): 1129–1133. doi:10.1126/science.1061457
  • Hennig, Willi, 1966, Phylogenetic Systematics , D. Dwight Davis and Rainer Zangerl (trans.), Urbana, IL: University of Illinois Press; this is a revised and translated version of the German original, Grundzüge einer Theorie der phylogenetischen Systematik , Berlin: Deutscher Zentralverlag, 1950.
  • –––, 1975, “‘Cladistic Analysis or Cladistic Classification’: A Reply to Ernst Mayr”, Systematic Zoology , 24(2): 244–256. doi:10.1093/sysbio/24.2.244
  • Hillis, David M. and James J. Bull, 1993, “An Empirical Test of Bootstrapping as a Method for Assessing Confidence in Phylogenetic Analysis”, Systematic Biology , 42(2): 182–192. doi:10.1093/sysbio/42.2.182
  • Hillis, David M., James J. Bull, Mary E. White, Marty R. Badgett, and Ian J. Molineux, 1992, “Experimental Phylogenetics: Generation of a Known Phylogeny”, Science , 255(5044): 589–592. doi:10.1126/science.1736360
  • Hillis, David M., John P. Huelsenbeck, and Clifford W. Cunningham, 1994, “Application and Accuracy of Molecular Phylogenies”, Science , 264(5159): 671–677. doi:10.1126/science.8171318
  • Hinchliff, Cody E., Stephen A. Smith, James F. Allman, J. Gordon Burleigh, Ruchi Chaudhary, Lyndon M. Coghill, Keith A. Crandall, Jiabin Deng, Bryan T. Drew, Romina Gazis, Karl Gude, David S. Hibbett, Laura A. Katz, H. Dail Laughinghouse, Emily Jane McTavish, Peter E. Midford, Christopher L. Owen, Richard H. Ree, Jonathan A. Rees, Douglas E. Soltis, et al., 2015, “Synthesis of Phylogeny and Taxonomy into a Comprehensive Tree of Life”, Proceedings of the National Academy of Sciences , 112(41): 12764–12769. doi:10.1073/pnas.1423041112
  • Hoch, Peter C. and A. G. Stephenson (eds), 1995, Experimental and Molecular Approaches to Plant Biosystematics , St. Louis, MO: Proceedings of the Fifth International Symposium of the International Organization of Plant Biosystematists; Missouri Botanical Garden Press.
  • Howard, Daniel J. and Stewart H. Berlocher (eds.), 1998, Endless Forms: Species and Speciation , New York: Oxford University Press.
  • Hudson, Richard R., 1983, “Properties of a Neutral Allele Model with Intragenic Recombination”, Theoretical Population Biology , 23(2): 183–201. doi:10.1016/0040-5809(83)90013-8
  • Huelsenbeck, John P., 1997, “Is the Felsenstein Zone a Fly Trap?”, Systematic Biology , 46(1): 69–74. doi:10.1093/sysbio/46.1.69
  • Huelsenbeck, John P., Cécile Ané, Bret Larget, and Fredrik Ronquist, 2008, “A Bayesian Perspective on a Non-Parsimonious Parsimony Model”, Systematic Biology , 57(3): 406–419. doi:10.1080/10635150802166046
  • Hug, Laura A., Brett J. Baker, Karthik Anantharaman, Christopher T. Brown, Alexander J. Probst, Cindy J. Castelle, Cristina N. Butterfield, Alex W. Hernsdorf, Yuki Amano, Kotaro Ise, Yohey Suzuki, Natasha Dudek, David A. Relman, Kari M. Finstad, Ronald Amundson, Brian C. Thomas, and Jillian F. Banfield, 2016, “A New View of the Tree of Life”, Nature Microbiology , 1(5): 16048. doi:10.1038/nmicrobiol.2016.48
  • Hull, David L., 1970, “Contemporary Systematic Philosophies”, Annual Review of Ecology and Systematics , 1(1): 19–54. doi:10.1146/annurev.es.01.110170.000315
  • –––, 1976, “Are Species Really Individuals?”, Systematic Zoology , 25(2): 174–191. doi:10.2307/2412744
  • –––, 1978, “A Matter of Individuality”, Philosophy of Science , 45(3): 335–360. doi:10.1086/288811
  • –––, 1983, “Karl Popper and Plato’s Metaphor”, in Advances in Cladistics , Vol. 2, N. I. Platnick & V. A. Funk (eds.), New York: Columbia University Press, pp. 177–189.
  • –––, 1988, Science as a Process: An Evolutionary Account of the Social and Conceptual Development of Science , Chicago, IL: The University of Chicago Press.
  • –––, 1999, “The Use and Abuse of Sir Karl Popper”, Biology and Philosophy , 14(4): 481–504. doi:10.1023/A:1006554919188
  • Huxley, Julian (ed.), 1940, The New Systematics. , Oxford: Clarendon Press.
  • Jukes, Thomas H. and Charles R. Cantor, 1969, “Evolution of Protein Molecules”, in Mammalian Protein Metabolism , volume 3, New York: Academic Press, 21–132. doi:10.1016/B978-1-4832-3211-9.50009-7
  • Kearney, Maureen, 2007, “Philosophy and Phylogenetics: Historical and Current Connections”, in The Cambridge Companion to the Philosophy of Biology , David L. Hull and Michael Ruse (eds.), Cambridge: Cambridge University Press, 211–232. doi:10.1017/CCOL9780521851282.011
  • Kendig, Catherine, 2016, “Homologizing as Kinding”, in Natural Kinds and Classification in Scientific Practice , Catherine Kendig (ed.), London: Routledge, 106-125.
  • Kenrick, Paul and Peter R. Crane, 1997, “The Origin and Early Evolution of Plants on Land”, Nature , 389(6646): 33–39. doi:10.1038/37918
  • Kimura, Motoo, 1968, “Evolutionary Rate at the Molecular Level”, Nature , 217(5129): 624–626. doi:10.1038/217624a0
  • –––, 1969, “The Rate of Molecular Evolution Considered from the Standpoint of Population Genetics”, Proceedings of the National Academy of Sciences , 63(4): 1181–1188. doi:10.1073/pnas.63.4.1181
  • –––, 1980, “A Simple Method for Estimating Evolutionary Rates of Base Substitutions through Comparative Studies of Nucleotide Sequences”, Journal of Molecular Evolution , 16(2): 111–120. doi:10.1007/BF01731581
  • Kingman, J. F. C., 1982, “On the Genealogy of Large Populations”, Journal of Applied Probability , 19: 27–43. doi:10.2307/3213548
  • –––, 2000, “Origins of the Coalescent: 1974–1982”, Genetics , 156(4): 1461–1463.
  • Kluge, Arnold G., 2001, “Philosophical Conjectures and Their Refutation”, Systematic Biology , 50(3): 322–330. doi:10.1080/10635150119615
  • –––, 2005, “What Is the Rationale for ‘Ockham’s Razor’(aka Parsimony) in Phylogenetic Inference”, in Albert 2005: 15–42.
  • Kluge, Arnold G. and James S. Farris, 1999, “Taxic Homology = Overall Similarity”, Cladistics , 15(2): 205–212. doi:10.1111/j.1096-0031.1999.tb00263.x
  • Knowles, L. Lacey and Bryan C. Carstens, 2007, “Delimiting Species without Monophyletic Gene Trees”, Systematic Biology , 56(6): 887–895. doi:10.1080/10635150701701091
  • Larget, Bret and Donald L. Simon, 1999, “Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees”, Molecular Biology and Evolution , 16(6): 750–759. doi:10.1093/oxfordjournals.molbev.a026160
  • Lee, Michael S.Y., 2002, “Divergent Evolution, Hierarchy, and Cladistics”, Zoologica Scripta , 31(2): 217–219. doi:10.1046/j.1463-6409.2002.00101.x
  • Lemey, Philippe, Marco Salemi, and Anne-Mieke Vandamme (eds.), 2009, The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing , second edition, Cambridge: Cambridge University Press. doi:10.1017/CBO9780511819049
  • Lemieux, Jacob E., Katherine J. Siddle, Bennett M. Shaw, Christine Loreth, Stephen F. Schaffner, Adrianne Gladden-Young, Gordon Adams, Timelia Fink, Christopher H. Tomkins-Tinch, Lydia A. Krasilnikova, Katherine C. DeRuff, Melissa Rudy, Matthew R. Bauer, Kim A. Lagerborg, Erica Normandin, Sinéad B. Chapman, Steven K. Reilly, Melis N. Anahtar, Aaron E. Lin, Amber Carter, et al., 2021, “Phylogenetic Analysis of SARS-CoV-2 in Boston Highlights the Impact of Superspreading Events”, Science , 371(6529): 574–575. doi:10.1126/science.abe3261
  • Lewens, Tim, 2012, “Pheneticism Reconsidered”, Biology and Philosophy , 27(2), Springer Netherlands: 159–177. doi:10.1007/s10539-011-9302-2
  • Li, Shuying, Dennis K. Pearl, and Hani Doss, 2000, “Phylogenetic Tree Construction Using Markov Chain Monte Carlo”, Journal of the American Statistical Association , 95(450): 493–508. doi:10.1080/01621459.2000.10474227
  • Lipton, Peter, 2004, Inference to the Best Explanation , second edition, London: Routledge. doi:10.4324/9780203470855
  • Longino, Helen E., 1990, Science as Social Knowledge: Values and Objectivity in Scientific Inquiry , Princeton, NJ: Princeton University Press.
  • –––, 2019, “The Social Dimensions of Scientific Knowledge”, in The Stanford Encyclopedia of Philosophy (Summer 2019), Edward N. Zalta (ed.). URL= < https://plato.stanford.edu/archives/sum2019/entries/scientific-knowledge-social/ >
  • Love, Alan C., 2014, “The Erotetic Organization of Developmental Biology”, in Towards a Theory of Development , Alessandro Minelli and Thomas Pradeu (eds.), Oxford: Oxford University Press, 33–55. doi:10.1093/acprof:oso/9780199671427.003.0003
  • Maddison, Wayne P., 1995, “Phylogenetic Histories Within and Among Species”, in Hoch and Stephenson 1995: 273–287.
  • –––, 1997, “Gene Trees in Species Trees”, Systematic Biology , 46(3): 523–536. doi:10.1093/sysbio/46.3.523
  • Maddison, Wayne P. and L. Lacey Knowles, 2006, “Inferring Phylogeny Despite Incomplete Lineage Sorting”, Systematic Biology , 55(1): 21–30. doi:10.1080/10635150500354928
  • Mallet, James, 2007, “Hybrid Speciation”, Nature , 446(7133): 279–283. doi:10.1038/nature05706
  • Mallet, James, Nora Besansky, and Matthew W. Hahn, 2016, “How Reticulated Are Species?”, BioEssays , 38(2): 140–149. doi:10.1002/bies.201500149
  • Margoliash, Emanuel, 1963, “Primary Structure and Evolution of Cytochrome c ”, Proceedings of the National Academy of Sciences , 50(4): 672–679. doi:10.1073/pnas.50.4.672
  • Martin, William, 1999, “Mosaic Bacterial Chromosomes: A Challenge En Route to a Tree of Genomes”, BioEssays , 21(2): 99–104. doi:10.1002/(SICI)1521-1878(199902)21:2<99::AID-BIES3>3.0.CO;2-B
  • Mau, Bob and Michael A. Newton, 1997, “Phylogenetic Inference for Binary Data on Dendograms Using Markov Chain Monte Carlo”, Journal of Computational and Graphical Statistics , 6(1): 122–131. doi:10.1080/10618600.1997.10474731
  • Mayr, Ernst, 1965a, “Classification and Phylogeny”, American Zoologist , 5(1): 165–174. doi:10.1093/icb/5.1.165
  • –––, 1965b, “Numerical Phenetics and Taxonomic Theory”, Systematic Zoology , 14(2): 73–97. doi:10.2307/2411730
  • –––, 1969, Principles of Systematic Zoology , New York: MacGraw-Hill.
  • –––, 1974, “Cladistic Analysis or Cladistic Classification?”, Journal of Zoological Systematics and Evolutionary Research , 12(1): 94–128. doi:10.1111/j.1439-0469.1974.tb00160.x
  • McMullin, Ernan, 1985, “Galilean Idealization”, Studies in History and Philosophy of Science Part A , 16(3): 247–273. doi:10.1016/0039-3681(85)90003-2
  • Mishler, Brent D., 2005, “The Logic of the Data Matrix in Phylogenetic Analysis”, in Albert 2005: 57–70.
  • Morgan, Gregory J., 1998, “Emile Zuckerkandl, Linus Pauling, and the Molecular Evolutionary Clock, 1959-1965”, Journal of the History of Biology , 31(2): 155–178. doi:10.1023/A:1004394418084
  • Nakhleh, Luay, 2013, “Computational Approaches to Species Phylogeny Inference and Gene Tree Reconciliation”, Trends in Ecology & Evolution , 28(12): 719–728. doi:https://doi.org/10.1016/j.tree.2013.09.004
  • Neff, Nancy A., 1986, “A Rational Basis for a Priori Character Weighting”, Systematic Zoology , 35(1): 110–123. doi:10.1093/sysbio/35.1.110
  • Neyman, Jerzy, 1952, Lectures and Conferences on Mathematical Statistics and Probability , second edition, Washington, DC: Graduate School, United States Department of Agriculture.
  • –––, 1971, “Molecular Studies of Evolution: A Source of Novel Statistical Problems”, in Statistical Decision Theory and Related Topics , Shanti S. Gupta and James Yackel (eds), New York: Academic Press, 1–27.
  • O’Hara, Robert J., 1988, “Homage to Clio, or, Toward an Historical Philosophy for Evolutionary Biology”, Systematic Zoology , 37(2): 142–155. doi:10.2307/2992272
  • –––, 1993, “Systematic Generalization, Historical Fate, and the Species Problem”, Systematic Biology , 42(3): 231–246. doi:10.1093/sysbio/42.3.231
  • O’Malley, Maureen A., William Martin, and John Dupré, 2010, “The Tree of Life: Introduction to an Evolutionary Debate”, Biology and Philosophy , 25(4): 441–453. doi:10.1007/s10539-010-9208-4
  • Owen, Richard, 1843, Lectures on the Comparative Anatomy and Physiology of the Invertebrate Animals , London: Longman, Brown, Greene, and Longmans.
  • Padian, Kevin and John R. Horner, 2002, “Typology versus Transformation in the Origin of Birds”, Trends in Ecology & Evolution , 17(3): 120–124. doi:10.1016/S0169-5347(01)02409-0
  • Patterson, Colin, 1982, “Classes and Cladists or Individuals and Evolution”, Systematic Biology , 31(3): 284–286. doi:10.1093/sysbio/31.3.284
  • Pauling, Linus and Emile Zuckerkandl, 1963, “Chemical Paleogenetics. Molecular ‘Restoration Studies’ of Extinct Forms of Life”, Acta Chemica Scandinavica , 17 supl.: 9–16. doi:10.3891/acta.chem.scand.17s-0009
  • Pearson, Christopher H., 2010, “Pattern Cladism, Homology, and Theory-Neutrality”, History and Philosophy of the Life Sciences , 32(4): 475–492.
  • Pease, James B., David C. Haak, Matthew W. Hahn, and Leonie C. Moyle, 2016, “Phylogenomics Reveals Three Sources of Adaptive Variation during a Rapid Radiation”, PLOS Biology , 14(2): 1–24. doi:10.1371/journal.pbio.1002379
  • Pickett, K. M. and Randle, C. P., 2005, “Strange bayes indeed: uniform topological priors imply non–uniform clade priors”, Molecular Phylogenetics and Evolution , 34(1): 203–211. doing:10.1016/j.ympev.2004.09.001
  • Pinna, Mário C. C. de, 1991, “Concepts and Tests of Homology in the Cladistic Paradigm”, Cladistics , 7(4): 367–394. doi:10.1111/j.1096-0031.1991.tb00045.x
  • Platnick, Norman I., 1982, “Defining Characters and Evolutionary Groups”, Systematic Biology , 31(3): 282–282. doi:10.1093/sysbio/31.3.282
  • Posada, David, 2008, “JModelTest: Phylogenetic Model Averaging”, Molecular Biology and Evolution , 25(7): 1253–1256. doi:10.1093/molbev/msn083
  • Posada, David and Keith A. Crandall, 1998, “MODELTEST: Testing the Model of DNA Substitution”, Bioinformatics , 14(9): 817–818. doi:10.1093/bioinformatics/14.9.817
  • Quinn, Aleta, 2016, “Phylogenetic Inference to the Best Explanation and the Bad Lot Argument”, Synthese , 193(9): 3025–3039. doi:10.1007/s11229-015-0908-9
  • –––, 2017, “When Is a Cladist Not a Cladist?”, Biology & Philosophy , 32(4): 581–598. doi:10.1007/s10539-017-9577-z
  • –––, 2019, “Diagnosing Discordance: Signal in Data, Conflict in Paradigms”, Philosophy, Theory, and Practice in Biology , 11(July): art. 017. doi:10.3998/ptpbio.16039257.0011.017
  • Rannala, Bruce and Ziheng Yang, 1996, “Probability Distribution of Molecular Evolutionary Trees: A New Method of Phylogenetic Inference”, Journal of Molecular Evolution , 43(3): 304–311. doi:10.1007/BF02338839
  • –––, 2003, “Bayes Estimation of Species Divergence Times and Ancestral Population Sizes Using DNA Sequences from Multiple Loci”, Genetics , 164(4): 1645–1656. doi:10.1093/genetics/164.4.1645
  • Redelings, Benjamin D. and Marc A. Suchard, 2005, “Joint Bayesian Estimation of Alignment and Phylogeny”, Systematic Biology , 54(3): 401–418. doi:10.1080/10635150590947041
  • Richards, Richard, 2002, “Kuhnian Values and Cladistic Parsimony”, Perspectives on Science , 10(1): 1–27. doi:10.1162/106361402762674780
  • –––, 2003, “Character Individuation in Phylogenetic Inference”, Philosophy of Science , 70(2): 264–279. doi:10.1086/375467
  • Rieppel, Olivier, 2008, “Re-Writing Popper’s Philosophy of Science for Systematics”, History and Philosophy of the Life Sciences , 30(3–4): 293–316.
  • Rieppel, Olivier and Maureen Kearney, 2002, “Similarity”, Biological Journal of the Linnean Society , 75(1): 59–82. doi:10.1046/j.1095-8312.2002.00006.x
  • –––, 2007, “The Poverty of Taxonomic Characters”, Biology & Philosophy , 22(1): 95–113. doi:10.1007/s10539-006-9024-z
  • Roffé, Ariel Jonathan, 2020, “Dynamic Homology and Circularity in Cladistic Analysis”, Biology & Philosophy , 35(1): art. 21. doi:10.1007/s10539-020-9737-4
  • Rogers, James S., 2001, “Maximum Likelihood Estimation of Phylogenetic Trees Is Consistent When Substitution Rates Vary According to the Invariable Sites plus Gamma Distribution”, Systematic Biology , 50(5): 713–722. doi:10.1080/106351501753328839
  • Ronquist, Fredrik, Paul van der Mark, and John P. Huelsenbeck, 2009, “Bayesian Phylogenetic Analysis Using MRBAYES”, in Lemey, Salemi, and Vandamme 2009: 210–266. doi:10.1017/CBO9780511819049.009
  • Salipante, Stephen J. and Marshall S. Horwitz, 2006, “Phylogenetic Fate Mapping”, Proceedings of the National Academy of Sciences , 103(14): 5448–5453. doi:10.1073/pnas.0601265103
  • Santis, Marcelo Domingos de, 2020, “Popper as a process: revisiting the appropriation of the Popperian philosophy by the cladists during the ‘systematics wars’”, Arquivos De Zoologia , 51(2): 13–20. doi:10.11606/2176-7793/2020.51.02
  • Schumer, Molly, Gil G. Rosenthal, and Peter Andolfatto, 2014, “How Common Is Homoploid Hybrid Speciation?”, Evolution , 68(6): 1553–1560. doi:10.1111/evo.12399
  • Scotland, Robert and R. Toby Pennington (eds.), 2000, Homology and Systematics: Coding Characters for Phylogenetic Analysis , London: CRC Press. doi:10.1201/9781482268249
  • Scotland, Robert W., Richard G. Olmstead, and Jonathan R. Bennett, 2003, “Phylogeny Reconstruction: The Role of Morphology”, Systematic Biology , 52(4): 539–548. doi:10.1080/10635150309309
  • Sereno, Paul C., 2005, “The Logical Basis of Phylogenetic Taxonomy”, Systematic Biology , 54(4): 595–619. doi:10.1080/106351591007453
  • Siddall, Mark E., 1998, “Success of Parsimony in the Four-Taxon Case: Long-Branch Repulsion by Likelihood in the Farris Zone”, Cladistics , 14(3): 209–220. doi:10.1111/j.1096-0031.1998.tb00334.x
  • Siddall, Mark E. and Arnold G. Kluge, 1997, “Probabilism and Phylogenetic Inference”, Cladistics , 13(4): 313–336. doi:10.1111/j.1096-0031.1997.tb00322.x
  • Siddall, Mark E. and Michael F. Whiting, 1999, “Long-Branch Abstractions”, Cladistics , 15(1): 9–24. doi:10.1111/j.1096-0031.1999.tb00391.x
  • Simpson, George Gaylord, 1961, Principles of Animal Taxonomy , New York: Columbia University Press.
  • Slater, Matthew H., 2013, Are Species Real? An Essay on the Metaphysics of Species , (New Directions in the Philosophy of Science), New York: Palgrave Macmillan. doi:10.1057/9780230393233
  • –––, 2015, “Natural Kindness”, The British Journal for the Philosophy of Science , 66(2): 375–411. doi:10.1093/bjps/axt033
  • Sneath, Peter H. A. and Robert R. Sokal, 1973, Numerical Taxonomy. The Principles and Practice of Numerical Classification. , San Francisco, CA: W. H. Freeman.
  • Sober, Elliott, 1980, “Evolution, Population Thinking, and Essentialism”, Philosophy of Science , 47(3): 350–383. doi:10.1086/288942
  • –––, 1985, “A Likelihood Justification of Parsimony”, Cladistics , 1(3): 209–233. doi:10.1111/j.1096-0031.1985.tb00424.x
  • –––, 1988a, “Likelihood and Convergence”, Philosophy of Science , 55(2): 228–237. doi:10.1086/289429
  • –––, 1988b, Reconstructing the Past: Parsimony, Evolution, and Inference , Cambridge, MA: MIT Press.
  • –––, 2004, “The Contest Between Parsimony and Likelihood”, Systematic Biology , 53(4): 644–653. doi:10.1080/10635150490468657
  • –––, 2009, “Did Darwin Write the Origin Backwards?”, Proceedings of the National Academy of Sciences , 106(Supplement 1): 10048–10055. doi:10.1073/pnas.0901109106
  • –––, 2010, Did Darwin Write the “Origin” Backwards? Philosophical Essays on Darwin’s Theory , Amherst, NY: Prometheus Books.
  • –––, 2015, Ockham’s Razors: A User’s Manual , Cambridge: Cambridge University Press. doi:10.1017/CBO9781107705937
  • Sokal, Robert R. and Peter H. A. Sneath, 1963, Principles of Numerical Taxonomy , San Francisco, CA: W. H. Freeman.
  • Steel, Mike and David Penny, 2000, “Parsimony, Likelihood, and the Role of Models in Molecular Phylogenetics”, Molecular Biology and Evolution , 17(6): 839–850. doi:10.1093/oxfordjournals.molbev.a026364
  • Sterelny, Kim and Paul Griffiths, 1999, Sex and Death: An Introduction to Philosophy of Biology , (Science and Its Conceptual Foundations), Chicago, IL: University of Chicago Press.
  • Sterner, Beckett and Scott Lidgard, 2014, “The Normative Structure of Mathematization in Systematic Biology”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 46: 44–54. doi:http://dx.doi.org/10.1016/j.shpsc.2014.03.001
  • –––, 2018, “Moving Past the Systematics Wars”, Journal of the History of Biology , 51(1): 31–67. doi:10.1007/s10739-017-9471-1
  • Stevens, Peter F., 2000, “On Characters and Character States: Do Overlapping and Non-Overlapping Variation, Morphology and Molecules All Yield Data of the Same Value?”, in Scotland and Pennington 2000: 81–105.
  • Sukumaran, Jeet and L. Lacey Knowles, 2017, “Multispecies Coalescent Delimits Structure, Not Species”, Proceedings of the National Academy of Sciences , 114(7): 1607–1612. doi:10.1073/pnas.1607921114
  • Sullivan, Jack and Paul Joyce, 2005, “Model Selection in Phylogenetics”, Annual Review of Ecology, Evolution, and Systematics , 36: 445–466. doi:10.1146/annurev.ecolsys.36.102003.152633
  • Swofford, David L. and Jack Sullivan, 2009, “Phylogeny Inference Based on Parsimony and Other Methods Using PAUP”, in The Phylogenetic Handbook: A Practical Approach to Phylogenetic Analysis and Hypothesis Testing , P. Lemey, M. Salemi, and A-M, Vandamme (eds.), Cambridge: Cambridge University Press., pp. 267-312.
  • Swofford, David L., Peter J. Waddell, John P. Huelsenbeck, Peter G. Foster, Paul O. Lewis, and James S. Rogers, 2001, “Bias in Phylogenetic Estimation and Its Relevance to the Choice between Parsimony and Likelihood Methods”, Systematic Biology , 50(4): 525–539. doi:10.1080/106351501750435086
  • Tajima, Fumio, 1983, “Evolutionary Relationship of DNA Sequences in Finite Populations”, Genetics , 105(2): 437–460. doi:10.1093/genetics/105.2.437
  • Takezaki, Naoko and Hidenori Nishihara, 2017, “Support for Lungfish as the Closest Relative of Tetrapods by Using Slowly Evolving Ray-Finned Fish as the Outgroup”, Genome Biology and Evolution , 9(1): 93–101. doi:10.1093/gbe/evw288
  • Tamura, K. and M. Nei, 1993, “Estimation of the Number of Nucleotide Substitutions in the Control Region of Mitochondrial DNA in Humans and Chimpanzees”, Molecular Biology and Evolution , 10(3): 512–526. doi:10.1093/oxfordjournals.molbev.a040023
  • Thomson, Keith Stewart, 1991, Living Fossil: The Story of the Coelacanth , New York: W.W. Norton.
  • Tuffley, Chris and Mike Steel, 1997, “Links between Maximum Likelihood and Maximum Parsimony under a Simple Model of Site Substitution”, Bulletin of Mathematical Biology , 59(3): 581–607. doi:10.1007/BF02459467
  • Turner, Derek, 2007, Making Prehistory: Historical Science and the Scientific Realism Debate , Cambridge: Cambridge University Press. doi:10.1017/CBO9780511487385
  • van Fraassen, Bas C., 1989, Laws and Symmetry , Oxford: Oxford University Press. doi:10.1093/0198248601.001.0001
  • Vassend, Olav B., 2020, “A Verisimilitude Framework for Inductive Inference, with an Application to Phylogenetics”, The British Journal for the Philosophy of Science , 71(4): 1359–1383. doi:10.1093/bjps/axy054
  • Velasco, Joel D., 2007, “Why Non-Uniform Priors on Clades Are Both Unavoidable and Unobjectionable”, Molecular Phylogenetics and Evolution , 45(2): 748–749. doi:10.1016/j.ympev.2007.08.003
  • –––, 2008a, “Species Concepts Should Not Conflict with Evolutionary History, but Often Do”, Studies in History and Philosophy of Science Part C: Studies in History and Philosophy of Biological and Biomedical Sciences , 39(4): 407–414. doi:10.1016/j.shpsc.2008.09.007
  • –––, 2008b, “The Prior Probabilities of Phylogenetic Trees”, Biology & Philosophy , 23(4): 455–473. doi:10.1007/s10539-007-9105-7
  • –––, 2010, “Species, Genes, and the Tree of Life”, The British Journal for the Philosophy of Science , 61(3): 599–619. doi:10.1093/bjps/axp051
  • –––, 2012, “The Future of Systematics: Tree Thinking without the Tree”, Philosophy of Science , 79(5): 624–636. doi:10.1086/667878
  • –––, 2013, “The Tree of Life”, in The Cambridge Encyclopedia of Darwin and Evolutionary Thought , Michael Ruse (ed.), Cambridge: Cambridge University Press, 340–345. doi:10.1017/CBO9781139026895.043
  • –––, 2019, “The Foundations of Concordance Views of Phylogeny”, Philosophy, Theory, and Practice in Biology , 11(July): art. 020. doi:10.3998/ptpbio.16039257.0011.020
  • Wagner, Günter P. (ed.), 2000, The Character Concept in Evolutionary Biology , San Diego, CA: Academic Press. doi:10.1016/B978-0-12-730055-9.X5005-8
  • Wald, Abraham, 1949, “Note on the Consistency of the Maximum Likelihood Estimate”, The Annals of Mathematical Statistics , 20(4): 595–601. doi:10.1214/aoms/1177729952
  • Wang, Lusheng and Tao Jiang, 1994, “On the Complexity of Multiple Sequence Alignment”, Journal of Computational Biology , 1(4): 337–348. doi:10.1089/cmb.1994.1.337
  • Weisberg, Michael, 2007, “Three Kinds of Idealization”, Journal of Philosophy , 104(12): 639–659. doi:10.5840/jphil20071041240
  • Wheeler, Ward C., Lone Aagesen, Claudia P. Arango, Julián Faivovich, Taran Grant, Cyrille D’Haese, Daniel Janies, William Leo Smith, Andrés Varón, and Gonzalo Giribet, 2006, Dynamic Homology and Phylogenetic Systematics: A Unified Approach Using POY , Washington, DC: American Museum of Natural History.
  • Whiting, Michael F., 1998, “Long-Branch Distraction and the Strepsiptera”, Systematic Biology , 47(1): 134–137. doi:10.1080/106351598261076
  • Whittall, Justen B. and Scott A. Hodges, 2007, “Pollinator Shifts Drive Increasingly Long Nectar Spurs in Columbine Flowers”, Nature , 447(7145): 706–709. doi:10.1038/nature05857
  • Wiens, John J., 2004, “The Role of Morphological Data in Phylogeny Reconstruction”, Systematic Biology , 53(4): 653–661. doi:10.1080/10635150490472959
  • Wiley, E. O., 1975, “Karl R. Popper, Systematics, and Classification: A Reply to Walter Bock and Other Evolutionary Taxonomists”, Systematic Zoology , 24(2): 233–243. doi:10.1093/sysbio/24.2.233
  • –––, 1981, Phylogenetics: The Theory and Practice of Phylogenetic Systematics , New York: Wiley. Second edition is Wiley and Lieberman 2011.
  • Wiley, E. O. and Bruce S. Lieberman, 2011, Phylogenetics: Theory and Practice of Phylogenetic Systematics , second edition, Hoboken, NJ: John Wiley & Sons. See Wiley 1981 for first edition. doi:10.1002/9781118017883
  • Williams, David M. and Malte C. Ebach, 2018, “A Cladist Is a Systematist Who Seeks a Natural Classification: Some Comments on Quinn (2017)”, Biology & Philosophy , 33(1–2): art. 10. doi:10.1007/s10539-018-9621-7
  • Williams, David M. and Malte C. Ebach, Cladistics: A Guide to Biological Classification , 3rd Edition, Cambridge: Cambridge University Press. doi:10.1017/9781139047678.
  • Wilson, Robert A., Matthew J. Barker, and Ingo Brigandt, 2007, “When Traditional Essentialism Fails: Biological Natural Kinds”, Philosophical Topics , 35(1–2): 189–215. doi:10.5840/philtopics2007351/29
  • Wimsatt, William C., 2007, Re-Engineering Philosophy for Limited Beings: Piecewise Approximations to Reality , Cambridge, MA: Harvard University Press.
  • Winsor, Mary Pickard, 1976, Starfish, Jellyfish, and the Order of Life: Issues in Nineteenth Century Science , New Haven, CT: Yale University Press.
  • –––, 1995, “The English Debate on Taxonomy and Phylogeny, 1937–1940”, History and Philosophy of the Life Sciences , 17(2): 227–252.
  • –––, 2009, “Taxonomy Was the Foundation of Darwin’s Evolution”, TAXON , 58(1): 43–49. doi:10.1002/tax.581007
  • Winther, Rasmus Grønfeldt, 2009, “Character Analysis in Cladistics: Abstraction, Reification, and the Search for Objectivity”, Acta Biotheoretica , 57(1–2): 129–162. doi:10.1007/s10441-008-9064-7
  • Woese, Carl R. and George E. Fox, 1977, “Phylogenetic Structure of the Prokaryotic Domain: The Primary Kingdoms”, Proceedings of the National Academy of Sciences , 74(11): 5088–5090. doi:10.1073/pnas.74.11.5088
  • Woese, Carl R., George E. Fox, Lawrence Zablen, Tsuneko Uchida, Linda Bonen, Kenneth Pechman, Bobby J. Lewis, and David Stahl, 1975, “Conservation of Primary Structure in 16S Ribosomal RNA”, Nature , 254(5495): 83–86. doi:10.1038/254083a0
  • Woese, Carl R., Otto Kandler, and Mark L. Wheelis, 1990, “Towards a Natural System of Organisms: Proposal for the Domains Archaea, Bacteria, and Eucarya”, Proceedings of the National Academy of Sciences , 87(12): 4576–4579. doi:10.1073/pnas.87.12.4576
  • Yang, Ziheng, 1994, “Statistical Properties of the Maximum Likelihood Method of Phylogenetic Estimation and Comparison with Distance Matrix Methods”, Systematic Biology , 43(3): 329–342. doi:10.1093/sysbio/43.3.329
  • –––, 1996, “Phylogenetic Analysis Using Parsimony and Likelihood Methods”, Journal of Molecular Evolution , 42(2): 294–307. doi:10.1007/BF02198856
  • Yang, Ziheng and Bruce Rannala, 1997, “Bayesian Phylogenetic Inference Using DNA Sequences: A Markov Chain Monte Carlo Method”, Molecular Biology and Evolution , 14(7): 717–724. doi:10.1093/oxfordjournals.molbev.a025811
  • –––, 2010, “Bayesian Species Delimitation Using Multilocus Sequence Data”, Proceedings of the National Academy of Sciences , 107(20): 9264–9269. doi:10.1073/pnas.0913022107
  • –––, 2012, “Molecular Phylogenetics: Principles and Practice”, Nature Reviews Genetics , 13(5): 303–314. doi:10.1038/nrg3186
  • Zuckerkandl, Emile and Linus Pauling, 1962, “Molecular Disease, Evolution, and Genic Heterogeneity”, in Horizons in Biochemistry: Albert Szent-Györgyi Dedicatory Volume , Michael Kasha and Bernard Pullman (eds.), New York: Academic Press, 189–225.
  • –––, 1965, “Evolutionary Divergence and Convergence in Proteins”, in Evolving Genes and Proteins , Vernon Bryson and Henry J. Vogel (eds.), New York: Academic Press, 97–166. doi:10.1016/B978-1-4832-2734-4.50017-6
  • Zuckerkandl, Emile and Walter A. Schroeder, 1961, “Amino-Acid Composition of the Polypeptide Chains of Gorilla Haemoglobin”, Nature , 192(4806): 984–985. doi:10.1038/192984a0
How to cite this entry . Preview the PDF version of this entry at the Friends of the SEP Society . Look up topics and thinkers related to this entry at the Internet Philosophy Ontology Project (InPhO). Enhanced bibliography for this entry at PhilPapers , with links to its database.
  • Haber, Matthew H., 2016b, “ Transformation, Persistence, and Identity ”, Extinct: The Philosophy of Paleontology Blog .
  • “ Understanding Evolution ”, University of California/Berkeley.
  • PhyloCode Literature , foundational literature, critiques, reply to critics, etc.
  • The Society of Systematic Biologists
  • “ Bears, Species, and DNA ”, the University of Utah Genetic Science Learning Center.

adaptationism | biology: philosophy of | developmental biology | developmental biology: evolution and development | empiricism: logical | evolution | evolution: cultural | genetics: ecological | genomics and postgenomics | microbiology, philosophy of | molecular biology | Popper, Karl | species | statistics, philosophy of | underdetermination, of scientific theories

Copyright © 2022 by Matt Haber < matt . haber @ utah . edu > Joel Velasco < joel . velasco @ ttu . edu >

  • Accessibility

Support SEP

Mirror sites.

View this site from another server:

  • Info about mirror sites

The Stanford Encyclopedia of Philosophy is copyright © 2024 by The Metaphysics Research Lab , Department of Philosophy, Stanford University

Library of Congress Catalog Data: ISSN 1095-5054

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

1.14: Phylogenetic Trees

  • Last updated
  • Save as PDF
  • Page ID 44289

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Learning Objectives

Explain the purpose of phylogenetic trees

In scientific terms, the evolutionary history and relationship of an organism or group of organisms is called phylogeny. Phylogeny describes the relationships of one organism to others—such as which organisms it is thought to have evolved from, which species it is most closely related to, and so forth. Phylogenetic relationships provide information on shared ancestry but not necessarily on how organisms are similar or different.

Phylogenetic Trees

Scientists use a tool called a phylogenetic tree to show the evolutionary pathways and connections among organisms. A phylogenetic tree is a diagram used to reflect evolutionary relationships among organisms or groups of organisms. Scientists consider phylogenetic trees to be a hypothesis of the evolutionary past since one cannot go back to confirm the proposed relationships. In other words, a “tree of life” can be constructed to illustrate when different organisms evolved and to show the relationships among different organisms (Figure 1).

A rooted phylogenetic tree resembles a living tree, with a common ancestor indicated as the base of the trunk. Two branches form from the trunk. The left branch leads to the domain Bacteria. The right branch branches again, giving rise to Archaea and Eukarya. Smaller branches within each domain indicate the groups present in that domain.

A phylogenetic tree can be read like a map of evolutionary history. Many phylogenetic trees have a single lineage at the base representing a common ancestor. Scientists call such trees rooted, which means there is a single ancestral lineage (typically drawn from the bottom or left) to which all organisms represented in the diagram relate. Notice in the rooted phylogenetic tree that the three domains—Bacteria, Archaea, and Eukarya—diverge from a single point and branch off. The small branch that plants and animals (including humans) occupy in this diagram shows how recent and minuscule these groups are compared with other organisms. Unrooted trees don’t show a common ancestor but do show relationships among species (Figure 2).

In the past, biologists grouped living organisms into five kingdoms: animals, plants, fungi, protists, and bacteria. The organizational scheme was based mainly on physical features, as opposed to physiology, biochemistry, or molecular biology, all of which are used by modern systematics. The pioneering work of American microbiologist Carl Woese in the early 1970s has shown, however, that life on Earth has evolved along three lineages, now called domains—Bacteria, Archaea, and Eukarya. The first two are prokaryotic groups of microbes that lack membrane-enclosed nuclei and organelles. The third domain contains the eukaryotes and includes unicellular microorganisms together with the four original kingdoms (excluding bacteria). Woese defined Archaea as a new domain, and this resulted in a new taxonomic tree (Figure 1). Many organisms belonging to the Archaea domain live under extreme conditions and are called extremophiles. To construct his tree, Woese used genetic relationships rather than similarities based on morphology (shape).

Woese’s tree was constructed from comparative sequencing of the genes that are universally distributed, present in every organism, and conserved (meaning that these genes have remained essentially unchanged throughout evolution). Woese’s approach was revolutionary because comparisons of physical features are insufficient to differentiate between the prokaryotes that appear fairly similar in spite of their tremendous biochemical diversity and genetic variability (Figure 3). The comparison of homologous DNA and RNA sequences provided Woese with a sensitive device that revealed the extensive variability of prokaryotes, and which justified the separation of the prokaryotes into two domains: bacteria and archaea.

Photo depict: A: bacterial cells. Photo depict: B: a natural hot vent. Photo depict: C: a sunflower. Photo depict: D: a lion.

Contributors and Attributions

  • Biology. Provided by : OpenStax CNX. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]
  • Collapsed tree Labels. Authored by : TimVickers, SVG conversion by User_A1. Located at : https://commons.wikimedia.org/wiki/File:CollapsedtreeLabels-simplified.svg . License : Public Domain: No Known Copyright

Phylogenetics and Systematics in a Nutshell

  • First Online: 20 October 2020

Cite this chapter

define phylogenetic hypotheses

  • Alejandro Espinosa de los Monteros 3  

555 Accesses

3 Citations

During the last 50 years, phylogenetic systematics has suffered a substantial transformation in philosophy and methods. Systematics has gone from been a merely descriptive discipline to a scientific theory encompassing solid evolutionary principles capable of inferring robust and replicable historical hypothesis about the interrelationships of taxa. This chapter provides the basic concepts in the field of systematic biology (e.g., terminology, characters codification, tree description) and phylogenetic reconstructions (e.g., alignments, reconstruction methods, support measurements). A particular emphasis is given to nucleotide data. It will provide a guide on how sequences can be used to detect natural selection, adaptation, recombination, and to evaluate substitution saturation. In particular, this chapter seeks to provide the novice with all basic concepts necessary to understand and interpret phylogenetic hypotheses: for instance, to understand nucleotide substitution models, what a molecular clock is, tree selection methods (e.g., Maximum Parsimony, Maximum Likelihood, Bayesian), how to interpret node support values, and testing tree topologies (e.g., Kishino-Hasegawa). Finally, a short review is presented on the current phylogenetic knowledge of avian Haemosporida.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

define phylogenetic hypotheses

Molecular Phylogenetics: Concepts for a Newcomer

define phylogenetic hypotheses

Phylogenetics and Systematics of Animal Life

define phylogenetic hypotheses

Darwin and Phylogenetics: Past and Present

Albu M, Min XJ, Hickey D et al (2008) Uncorrected nucleotide bias in mtDNA can mimic the effects of positive Darwinian selection. Mol Biol Evol 25:2521–2524

Article   CAS   PubMed   PubMed Central   Google Scholar  

Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410

Google Scholar  

Avise JC (2000) Phylogeography: the history and formation of species. Harvard University Press, London

Avise JC (2004) Molecular markers, natural history, and evolution. Sinauer Associates, Sunderland

Avise JC, Aquadro CF, Patton JC (1982) Evolutionary genetics of birds. Genetic distances within Mimidae (mimic thrushes) and Vireonidae (vireos). Biochem Genet 20:95–104

Article   CAS   PubMed   Google Scholar  

Baker MC (1982) Vocal dialect recognition and population genetics sequences. Am Zool 22:561–569

Article   Google Scholar  

Baldauf SL (2003) Phylogeny for the faint of heart: a tutorial. Trends Genet 19:345–351

Barker FK, Lutzoni FM (2002) The utility of the incongruence length difference test. Syst Biol 51:625–637

Article   PubMed   Google Scholar  

Barta JR (1989) Phylogenetic analysis of the class Sporozoea (Phylum: Apicomplexa Levine, 1970): evidence for the independent evolution of heteroxenous life cycles. J Parasitol 75:195–206

Bensch S, Stjernman M, Hasselquist D et al (2000) Host specificity in avian blood parasites: a study of Plasmodium and Haemoproteus mitochondrial DNA amplified from birds. Proc R Soc Lond B Biol Sci 267:1583–1589

Biedrzycka A, Migalska M, Bielański W (2016) A quantitative PCR protocol for detecting specific Haemoproteus lineages: molecular characterization of blood parasites in a Sedge Warbler population from southern Poland. J Ornithol 156:201–208

Borner J, Pick C, Thiede J et al (2016) Phylogeny of haemosporidian blood parasites revealed by a multi-gene approach. Mol Phylogenet Evol 94:221–231

Bromham L (2002) Molecular clocks in reptiles: life history influences rate of molecular evolution. Mol Biol Evol 19:302–309

Buckley TR (2002) Model misspecification and probabilistic test of topology: evidence from empirical data sets. Syst Biol 51:509–523

Cadotte MW, Davies TJ (eds) (2016) Phylogenies in ecology: a guide to concepts and methods. Princeton University Press, Princeton

Campana MG, Hawkins MTR, Henson LH et al (2016) Simultaneous identification of host, ectoparasite and pathogen DNA via in-solution capture. Mol Ecol Resour 16:1224–1239

Carpenter JM (1992) Random cladistics. Cladistics 8:147–153

Cracraft J (1983) Species concepts and speciation analysis. Curr Ornithol 1:159–187

Cracraft J (2002) The seven great questions of systematic biology: an essential foundation for conservation and sustainable use of biodiversity. Ann Mo Bot Gard 89:127–144

Dickerson RE (1971) The structure of cytochrome c and the rates of molecular evolution. J Mol Evol 1(1):26–45

Dobrow RP (2016) Introduction to Stochastic processes with R. Wiley, Hoboken

Book   Google Scholar  

Espinosa de los Monteros A (2000) Higher-level phylogeny of Trogoniformes. Mol Phylogenet Evol 14:20–34

Espinosa de los Monteros A (2003) Models of the primary and secondary structure for the 12S rRNA of birds. DNA Seq 14:241–256

Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughout. Nucleic Acids Res 32:1792–1797

Emerson BC, Ibrahim KC, Hewitt GM (2001) Selection of evolutionary models for phylogenetic hypothesis testing using parametric methods. J Evol Biol 14:620–631

Article   CAS   Google Scholar  

Escalante AA, Freeland DE, Collins WE et al (1998) The evolution of primate malaria parasites based on the gene encoding cytochrome b from the linear mitochondrial genome. Proc Natl Acad Sci U S A 95:8124–8129

Farris JS (1970) Methods of computing Wagner trees. Syst Zool 19:83–92

Felsenstein J (1977) The number of evolutionary trees. Syst Zool 27:27–33

Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791

Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, Sunderland

Galen SC, Borner J, Martinsen ES et al (2018) The polyphyly of Plasmodium : comprehensive phylogenetic analyses of the malaria parasites (Order Haemosporida) reveal widespread taxonomic conflict. R Soc Open Sci 5:171–780

Garcia-Sandoval R (2014) Why some clades have low bootstrap frequencies and high Bayesian posterior probabilities. Isr J Ecol Evol 60:41–44

Geyer CJ (1991) Markov chain Monte Carlo maximum likelihood. In: Keramidas EM (ed) Computing Science and Statistics. Proceedings of the 23rd symposium on the interface. Interface Foundation of North America, Fairfax Station, pp 156–163

Goldman N, Anderson JP, Rodrigo AG (2000) Likelihood-based test of topologies in phylogenetics. Syst Biol 49:652–670

Goloboff P, Szumik A (2016) Problems with supertrees based on the subtree prune-and-regraft distance, with comments on majority rule supertrees. Cladistics 32:82–89

Goloboff P, Torres A, Arias JS (2018) Weighted parsimony outperforms other methods of phylogenetic inference under models appropriate for morphology. Cladistics 34:407–437

Graybeal A (1998) Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol 47:9–17

Gribaldo S, Philippe H (2002) Ancient phylogenetic relationships. Theor Popul Biol 61:391–408

Gutierrez RJ, Zink RM, Yang SY (1983) Genetic variation, systematics and biogeographic relationships of some galliform birds. Auk 100:33–47

Hasegawa M, Kishino H (1989) Confidence limits on the maximum-likelihood estimate of the hominoid tree from mitochondrial-DNA sequences. Evolution 43:672–677

PubMed   Google Scholar  

Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

Hernández-Lara C, Espinosa de los Monteros A, Ibarra-Cerdeña CN et al (2018) Combining morphological and molecular data to reconstruct the phylogeny of avian Haemosporida. Int J Parasitol 48:1137–1148

Hillis DM, Moritz C, Mable BK (eds) (1991) Molecular systematics. Sinauer Associates, Sunderland

Hipsley CA, Müller J (2014) Beyond fossil calibrations: realities of molecular clock practices in evolutionary biology. Front Genet 5:138

Article   PubMed   PubMed Central   CAS   Google Scholar  

Ho SYW, Shapiro B (2011) Skyline-plot methods for estimating demographic history from nucleotide sequences. Mol Ecol Resour 11:423–434

Huelsenbeck JP, Bull JJ, Cunningham CW (1996) Combining data in phylogenetic analysis. TREE 11:152–157

CAS   PubMed   Google Scholar  

Huelsenbeck JP, Ronquist F, Nielsen R et al (2001) Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314

Johnson KP, Dietrich CH, Friedrich F et al (2018) Phylogenomics and the evolution of hemipteroid insects. Proc Natl Acad Sci U S A 115:12775–12780

Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro NH (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–123

Chapter   Google Scholar  

Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626

Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, Cambridge

Kingman JFC (1982a) The coalescent. Stoch Process Appl 13:235–248

Kingman JFC (1982b) On the genealogy of large populations. J Appl Probab 19:27–43

Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179

Kolaczkowski B, Thornton JW (2004) Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneous. Nature 431:980–984

Kumar S (2005) Molecular clocks: four decades of evolution. Nat Rev Genet 6:654–662

Lanyon SM (1992) Phylogeny and classification of birds. A study in molecular evolution. Condor 94:304–307

Larkin MA, Blackshields G, Brown NP et al (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948

Lemmon AR, Moriarty EC (2004) The importance of proper model assumption in Bayesian Phylogenetics. Syst Biol 53:265–277

Li WH (1997) Molecular evolution. Sinauer Associates, Sunderland

Liang L, Yu L, Kubatko L et al (2009) Coalescent methods for estimating phylogenetic trees. Mol Phylogenet Evol 53:320–328

Margush T, McMorris FR (1981) Consensus n-trees. Bull Math Biol 43:239–244

Martin AP, Palumbi SR (1993) Body size, metabolic rate, generation time, and the molecular clock. Proc Natl Acad Sci U S A 90:4087–4091

Martinsen ES, Waite JL, Schall JJ (2007) Morphologically defined subgenera of Plasmodium from avian hosts: test of monophyly by phylogenetic analysis of two mitochondrial genes. Parasitology 134:483–490

Martinsen ES, Perkins SL, Schall JJ (2008) A three-genome phylogeny of malaria parasites ( Plasmodium and closely related genera): evolution of life-history traits and host switches. Mol Phylogenet Evol 47:261–273

Matthiopoulos J (2011) How to be a quantitative ecologist: the A to R of Green mathematics and statistics. Wiley, West Sussex

Mayr E (1982) The growth of biological thought: diversity, evolution, and inheritance. Belknap Press, Cambridge

Mayr E (1993) Fifty years of progress in research on species and speciation. Proc Calif Acad Sci 48:131–140

Mayr E (2000) The biological species concept. In: Wheeler QD, Meier R (eds) Species concepts and phylogenetic theory: a debate. Columbia University Press, New York, pp 17–29

Metropolis N, Rosenbluth AW, Rosenbluth NM et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092

Nei M, Kumar S (2000) Molecular evolution and phylogenetics. Oxford University Press, New York

Nei M, Xu P, Glazko G (2001) Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms. Proc Natl Acad Sci U S A 98:2497–2502

Nguyen LT, Schmidt HA, von Haeseler A et al (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol Biol Evol 32:268–274

Nixon KC (1999) The parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics 15:407–414

Nuttall GHF (1904) Blood immunity and blood relationship. Cambridge University Press, Cambridge

Pacheco MA, Matta NE, Valkiūnas G et al (2018) Mode and rate of evolution of haemosporidian mitochondrial genomes: timing the radiation of avian parasites. Mol Biol Evol 35:383–403

Perkins SL, Schall JJ (2002) A molecular phylogeny of malarial parasites recovered from cytochrome b gene sequences. J Parasitol 88:972–978

Piontkivska H (2004) Efficiencies of maximum likelihood methods of phylogenetic inferences when different substitution models are used. Mol Phylogenet Evol 31:865–873

Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of Akaike Information Criterion and Bayesian Approaches over Likelihood ratio tests. Syst Biol 53:793–808

Purvis A, Agapow PM (2002) Phylogemetic imbalance: taxonomic level matters. Syst Biol 51:844–854

Rockwell RF, Barrowclough GF (1987) Gene flow and genetic structure of populations. In: Cooke F, Buckley PA (eds) Avian genetics. Academic Press, London, pp 223–255

Ronquist F, Teslenko M, van der Mark P et al (2012) MRBAYES 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542

Article   PubMed   PubMed Central   Google Scholar  

Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

Sanger F, Niklen S, Coulson AR (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A 74:5463–5467

Santiago-Alarcon D, Palinauskas V, Schaefer HM (2012) Diptera vectors of avian haemosporidian parasites: untangling parasite life cycles and their taxonomy. Biol Rev 87:928–964

Shimodaira H (1998) An application of multiple comparison techniques to model selection. Ann Inst Stat Math 50:1–13

Singer GA, Hickey DA (2000) Nucleotide bias causes a genome-wide bias in the amino acid composition of proteins. Mol Biol Evol 17:1581–1588

Smith MA, Bertrand C, Crosby K et al (2012) Wolbachia and DNA barcoding insects: patterns, potential, and problems. PLoS One 7:e36514

Sokal RR, Rohlf FJ (1981) Taxonomic congruence in the Leptopodomorpha reexamined. Syst Zool 30:309–325

Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313

Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Myers, David J. Lipman, (1990) Basic local alignment search tool. J Mol Biol 215 (3):403-410

Swofford DL (1991) When are phylogeny estimates from molecular and morphological data incongruent? In: Miyamoto MM, Cracraft J (eds) Phylogenetic analysis of DNA sequences. Oxford University Press, New York, pp 293–333

Tutar Y (2012) Pseudogenes. Comp Funct Genomics 2012:424526

Valkiūnas G (2005) Avian malaria parasites and other haemosporidia. CRC Press, Boca Raton

Wang H (2010) The effects of nucleotide bias on genome evolution: the causes and effects of wide variations in G+C content of the genomes. VDM, Saarbrücken

Wheeler QD, Meier R (eds) (2000) Species concepts and phylogenetic theory: a debate. Columbia University Press, New York

Whelan S, Liò P, Goldman N (2001) Molecular phylogenetics: state-of-the-art methods for looking into the past. Trends Genet 17:262–272

Yang Z, Rannala B (2012) Molecular phylogenetics: principles and practice. Nat Rev Genet 13:303–314

Zharkikh A (1994) Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39:315–329

Zuckerkandl E, Pauling LB (1962) Molecular disease, evolution, and genetic heterogeneity. In: Kasha M, Pullman B (eds) Horizons in biochemistry. Academic Press, New York, pp 189–225

Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Dissertation, University of Texas at Austin. www.bio.utexas.edu/faculty/antisense/garli/Garli.html . Accessed 15 Feb 2019

Download references

Author information

Authors and affiliations.

Laboratorio de Sistematica Filogenetica. Departamento de Biología Evolutiva, Instituto de Ecología, Xalapa, Veracruz, Mexico

Alejandro Espinosa de los Monteros

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Red de Biología y Conservación de Vertebrados, Instituto de Ecología, A.C. - CONACYT, Xalapa, Veracruz, Mexico

Diego Santiago-Alarcon

Department of Anatomy, Cellular Biology and Zoology, University of Extremadura, Badajoz, Badajoz, Spain

Alfonso Marzal

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Espinosa de los Monteros, A. (2020). Phylogenetics and Systematics in a Nutshell. In: Santiago-Alarcon, D., Marzal, A. (eds) Avian Malaria and Related Parasites in the Tropics. Springer, Cham. https://doi.org/10.1007/978-3-030-51633-8_3

Download citation

DOI : https://doi.org/10.1007/978-3-030-51633-8_3

Published : 20 October 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-51632-1

Online ISBN : 978-3-030-51633-8

eBook Packages : Biomedical and Life Sciences Biomedical and Life Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Geosciences LibreTexts

4.1: Phylogenetic Trees

  • Last updated
  • Save as PDF
  • Page ID 6179

  • Dawn Sumner
  • University of California, Davis

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

LEARNING OBJECTIVES

Understand how life can be organized

Describe how phylogenetic trees show how we organize life

Categorizing Life

We interpret all life on Earth as related based on its shared characteristics. For example, all life uses DNA to store genetic information, that DNA is translated into RNA, and RNA is transcribed into proteins that do the biochemistry of cells. Various lineages of life share similar characteristics within a lineage, but they differ from other lineages. For example, the composition of bacterial cell walls are similar to each other but differ from those of animals in important ways. Similarly, the symmetry in the bodies of different types of animals vary in systematic ways; echinoderms have 5-fold symmetry whereas bilaterians have 2-fold symmetry. Most of these similarities and differences in characteristics reflect historical evolutionary processes, and we use those that do to classify subsets of life and to reconstruct the evolutionary history of life. One of the ways we perform, evaluate, and interpret the classifications is by making phylogenetic trees.

Phylogenetic trees are diagrams that show relationships among organisms. Scientists consider phylogenetic trees as hypotheses of the evolutionary past built from the observable characteristics of the organisms in the tree. In other words, a “tree of life”, as it is sometimes called, can be constructed to illustrate the relationships among different organisms. These organisms can be modern or fossil with the trees based on data from genomes, function, or morphology. The trees illustrate hypotheses about when different organisms diverged from their ancient common ancestors and how they evolved. 

The hypothesized relationships are graphically represented by lines that connect the organisms at the tips of the lines to their ancestors deeper in the tree. The lines branch where one lineage of life evolved (or split) into two lineages. These branch points and their order in the tree record the hypothesized relationships among the organisms. The length of each line represents the relative evolutionary difference of the organism at the tip to the most recent common ancestor with other lineages at the branch point; long lines represent significant amounts of evolutionary difference, whereas short lines represent closer evolutionary relationships among the organisms. Sometimes the lines are calibrated to time, with the location of each branch point corresponding to the hypothesized time the younger lineages diverged.

If a phylogenetic tree has a line at the base (usually drawn at the bottom or left; see phylogenetic tree (a) in the figure below), the tree is "rooted". Rooted trees can be read like a map of evolutionary history, starting from a single lineage at the base of the tree, which represents a common ancestor to all of the life shown in the tree, and up through the branches to the tips, which represent living organisms or the most recent fossil representative of the lineage. The rooted phylogenetic tree (a) shown below includes the three domains of life (Bacteria, Archaea, and Eukarya), with the "last common ancestor" of all life diverging into two Bacteria and Archaea lineages first. At some later point in time, the last common ancestor to Archaea and Eukarya diverged into two lineages, leading to modern Archaea and Eukarya. The Eukarya lineage eventually diverged into many more lineages, including plants and animals (including humans). The short lines between plants and animals show that they are more closely related to each other than any of the bacteria are to animals, for example. Similarly, the location of the branch point between plants and animals near the outside of the tree shows that they shared a common ancestor in much more recent times than the last common ancestor between animals and any of the Archaea, for example. Unrooted trees (see phylogenetic tree (b) in the figure below) do not show a last common ancestor to all the life in the tree. The branch geometry shows evolutionary relationships among the organisms, but the central point does not always indicate a common ancestor. Unrooted trees show hypothesized relationships among organisms, but not necessarily their evolutionary history.

image

Phylogenetic trees : Both of these phylogenetic trees shows the relationship of the three domains of life (Bacteria, Archaea, and Eukarya), but the (a) rooted tree attempts to identify when various species diverged from a common ancestor, while the (b) unrooted tree does not. 

Drawing Trees and Terminology

In a phylogenetic tree, each branch point represents a single lineage evolving into two distinct ones. A lineage that evolved early from the root of a tree and remains unbranched is called basal taxon. When two lineages stem from the same branch point, they are called sister taxa. A branch with more than two lineages is called a polytomy and serves to illustrate where scientists have not definitively determined all of the evolutionary relationships.

image

Rooted phylogenetic trees : The root of a phylogenetic tree indicates that an ancestral lineage gave rise to all organisms on the tree. A branch point indicates where lineages diverged from each other. A lineage that evolved early and remains unbranched is a basal taxon. When two lineages stem from the same branch point, they are sister taxa. A branch with more than two lineages is a polytomy.

It is important to note that the lines making up the tree represent ancestral organisms, not the organisms that are present at the tips of the lines. Sister taxa and polytomy share an ancestor, but the groups of organisms did not split or evolve from each other.; rather they split and evolved from a common ancestor. Organisms in two taxa may have split apart at a specific branch point, but neither taxa gave rise to the other; both have evolved, as represented by the length of the lines extending from the branch point to the tips.

Phylogenetic trees can be drawn in multiple ways, but the order of branches is always the same for all trees based on the same data. Phylogenetic trees can look very different, but still illustrate the same relationships. For example, sister taxa can be rotated around their branch point, but they are still sister taxa. The location of the Animals and Fungi can be exchanged in the rooted tree above (a rotation around their branch point). When branch points are rotated, the taxon order changes but the information in the tree is the same because the sequence of branch points is the same. If you trace the evolutionary path from a common ancestor to an organism at a branch tip, the path does not change even if the branches are rotated. In other words, the evolution of each taxon from a branch point is independent of the other organisms in the tree. Similarly, branches can be drawn at angles to each other or with horizontal/vertical/circumfential bars connecting them (compare trees (a) and (b) above). These geometrical options in drawing phylogenetic trees allow scientists to visualize complicated data to help with interpretations. Scientists often choose options that emphasize the aspects of the evolutionary relationships that they are most interested in illustrating. 

Interpreting Rooted Trees

Rooted phylogenetic trees can serve as a pathway to understanding evolutionary history. A pathway can be traced from an ancestor, or even the origin of life, to any individual species by navigating through the evolutionary branches between the two points. Also, by starting with a single species and tracing back towards the “trunk” of the tree, one can discover that species’ ancestors, as well as where various lineages share a common ancestry. In addition, the tree can be used to study the evolutionary relationships within or between entire groups of organisms.

Many disciplines within biology contribute to understanding how past and present life evolved over time; together, these disciplines contribute to building, updating, and maintaining the “tree of life.” Information is used to organize and classify organisms based on evolutionary relationships in a scientific field called systematics. Data may be collected from fossils, from studying the structure of body parts or molecules used by an organism, and by DNA analysis. By combining data from many sources, scientists can put together the phylogeny of organisms. Since phylogenetic trees are hypotheses, they will continue to change as new types of life are discovered and new information is learned.

  • Rooted trees have a single lineage at the base representing a common ancestor that connects all organisms presented in a phylogenetic diagram.
  • Branch points in a phylogenetic tree represent a split where a single lineage evolved into a distinct new one, while basal taxon depict unbranched lineages that diverged early from the root.
  • Unrooted trees portray relationships among species, but do not depict their common ancestor.
  • Phylogenetic trees are hypotheses and are, therefore, modified as more and better data becomes available.
  • Systematics uses data from fossils, the study of bodily structures, molecules used by a species, and DNA analysis to contribute to the building, updating, and maintaining of phylogenetic trees.
  • polytomy : a section of a phylogeny in which the evolutionary relationships cannot be fully resolved to dichotomies (splits into only two lineages)
  • basal taxon : a lineage, displayed using a phylogenetic tree, that diverged early from the root and from which no other branches have diverged
  • systematics : research into the relationships of organisms; the science of systematic classification
  • phylogeny : the visual representation of the evolutionary history of organisms; based on rigorous analyses

Questions to ponder

  • What aspects of organisms provide useful data for their classification?
  • Look up a phylogenetic tree in a scientific publication. Are you comfortable interpreting it? What questions do you have about it?

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Review Article
  • Published: 28 March 2012

Molecular phylogenetics: principles and practice

  • Ziheng Yang 1 , 2 &
  • Bruce Rannala 1 , 3  

Nature Reviews Genetics volume  13 ,  pages 303–314 ( 2012 ) Cite this article

43k Accesses

407 Citations

115 Altmetric

Metrics details

  • Bioinformatics
  • Evolutionary biology
  • Phylogenetics
  • Phylogenomics
  • Population genetics

The rapid accumulation of genome sequence data has made phylogenetics an indispensable tool to various branches of biology. However, it has also posed considerable statistical and computational challenges to data analysis.

Distance, parsimony, likelihood and Bayesian methods of phylogenetic analysis have different strengths and weaknesses. Although distance methods are good for large data sets of highly similar sequences, likelihood and Bayesian methods often have more power and are more robust, especially for inferring deep phylogenies.

Assessing phylogenetic uncertainty remains a difficult statistical problem.

Data partitioning may have an important influence on the phylogenetic analysis of genome-scale data sets.

Systematic biases, such as long-branch attraction, may be more important than random sampling errors in the analysis of genomic-scale data sets.

Phylogenies are important for addressing various biological questions such as relationships among species or genes, the origin and spread of viral infection and the demographic changes and migration patterns of species. The advancement of sequencing technologies has taken phylogenetic analysis to a new height. Phylogenies have permeated nearly every branch of biology, and the plethora of phylogenetic methods and software packages that are now available may seem daunting to an experimental biologist. Here, we review the major methods of phylogenetic analysis, including parsimony, distance, likelihood and Bayesian methods. We discuss their strengths and weaknesses and provide guidance for their use.

This is a preview of subscription content, access via your institution

Access options

Subscribe to this journal

Receive 12 print issues and online access

176,64 € per year

only 14,72 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

define phylogenetic hypotheses

Similar content being viewed by others

define phylogenetic hypotheses

Phylogenetic tree building in the genomic age

define phylogenetic hypotheses

Incongruence in the phylogenomics era

define phylogenetic hypotheses

Fast and accurate bootstrap confidence limits on genome-scale phylogenies using little bootstraps

Maser, P. et al. Phylogenetic relationships within cation transporter families of Arabidopsis . Plant Physiol. 126 , 1646–1667 (2001).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Edwards, S. V. Is a new and general theory of molecular systematics emerging? Evolution 63 , 1–19 (2009).

Article   CAS   PubMed   Google Scholar  

Marra, M. A. et al. The genome sequence of the SARS-associated coronavirus. Science 300 , 1399–1404 (2003).

Grenfell, B. T. et al. Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303 , 327–332 (2004).

Salipante, S. J. & Horwitz, M. S. Phylogenetic fate mapping. Proc. Natl Acad. Sci. USA 103 , 5448–5453 (2006).

Gray, R. D., Drummond, A. J. & Greenhill, S. J. Language phylogenies reveal expansion pulses and pauses in pacific settlement. Science 323 , 479–483 (2009).

Brady, A. & Salzberg, S. PhymmBL expanded: confidence scores, custom databases, parallelization and more. Nature Methods 8 , 367 (2011).

Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423 , 241–254 (2003).

Pedersen, J. S. et al. Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput. Biol. 2 , e33 (2006).

Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478 , 476–482 (2011).

Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328 , 710–722 (2010).

Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature Genet. 43 , 1031–1034 (2011).

Li, H. & Durbin, R. Inference of human population history from individual whole-genome sequences. Nature 475 , 493–496 (2011).

Paten, B. et al. Genome-wide nucleotide-level mammalian ancestor reconstruction. Genome Res. 18 , 1829–1843 (2008).

Ma, J. Reconstructing the history of large-scale genomic changes: biological questions and computational challenges. J. Comput. Biol. 18 , 879–893 (2011).

Kingman, J. F. C. On the genealogy of large populations. J. Appl. Probab. 19A , 27–43 (1982).

Article   Google Scholar  

Kingman, J. F. C. The coalescent. Stoch. Process. Appl. 13 , 235–248 (1982).

Edwards, S. V., Liu, L. & Pearl, D. K. High-resolution species trees without concatenation. Proc. Natl Acad. Sci. USA 104 , 5936–5941 (2007). This paper introduces a method for estimating the species tree despite the presence of conflicting gene trees.

Than, C. & Nakhleh, L. Species tree inference by minimizing deep coalescences. PLoS Comput. Biol. 5 , e1000501 (2009).

Article   PubMed   PubMed Central   CAS   Google Scholar  

Rannala, B. & Yang, Z. Phylogenetic inference using whole genomes. Annu. Rev. Genomics Hum. Genet. 9 , 217–231 (2008).

Felsenstein, J. Phylogenies and the comparative method. Am. Nat. 125 , 1–15 (1985). This paper introduces the bootstrap approach to phylogenetic analysis. This is the most commonly used method for assessing sampling errors in estimated phylogenies.

Yang, Z. in Handbook of Statistical Genetics (eds Balding, D., Bishop, M. & Cannings, C.) 377–406 (Wiley, New York, 2007).

Google Scholar  

Felsenstein, J. Inferring Phylogenies (Sinauer Associates, Sunderland, Massachusetts, 2004).

Yang, Z. Computational Molecular Evolution (Oxford Univ. Press, UK, 2006).

Book   Google Scholar  

Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4 , 406–425 (1987).

CAS   PubMed   Google Scholar  

Jukes, T. H. & Cantor, C. R. in Mammalian Protein Metabolism (ed. Munro, H. N.) 21–123 (Academic Press, New York, 1969).

Kimura, M. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J. Mol. Evol. 16 , 111–120 (1980).

Hasegawa, M., Kishino, H. & Yano, T. Dating the human–ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22 , 160–174 (1985).

Tavaré, S. Some probabilistic and statistical problems on the analysis of DNA sequences. Lect. Math. Life Sci. 17 , 57–86 (1986).

Yang, Z. Estimating the pattern of nucleotide substitution. J. Mol. Evol. 39 , 105–111 (1994).

PubMed   Google Scholar  

Yang, Z. Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites. Mol. Biol. Evol. 10 , 1396–1401 (1993).

Cavalli-Sforza, L. L. & Edwards, A. W. F. Phylogenetic analysis: models and estimation procedures. Evolution 21 , 550–570 (1967).

Fitch, W. M. & Margoliash, E. Construction of phylogenetic trees. Science 155 , 279–284 (1967).

Rzhetsky, A. & Nei, M. A simple method for estimating and testing minimum-evolution trees. Mol. Biol. Evol. 9 , 945–967 (1992).

CAS   Google Scholar  

Desper, R. & Gascuel, O. Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J. Comput. Biol. 9 , 687–705 (2002).

Gascuel, O. & Steel, M. Neighbor-joining revealed. Mol. Biol. Evol. 23 , 1997–2000 (2006).

Tamura, K. et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28 , 2731–2739 (2011).

Bruno, W. J., Socci, N. D. & Halpern, A. L. Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol. Biol. Evol. 17 , 189–197 (2000).

Fitch, W. M. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20 , 406–416 (1971).

Hartigan, J. A. Minimum evolution fits to a given tree. Biometrics 29 , 53–65 (1973).

Swofford, D. L. PAUP * : Phylogenetic Analysis by Parsimony (and Other Methods)4.0 Beta (Sinauer Associates, Massachusetts, 2000).

Goloboff, P. A., Farris, J. S. & Nixon, K. C. TNT, a free program for phylogenetic analysis. Cladistics 24 , 774–786 (2008).

Felsenstein, J. Cases in which parsimony and compatibility methods will be positively misleading. Syst. Zool. 27 , 401–410 (1978).

Huelsenbeck, J. P. Systematic bias in phylogenetic analysis: is the Strepsiptera problem solved? Syst. Biol. 47 , 519–537 (1998).

Swofford, D. L. et al. Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods. Syst. Biol. 50 , 525–539 (2001).

Yang, Z. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol. Evol. 11 , 367–372 (1996).

Philippe, H. et al. Acoelomorph flatworms are deuterostomes related to Xenoturbella . Nature 470 , 255–258 (2011).

Zhong, B. et al. Systematic error in seed plant phylogenomics. Genome Biol. Evol. 3 , 1340–1348 (2011).

Felsenstein, J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17 , 368–376 (1981). This paper introduces the pruning algorithm for likelihood calculation on a tree. This approach forms the basis for modern likelihood and Bayesian methods of phylogenetic analysis.

Yang, Z. Phylogenetic analysis using parsimony and likelihood methods. J. Mol. Evol. 42 , 294–307 (1996).

Felsenstein, J. Phylip: Phylogenetic Inference Program, Version 3.6. (Univ. of Washington, Seattle, 2005).

Adachi, J. & Hasegawa, M. MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Comput. Sci. Monogr. 28 , 1–150 (1996).

Guindon, S. & Gascuel, O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52 , 696–704 (2003).

Article   PubMed   Google Scholar  

Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22 , 2688–2690 (2006).

Zwickl, D. Genetic Algorithm Approaches for the Phylogenetic Analysis of Large Biological Sequence Datasets Under the Maximum Likelihood Criterion . Thesis, Univ. Texas at Austin (2006).

Yang, Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39 , 306–314 (1994).

Lartillot, N. & Philippe, H. A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21 , 1095–1109 (2004).

Blanquart, S. & Lartillot, N. A site- and time-heterogeneous model of amino acid replacement. Mol. Biol. Evol. 25 , 842–858 (2008).

Goldman, N. Statistical tests of models of DNA substitution. J. Mol. Evol. 36 , 182–198 (1993).

Zuckerkandl, E. & Pauling, L. in Evolving Genes and Proteins (eds Bryson, V. & Vogel, H. J.) 97–166 (Academic Press, New York, 1965).

Nielsen, R. & Yang, Z. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148 , 929–936 (1998).

Yang, Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15 , 568–573 (1998).

Yang, Z. & Nielsen, R. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19 , 908–917 (2002).

Huelsenbeck, J. P. & Rannala, B. Phylogenetic methods come of age: testing hypotheses in an evolutionary context. Science 276 , 227–232 (1997).

Whelan, S., Liò, P. & Goldman, N. Molecular phylogenetics: state of the art methods for looking into the past. Trends Genet. 17 , 262–272 (2001).

Rannala, B. & Yang, Z. Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J. Mol. Evol. 43 , 304–311 (1996).

Yang, Z. & Rannala, B. Bayesian phylogenetic inference using DNA sequences: a Markov chain Monte Carlo Method. Mol. Biol. Evol. 14 , 717–724 (1997).

Mau, B. & Newton, M. A. Phylogenetic inference for binary data on dendrograms using Markov chain Monte Carlo. J. Comput. Graph. Stat. 6 , 122–131 (1997).

Li, S., Pearl, D. & Doss, H. Phylogenetic tree reconstruction using Markov chain Monte Carlo. J. Am. Stat. Assoc. 95 , 493–508 (2000).

Larget, B. & Simon, D. L. Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees. Mol. Biol. Evol. 16 , 750–759 (1999).

Article   CAS   Google Scholar  

Huelsenbeck, J. P. & Ronquist, F. MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics 17 , 754–755 (2001).

Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4 , e88 (2006). This paper introduces a Bayesian MCMC algorithm (the BEAST program) to estimate rooted trees under relaxed-clock models.

Felsenstein, J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39 , 783–791 (1985).

Felsenstein, J. & Kishino, H. Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull. Syst. Biol. 42 , 193–200 (1993).

Efron, B., Halloran, E. & Holmes, S. Bootstrap confidence levels for phylogenetic trees. Proc. Natl Acad. Sci. USA 93 , 7085–7090 (1996); corrected article Proc. Natl Acad. Sci. USA 93 , 13429–13434 (1996).

Berry, V. & Gascuel, O. On the interpretation of bootstrap trees: appropriate threshold of clade selection and induced gain. Mol. Biol. Evol. 13 , 999–1011 (1996).

Susko, E. First-order correct bootstrap support adjustments for splits that allow hypothesis testing when using maximum likelihood estimation. Mol. Biol. Evol. 27 , 1621–1629 (2010).

Suzuki, Y., Glazko, G. V. & Nei, M. Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics. Proc. Natl Acad. Sci. USA 99 , 16138–16143 (2002).

Lewis, P. O., Holder, M. T. & Holsinger, K. E. Polytomies and Bayesian phylogenetic inference. Syst. Biol. 54 , 241–253 (2005).

Yang, Z. & Rannala, B. Branch-length prior influences Bayesian posterior probability of phylogeny. Syst. Biol. 54 , 455–470 (2005).

Huelsenbeck, J. P. & Rannala, B. Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Syst. Biol. 53 , 904–913 (2004).

Brown, J. M., Hedtke, S. M., Lemmon, A. R. & Lemmon, E. M. When trees grow too long: investigating the causes of highly inaccurate Bayesian branch-length estimates. Syst. Biol. 59 , 145–161 (2010).

Rannala, B., Zhu, T. & Yang, Z. Tail paradox, partial identifiability and influential priors in Bayesian branch length inference. Mol. Biol. Evol. 29 , 325–335 (2012).

Zhang, C., Rannala, B. & Yang, Z. Robustness of compound Dirichlet priors for Bayesian inference of branch lengths. Syst. Biol. 10 Feb 2012 (doi: 10.1093/sysbio/sys030).

Suchard, M. & Rambaut, A. Many-core algorithms for statistical phylogenetics. Bioinformatics 25 , 1370–1376 (2009).

Zierke, S. & Bakos, J. FPGA acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods. BMC Bioinform. 11 , 184 (2010).

Bininda-Emonds, O. R. P. Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life (Kluwer Academic, the Netherlands, 2004).

de Queiroz, A. & Gatesy, J. The supermatrix approach to systematics. Trends Ecol. Evol. 22 , 34–41 (2007).

Yang, Z. Maximum-likelihood models for combined analyses of multiple sequence data. J. Mol. Evol. 42 , 587–596 (1996).

Shapiro, B., Rambaut, A. & Drummond, A. J. Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences. Mol. Biol. Evol. 23 , 7–9 (2006).

Ren, F., Tanaka, H. & Yang, Z. A likelihood look at the supermatrix–supertree controversy. Gene 441 , 119–125 (2009).

Criscuolo, A., Berry, V., Douzery, E. J. & Gascuel, O. SDM: a fast distance-based approach for (super) tree building in phylogenomics. Syst. Biol. 55 , 740–755 (2006).

Wiens, J. J. & Moen, D. S. Missing data and the accuracy of Bayesian phylogenetics. J. Syst. Evol. 46 , 307–314 (2008).

Dwivedi, B. & Gadagkar, S. Phylogenetic inference under varying proportions of indel-induced alignment gaps. BMC Evol. Biol. 9 , 1471–2148 (2009).

Rodrigue, N., Philippe, H. & Lartillot, N. Mutation-selection models of coding sequence evolution with site-heterogeneous amino acid fitness profiles. Proc. Natl Acad. Sci. USA 107 , 4629–4634 (2010).

Pagel, M. & Meade, A. A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst. Biol. 53 , 571–581 (2004).

Nishihara, H., Okada, N. & Hasegawa, M. Rooting the Eutherian tree — the power and pitfalls of phylogenomics. Genome Biol. 8 , R199 (2007).

Leigh, J. W., Susko, E., Baumgartner, M. & Roger, A. J. Testing congruence in phylogenomic analysis. Syst. Biol. 57 , 104–115 (2008).

Higgins, D. G. & Sharp, P. M. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73 , 237–244 (1988).

Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32 , 1792–1797 (2004).

Löytynoja, A. & Goldman, N. An algorithm for progressive multiple alignment of sequences with insertions. Proc. Natl Acad. Sci. USA 102 , 10557–10562 (2005).

Article   PubMed   CAS   PubMed Central   Google Scholar  

Löytynoja, A. & Goldman, N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science 320 , 1632–1635 (2008).

Thorne, J. L., Kishino, H. & Felsenstein, J. An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33 , 114–124 (1991); erratum J. Mol. Evol . 34 , 91 (1992).

Hein, J., Jensen, J. L. & Pedersen, C. N. Recursions for statistical multiple alignment. Proc. Natl Acad. Sci. USA 100 , 14960–14965 (2003).

Redelings, B. D. & Suchard, M. A. Joint Bayesian estimation of alignment and phylogeny. Syst. Biol. 54 , 401–418 (2005).

Lunter, G., Miklos, I., Drummond, A., Jensen, J. L. & Hein, J. Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinformatics 6 , 83 (2005).

Thorne, J. L., Kishino, H. & Painter, I. S. Estimating the rate of evolution of the rate of molecular evolution. Mol. Biol. Evol. 15 , 1647–1657 (1998). This paper describes the first Bayesian MCMC method for dating species divergence using minimum and maximum bounds to incorporate fossil calibrations.

Kishino, H., Thorne, J. L. & Bruno, W. J. Performance of a divergence time estimation method under a probabilistic model of rate evolution. Mol. Biol. Evol. 18 , 352–361 (2001).

Rannala, B. & Yang, Z. Inferring speciation times under an episodic molecular clock. Syst. Biol. 56 , 453–466 (2007).

Yang, Z. & Rannala, B. Bayesian estimation of species divergence times under a molecular clock using multiple fossil calibrations with soft bounds. Mol. Biol. Evol. 23 , 212–226 (2006).

Inoue, J., Donoghue, P. C. H. & Yang, Z. The impact of the representation of fossil calibrations on Bayesian estimation of species divergence times. Syst. Biol. 59 , 74–89 (2010).

Tavaré, S., Marshall, C. R., Will, O., Soligos, C. & Martin, R. D. Using the fossil record to estimate the age of the last common ancestor of extant primates. Nature 416 , 726–729 (2002).

Article   PubMed   CAS   Google Scholar  

Wilkinson, R. D. et al. Dating primate divergences through an integrated analysis of palaeontological and molecular data. Syst. Biol. 60 , 16–31 (2011).

Knowles, L. L. Statistical phylogeography. Annu. Rev. Ecol. Syst. 40 , 593–612 (2009).

Lemey, P., Rambaut, A., Drummond, A. J. & Suchard, M. A. Bayesian phylogeography finds its roots. PLoS Comp. Biol. 5 , e1000520 (2009).

Lemey, P., Rambaut, A., Welch, J. J. & Suchard, M. A. Phylogeography takes a relaxed random walk in continuous space and time. Mol. Biol. Evol. 27 , 1877–1885 (2010).

Takahata, N., Satta, Y. & Klein, J. Divergence time and population size in the lineage leading to modern humans. Theor. Popul. Biol. 48 , 198–221 (1995).

Rannala, B. & Yang, Z. Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics 164 , 1645–1656 (2003). This study describes the multi-species coalescent model. This is the basis for carrying out comparative analyses of individual genomes and phylogeographic studies and for applying species tree methods.

Drummond, A. J., Nicholls, G. K., Rodrigo, A. G. & Solomon, W. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161 , 1307–1320 (2002).

Hey, J. Isolation with migration models for more than two populations. Mol. Biol. Evol. 27 , 905–920 (2010).

Knowles, L. L. & Carstens, B. C. Delimiting species without monophyletic gene trees. Syst. Biol. 56 , 887–895 (2007).

Yang, Z. & Rannala, B. Bayesian species delimitation using multilocus sequence data. Proc. Natl Acad. Sci. USA 107 , 9264–9269 (2010). This paper describes a Bayesian MCMC method for delimiting species using sequence data from multiple loci under the multi-species coalescent model.

Rohland, N. et al. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants. PLoS Biol. 8 , e1000564 (2010).

Bos, K. I. et al. A draft genome of Yersinia pestis from victims of the Black Death. Nature 478 , 506–510 (2011).

Patterson, N., Richter, D. J., Gnerre, S., Lander, E. S. & Reich, D. Genetic evidence for complex speciation of humans and chimpanzees. Nature 441 , 1103–1108 (2006).

Innan, H. & Watanabe, H. The effect of gene flow on the coalescent time in the human–chimpanzee ancestral population. Mol. Biol. Evol. 23 , 1040–1047 (2006).

Becquet, C. & Przeworski, M. A new approach to estimate parameters of speciation models with application to apes. Genome Res. 17 , 1505–1519 (2007).

Hobolth, A., Christensen, O. F., Mailund, T. & Schierup, M. H. Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genet. 3 , e7 (2007).

Burgess, R. & Yang, Z. Estimation of hominoid ancestral population sizes under Bayesian coalescent models incorporating mutation rate variation and sequencing errors. Mol. Biol. Evol. 25 , 1979–1994 (2008).

Becquet, C. & Przeworski, M. Learning about modes of speciation by computational approaches. Evolution 63 , 2547–2562 (2009).

Yang, Z. A likelihood ratio test of speciation with gene flow using genomic sequence data. Genome Biol. Evol. 2 , 200–211 (2010).

Reich, D. et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468 , 1053–1060 (2010).

Sitnikova, T., Rzhetsky, A. & Nei, M. Interior-branch and bootstrap tests of phylogenetic trees. Mol. Biol. Evol. 12 , 319–333 (1995).

Zhong, B., Yonezawa, T., Zhong, Y. & Hasegawa, M. The position of gnetales among seed plants: overcoming pitfalls of chloroplast phylogenomics. Mol. Biol. Evol. 27 , 2855–2863 (2010).

Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7 , 214 (2007).

Kosakovsky Pond, S. L., Frost, S. D. W. & Muse, S. V. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21 , 676–679 (2005).

Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24 , 1586–1591 (2007).

Lartillot, N. & Philippe, H. Computing Bayes factors using thermodynamic integration. Syst. Biol. 55 , 195–207 (2006).

Xie, W., Lewis, P. O., Fan, Y., Kuo, L. & Chen, M.-H. Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst. Biol. 60 , 150–160 (2011).

Download references

Acknowledgements

We thank the three referees for their constructive comments and M. Hasegawa and B. Zhong for providing the seed-plant phylogenies of Fig. 3. Z.Y. is supported by a UK Biotechnology and Biological Sciences Research Council grant and a Royal Society Wolfson Research Merit Award. B.R. is supported by a US National Institutes of Health grant.

Author information

Authors and affiliations.

Center for Computational and Evolutionary Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China

Ziheng Yang & Bruce Rannala

Department of Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK

Ziheng Yang

Genome Center and Department of Evolution and Ecology, University of California, 95616, Davis, California, USA

Bruce Rannala

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ziheng Yang .

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Related links

Further information.

Ziheng Yang's homepage

Bruce Rannala's homepage

A comprehensive list of phylogenetic programs maintained by Joe Felsenstein

Nature Reviews Genetics article series on Study designs

The inference of phylogenetic relationships among species and the use of such information to classify species.

The description, classification and naming of species.

The process of joining ancestral lineages when the genealogical relationships of a random sample of sequences from a modern population are traced back.

The phylogenetic or genealogical tree of sequences at a gene locus or genomic region.

The statistical analysis of population data from closely related species to infer population parameters and processes such as population sizes, demography, migration patterns and rates.

A phylogenetic tree for a set of species that underlies the gene trees at individual loci.

Errors that are due to an incorrect model assumption. They are exacerbated when the data size increases.

Errors or uncertainties in parameter estimates owing to limited data.

An algorithm of assigning a set of individuals to groups (or clusters) so that objects of the same cluster are more similar to each other than those from different clusters. Hierarchical cluster analysis can be agglomerative (starting with single elements and successively joining them into clusters) or divisive (starting with all objects and successively dividing them into partitions).

A stochastic sequence (or chain) of states with the property that, given the current state, the probabilities for the next state do not depend on the past states.

Substitutions between the two pyrimidines (T↔C) or between the two purines (A↔G).

Substitutions between a pyrimidine and a purine (T or C↔A or G).

Phylogenetic trees for which the location of the root is unspecified.

The phenomenon of inferring an incorrect tree with long branches grouped together by parsimony or by model-based methods under simplistic models.

A general hypothesis-testing method that uses the likelihood to compare two nested hypotheses, often using the χ 2 distribution to assess significance.

The hypothesis or observation that the evolutionary rate is constant over time or across lineages.

The distribution assigned to parameters before the analysis of the data.

The distribution of the parameters (or models) conditional on the data. It combines the information in the prior and in the data (likelihood).

(MCMC algorithms). A Monte Carlo simulation is a computer simulation of a biological process using random numbers. An MCMC algorithm is a Monte Carlo simulation algorithm that generates a sample from a target distribution (often a Bayesian posterior distribution).

Groups of species that have descended from a common ancestor.

(GPU). Specialized units that are traditionally used to manipulate output on a video display and have recently been explored for use in parallel computation.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Yang, Z., Rannala, B. Molecular phylogenetics: principles and practice. Nat Rev Genet 13 , 303–314 (2012). https://doi.org/10.1038/nrg3186

Download citation

Published : 28 March 2012

Issue Date : May 2012

DOI : https://doi.org/10.1038/nrg3186

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Insights into the genetic variability and evolutionary dynamics of tomato spotted wilt orthotospovirus in china.

  • Jinguang Yang

BMC Genomics (2024)

CastNet: a systems-level sequence evolution simulator

  • Carlos J. Rivera-Rivera
  • Djordje Grbic

BMC Bioinformatics (2023)

  • Jacob L. Steenwyk
  • Yuanning Li
  • Antonis Rokas

Nature Reviews Genetics (2023)

Comparative phylogenomic insights of KCS and ELO gene families in Brassica species indicate their role in seed development and stress responsiveness

  • Uzair Muhammad Khan
  • Iqrar Ahmad Rana
  • Rana Muhammad Atif

Scientific Reports (2023)

Fitness, growth and transmissibility of SARS-CoV-2 genetic variants

Quick links.

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

define phylogenetic hypotheses

Cambridge Dictionary

  • Cambridge Dictionary +Plus

Meaning of phylogenetic in English

  • To clarify this line of reasoning , it is necessary to consider our phylogenetic heritage .
  • The phylogenetic distance between recipient cytoplasm and donor nucleus could be partly responsible for these results .
  • The phylogenetic relationships of the three tick families are unresolved .
  • autosomally
  • genetic fingerprinting
  • heterozygous
  • optogenetics
  • parthenogenetic
  • Patau syndrome

Examples of phylogenetic

{{randomImageQuizHook.quizId}}

Word of the Day

If you are on hold when using the phone, you are waiting to speak to someone.

Searching out and tracking down: talking about finding or discovering things

Searching out and tracking down: talking about finding or discovering things

define phylogenetic hypotheses

Learn more with +Plus

  • Recent and Recommended {{#preferredDictionaries}} {{name}} {{/preferredDictionaries}}
  • Definitions Clear explanations of natural written and spoken English English Learner’s Dictionary Essential British English Essential American English
  • Grammar and thesaurus Usage explanations of natural written and spoken English Grammar Thesaurus
  • Pronunciation British and American pronunciations with audio English Pronunciation
  • English–Chinese (Simplified) Chinese (Simplified)–English
  • English–Chinese (Traditional) Chinese (Traditional)–English
  • English–Dutch Dutch–English
  • English–French French–English
  • English–German German–English
  • English–Indonesian Indonesian–English
  • English–Italian Italian–English
  • English–Japanese Japanese–English
  • English–Norwegian Norwegian–English
  • English–Polish Polish–English
  • English–Portuguese Portuguese–English
  • English–Spanish Spanish–English
  • English–Swedish Swedish–English
  • Dictionary +Plus Word Lists
  • English    Adjective
  • All translations

To add phylogenetic to a word list please sign up or log in.

Add phylogenetic to one of your lists below, or create a new one.

{{message}}

Something went wrong.

There was a problem sending your report.

IMAGES

  1. Phylogenetic Trees

    define phylogenetic hypotheses

  2. Phylogenetic Tree

    define phylogenetic hypotheses

  3. What is Molecular Phylogenetics?

    define phylogenetic hypotheses

  4. Phylogenetic Tree- Definition, Types, Steps, Methods, Uses

    define phylogenetic hypotheses

  5. PPT

    define phylogenetic hypotheses

  6. Phylogenetic Tree- Definition, Analysis, Elements, Methods, Uses

    define phylogenetic hypotheses

VIDEO

  1. phylogenetic analysis of an unknown DNA sequence

  2. phylogenetic analysis of an unknown DNA sequence

  3. Probability and Statistics

  4. Retrieve Phylogenetic Tree from Accession Number through Performing BLASTp

  5. Phylogenetic Species Concept

  6. Phylogenetic Analysis of Genes Responsible for the Primary and Secondary Response of Sinorhizobium

COMMENTS

  1. Phylogenetics

    e. In biology, phylogenetics ( / ˌfaɪloʊdʒəˈnɛtɪks, - lə -/) [1] [2] [3] is the study of the evolutionary history and relationships among or within groups of organisms. These relationships are determined by phylogenetic inference, methods that focus on observed heritable traits, such as DNA sequences, protein amino acid sequences, or ...

  2. Phylogenetic trees

    A phylogenetic tree is a diagram that represents evolutionary relationships among organisms. Phylogenetic trees are hypotheses, not definitive facts. The pattern of branching in a phylogenetic tree reflects how species or other groups evolved from a series of common ancestors. In trees, two species are more related if they have a more recent ...

  3. Phylogeny

    diphyletic theory. phylogeny, the history of the evolution of a species or group, especially in reference to lines of descent and relationships among broad groups of organisms. Fundamental to phylogeny is the proposition, universally accepted in the scientific community, that plants or animals of different species descended from common ancestors.

  4. Phylogenetics

    Phylogenetics Definition. Phylogenetics is the scientific study of phylogeny.It studies evolutionary relationships among various groups of organisms based on evolutionary history, similarities, and differences. It makes use of molecular sequencing data (such as homologous sequences, protein sequences, nucleotide sequences, etc.) and morphological data matrices to understand and analyze the ...

  5. Phylogenetics

    phylogenetics, in biology, the study of the ancestral relatedness of groups of organisms, whether alive or extinct.. History. Classification of the natural world into meaningful and useful categories has long been a basic human impulse and is systematically evident at least since time of ancient Greece.Dominant for close to 2,000 years in the West was the notion of a "Great Chain of Being ...

  6. Phylogeny review (article)

    Phylogenetic trees are hypotheses of relatedness. Although we know that modern organisms evolved from ancient organisms, the pathway of this evolution is sometimes a best guess based on the amount of evidence available at the time. The more we uncover about the lineage of a set of organisms, the more accurate the phylogenetic trees become.

  7. Building a phylogenetic tree (article)

    Phylogenetic trees represent hypotheses about the evolutionary relationships among a group of organisms. A phylogenetic tree may be built using morphological (body shape), biochemical, behavioral, or molecular features of species or other groups. In building a tree, we organize species into nested groups based on shared derived traits (traits ...

  8. What is phylogenetics?

    Phylogenetics is the study of the evolutionary relationships between organisms, based on their genetic material revealed through DNA and RNA sequencing. A phylogeny, or a phylogenetic tree, is a way of visually representing evolutionary relationships. They are a scientist's best guess as to how an organism or group of organisms have evolved.

  9. 20.1A: Phylogenetic Trees

    Figure 20.1A. 1 20.1 A. 1: Rooted phylogenetic trees: The root of a phylogenetic tree indicates that an ancestral lineage gave rise to all organisms on the tree. A branch point indicates where two lineages diverged. A lineage that evolved early and remains unbranched is a basal taxon. When two lineages stem from the same branch point, they are ...

  10. Phylogenetic Trees and Monophyletic Groups

    A phylogenetic tree, also known as a phylogeny, is a diagram that depicts the lines of evolutionary descent of different species, organisms, or genes from a common ancestor. Phylogenies are useful ...

  11. Phylogenetic Inference

    Phylogenetic Inference. First published Wed Dec 8, 2021; substantive revision Thu Jun 30, 2022. Phylogenetics is the study of the evolutionary history and relationships among individuals, groups of organisms (e.g., populations, species, or higher taxa), or other biological entities with evolutionary histories (e.g., genes, biochemicals, or ...

  12. Phylogenetics, Overview

    Definition. Phylogenetics, derived from the Greek terms phylon (meaning "tribe") and genetikos (meaning "genitive" or origin), is the study of the evolutionary history of species, organisms, genes, or proteins through the construction and analysis of mathematical entities known as trees or phylogenies.

  13. 1.14: Phylogenetic Trees

    A phylogenetic tree is a diagram used to reflect evolutionary relationships among organisms or groups of organisms. Scientists consider phylogenetic trees to be a hypothesis of the evolutionary past since one cannot go back to confirm the proposed relationships. In other words, a "tree of life" can be constructed to illustrate when ...

  14. Phylogenesis

    Phylogenesis. Phylogenetic divergence (Phyletic gradualism) (above) shows relatively slow changes during geologic epoch: the broken balance (below) illustrates morphological stability and (rarely) the relatively rapid evolutionary change. Phylogenesis (from Greek φῦλον phylon "tribe" + γένεσις genesis "origin") is the biological ...

  15. Phylogenetics and Systematics in a Nutshell

    Phylogenetic trees are hypotheses, based on character analysis, for explaining the historical component of lineage evolution. Speciation is a fundamental process in evolutionary theory, in which a lineage (population) splits giving origin to two or more new species (Mayr 1993 ).

  16. Phylogenetic tree

    A phylogenetic tree, phylogeny or evolutionary tree is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time. In other words, it is a branching diagram or a tree showing the evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical or genetic ...

  17. Phylogenetic tree

    phylogenetic tree, a diagram showing the evolutionary interrelations of a group of organisms derived from a common ancestral form. The ancestor is in the tree "trunk"; organisms that have arisen from it are placed at the ends of tree "branches." The distance of one group from the other groups indicates the degree of relationship; i.e., closely related groups are located on branches ...

  18. Phylogenomics and the reconstruction of the tree of life

    The three main types of standard phylogenetic reconstruction method (distance, parsimony and likelihood methods 10; Box 1) have been adapted for use in phylogenomics. Phylogenomic reconstruction ...

  19. 4.1: Phylogenetic Trees

    Phylogenetic trees are diagrams that show relationships among organisms. Scientists consider phylogenetic trees as hypotheses of the evolutionary past built from the observable characteristics of the organisms in the tree. In other words, a "tree of life", as it is sometimes called, can be constructed to illustrate the relationships among ...

  20. Phylogenetics is the New Genetics (for Most of Biodiversity)

    New phylogenetic methods have been developed with the explicit goal of linking genes and even specific mutations to species differences ('PhyloG2P'). Applications of these methods show great promise for uncovering new sources of functional variation and tackling traits beyond the reach of traditional genetic approaches.

  21. Molecular phylogenetics: principles and practice

    Phylogenetic analysis is pervading every field of biological study. The authors review and assess the main methods of phylogenetic analysis — including parsimony, distance, likelihood and ...

  22. Phylogenomics

    As greater amounts of data are incorporated into phylogenetic studies, new evidence and hypotheses regarding relationships among taxa can emerge, and placement of lineages within clades can change dramatically. Taxon sampling can thus greatly influence hypotheses supported by phylogenetic inference (Rosenberg & Kumar, 2003; Nabhan & Sarkar, 2012).

  23. PHYLOGENETIC

    PHYLOGENETIC definition: 1. relating to the development of organisms over time, including how they separate into different…. Learn more.