Abstracts For Putting Fossils In Trees Symposium

At the 2014 Annual Meeting of the Society for Vertebrate Paleontology, I am co-organizing a workshop and a special symposium on "tip-dating." I have pulled out all relevant abstracts from the SVP 2014 Program and Abstracts Book.

Note: Official citation information:

Citing an Abstract in the 2014 SVP Program and Abstracts Book: This Program and Abstracts Book is an official supplement to the online version of the Journal of Vertebrate Paleontology. The citation format for an abstract printed in this book is: Journal of Vertebrate Paleontology, Program and Abstracts, 2014, <insert page number here>.

Table of Contents

SCHEDULE OF WORKSHOP AND SYMPOSIUM

(link to this section)

SYMPOSIUM 3: Putting Fossils In Trees: New Methods for Combining Morphology, Time, and Molecules To Estimate Phylogenetic and Divergence Times of Living and Fossil Taxa

FRIDAY MORNING, NOVEMBER 7, 2014
8:00am – 12:15pm

LOCATION: ESTREL BERLIN, HALL A

MODERATORS: Nicholas J. Matzke and April Wright

Schedule of "Putting Fossils In Trees":
8:00 Matzke, N., Wright, A., Bapst, D. Incorporation of Absolute and Relative Fossil Dating Information In Bayesian Tip-Dating Analyses Using the R Package BeastmasteR: Examples From Assassin Spiders, Salmonids, and Hominids

8:15 Irmis, R., Parham, J., Ksepka, D. Understanding and Incorporating Geologic Information In Divergence Dating Analyses

8:30 Wagner, P., Marcot, J. Macroevolutionary Models and Tip-Dating: Turning Putative Assumptions Into Testable Hypotheses

8:45 Guillerme, T., Cooper, N. Combining Living and Fossil Taxa Into Phylogenies: the Missing Data Issue

9:00 Pol, D., Xu, X. effects of non-randomly distributed missing data in parsimony and Bayesian analyses

9:15 Clarke, J., Boyd, C. Methods for the Quantitative Comparison of Molecular Estimates of Clade Age and the Fossil Record

9:30 Puttick, M., Benton, M., Thomas, G. Origin of Mammals: Molecular Versus Morphological Clocks

9:45 Warnock, R., Donoghue, P. Testing the Molecular Clock Using Simulated Trees, Fossils, and Sequences

10:00 BREAK

10:15 Ksepka, D., Phillips, M. Putting Fossil Birds In Trees: Empirical Evidence for Biases In Dating the Avian Tree of Life

10:30 O'Reilly, J., Donoghue, P., Dos Reis, M., Yang, Z. Evaluating the Performance of Node Versus Tip Based Fossil Calibration of the Molecular Clock

10:45 Friedman, M., Dornburg, A., Near, T. Morphological Clocks Close the Gap Between Ages of Teleost Fishes Estimated From Molecular Clocks and the Fossil Record

11:00 Turner, A., Pritchard, A., Matzke, N. J. 'Tip-Dating' When All You Have Are Fossils: Comparing Traditional and Bayesian Approaches To Fossil Divergence Times

11:15 Wright, A., Lloyd, G., Matzke, N. Fossils-Only Tip-Dating of Deinonychosaurian Theropods: a Comparison of Methods and Models

11:30 Brochu, C. Gharial Biogeography, Conflicting Signals, and Phylogenetic Entmoots

11:45 Gorscak, E., O'Connor, P. Re-Evaluation of Cretaceous Paleobiogeographical Patterns Using Morphological Clock and Model-Based Approaches: a Case Study Utilizing Titanosaurian Sauropods With Evidence for a More Centralized Role for Continental Africa

12:00 Lloyd, G., Bapst, D., Davis, K., Friedman, M. A Probabilistically Time-Scaled 1000-Taxon Phylogenetic Hypothesis for Mesozoic Dinosaurs and the Origins of Flight and Crown-Birds

WORKSHOP: Tip-Dating: Estimating Dated Phylogenies Using Fossils as Terminal Taxa

(link to this section)

Note: See the BEASTmasteR code and example scripts!

Tip-Dating: Estimating Dated Phylogenies Using Fossils as Terminal Taxa - FULL

This workshop will introduce participants to new computational methods that allow joint inference of phylogenetic relationships and divergence times. In older dating methods, fossil relationships were estimated with an undated cladistic or Bayesian analysis, and then these fossils were converted, usually subjectively, into prior probability distributions on the dates of certain nodes. These calibrations were then used in molecular clock analyses to date molecular trees. This procedure essentially “threw away” hard-won fossil data (and any living morphology data as well) once the dating calibration was produced.

However, in the last two years, several methods have become available that allow the addition of fossil and living morphology, as well as fossil dates, to dating analyses. In these methods, the phylogenetic relationships of the fossils and living taxa are estimated simultaneously with the dating of the tree. These methods have the potential to revolutionary for paleontologists. First, because character and dating data from fossil specimens are a requirement for the method, paleontologists and morphologists will have an increased role to play in future divergence time analyses, previously the domain of molecular biologists. Second, the joint estimation of fossil relationships and the divergence times of fossil taxa is of intrinsic interest, and many phylogenetic comparative methods can be applied to fossil data once statistically-estimated, time-scaled trees of fossil taxa are available.

The two main methods in use currently are BEAST (Pyron 2011; Wood, Matzke et al. 2013; Alexandrous et al. 2013) and MrBayes 3.3 (Ronquist et al. 2012). Both take more skill and background than traditional phylogeny-estimation and dating methods. Therefore we will guide participants through tutorials and then help them to set up analyses of their own data.

Date: Tuesday, November 4

Time: 10:00am - 4:00pm

Location: The Leibniz Headquarters (Chausseestr. 111, 150 meters away from the Museum für Naturkunde and next to the UBahn station Naturkundemuseum)

Cost: Free (FULL!)
Minimum Number of Participants: 10
Maximum Number of Participants: 40

Leaders:

Nicholas J. Matzke
National Institute for Mathematical and Biological Synthesis
University of Tennessee
gro.soibmin|ekztam#gro.soibmin|ekztam

April Wright
Univeristy of Texas, Austin
moc.liamg|mlirpa.thgirw#moc.liamg|mlirpa.thgirw

SYMPOSIUM: Putting fossils in trees: new methods for combining morphology, time, and molecules to estimate phylogenetic position and divergence times of living and fossil taxa (Society of Vertebrate Paleontology Annual Meeting, Friday, Nov. 7, Berlin)

(link to this section)

Note: See the BEASTmasteR code and example scripts!

Putting fossils in trees: new methods for combining morphology, time, and molecules to estimate phylogenetic position and divergence times of living and fossil taxa

Co-Convenors: Nicholas J. Matzke, April Wright, Graeme Lloyd, David W. Bapst

Fossil data are crucial to correct estimation of phylogeny and divergence times. However, most traditional methods artificially separate the analysis of fossil relationships and divergence time analysis. For example, it is common for paleontologists to estimate the topological position of fossils using cladistic or Bayesian methods, either in a morphology-only or “total evidence” analysis. This tree, which is undated, may then be used by molecular biologists to supply calibration distributions for dating a molecules-only tree of living taxa. Such trees form the starting point for various comparative methods which require dated phylogenies, e.g., model-based ancestral state analyses, diversification analyses, or historical biogeography.

Such procedures “throw away” most of the fossil data, treating paleontology as merely a source of calibration points for molecular analyses, and separate the questions of estimating relationships and dating, when in fact they may be linked. However, increasing collaboration between paleontologists, biologists, statisticians and computer scientists has been fruitful in yielding new technologies and techniques that attempt to combine fossil and living morphology, fossil dates, and molecular data in joint analyses. This symposium will be devoted to reviewing, discussing, and critiquing new methods and models for estimating phylogenetic trees and for incorporating fossils in the derivation of divergence times.

The three foci of the symposium are: 1. "Model-based methods: advantages and limitations." This will focus on the assumptions behind the current probabilistic models for morphological and fossil data, the resulting advantages and limitations, and suggestions for improvements. 2. "Fossils as terminal taxa in dating analyses: prospects and challenges." Methods using fossils as terminal taxa in dating analyses are new and mostly unevaluated, so participants will present case studies that give insight into the practical benefits and problems encountered in the use of such methods. 3. "Fossils as dual information sources: morphology and stratigraphy." The stratigraphic range and sampling frequency of clades also gives important information about the timing of clade origins. Stratocladistics was an early attempt to take this information into account, but was not widely adopted. Probabilistic methods, as well as advances in fossil databases, may allow improved approaches. Participants will review and critique recent developments in this area.

TALK #1 — Matzke, Wright & Bapst (2014): Incorporation of Absolute and Relative Fossil Dating Information In Bayesian Tip-Dating Analyses Using the R Package BeastmasteR: Examples From Assassin Spiders, Salmonids, and Hominids

(link to this section)

pp. 182-183

Symposium 3 (Friday, November 7, 2014, 8:00 AM)

INCORPORATION OF ABSOLUTE AND RELATIVE FOSSIL DATING INFORMATION IN BAYESIAN TIP-DATING ANALYSES USING THE R PACKAGE BEASTMASTER: EXAMPLES FROM ASSASSIN SPIDERS, SALMONIDS, AND HOMINIDS

MATZKE, Nicholas, University of Tennessee, Knoxville, Knoxville, TN, United States of America, 37919
WRIGHT, April, The University of Texas at Austin, Austin, TX, United States of America
BAPST, David, South Dakota School of Mines, Rapid City, WY, United States of America

Until recently, Bayesian phylogenetic dating analyses (e.g., in the program BEAST) used fossils only to inform prior distributions on the dates of certain nodes ('node-dating') in molecular phylogenies; the fossil data was effectively 'thrown away' in subsequent analysis of the dated, molecules-only tree. However, recent advances allow simultaneous inference of dating and the phylogenetic position of dated fossils ('tip-dating').

Tip-dating has great potential to increase the use of hard-won paleontological data in phylogenetics: dated fossil tips represent direct observation of character states at particular times, and these inform the estimation of rates of character evolution, divergence times, and allow direct inclusion of fossils in phylogenetic comparative methods. Tip-dating raises numerous theoretical issues concerning priors and models, and exploration of these issues has been limited by the practical difficulty of implementing different models in BEAST. To aid this research, we present BEASTmasteR, an R package that can convert standard NEXUS files into BEAST XML files. BEASTmasteR also produces XML Bayesian hierarchical models encoding absolute or relative dating information, including: (1) fossil tips with uncertain dates (e.g., date-ranges based on stratigraphic bins, or distributions derived from radiometric dates); (2) relative dating information for tips (e.g., in some cases two fossils from the same deposit have approximately the same date, despite their absolute date being uncertain; or one fossil may be known to be older than another); and (3) relative dating information for nodes with linked dates. These approaches are demonstrated on several invertebrate and vertebrate datasets. In assassin spiders, inclusion of amber-preserved fossils as tips supports divergences consistent with ancient Gondwanan vicariance. In salmonids, inclusion of Eosalmo as a tip suggests that genome duplication preceded the evolution of anadromy by 45-60 My. In hominids, linking nodes of a gene tree/species tree analysis to a fossil tip-dated phylogeny inferred a human-chimp divergence at 4.38-5.54 Ma, while a morphology-only analysis yielded 4.5-8.95 Ma.

TALK #2 —Irmis, Parham, & Ksepka (2014): Understanding and Incorporating Geologic Information In Divergence Dating Analyses

(link to this section)

p. 152

Symposium 3 (Friday, November 7, 2014, 8:15 AM)

UNDERSTANDING AND INCORPORATING GEOLOGIC INFORMATION IN DIVERGENCE DATING ANALYSES

IRMIS, Randall, University of Utah, Salt Lake City, UT, United States of America, 84108-1214
PARHAM, James, California State University, Fullerton, CA, United States of America
KSEPKA, Daniel, NESCent, Durham, NC, United States of America

The temporal calibration of phylogenetic trees is a necessary prerequisite to investigating the tempo and mode of evolution. Regardless of whether fossils are included directly as terminal taxa or used to formulate node calibrations, the interpretation of geologic data is necessary to assign a numeric (absolute) age to the tips and nodes of a phylogenetic tree. Both paleontological and molecular divergence-dating analyses often ignore uncertainty in the interpretation and application of this geologic information. A major source of uncertainty is the conversion of relative geologic ages (e.g., middle Miocene or Barstovian) to a numeric age (e.g., 16.3-13.6 Ma); these values must reflect both the full age range as well as the uncertainty in the age of the boundary of the interval in question. Other uncertainties include the analytical uncertainty of a radioisotopic age, which can vary by an order of magnitude depending on the method, the geologic uncertainty of those ages (e.g., crystal residence times and detrital signatures), cross-correlation of different isotopic systems (e.g., systematic bias of Ar/Ar when compared with U-Pb) and the stratigraphic distance between geochronologic constraints and the fossiliferous stratum. All potential sources of uncertainty must be incorporated when justifying the ages of both the tips and internal nodes of a phylogenetic tree.

In many cases, these geochronologic data are used to determine hard minima and soft maxima in divergence-dating analyses; these bounds must reflect the full uncertainties for the ages of the calibrating fossils. Increasingly, such analyses are conducted in a Bayesian framework, which means that the ages and their uncertainties are converted to prior probability curves. If the geologic and paleoenvironmental biases affecting the fossil preservation of a clade are known, models of fossil occurrence data could be used to generate a prior curve that reflects available paleontological information. However, the data to support this methodology are not always readily accessible, so it has not been widely attempted. Consequently, the construction of prior probability curves is a 'black box', with little justification for widely used prior shapes (e.g., linear, parabolic, or logistic). We propose that if a clade's fossil record is well-sampled, these data should dictate the shape of such curves, by using existing methods for calculating confidence intervals for biostratigraphic ranges. In the absence of such data, flat curves between hard minima and soft maxima are the safest assumption.

TALK #3 —Wagner & Marcot (2014): Macroevolutionary Models and Tip-Dating: Turning Putative Assumptions Into Testable Hypotheses

(link to this section)

p. 251

Symposium 3 (Friday, November 7, 2014, 8:30 AM)

MACROEVOLUTIONARY MODELS AND TIP-DATING: TURNING PUTATIVE ASSUMPTIONS INTO TESTABLE HYPOTHESES

WAGNER, Peter, Smithsonian, Washington, DC, United States of America, 20013-7012
MARCOT, Jon, University of Illinois at Urbana-Champaign, Urbana, IL, United States of America

Tip-dating affects macroevolutionary issues as broad as possible morphological and phylogenetic 'explosions' down to those as fine as which speciation models predominate within a clade by affecting how much time we allot evolutionary change. Consider three basic 'speciation' models that paleontologists might encounter: (1) continuous morphological change (i.e., 'Darwinian' speciation) with continuous cladogenesis; (2) pulsed morphological change in which rates of change are elevated during events paleontologists recognize as speciation (e.g., punctuated equilibrium and related models), but where speciation events are continuous through time; and (3) pulsed turnovers, in which morphological change is pulsed and these pulses are concentrated in particular intervals of time. Model 1 is the simplest (one distribution of morphological rates and one distribution of cladogenetic rates) and it typically will imply the deepest divergence times given for any dataset. Model 2 adds complexity (separate pulse and background rates of change instead of a shared rate for pulses and background time as in Model 1) and increases the probability of change over short intervals of time. Model 3 is the most complex (pulsed vs. background rates for both character change and speciation) and can predict considerable morphological change over short intervals of time even with uniform per-speciation rates of change. Thus, a key issue when contrasting these models is: given apparent frequencies of change within lineages and hypothesized changes between lineages, can we allot as much time as Model 1 requires for divergences?

Tip-dating procedures usually assume Model 1, but it is easy to modify the underlying Mk methods to allow for Models 2 and 3. These can be combined with divergence-time likelihoods given stratigraphic data, using occurrence data to estimate distributions of preservation rates; the divergence-time likelihoods now give us not simply likely divergence times, but the relative likelihoods of the basic speciation models. Case studies with two groups of Ordovician gastropods show that tip-dating with pulsed change (Model 2) yields vastly better likelihood than does continuous morphological change (Model 1). In one of the two gastropod cases, pulsed speciation (Model 3) hypothesis model likelihoods still more. Thus, evolutionary models that might seem to invalidate tip-dating procedures actually are testable hypotheses given a tip-dating framework.

TALK #4 —Guillerme & Cooper (2014): Combining Living and Fossil Taxa Into Phylogenies: the Missing Data Issue

(link to this section)

p. 142

Symposium 3 (Friday, November 7, 2014, 8:45 AM)

COMBINING LIVING AND FOSSIL TAXA INTO PHYLOGENIES: THE MISSING DATA ISSUE

GUILLERME, Thomas, Trinity College Dublin, Dublin, Ireland
COOPER, Natalie, Trinity College Dublin, Dublin, Ireland

Living species represent less than 1% of all species that have ever lived. Ignoring fossil taxa may lead to misinterpretation of macroevolutionary patterns and processes such as trends in species richness, biogeographical history, or paleoecology. This fact has led to an increasing consensus among scientists that both fossil and living taxa must be included in macroevolutionary studies. One approach, the Total Evidence Method, uses molecular data from living taxa and morphological data from both living and fossil taxa to infer phylogenies with both fossil and living taxa at the tips. Although the Total Evidence Method seems very promising, it requires a lot of data and is therefore likely to suffer from missing data issues which may affect its ability to infer correct phylogenies.

In this study we assess the effect of missing data on tree topologies inferred from total evidence supermatrices. Using simulations we investigate three major factors that directly affect the completeness of the morphological part of the supermatrix: (1) the proportion of living taxa with no morphological data; (2) the amount of missing data in the fossil record; and (3) the overall number of morphological characters in the supermatrix. We find that, in a Bayesian framework, difficulties in recovering a stable topology are mainly driven by the missing data in the molecular part of the matrix (for which fossil taxa have no data). In a Maximum Likelihood framework, however, topology is not directly affected by missing data per se, but by the number of morphological characters shared among the taxa. Therefore, the two main drivers of incorrect topologies are the overall number of morphological characters and the number of living species with no morphological data.

Our results suggest that, in order to use total evidence methods, one should reduce the missing data in the morphological part of the supermatrix for living species and use a Maximum Likelihood framework to fix the topology prior to the overall Bayesian phylogenetic inference process. We apply our method to a comprehensive data set of both living and fossil primates. We find that using this integrative method modifies previous estimates of rates of body mass evolution within primates emphasizing the importance of using such methods.

TALK #5 — Pol & Xu (2014): Effects of non-randomly distributed missing data in parsimony and Bayesian analyses

(link to this section)

p. 206

Symposium 3 (Friday, November 7, 2014, 9:00 AM)

EFFECTS OF NON-RANDOMLY DISTRIBUTED MISSING DATA IN PARSIMONY AND BAYESIAN ANALYSES

POL, Diego, Museo Paleontologico E. Feruglio, Trelew, Argentina
XU, Xing, Institute of Vertebrate Paleontology & Paleoanthropology, Beijing, China

The use of Bayesian analyses of paleontological data matrices has increased in recent years and the potential advantages of this approach have been advocated in the literature, such as statistical properties of the estimates and its natural integration with Bayesian molecular clock estimates. Sample cases have been discussed given they resulted in disparate topological results in comparison with parsimony analyses, such as the recently discussed phylogenetic position of Archaeopteryx and its affinities with basal avialans. All these applications of Bayesian phylogenetic analyses of morphological data are based on the assumption that all characters evolve through a homogeneous Markov model, the Mk model that is a generalization of the simplest model used for nucleotide substitutions (Jukes-Cantor model).

Despite the adequacy of this model for treating morphological data, paleontological datasets are characterized by the presence of abundant missing data. The distribution of missing data in paleontological data matrices is non-random, and is usually concentrated on highly incompletely scored taxa and highly incompletely scored characters. Recent studies using both empirical and simulated data matrices have shown that probabilitybased methods (including Bayesian analysis) can be affected by the presence of abundant missing entries. However, the impact of these problems for paleontological matrices has not been thoroughly studied yet.

Here I present a study on the effect that non-randomly distributed missing entries have on a set of empirical data matrices of morphological characters and assess the impact on the type and quantity of missing data for Bayesian analysis in comparison with parsimony analysis. The sensitivity of both methods is compared in terms of the topological results obtained under different regimes of quantity and distribution of missing entries, as well as on their support measures (posterior probabilities in Bayesian analysis and bootstrap frequencies for parsimony analysis). The results of these analyses show that both methods can be highly sensitive to the presence of non-randomly distributed missing entries, in particular for the case of highly incompletely scored taxa. However, a major difference in the results of both methods is found in the obtained support measures, which indicate an overestimation of credibility measures for the position of highly incomplete taxa in Bayesian analyses.

TALK #6 —Clarke & Boyd (2014): Methods for the Quantitative Comparison of Molecular Estimates of Clade Age and the Fossil Record

(link to this section)

p. 110

Symposium 3 (Friday, November 7, 2014, 9:15 AM)

METHODS FOR THE QUANTITATIVE COMPARISON OF MOLECULAR ESTIMATES OF CLADE AGE AND THE FOSSIL RECORD

CLARKE, Julia, The University of Texas at Austin, Austin, TX, United States of America, 78712
BOYD, Clint, South Dakota School of Mines and Technology, Rapid City, SD, United States of America

Approaches quantifying relative congruence, or incongruence, of molecular divergence estimates and the fossil record have been limited. Previously proposed methods are largely node specific, assessing incongruence at particular nodes for which both fossil data and molecular divergence estimates are available. These existing metrics, and other methods that quantify incongruence across topologies including entirely extinct clades, have so far not taken into account uncertainty surrounding both the divergence estimates and the ages of fossils. They have also treated molecular divergence estimates younger than previously assessed fossil minimum estimates of clade age as if they were the same as cases in which they were older. However, these cases are not the same. Recovered divergence dates younger than compared oldest known occurrences require prior hypotheses regarding the phylogenetic position of the compared fossil record and standard assumptions about the relative timing of morphological and molecular change to be incorrect. Older molecular dates, by contrast, are consistent with an incomplete fossil record and do not require prior assessments of the fossil record to be unreliable in some way.

Here, we compare previous approaches and introduce two new descriptive metrics. Both metrics explicitly incorporate information on uncertainty by utilizing the 95% confidence intervals on estimated divergence dates and data on stratigraphic uncertainty concerning the age of the compared fossils. Metric scores are maximized when these ranges are overlapping. MDI (minimum divergence incongruence) discriminates between situations where molecular estimates are younger or older than known fossils reporting both absolute fit values and a number score for incompatible nodes. DIG range (divergence implied gap range) allows quantification of the minimum increase in implied missing fossil record induced by enforcing a given set of molecular-based estimates.

These metrics are used together to describe the relationship between time trees and a set of fossil data, which we recommend be phylogenetically-vetted and referred on the basis of apomorphy. Differences from previously proposed metrics and the utility of MDI and DIG range are illustrated in three empirical case studies from angiosperms, ostracods, and birds. These case studies also illustrate the ways in which MDI and DIG range may be used to assess time trees resultant from analyses varying in calibration regime, divergence dating approach or molecular sequence data analyzed.

TALK #7 —Puttick, Benton, & Thomas (2014): Origin of Mammals: Molecular Versus Morphological Clocks

(link to this section)

p. 210

Symposium 3 (Friday, November 7, 2014, 9:30 AM) ORIGIN OF MAMMALS: MOLECULAR VERSUS MORPHOLOGICAL CLOCKS

PUTTICK, Mark, University of Bristol, Bristol, United Kingdom
BENTON, Michael, Univ of Bristol, Bristol, United Kingdom
THOMAS, Gavin, University of Sheffield, Sheffield, United Kingdom

The origin of mammals has received a huge amount of attention in recent years, and has been the source of much debate. A particular controversy is that the vast majority of molecular-dated phylogenies place the origin of placentals within the Cretaceous, but no certifiable crown placental fossils are known until the Cenozoic. One way to investigate this issue is to perform a morphological-clock analysis that allows for the dating of nodes with the direct incorporation of fossils by using fossils as tip dates in phylogenies. These methods can then be directly compared to dates of placentals nodes from a traditional molecular-clock analysis of extant mammals and so elucidate how the incorporation of fossil data impact our understanding of the evolution of a major group of extant animals.

Using a recently-published extensive morphological matrix of extinct and extant mammals allows for a direct comparison of ages of the origin of major clades of placental mammals to the largest molecular study of mammalian origins. Whilst a molecular date for the origin of placentals is around 90 Ma, the use of the morphological clock alone based on tip-dating pushes these results back further into the Mesozoic. These results have important implications for both how we understand the timing of mammalian evolution, and for how we employ the morphological clock.

TALK #8 —Warnock & Donoghue (2014): Testing the Molecular Clock Using Simulated Trees, Fossils, and Sequences

(link to this section)

p. 252

Symposium 3 (Friday, November 7, 2014, 9:45 AM)

TESTING THE MOLECULAR CLOCK USING SIMULATED TREES, FOSSILS, AND SEQUENCES

WARNOCK, Rachel, Smithsonian NMNH, Washington, DC, United States of America, 20013-7012
DONOGHUE, Philip, University of Bristol, Bristol, United Kingdom

The molecular clock provides the most powerful means of establishing an evolutionary timescale. Approaches to calibrating the molecular substitution rate vary in their assumptions and complexity, differ in their use of geological evidence, and invariably yield different divergence estimates. Surprisingly, competing approaches to calibration have never been tested because, in reality, the true evolutionary timescale is never known. Consequently, it has not been possible to assess the accuracy and precision with which divergence times can ever be known. The solution is to use simulated data, where the relationship between times of divergence, molecular rate variation, and fossil evidence is known. In this study, we develop simulations that combine realistic models of speciation, molecular evolution, and fossil preservation. We use a non-random stratigraphic model of preservation, based on the well-defined depositional cycles that have been documented for the past 250 Ma. We first test the accuracy and precision of four quantitative and probabilistic methods of deriving temporal constraints from the fossil record. We implement these as bespoke calibration priors in Bayesian molecular clock analyses and assess the accuracy and precision of posterior divergence estimates— these are compared to the use of arbitrary priors. Finally, we present a case study using primates. The results demonstrate that paleontological constraints can be accurate but will typically be imprecise. Accurate molecular divergence estimates require both accurate and precise fossil-based constraints. However, the accuracy of posterior estimates is not determined by the accuracy of the specified fossil-based calibrations. Instead, the accuracy is determined by the way in which the calibrations are effectively implemented by contemporary Bayesian models of divergence time estimation. The analysis of the primate fossil and molecular data illustrates that this has material consequences for understanding the evolution of this group—reliable hypothesis testing surrounding the K- Pg boundary requires a higher degree of precision than is obtainable from current knowledge of the primate fossil record.

TALK #9 —Ksepka & Phillips (2014): Putting Fossil Birds In Trees: Empirical Evidence for Biases In Dating the Avian Tree of Life

(link to this section)

p. 163

Symposium 3 (Friday, November 7, 2014, 10:15 AM)

PUTTING FOSSIL BIRDS IN TREES: EMPIRICAL EVIDENCE FOR BIASES IN DATING THE AVIAN TREE OF LIFE

KSEPKA, Daniel, NESCent, Durham, NC, United States of America, 27705
PHILLIPS, Matthew, Queensland University of Technology, Brisbane, Australia

Birds are the most diverse extant tetrapod clade, but the timing of the crown avian radiation remains controversial. In general, the fossil record supports a primarily Cenozoic radiation, whereas divergence dating analyses push many splits into the Cretaceous. One underappreciated phenomenon is that disparity between fossil ages and molecular dates tends to be proportionally greater for shallower nodes in the Avian Tree of Life. However, because individual analyses differ in variables such as model choice, taxonomic representation and calibration strategy, it has previously not been possible to test for patterns of disparity in a controlled setting.

We conducted a series of divergence dating analyses in BEAST 1.75 to test the effects of calibration depth and gene type on divergence estimates. Complete mitochondrial genomes (11 193 bp) and a 27-gene nuclear dataset (7 208 protein coding bp and 14 691 non-coding bp) were aligned for 72 taxa. We formulated 17 node age priors that meet recently proposed Best Practices for fossil calibrations. Each calibration was based on an individual fossil specimen that has been phylogenetically and stratigraphically vetted. Priors included hard minimum ages based on the minimum possible age of each calibrating specimen (inclusive of dating error) and soft maximum ages based on global preservation patterns.

All analyses supported a mid-Cretaceous origin of Aves and placed several neoavian divergences in the latest Cretaceous. However, the major early diversification phase placed in the Campanian by some studies is instead associated with recovery after the Cretaceous-Paleogene in our results. Several interesting patterns emerged. We found that when mitochondrial sequences were analysed using purine/pyrimidine (RY) rather than standard nucleotide (NT) coding strategies, mean node age of the tree as a whole decreased. Critically, this pattern was strongest for shallow nodes and reversed for the deepest nodes in the tree. This provides empirical support for simulations that suggest 'tree compression' due to model misspecification will tend to overestimate shallow node ages. It offers a plausible explanation for some of the disparity between the fossil record and molecular dates in birds. The results also support anecdotal observations that mitochondrial genes have yielded older dates than nuclear genes for birds. Vetting fossil calibrations and accounting for model biases offers the potential for substantial reconciliation between molecular and fossil interpretations of the radiation of modern birds.

TALK #10 —O'Reilly, Donoghue, Dos Reis, & Yang (2014): Evaluating the Performance of Node Versus Tip Based Fossil Calibration of the Molecular Clock

(link to this section)

p. 199

Symposium 3 (Friday, November 7, 2014, 10:30 AM)

EVALUATING THE PERFORMANCE OF NODE VERSUS TIP BASED FOSSIL CALIBRATION OF THE MOLECULAR CLOCK

O'REILLY, Joe, University of Bristol, Bristol, United Kingdom
DONOGHUE, Philip, University of Bristol, Bristol, United Kingdom
DOS REIS, Mario, UCL, London, United Kingdom
YANG, Ziheng, UCL, London, United Kingdom

Molecular clocks have long usurped the role of palaeontology in establishing a timescale for evolutionary history. However, the introduction of Total Evidence Dating (TED), which incorporates fossil taxa directly in divergence time estimation as dated tips, rather than indirectly as dated nodes constraining the age of ancestors of living lineages, promises to deliver greater accuracy and precision. Unfortunately, such analyses performed to date have failed to transfer the lessons learned from dating fossils for nodecalibration of molecular clocks. In most instances, individual fossils cannot be dated with any greater precision than can node-based calibrations. In accommodating these errors, it is not clear whether TED affords greater accuracy or precision than node-based calibration. We attempted to evaluate the performance of these competing approaches through analysis of both empirical and simulated data. Empirical work has centred on testing the convention that TED allows for higher precision estimates of divergence time. This has been achieved by re-analysing a well-known TED dataset for Hymenoptera, with the inclusion of a more realistic measure of chronological uncertainty in the construction of fossil tip calibrations. Our results demonstrate that TED may not necessarily provide more precise age estimates than node dating when established calibration design methodology is utilised. Simulated data has allowed us to test what effect the inclusion of increasing quantities of fossil morphological data exerts on the precision of divergence time estimates. Our approach builds on the infinite-sites theory for molecular dating which predicts that uncertainty in divergence time estimation can only be reduced through the inclusion of more precise time priors.

Our results suggest that TED is not a panacea for the perceived inaccuracy and imprecision of the molecular clock. Indeed, TED and traditional node-based calibration are not mutually incompatible and we suggest that a combination of node calibrated and TED approaches to divergence time estimation is the best approach to improving the accuracy and precision of evolutionary timescales.

TALK #11 —Friedman, Dornburg, & Near (2014): Morphological Clocks Close the Gap Between Ages of Teleost Fishes Estimated From Molecular Clocks and the Fossil Record

(link to this section)

p. 133

Symposium 3 (Friday, November 7, 2014, 10:45 AM)

MORPHOLOGICAL CLOCKS CLOSE THE GAP BETWEEN AGES OF TELEOST FISHES ESTIMATED FROM MOLECULAR CLOCKS AND THE FOSSIL RECORD

FRIEDMAN, Matt, University of Oxford, Oxford, United Kingdom
DORNBURG, Alex, Yale University, New Haven, CT, United States of America
NEAR, Thomas, Yale University, New Haven, CT, United States of America

Many iconic vertebrate clades exhibit substantial disagreement between times of evolutionary origin estimated from the fossil record and those inferred using relaxed molecular clocks. The sudden appearance in the fossil record of clades with a rich diversity of lineages, such as angiosperms and teleost fishes, is a well-documented pattern. The earliest fossils of crown teleosts date to the Late Jurassic and include three of the four earliest diverging lineages: Elopomorpha, Otocephala, and Euteleostei. Literal readings of this paleontological pattern have led to conclusions that the modern teleost radiation dates to approximately 150 Ma and was characterized by an explosive diversification of fully differentiated lineages early in its history. However, this same pattern has also been interpreted as a sampling artifact driven by rich marine Lagerstaetten of Late Jurassic age (i.e., Cerin and the lithographic limestones of southern Germany) that yield abundant articulated fishes. Fossil age-calibrated relaxed molecular clock analyses suggest Paleozoic roots for modern teleost biodiversity, consistently estimating a late Carboniferous-Permian age (ca. 320-280 Ma) for the teleost crown node. We investigated the use of new relaxed morphological clocks to determine if datasets of discretely coded phenotypic characters that include a number of fossil taxa would result in age estimates for crown teleosts that were similar to either the earliest appearance in the fossil record or the ages derived from molecular clock analyses. Our analyses were performed on a morphological dataset targeting the phylogenetic relationships of stem and crown lineage teleosts that included 194 characters scored for 51 taxa, of which only 14 are extant. The Lewis Mkv model was employed in BEAST v. 1.8 and the rock ages of fossil taxa were used for non-contemporaneous sampling or tip dating. The mean posterior age estimate from the relaxed morphological clock analyses for the crown teleost lineage is 279.8 Ma with a 95% highest posterior density (244.1, 314.2 Ma) that overlaps with recent relaxed molecular clock estimates for this lineage. This morphological timescale for teleost evolution strongly contradicts classical paleontological models that posit rapid diversification in the Late Jurassic, and implies an extensive unsampled early history of this successful vertebrate radiation.

TALK #12 —Turner, Pritchard, & Matzke (2014): 'Tip-Dating' When All You Have Are Fossils: Comparing Traditional and Bayesian Approaches To Fossil Divergence Times

(link to this section)

pp. 242-243

Symposium 3 (Friday, November 7, 2014, 11:00 AM)

'TIP-DATING' WHEN ALL YOU HAVE ARE FOSSILS: COMPARING TRADITIONAL AND BAYESIAN APPROACHES TO FOSSIL DIVERGENCE TIMES

TURNER, Alan, Stony Brook University, Stony Brook, NY, United States of America, 11794-8081
PRITCHARD, Adam, Stony Brook University, Stony Brook, NY, United States of America
MATZKE, Nicholas J., University of Tennessee, Knoxville, Knoxville, TN, United States of America

Divergence timing is critical in paleontological and neontological studies. Chronostratigraphically constrained fossils are the only direct evidence of absolute timing of species divergence. Strict temporal calibration of fossil-only phylogenies provides minimum divergence estimates, with various methods proposed to estimate divergences beyond these minimum values. We explore the utility of simultaneous estimation of tree topology and divergence times using BEAST tip-dating on datasets consisting only of fossils, a technique that has become available by combining relaxed morphological clocks and birth-death tree priors that include serial sampling (BDSS) at a constant rate through time. We compare BEAST results to those from the traditional maximum parsimony (MP) and undated Bayesian inference (BI) methods. Three overlapping datasets were used that span 250 million years of archosauromorph evolution leading to crocodylians. The first dataset focuses on early Sauria (~40 taxa, 240 characters), the second on early Archosauria (~75 taxa, 400 characters) and the third on Crocodyliformes (~100 taxa, 340 characters). For each dataset, three time-calibrated trees (timetrees) were calculated: a 'null' timetree with node ages based on earliest occurrences fossil record; a 'smoothed' timetree using a range of time added to the root that is then averaged over zero-length internodes; and a BEAST timetree. As expected, both the smoothed timetrees and the BEAST timetrees provide node-age estimates older than the minimum ages of the null timetrees. Comparisons within datasets show that the smoothed and BEAST timetrees provide remarkably similar estimates. Only near the root node do BEAST estimates fall outside the smoothed timetree range. The BEAST model is not able to overcome limited sampling to correctly estimate divergences considerably older than sampled fossil occurrence dates. Conversely, the smoothed timetrees consistently provide node-ages far older than the strict dates or BEAST estimates for morphologically conservative sister-taxa when they sit on long ghost lineages. In this latter case, the Bayesian model appears to be correctly moderating the node-age estimate based on the limited morphological divergence. Topologies are generally similar across analyses, but BEAST trees for crocodyliforms differ when clades are deeply nested but contain very old taxa. It appears that the constant-rate sampling assumption of the BDSS tree prior influences topology inference by disfavoring long, unsampled branches.

TALK #13 —Wright, Lloyd, & Matzke (2014): Fossils-Only Tip-Dating of Deinonychosaurian Theropods: a Comparison of Methods and Models

(link to this section)

p. 258

Symposium 3 (Friday, November 7, 2014, 11:15 AM)

FOSSILS-ONLY TIP-DATING OF DEINONYCHOSAURIAN THEROPODS: A COMPARISON OF METHODS AND MODELS

WRIGHT, April, University of Texas at Austin, Austin, TX, United States of America, 78702
LLOYD, Graeme, University of Oxford, Oxford, United Kingdom
MATZKE, Nicholas, National Institute for Mathematical and Biological Synthesis, Knoxville, TN, United States of America

Fossil calibrations are commonly used to attach dates to occurrences on molecular phylogenetic trees. In recent years, Bayesian methods have emerged for treating the fossils as terminal branches in divergence dating analyses, allowing researchers to use likelihood-based methods to co-estimate the phylogenetic tree and divergence dates from combined molecular and morphological data. Because Bayesian methods have carried the assumption of a molecular clock, traditionally molecular data have been included in any studies, even those with abundant fossil data. Likelihood models for estimating phylogenetic trees from fossil data are also compatible with divergence dating analyses and, in this study, we explore the efficacy of these methods in all-fossil analyses.

Here, we perform a novel analysis of divergence dates in the deinonychosaurian theropod dinosaurs and avian birds, including Archaeopteryx. In this analysis, we reexamined a published data set of 89 taxa and 374 morphological characters. Using relaxed-clock models implemented in BEAST and Mr. Bayes, we co-estimated phylogenetic trees and divergence dates from this dataset. In both cases, Archaeopteryx is supported as a basal bird. BEAST returns a far more well-resolved topology. However, MrBayes returns a tree that aligns better with suggested dates from the fossil record, placing the Archaeopteryx divergence from the rest of the avian taxa at about 167.5 Ma [152-183 Ma 95% Highest Posterior Density interval]. Divergences obtained in this analysis are largely younger than dates suggested in recent analyses of molecular data on the origin of the Aves clade.

TALK #14 —Brochu (2014): Gharial Biogeography, Conflicting Signals, and Phylogenetic Entmoots

(link to this section)

p. 98

Symposium 3 (Friday, November 7, 2014, 11:30 AM)

GHARIAL BIOGEOGRAPHY, CONFLICTING SIGNALS, AND PHYLOGENETIC ENTMOOTS

BROCHU, Christopher, Univ of Iowa, Iowa City, IA, United States of America, 52242- 1319

Although morphological and molecular data sets are strongly congruent over most aspects of crocodylian phylogeny, they conflict over the relationships and divergence timing of the living gharials Gavialis gangeticus and Tomistoma schlegelii, both of which are currently found only in fresh water in Asia. These support very different historical biogeographic scenarios; either the two gharials share an Asian freshwater-dwelling ancestor during the Cenozoic, or they last shared a common ancestor in the Late Cretaceous and independently became limited to nonmarine settings from North Atlantic shallow marine ancestors. Combined-data and molecular scaffold analyses including fossils both support the close relationship between Gavialis and Tomistoma preferred by molecular data, but they also put the Late Cretaceous through Paleocene 'thoracosaurs' on the Gavialis line, as supported by morphological data, suggesting that the common gharial ancestor was a coastal Laurasian crocodylian. But whether these reflect actual relationships or a topological compromise between strongly conflicting signals is an open question — they support a clade recovered in molecular analyses, but with a divergence time 20 or more million years older than the dates supported by the same data. Analyses with fossils that constrain Gavialis and Tomistoma to a late Paleogene split may more precisely reflect molecular evidence, but they require arbitrary decisions about clade membership not independently supported by either data set. They do, however, continue to support a common gharial ancestor that was at least capable of crossing substantial marine barriers, which is consistent with the presence of oral and glandular features in both gharials that allow tolerance of salt water. Studies using the results of phylogenetic analyses often use one tree, but in cases like this, it might be more advisable to consider trees supported by different data sets and combined-data trees independently, and to regard different scenarios supported by these trees as equally viable, while further information is collected to resolve remaining conflicts intrinsic to the data.

TALK #15 —Gorscak & O'Connor (2014): Re-Evaluation of Cretaceous Paleobiogeographical Patterns Using Morphological Clock and Model-Based Approaches: a Case Study Utilizing Titanosaurian Sauropods With Evidence for a More Centralized Role for Continental Africa

(link to this section)

p. 140

Symposium 3 (Friday, November 7, 2014, 11:45 AM)

RE-EVALUATION OF CRETACEOUS PALEOBIOGEOGRAPHICAL PATTERNS USING MORPHOLOGICAL CLOCK AND MODEL-BASED APPROACHES: A CASE STUDY UTILIZING TITANOSAURIAN SAUROPODS WITH EVIDENCE FOR A MORE CENTRALIZED ROLE FOR CONTINENTAL AFRICA

GORSCAK, Eric, Ohio Univ, Athens, OH, United States of America, 45701
O'CONNOR, Patrick, Ohio Univ, Athens, OH, United States of America

Titanosaurian sauropod dinosaurs were highly successful during the Cretaceous. Yet, certain aspects of their evolutionary history remain obscure, including: (1) their phylogenetic relationships; (2) when and where they originated; and (3) how continental break-up influenced their global distribution patterns. We examined these issues using recent advances in phylogenetic and paleobiogeographic analyses that utilize a morphological clock and tip-dating of non-contemporaneous taxa. Using a modified morphological data set and explicit information on taxon ages, we tested varying phylogenetic models (parameters include character and clock rates) within a Bayesian framework to estimate the best-fit tree. Uncorrelated clock models fit better than strict clocks and, interestingly, equal-rate character models were slightly favored over variablerate character models. Overall, model topologies are congruent with one another with slight variations in taxon placement and estimated branch lengths and nodal dates. The best-fit phylogeny estimated the mean divergence date of Titanosauria at 136.26 Ma (95% highest posterior density: 152.34-124.04 Ma), supporting a likely Early Cretaceous origination. Using the R package BioGeoBEARS, we employed two paleobiogeographical models over the best-fit phylogeny: (1) a model with range expansion and contraction parameters and (2) the same model but with an additional dispersal parameter. Both models favor an African origin for Titanosauria and Lithostrotia, and an Aptian-Albian South American origin for the younger 'aeolosaurid' and 'saltasaurid' clades.

Furthermore, several Late Cretaceous immigrant taxa diverged earlier from South American clades during the middle Cretaceous: Alamosaurus (North America), Isisaurus (India), Rapetosaurus (Madagascar), an Asian clade, and an European clade. Consistent with our models, recent discoveries of several caudal vertebral elements that express aeolosaurid affinities from the middle Cretaceous Galula Formation of eastern Africa support a widespread and middle Cretaceous divergence of aeolosaurid-related titanosaurians. Though too fragmentary to formally include into the present analyses, the material may potentially link both the Late Cretaceous European and eastern Gondwanan aeolosaurid-relatives with those from South America. These analyses suggest a more centralized paleobiogeographic role for Cretaceous continental Africa despite the current under-sampled and poorly-documented state of much of this critical landmass.

TALK #16 —Lloyd, Bapst, Davis, & Friedman (2014): A Probabilistically Time-Scaled 1000-Taxon Phylogenetic Hypothesis for Mesozoic Dinosaurs and the Origins of Flight and Crown-Birds

(link to this section)

pp. 169-170

Symposium 3 (Friday, November 7, 2014, 12:00 PM)

A PROBABILISTICALLY TIME-SCALED 1000-TAXON PHYLOGENETIC HYPOTHESIS FOR MESOZOIC DINOSAURS AND THE ORIGINS OF FLIGHT AND CROWN-BIRDS

LLOYD, Graeme, University of Oxford, Oxford, United Kingdom
BAPST, David, South Dakota School of Mines and Technology and University of California Davis, Rapid City, SD, United States of America
DAVIS, Katie, University of Bath, Bath, United Kingdom
FRIEDMAN, Matt, University of Oxford, Oxford, United Kingdom

Fossil-calibrated molecular clocks are often used to broadly bracket the timing of many key branching events in the history of life, such as the origin of avian flight, in contrast to literal interpretations based on the timing of fossil occurrences. However, two new approaches allow for probabilistic dating of phylogenetic nodes based on appearance times in the fossil record, without reference to a molecular or morphological clock. The first of these is a modification of a simple Bayesian approach that utilizes the sequence of sister group ages leading to each node. The second is a stochastic method that uses prior estimates of sampling, origination and extinction rates in a branching model before producing a set of time-scaled trees. Here we apply both approaches to attempt to answer three questions. 1) What is the probability that dinosaurs emerged prior to the end- Permian mass extinction? 2) When did avian flight first evolve? 3) When did crown birds originate?

Our phylogenetic hypothesis contains over 1000 unique taxa and is a novel formal supertree based on over 1500 data sets. We implemented several methodological improvements over previous approaches, including: 1) increased automation, reducing both labor and human error, 2) increased information content of input trees (all equally optimal topologies are retained), 3) inclusion of taxonomy as an additional input tree to increase coverage, 4) pruning of fixed outgroup taxa from input trees, 5) numerical determination of a cut-off point that maximizes coverage while minimizing redundancy, 6) automated removal of superseded data sets and shared weighting of sets of equally dependent input trees, 7) up-weighting of more recent studies over older studies, and 8) implementation of safe taxonomic reduction.

Parallelized tree searches in TNT produced 11 087 equally optimal topologies and the two dating approaches were applied (results given in respective order of their introduction above). In both cases a pre-Mesozoic origin for dinosaurs cannot be rejected at an alpha of 0.05 (p = 0.098 or 0.331). Other estimates agree closely, with avian flight estimated at 152.39-172.69 Ma or 156.6-167.9 Ma, and crown-birds estimated at 70.12- 107.52 Ma or 70.7-97.8 Ma (all ranges are 95% CIs). These last sets of dates can be compared to molecular estimates, which are broadly older. We propose that elevated rates of molecular evolution at the base of the extant dinosaur radiation may be necessary to reconcile these differences between molecular and model-based paleontological estimates of branching times.

POSTER — Holder & Heath (2014): The Effects of Using Filtered Data for Branch Length and

Divergence Time Estimation
(link to this section)

Poster #29

Holder, M., Heath, T. A. THE EFFECTS OF USING FILTERED DATA FOR BRANCH LENGTH AND
DIVERGENCE TIME ES
TIMATION

Poster Session III (Friday, November 7, 2014, 4:15 - 6:15 PM)

THE EFFECTS OF USING FILTERED DATA FOR BRANCH LENGTH AND DIVERGENCE TIME ESTIMATION

HOLDER, Mark, Univ. of Kansas, Lawrence, KS, United States of America, 66045
HEATH, Tracy A., University of California, Berkeley, Berkeley, CA, United States of America

Typical morphological character matrices can be considered to be a form of filtered data because the process of scoring characters entails looking for variable traits (or only parsimony-informative characters). The importance of correcting this filtering when one estimates a phylogeny from these data has been understood for more than a decade, but applying models of ascertainment bias to paleontological data is more complex because of the relatively large number of missing data cells in a character matrix. The calculations must account for the fact that the data filtering only applies to the subset of species that are scored for a particular character. Theoretical results predict that conducting likelihood-based inference on filtered data can be reliable if corrections for ascertainment bias are used. However, the effects of data filtering on branch length estimation and divergence time estimation are poorly understood. We will present results from a computer simulation study and from the analysis of three paleontological data sets (from ursids, dinosaurs, and trilobites) to characterize the biases and loss of power induced by data filtering. The authors thank the NSF and HITS for funding.

Other relevant talks and posters (if people suggest them)

(link to this section)

Please send suggestions to gro.soibmin|ekztam#gro.soibmin|ekztam.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License