Table of Contents

Validation tests of BioGeoBEARS — Introduction
^{(link to this section)}
Here I post some validation examples for BioGeoBEARS.
I have left out comparisons to Lagrange, which are documented in Matzke (2014, Systematic Biology), and which test:
 the downpass loglikelihood calculations under the DEC model,
 the parameter inferences for d and e, and
 the maximum likelihood (ML) optimizer.
Although under the DEC model, the ML loglikelihood and parameter estimates are identical, the ancestral states estimates are not directly comparable between Lagrange and BioGeoBEARS, because:
 Lagrange estimated node state/range probabilities under "local optimization", i.e. for every possible ancestral range at every node, Lagrange runs a new ML optimization of the parameters while fixing that range at that node.
 BioGeoBEARS estimates ancestral states/ranges under the globally optimum ML model.
Local and global ancestral state/range probabilities will typically be similar but not identical.
In this page I focus on:
 Comparison of BioGeoBEARS ancestral states to Liam Revell's phytools, for a case where a BioGeoBEARS model can be converted into the (simple) model used by phytools
 Comparison of BioGeoBEARS Biogeographical Stochastic Mapping (BSM) to Liam Revell's phytools stochastic mapping, for a case where a BioGeoBEARS model can be converted into the (simple) model used by phytools
 Comparison of BioGeoBEARS loglikelihoods to ClaSSE (in diversitree), showing how the ClaSSE model can be simplified into a model identical with DEC or DEC+J, etc.
 Comparison of BioGeoBEARS ancestral state probabilities in timestratified analyses to BioGeoBEARS ancestral state probabilities in nontimestratified analyses. The latter are much simpler to implement. By setting up a timestratified analysis, but set up the constraints matrices so that there is no actual change in geography between time slices, we can test that timestratified analyses retrieve the same loglikelihoods, ML inferences, ancestral state probabilities, and stochastic mapping results
Note: I use the terms "state", "range", and "geographic range" interchangeably throughout, because in the framework of discrete historical biogeography, each possible geographic range is a state in the state space. Transitions between states are controlled by the anagenetic and cladogenetic transition matrices, which themselves are determined by the model parameters of the BioGeoBEARS supermodel.
Abbreviations:
 ML = maximum likelihood
 BSM = Biogeographical Stochastic Map(ping)(s)
BioGeoBEARS DEC and DEC+J — ancestral state/range probabilities versus BSMs
^{(link to this section)}
NOTE: For an introduction to Biogeographical Stochastic Mapping (BSM), please see the section introducing Biogeographical Stochastic Mapping, and the Biogeographical Stochastic Mapping Example Script.
The input/output/script files for this section can be found in: Psychotria_M0_unconstrained_BSMs.zip
Here, we check that the ancestral state probabilities estimated under BioGeoBEARS DEC and DEC+J (nonstratified, unconstrained) are approximated by averaging many Biogeographical Stochastic Maps (BSMs).
DEC
^{(link to this section)}
Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Unconstrained DEC model (nonstratified) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
DEC+J
^{(link to this section)}
Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Unconstrained DEC+J model (nonstratified) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
The BSMs line up well with the analytic probabilities for both DEC and DEC+J. This is also the case for timestratified analyses and other models, although I am not posting these for the moment due to the lateness of the hour.
BioGeoBEARS stratified analyses — ancestral state/range probabilities versus BSMs
^{(link to this section)}
The input/output/script files for this section can be found in: Psychotria_M0_stratified_BSMs.zip (6 subdirectories, one for each model)
Here, for the 6 standard [[BioGeoBEARS]] models, analyses that are timestratified (but with constant geography) are run, to check that the ancestral state probabilities match the mean of 200 Biogeographic Stochastic Maps (BSMs). This is a doublecheck on the ancestral state probabilities and the BSM algorithms in a timestratified analysis.
Timestratified DEC and DEC+J
^{(link to this section)}
Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified DEC model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package).  Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified DEC+J model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
Timestratified DIVALIKE and DIVALIKE+J
^{(link to this section)}
Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified DIVALIKE model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package).  Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified DIVALIKE+J model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
Timestratified BAYAREALIKE and BAYAREALIKE+J
^{(link to this section)}
Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified BAYAREALIKE model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package).  Figure Caption: Ancestral state/range probabilities versus mean of stochastically mapped states. Timestratified BAYAREALIKE+J model (but unconstrained, constant geography) on the Psychotria dataset. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. yaxis: mean of 200 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
BioGeoBEARS stratified vs. nonstratified analyses (ancestral state/range probabilities)
^{(link to this section)}
The input/output/script files for this section can be found in: Psychotria_M0_unconstrained.zip and Psychotria_M0_stratified.zip
Programming a timestratified analysis is much more complicated than a nonstratified analysis: imagine taking a timetree, slicing it into pieces wherever it crosses a stratum border, and then keeping track of all of the pieces as probabilities are passed around during the downpass and uppass. Then imagine a changing state space, changing constraints, etc. This is the kind of thing where a real objectoriented programming language with a real phylogeny structure based on node objects would have a major advantage over R (In R, I have to construct a big table and continually query it to keep track of all of the tree pieces and what time slice the calculations are taking place in.) It gets even more fun if there are fossil branches with tips that do not reach up to the present (0 million years ago).
To check that this has all been done correctly, we can compare nonstratified BioGeoBEARS analysis results to timestratified analyses where each timeperiod has the same constraints as the recent. Here, I used the standard Hawaii island ages to do the timestratification, but in each time period the manual dispersal matrix was all 1s.
ML, LnL, parameters
^{(link to this section)}
Here are the parameter inferences for the 6 standard models, in the timestratified analysis:
LnL numparams d e j
DEC 34.54313 2 3.501898e02 2.840040e02 0.0000000
DEC+J 20.94759 3 1.000000e12 1.000000e12 0.1142670
DIVALIKE 33.15172 2 4.472860e02 1.000000e12 0.0000000
DIVALIKE+J 21.08621 3 1.000000e12 1.000000e12 0.1157244
BAYAREALIKE 40.33461 2 1.863147e02 3.060137e01 0.0000000
BAYAREALIKE+J 21.55265 3 1.000000e07 1.000000e07 0.1081164
Here are the inferences in the nonstratified analysis (specifically, the analysis in the standard example script):
LnL numparams d e j
DEC 34.54196 2 3.504546e02 2.835632e02 0.0000000
DEC+J 20.94759 3 1.000000e12 1.000000e12 0.1142757
DIVALIKE 33.14968 2 4.474416e02 1.000000e12 0.0000000
DIVALIKE+J 21.08621 3 1.000000e12 1.000000e12 0.1156998
BAYAREALIKE 40.33704 2 1.738085e02 3.040188e01 0.0000000
BAYAREALIKE+J 21.55265 3 1.000000e07 1.000000e07 0.1081169
Pretty good. This just tests the downpass calculations and the ML optimizer on stratified vs. nonstratified analyses. Let's also compare the ancestral state probabilities under stratified and nonstratified analyses:
DEC
^{(link to this section)}
Left: Nonstratified DEC M0 unconstrained; Right: Stratified DEC M0 unconstrained
DEC+J
^{(link to this section)}
Left: Nonstratified DEC+J M0 unconstrained; Right: Stratified DEC+J M0 unconstrained
DIVALIKE
^{(link to this section)}
Left: Nonstratified DIVALIKE M0 unconstrained; Right: Stratified DIVALIKE M0 unconstrained
DIVALIKE+J
^{(link to this section)}
Left: Nonstratified DIVALIKE+J M0 unconstrained; Right: Stratified DIVALIKE+J M0 unconstrained
BAYAREALIKE
^{(link to this section)}
Left: Nonstratified BAYAREALIKE M0 unconstrained; Right: Stratified BAYAREALIKE M0 unconstrained
BAYAREALIKE+J
^{(link to this section)}
Left: Nonstratified BAYAREALIKE+J M0 unconstrained; Right: Stratified BAYAREALIKE+J M0 unconstrained
Phytools rerootingMethod vs. BioGeoBEARS BAYAREALIKE+ade model (nonstratified BSMs and ancestral state/range probabilities)
^{(link to this section)}
The input/output/script files for this section can be found in: Psychotria_M0_unconstrained_BAYAREALIKE+a.zip
To facilitate comparison with the ancestral states estimation and stochastic mapping algorithms in phytools, SIMMAP, etc., we need to create a "standard unordered character model with equal rates" in the BioGeoBEARS model object. This is simply done with the following code. The model ends up being the BAYAREALIKE model, with parameters d and e fixed to 0, and a (the "anagenetic rangeswitching" parameter) set to be freely estimated.
Specification of BAYAREALIKE+ade model
^{(link to this section)}
Specification of the BAYAREALIKE+ade model is done as follows:
# Set up BAYAREALIKE model
# No subset sympatry
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","est"] = 0.0
# No vicariance
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","est"] = 0.0
# No jump dispersal/founderevent speciation
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","type"] = "free"
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","init"] = 0.01
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","est"] = 0.01
# Adjust linkage between parameters
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["ysv","type"] = "1j"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["ys","type"] = "ysv*1/1"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["y","type"] = "1j"
# Only sympatric/rangecopying (y) events allowed, and with
# exact copying (both descendants always the same size as the ancestor)
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","init"] = 0.9999
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","est"] = 0.9999
# Turn off d and e; turn on a
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","est"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","est"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","type"] = "free"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","init"] = 0.03
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","est"] = 0.03
The model is thus really "BAYAREALIKE+ade", i.e. "the BAYAREALIKE model, plus a, minus d, minus e". Alternatively we could call it the "instantaneous area switching model", the "anagenetic island model", a Markovk (Mk model), the BEAST discrete phylogeography model with equal rates, the JukesCantor DNA model, or an unordered character model — these are all equivalent. This is, by the way, the model that is being assumed when biogeography is treated as a standard unordered character. If the rates of different rangeswitches were different free parameters, this would essentially be a DNA GTR model, as in Sanmartin's islandswitching paper.
Comparison of nonstratified BGB state probabilities and nonstratified BGB BSMs
^{(link to this section)}
Figure Caption: Comparison of BioGeoBEARS analytic ancestral state/range probability estimates (xaxis), and the probabilities by taking the mean of 200 stochastic maps (yaxis). BioGeoBEARS implementation of the BAYAREALIKE+ade model on the Psychotria dataset, nonstratified. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
Comparison of phytools state probabilities and nonstratified BGB BSMs
^{(link to this section)}
We can also compare the average of many stochastic maps to the analytic ancestral state probability estimates as estimated by the phytools function rerootingMethod.
Figure Caption: Comparison of phytools rerootingMethod analytic ancestral state/range probability estimates (xaxis), and the probabilities by taking the mean of 200 stochastic maps (yaxis). BioGeoBEARS implementation of the BAYAREALIKE+ade model on the Psychotria dataset, nonstratified. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
This graphic is the same as the previous one, because the same set of stochastic maps is being used for the yaxis, and it turns out that the phytools rerootingMethod ancestral state probabilities are nearly identical to the BioGeoBEARS ancestral state probabilities.
Comparison of phytools state probabilities and BGB nonstratified state probabilities
^{(link to this section)}
We can also compare the loglikelihoods and parameter inferences between phytools rerootingMethod and BioGeoBEARS ancestral states:
#######################################################
# phytools ancestral states (ER model)
#######################################################
library(ape)
library(phytools)
library(BioGeoBEARS)
trfn = "tree.newick"
# Look at the raw Newick file:
moref(trfn)
# Look at your phylogeny:
tr = read.tree(trfn)
tr
tipnodenums = 1:length(tr$tip.label)
rootnodenum = length(tr$tip.label) + 1
intnodenums = (length(tr$tip.label)+1):(length(tr$tip.label) + tr$Nnode)
intnodenums
geogfn = "geog_data.txt"
# Look at the raw geography text file:
moref(geogfn)
# Look at your geographic range data:
tipranges = getranges_from_LagrangePHYLIP(lgdata_fn=geogfn)
tipranges
areanames = names(tipranges@df)
areanums = 1:length(areanames)
tipvals = rep(NA, nrow(tipranges@df))
for (i in 1:nrow(tipranges@df))
{
tipvals[i] = areanums[unlist(tipranges@df[i,])==1]
}
names(tipvals) = row.names(tipranges@df)
tipvals
phytools_res = rerootingMethod(tree=tr, x=tipvals, model="ER")
phytools_res
phytools_ML = phytools_res$marginal.anc
#######################################################
# Store in a BioGeoBEARStype format
#######################################################
resBAYAREALIKEcopy = resBAYAREALIKE
resBAYAREALIKEcopy$ML_marginal_prob_each_state_at_branch_top_AT_node[intnodenums, 2:5] = phytools_ML
#######################################################
# START VALIDATION CHECK
#######################################################
# Loglikelihood DOES match:
phytools_res$loglik
# 22.97566
resBAYAREALIKE$total_loglikelihood
# 22.97566
# The rate parameter DOES match
phytools_res$Q[1,2]
# 0.07572288
resBAYAREALIKE$outputs@params_table["a","est"]
# 0.07575708
#######################################################
# END VALIDATION CHECK
#######################################################
We can plot the analytic estimates from phytools and the BioGeoBEARS nonstratified analysis:
Figure Caption: Comparison of analytic ancestral state/range probabilities under phytools rerootingMethod (xaxis) or a BioGeoBEARS nonstratified analysis with the BAYAREALIKE+ade model (yaxis). Psychotria dataset, nonstratified. 
The correlation between the two algorithms is high (R^{2}=1), basically identical.
Finally, as expected, if we do more stochastic maps, the mean state probabilities will (slowly) approach the analytic estimates. Here, I do this for the DEC model.
Figure Caption: Psychotria dataset, nonstratified, DEC model. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. Yaxis: mean of 10,000 stochastic maps for each state/range at each node. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
Phytools rerootingMethod vs. BioGeoBEARS BAYAREALIKE+ade model (timestratified BSMs and ancestral state/range probabilities)
^{(link to this section)}
The input/output/script files for this section can be found in: Psychotria_M0_stratified_BAYAREALIKE+a.zip
To facilitate comparison with the ancestral states estimation and stochastic mapping algorithms in phytools, SIMMAP, etc., we need to create a "standard unordered character model with equal rates" in the BioGeoBEARS model object. This is simply done with the following code. The model ends up being the BAYAREALIKE model, with parameters d and e fixed to 0, and a (the "anagenetic rangeswitching" parameter) set to be freely estimated.
Specification of BAYAREALIKE+ade model
^{(link to this section)}
Specification of the BAYAREALIKE+ade model is done as follows:
# Set up BAYAREALIKE model
# No subset sympatry
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["s","est"] = 0.0
# No vicariance
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["v","est"] = 0.0
# No jump dispersal/founderevent speciation
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","type"] = "free"
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","init"] = 0.01
# BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["j","est"] = 0.01
# Adjust linkage between parameters
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["ysv","type"] = "1j"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["ys","type"] = "ysv*1/1"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["y","type"] = "1j"
# Only sympatric/rangecopying (y) events allowed, and with
# exact copying (both descendants always the same size as the ancestor)
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","init"] = 0.9999
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["mx01y","est"] = 0.9999
# Turn off d and e; turn on a
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["d","est"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","type"] = "fixed"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","init"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["e","est"] = 0.0
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","type"] = "free"
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","init"] = 0.03
BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table["a","est"] = 0.03
The model is thus really "BAYAREALIKE+ade", i.e. "the BAYAREALIKE model, plus a, minus d, minus e". Alternatively we could call it the "instantaneous area switching model", the "anagenetic island model", a Markovk (Mk model), the BEAST discrete phylogeography model with equal rates, the JukesCantor DNA model, or an unordered character model — these are all equivalent. This is, by the way, the model that is being assumed when biogeography is treated as a standard unordered character. If the rates of different rangeswitches were different free parameters, this would essentially be a DNA GTR model, as in Sanmartin's islandswitching paper.
Comparison of stratified BGB state probabilities and stratified BGB BSMs
^{(link to this section)}
Figure Caption: Comparison of BioGeoBEARS analytic ancestral state/range probability estimates (xaxis), and the probabilities by taking the mean of 200 stochastic maps (yaxis). BioGeoBEARS implementation of the BAYAREALIKE+ade model on the Psychotria dataset, timestratified with constant geography. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
Comparison of phytools state probabilities and stratified BGB BSMs
^{(link to this section)}
We can also compare the average of many stochastic maps to the analytic ancestral state probability estimates as estimated by the phytools function rerootingMethod.
Figure Caption: Comparison of phytools rerootingMethod analytic ancestral state/range probability estimates (xaxis), and the probabilities by taking the mean of 200 stochastic maps (yaxis). BioGeoBEARS implementation of the BAYAREALIKE+ade model on the Psychotria dataset, timestratified with constant geography. Vertical lines are the 95% confidence intervals under the multinomial distribution (calculated with the multinomialCI function of the MultinomialCI package). 
This graphic is the same as the previous one, because the same set of stochastic maps is being used for the yaxis, and it turns out that the phytools rerootingMethod ancestral state probabilities are nearly identical to the BioGeoBEARS ancestral state probabilities.
Comparison of phytools state probabilities and BGB stratified state probabilities
^{(link to this section)}
We can also compare the loglikelihoods and parameter inferences between phytools rerootingMethod and BioGeoBEARS ancestral states:
#######################################################
# phytools ancestral states (ER model)
#######################################################
library(ape)
library(phytools)
library(BioGeoBEARS)
trfn = "tree.newick"
# Look at the raw Newick file:
moref(trfn)
# Look at your phylogeny:
tr = read.tree(trfn)
tr
tipnodenums = 1:length(tr$tip.label)
rootnodenum = length(tr$tip.label) + 1
intnodenums = (length(tr$tip.label)+1):(length(tr$tip.label) + tr$Nnode)
intnodenums
geogfn = "geog_data.txt"
# Look at the raw geography text file:
moref(geogfn)
# Look at your geographic range data:
tipranges = getranges_from_LagrangePHYLIP(lgdata_fn=geogfn)
tipranges
areanames = names(tipranges@df)
areanums = 1:length(areanames)
tipvals = rep(NA, nrow(tipranges@df))
for (i in 1:nrow(tipranges@df))
{
tipvals[i] = areanums[unlist(tipranges@df[i,])==1]
}
names(tipvals) = row.names(tipranges@df)
tipvals
phytools_res = rerootingMethod(tree=tr, x=tipvals, model="ER")
phytools_res
phytools_ML = phytools_res$marginal.anc
#######################################################
# Store in a BioGeoBEARStype format
#######################################################
resBAYAREALIKEcopy = resBAYAREALIKE
resBAYAREALIKEcopy$ML_marginal_prob_each_state_at_branch_top_AT_node[intnodenums, 2:5] = phytools_ML
#######################################################
# START VALIDATION CHECK
#######################################################
# Loglikelihood DOES match:
phytools_res$loglik
# 22.97551
resBAYAREALIKE$total_loglikelihood
# 22.97551
# The rate parameter DOES match
phytools_res$Q[1,2]
# 0.07572161
resBAYAREALIKE$outputs@params_table["a","est"]
# 0.07570332
#######################################################
# END VALIDATION CHECK
#######################################################
We can plot the analytic estimates from phytools and the BioGeoBEARS timestratified analysis:
Figure Caption: Comparison of analytic ancestral state/range probabilities under phytools rerootingMethod (xaxis) or a BioGeoBEARS timestratified analysis with the BAYAREALIKE+ade model (yaxis). Psychotria dataset, timestratified with constant geography. 
Clearly the correlation between the two algorithms is high (R^{2}=1). I consider it "good enough." The probabilities are not quite identical for a few state/node combinations. This might be due to the slight differences in the ML parameter estimates between the two methods, and/or rounding and exponentiation errors (in particular, the BioGeoBEARS timestratified analysis adds a bunch of extra exponentiations due to the fact that the phylogeny is sectioned into many pieces when the timeslicing is done).
Finally, as expected, if we do more stochastic maps, the mean state probabilities will (slowly) approach the analytic estimates.
Figure Caption: Psychotria dataset, timestratified with constant geography. xaxis: analytic ancestral state/range probabilities of each state/range at each internal node. Yaxis: mean of 5000 stochastic maps for each state/range at each node. 
BioGeoBEARS DEC and DEC+J as special cases of ClaSSE
^{(link to this section)}
(this was originally posted in 2015 — I showed this graphic at Evolution 2015)
The BioGeoBEARS models can be shown to be special cases of Goldberg & Igic's ClaSSE model. ClaSSE is implemented in Rich Fitzjohn's diversitree package.
Here, I made a simple phylogeny and a 4state geographic range space (2 areas, states are null, A, B, AB). I then set up a 4state model for ClaSSE, and set ClaSSE parameters to be specified by values of d, e, and/or j (this requires writing some functions). Then, for various values of these parameters, I calculated the likelihood of the data in BioGeoBEARS and ClaSSE. To make the likelihoods comparable, the (stateindependent) yule tree likelihood has to be added to the BioGeoBEARS likelihood, and the rootstate likelihoods have to be removed from the ClaSSE likelihood. If this is done, it can be seen that there is very good agreement between the likelihoods produced by BioGeoBEARS DEC or DEC+J and ClaSSE:
(Note: This is NOT a demonstration that DEC, DEC+J etc. are "equivalent" to ClaSSE or GeoSSE! DEC is a special case of GeoSSE and ClaSSE, and GeoSSE, DEC, and DEC+J are all special cases of ClaSSE. "X is a special case of Y" means that by fixing the values of certain parameters in model X, you produce model Y. In the case below, if you take ClaSSE, fix extinction to 0 (eliminating the "E" in "SSE") and allow statedependence in speciation only in the specific (equal pereventweights) way that DEC or DEC+J specify the conditional probability of cladogenesis events (strongly constraining the "SS" part of "SSE"), then you get the same likelihoods.)
Figure Caption: Loglikelihoods of a simple dataset (2 areas, 4 states/ranges) calculated under BioGeoBEARS (xaxis) and ClaSSE (yaxis). In the plot, blue "d" points indicate "DECe" models (d is positive, e=0, j=0). Black "e" points indicate traditional DEC models (d is positive, e is positive, j=0). Green "j" points represent the DEC+J model (all 3 parameters positive). A grid of possible d, e, and j parameter values was input into the BioGeoBEARS and ClaSSE likelihood calculators. These parameter value are not displayed, but the parameter combinations conferring the highest likelihood on the data, i.e. approaching the ML solution, are the points in the upper right of the plot. 
The remaining small differences are due to the numeric integration steps in the ClaSSE ODE calculations. This is strong evidence that BioGeoBEARS and ClaSSE are doing the same math under these constraints, which is encouraging as they were programmed completely independently.
Why don't we just use ClaSSE for everything? As explained in Matzke (2013, Frontiers in Biogeography), as the state space increases at numstates=2^{areas}, the number of parameters for the cladogenetic process in ClaSSE increases at something like (numstates^{3})/2. Even if many of these parameters were set to zero or equal to each other, I believe they still have to be represented in memory, and used in the numeric calculation. This probably means there are fairly strict limits to how many states can be used in a ClaSSE model, but the problem has not been thoroughly explored.
Note, 20170624: the script for this graphic is here: attached script
DEC and DEC+J can be statistically compared
^{(link to this section)}
Below is a brief response to the abstract of Ree and Sanmartín, scheduled at Evolution 2017. I have not seen the talk yet, of course, but the abstract contains sufficiently specific statements to offer a rebuttal.
I do, of course, welcome any and all criticisms of BioGeoBEARS models — I have offered a number of criticisms of these models myself! — but I do think they should be accurate. I hope there will be a more complete discussion of these issues in the literature, in the future.
I paste the abstract below, as it is impossible to give a direct link to abstracts on the meeting website:
Model selection in historical biogeography: why DEC and DEC+J likelihoods are not statistically comparable
Day: Saturday
Time: 4:15 PM  4:29 PM
Location:_ A106
Back
Talk type:
RegularTitle:
Model selection in historical biogeography: why DEC and DEC+J likelihoods are not statistically comparableAbstract:
The DEC+J model of geographic range evolution is a variant of DEC that allows "jump dispersal" or "founder event" speciation scenarios at internal phylogenetic nodes. It has become increasingly common for studies to conduct statistical model selection between DEC and DEC+J, even when jump dispersal is not integral to the hypotheses of interest. This is unfortunate because (a) DEC+J is a poor model for jump dispersal as a stochastic process, and (b) likelihoods of DEC and DEC+J are not statistically comparable. The primary purpose of modeling range evolution is to explain observed data via stochastic processes that generate biogeographic events with welldefined probabilities with respect to time. However, DEC and DEC+J do not describe distinct stochastic timedependent processes. Instead, they differ in their ability to explain the data via cladogenetic events that have no timedependent probabilities. Cladogenetic and anagenetic events of range evolution are not equivalent, and changing their relative contributions to the likelihood violates assumptions of model selection. Hypotheses of dispersaldriven speciation are much better served by models that treat cladogenetic events at phylogenetic nodes as stochastic outcomes of a lineage birthdeath process, such as ClaSSE.
Keyword 1:
BiogeographyKeyword 2:
Phylogenetic comparative methodsKeyword 3:
ModelingTalk Recorded:
YesAuthors:
Richard Ree
Field Museum of Natural History
Isabel Sanmartín
As I explain in my talk (pdf available), DEC and DEC+J can be statistically compared because both models put valid conditional probabilities (likelihoods) on the same dataset (or, technically, they produce probabilities times (number of states1), but this is a constant so it can be ignored in comparing models).
Conditional probabilities can be compared, as they represent the different fit of each model to the dataset.
I demonstrate the validity of these likelihoods in two ways:
1. Likelihoods for discrete data should add up to 1, when summed across all possible datasets (aka all possible data patterns). (If base frequencies are left out of the likelihood calculation, as is the case in Lagrange/BioGeoBEARS, it will be 1 times the number of possible ancestral states at the root.) I show that this is true for DEC and DEC+J in this script (zipfile).
2. DEC and DEC+J are special cases of the ClaSSE model. To show this, the DEC/DEC+J likelihood needs to be added to the likelihood of a Yule process on the input tree (again, this is an identical constant between the DEC and DEC+J models, so can be ignored in model comparisons) — and, the state frequencies at the root have to be set to be equal in ClaSSE, and then taken out of the likelihood (DEC and DEC+J assume equal state frequencies at the root, and, again, this is a constant that can be ignored when comparing the models).
I have demonstrated this in the attached script. (See also the ClaSSE section above, posted in 2015.) Likelihoods are not exactly identical between BioGeoBEARS and diversitree's ClaSSE — but ClaSSE is using a numerical approximation to calculate the likelihood, so this is expected. The correlation is extremely high (see graphic).
Now, there are numerous other issues that are worth discussing regarding DEC, DEC+J, and other biogeographical models. The biggest — as I have mentioned in Matzke (2014) and in many places on the BioGeoBEARS help pages — is that extinction is ignored in these models. They are effectively assuming a Yule process. This assumption is obviously wrong (Marshall 2017), but if we started ruling out methods that assumed or inferred a Yule process (many methods in phylogenetics underestimate extinction, even when they allow it), this area of science would have never gotten started.
In any case, a Yule process is a statistically valid model, even if it is a model that misses an important aspect of reality. The interesting question is this: what kind of data (or priors), and computationally feasible models, will it take to allow us to routinely infer the role of extinction in an nottoobiased way?
It would be great, of course, if we could snap our fingers and make a full ClaSSEtype model easy to use routinely for typical biogeographical problems. But the numerical calculations involved seem to get very slow for realistic problems. The best attempt I've seen at ClaSSE on a large state space is by Will Freyman and Sebastian Hoehna — some of the best programmers around — who implement a special case of a ClaSSE model to model polyploidy, in RevBayes. I believe their state space was of size 61 (51 chromosomes was the maximum observed, +10) — but this would still be a small biogeographical program (a 6area problem would have 64 possible ranges in the state space).
However, improving our model realism, and our computational abilities, is what we are doing science for! I would just emphasize that we are not well served by trying to rule certain model comparisons invalid, if their internal math is correct. All models are wrong, but we can do a lot of good science by comparing imperfect models by their relative fit to the data — the idea that any model devised by biologists is going to be the "true" model is a phantom (Burnham and Anderson 2002). The whole point of AIC and related procedures is to quantify the relative degree of fit to an unknown reality, and try to devise models that capture key features of interest with high explanatory value — i.e., find the large effects (Burnham and Anderson 2002). I think jump dispersal/founderevent speciation is likely to be one of these, for many datasets (not all datasets, of course — several cases are known where nonJ models are a better fit to a particular dataset).
Finally, as I have said many times, I view the DEC/DEC+J model comparison as the beginning of statistical model comparison in historical biogeography, not the end of it! In subsequent work, collaborators and I have explored the utility of using parameters for distancedependent dispersal (the +x model variant, Van Dam and Matzke 2016, J. Biogeography), parameterizing the influence of manualdispersal multipliers, rather than relying on subjectivelydetermined fixed multipliers (the +w model variant, Dupin, Matzke et al. 2016, J. Biogeography), traitdependent dispersal (various talks and papers in submission), fossils and fossil detection models (my Evolution 2017 talk), etc.
Updates added posttalks
^{(link to this section)}
Updates, posttalks: Nothing I saw in the talks or in posttalk discussions, so far at least, has rebutted the fundamental points about the likelihoods being valid probabilities summing to 1, or DEC and DEC+J being submodels of ClaSSE. Unless something changes on that score, it's hard to see how the claim that it is invalid to statistically compare the models can be sustained. Other kinds of discussions, e.g. whether there are better models out there, whether we should consider priors on models and how those should be justified, etc., are interesting but basically routine discussions in statistical model comparison.
Once we agree that statistical likelihood — Prob(datamodel) — serves as the basis of statistical comparison of models against a common dataset (perhaps with a prior added), the rest follows.
See also
Advice on Statistical Model Comparison In BioGeoBEARS
References
Burnham, Kenneth P.; Anderson, David R. (1998, first edition; 2002, second edition). Model Selection and Multimodel Inference: A Practical InformationTheoretic Approach. New York, Springer. doi: [10.1007/b97636 http://dx.doi.org/10.1007/b97636]
Link: http://www.springer.com/gb/book/9780387953649
 See also: Key Reference: Burnham and Anderson (2002)  The "Bible" of AIC/AICc (which has 37659 citations as of 20170626)
Dupin, Julia; Matzke, Nicholas J.; Sarkinen, Tiina; Knapp, Sandra; Olmstead, Richard; Bohs, Lynn; Smith, Stacey (2016). Bayesian estimation of the global biogeographic history of the Solanaceae. Journal of Biogeography, 44(4), 887899. http://dx.doi.org/10.1111/jbi.12898
Marshall, Charles R. (2017). Five palaeobiological laws needed to understand the evolution of the living biota. Nature Ecology & Evolution, 1, 0165. doi: 10.1038/s415590170165
Matzke, Nicholas J. (2014). "Model Selection in Historical Biogeography Reveals that Founderevent Speciation is a Crucial Process in Island Clades." Systematic Biology, 63(6), 951–970. doi: 10.1093/sysbio/syu056
Van Dam, Matthew; Matzke, Nicholas J. (2016). Evaluating the influence of connectivity and distance on biogeographic patterns in the southwestern deserts of North America. Journal of Biogeography. 43(8):1514–1532. http://dx.doi.org/10.1111/jbi.12727