In this page, I will accumulate mistakes that I spot in the literature, especially pertaining to likelihood, the Likelihood Ratio Test, AIC, and optimization problems.
In most cases, these mistakes are not fatal to interpretation, but they do indicate worrisome unfamiliarity with the basics of statistical model choice. The mistakes would be avoided by consultation of Burnham & Anderson's Model selection and multimodel inference: a practical information-theoretic approach, which has 30,000+ citations on Google Scholar.
A shorter introduction to the basics of statistical model choice may be found on my PhyloWiki pages on BioGeoBEARS Mistakes To Avoid and Advice On Statistical Model Comparison In BioGeoBEARS.
I may write a review on the topic at some point for a journal where these mistakes seem particularly frequent.
LRT / AIC
Viruel, J., Segarra-Moragues, J. G., Raz, L., Forest, F., Wilkin, P., Sanmartín, I. and Catalán, P. (2015), Late Cretaceous–Early Eocene origin of yams (Dioscorea, Dioscoreaceae) in the Laurasian Palaearctic and their subsequent Oligocene–Miocene diversification. Journal of Biogeography, published online December 18, 2015.
The two biogeographical DEC models with fossil AA constraints (M0 and M1) gave similar results (Fig. 4, Appendices S1, S3 Fig. S3), but reconstructed distinct biogeographical scenarios for the most recent common ancestor (MRCA) of Dioscorea (Node 132). Because the stratified M1 model showed a better fit to the data than the unconstrained M0 model (−ln likelihood 281.4 versus 316.9, respectively; likelihood ratio test, P = 0.001), we will refer to the results from this model hereafter (but see comments below and in the Discussion). The global estimated dispersal rate for the M1 model (dis: 0.0136947) was three times higher than the estimated extinction rate (ext: 0.00465982).
Mistake: The Likelihood Ratio Test can only be used to compare two models when one model is "nested" inside the other — and "nesting" means, specifically and exactly, that the simpler model is achieved by fixing the value of a parameter (or parameters) that are free in the more complex model. In other words, the simpler model is a special case of the more complex model.
In the example above, a time-stratified biogeographical model is being compared to an unconstrained biogeographical model. Both models have the same number of free parameters (namely, two: d and e, representing "dispersal"/range expansion and "extinction"/local extirpation).
The models differ in having different fixed multipliers on dispersal probabilities between regions. In the "unconstrained" model, the dispersal multipliers between all regions are set to 1 during all time periods (obviously, this does not need to be explicitly set in the program, because multiplying by 1 changes nothing). In the time-stratified model, the multipliers are set to other fixed values. Comparing two models with different fixed parameters is comparing two models that are non-nested. The models can be compared with AIC, AICc, etc., or even by just eyeballing the log-likelihood values — but the Likelihood Ratio Test is not appropriate.
This is because the whole point of the LRT is to ask whether the likelihood of the data improves significantly beyond what would be expected by chance when adding a free parameter, since adding a free parameter will always improve the likelihood at least slightly, and can never decrease the likelihood. (If you do see a decrease when adding a free parameter, this means either that your program has a math mistake, or, more likely, that your maximum-likelihood algorithm has failed to find the maximum likelihood. See ML optimization routines and their pitfalls.)
(Of course, the AIC etc. do not yield p-values, but if your model weights are strongly in favor of a more complex model, this is similar evidence that a model is a better explanation of the data.)
Is the sword moss (Bryoxiphium) a preglacial Tertiary relict?
Since different ancestral area reconstructions are based on different assumptions and can produce conflicting results (Pirie et al., 2012, Matzke, 2013 and Matzke, 2014), we compared these two versions of the DEC model with a likelihood version of the Dispersal-Vicariance Analysis (DIVALIKE), and a likelihood version of the range evolution model of the Bayesian Binary Model (BAYAREA) of RASP (Yu et al., 2015).
Short version: BayArea is the model of Landis et al. (2013). The Bayesian Binary Model (BBM) is not the same — it just treats every area as a binary character, and RASP ran this with the MrBayes library. However, the new version of RASP has I think abandoned BBM, which was deeply flawed (for example, ancestors living nowhere were allowed), and may run the BayArea library instead. BioGeoBEARS implements BAYAREALIKE, a likelihood interpretation of BayArea. The similarities and differences of BAYAREALIKE and BayArea are discussed in the example script on the main BioGeoBEARS page.
The one thing BayArea, BBM, and BAYAREALIKE all share is that there is no special cladogenesis process: under these models, the ancestral range, whether narrow or widespread, is copied to both descendants at speciation with 100% probability. This seems less plausible for most taxa, and the model usually confers much lower likelihood on the data than other models, but not always (e.g., clownfish).
Key References to Avoid Mistakes
Burnham, K.P.; Anderson, DR (2004). Multimodel Inference: Understanding AIC and BIC in Model Selection. Sociological Methods & Research, 33(2), 261-304. http://dx.doi.org/10.1177/0049124104268644
Burnham, K.P.; Anderson, DR (2002). Model selection and multimodel inference: a practical information-theoretic approach. Springer.