BioGeoBEARS Warnings to Ignore Mostly

Introduction

(link to this section)

When running any R program/script, whether using BioGeoBEARS or not, there are three sorts of problems that R itself might give you. These are:

  1. R crash: The entirety of R crashes and you exit to Terminal or the operating system. These are caused by memory misallocation and other severe problems. Sometimes closing a plot window while the plot is still being drawn by R will cause this problem.
  2. Error/script crash An error in a function/script causes R to stop execution of your program, but R remains running. This can be caused by a file not being found where you've told R it should be found, bad file formatting, user error, or many other issues.
  3. Warnings: R thinks something is weird, but managed to proceed with the analysis.

While #1 and #2 always require serious attention, #3 sometimes does not. You should of course read warnings and try to figure out their cause, and then decide whether or not they are problematic. Part of the genius/madness of R is that it is very good at "guessing" what you meant to do when you say, for example,

if (c(1,1) == TRUE) {print("yep")}

What R does here, first, is convert the logical (TRUE/FALSE) to numeric (1/0), and second, compare the list of 1s (c(1,1)) to it. This evaluates to TRUE TRUE. However, this causes a problem — you have two logical statements inside the if statement, but the if statement just wants one logical statement. What R does is choose the first, but it throws a warning because R's programmers know you probably didn't really want to have more than one logical inside the if statement.

This is the sort of warning you should probably fix in your code.

However, other warnings are less important. I will mention two that regularly appear in BioGeoBEARS analyses, but which you can basically ignore:

Note: no visible global function definition for 'calc_independent_likelihoods_on_each_branch'

(link to this section)

> source("http://phylo.wdfiles.com/local--files/biogeobears/cladoRcpp.R") # (needed now that traits model added; source FIRST!)
Loading required package: roxygen2
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_add_fossils_randomly_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_basics_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_calc_transition_matrices_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_classes_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_detection_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_DNA_cladogenesis_sim_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_extract_Qmat_COOmat_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_generics_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_models_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_on_multiple_trees_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_plots_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_readwrite_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_simulate_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_SSEsim_makePlots_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_SSEsim_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_stochastic_mapping_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_stratified_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/BioGeoBEARS_univ_model_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/calc_uppass_probs_v1.R")
> source("http://phylo.wdfiles.com/local--files/biogeobears/calc_loglike_sp_v01.R")

Note: no visible global function definition for 'calc_independent_likelihoods_on_each_branch'

This is just saying it doesn't see a function that hasn't been loaded yet, you can ignore it.

Warning: unused control arguments ignored

(link to this section)

After the ML search is concluded (on defaults, this is a search with optimx), this warning is thrown, but the program proceeds apace:

Warning message:
In (function (npt = min(n + 2L, 2L * n), rhobeg = NA, rhoend = NA,  :
  unused control arguments ignored

This warning is due to some issue in optimx or one of its dependencies (optimx wraps other ML algorithms) claiming there is an unrecognized control argument for optimx. I have double-checked the control arguments, taken them in and out, and followed the optimx help, but no matter what I do I get this warning nonetheless.

Short version: ignore this warning. If your script / search crashes, it's not because of this warning, even if it is printed close to other warnings/errors at the crash.

Warning: parameters or bounds appear to have different scalings

(link to this section)

Warning messages:
1: In optimx.check(par, optcfg$ufn, optcfg$ugr, optcfg$uhess, lower,  :
  Parameters or bounds appear to have different scalings.
  This can cause poor performance in optimization. 
  It is important for derivative free methods like BOBYQA, UOBYQA, NEWUOA.

This warning occurs if a starting parameter value (specified in the "init" column of BioGeoBEARS_run_object$BioGeoBEARS_model_object@params_table) is outside, or close to, a bound.

E.g., when you start a 3-parameter search with the results of a 2-parameter model, starting parameters at 0 or very small values (smaller than the lower bound) could result in this warning.

We are indeed using bobyqa for optimx (search with bounded limits for each parameter; L-BFGS-B is used for optim), so concievably there could be an issue.

However, in my experience so far, ML searches seem to reach the same results whether or not this warning is thrown, as long as the ML parameter values actually are within the user-specified bounds.

If you remain worried, just change the "init" starting parameters to be within the search bounds, and the warning should go away.

Another possibility is that the ranges of some of the parameters are very different. E.g., perhaps you've got x ranging from -10 to 10 and j ranging from 0 to 1. Some optimizers may complain about this. In my experience (I have not systematically explored this), it seems to make little difference to the optimization. I have also added the option BioGeoBEARS_run_object$rescale_params=TRUE, which will automatically rescale all free parameters to the same range during optimization (un-scaling them back for the likelihood calculations, of course). However, in my (again unsystematic) experience, this again seems to make little difference.

As with all Maximum Likelihood analyses in all programs, users should always be alert for signs of optimization failure in their analyses, especially in more complex models (models with more free parameters). See Advice On Statistical Model Comparison In BioGeoBEARS: ML optimization routines and their pitfalls and BioGeoBEARS Mistakes To Avoid: Issues with Maximum Likelihood (ML) optimization.

The GenSA optimizer (Generalized Simulated Annealing) seems to do a more thorough search than optim or optimx, although it is slower.

Found more than one class "tipranges" in cache; using the first, from namespace 'BioGeoBEARS'

(link to this section)

This warning:

Found more than one class "tipranges" in cache; using the first, from namespace 'BioGeoBEARS'

…just results from the order in which functions are loaded. You can ignore it.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License