首页> 美国卫生研究院文献>Molecular Biology and Evolution >Assessment of Substitution Model Adequacy Using Frequentist and Bayesian Methods
【2h】

Assessment of Substitution Model Adequacy Using Frequentist and Bayesian Methods

机译:使用频率和贝叶斯方法评估替代模型的充分性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In order to have confidence in model-based phylogenetic methods, such as maximum likelihood (ML) and Bayesian analyses, one must use an appropriate model of molecular evolution identified using statistically rigorous criteria. Although model selection methods such as the likelihood ratio test and Akaike information criterion are widely used in the phylogenetic literature, model selection methods lack the ability to reject all models if they provide an inadequate fit to the data. There are two methods, however, that assess absolute model adequacy, the frequentist Goldman–Cox (GC) test and Bayesian posterior predictive simulations (PPSs), which are commonly used in conjunction with the multinomial log likelihood test statistic. In this study, we use empirical and simulated data to evaluate the adequacy of common substitution models using both frequentist and Bayesian methods and compare the results with those obtained with model selection methods. In addition, we investigate the relationship between model adequacy and performance in ML and Bayesian analyses in terms of topology, branch lengths, and bipartition support. We show that tests of model adequacy based on the multinomial likelihood often fail to reject simple substitution models, especially when the models incorporate among-site rate variation (ASRV), and normally fail to reject less complex models than those chosen by model selection methods. In addition, we find that PPSs often fail to reject simpler models than the GC test. Use of the simplest substitution models not rejected based on fit normally results in similar but divergent estimates of tree topology and branch lengths. In addition, use of the simplest adequate substitution models can affect estimates of bipartition support, although these differences are often small with the largest differences confined to poorly supported nodes. We also find that alternative assumptions about ASRV can affect tree topology, tree length, and bipartition support. Our results suggest that using the simplest substitution models not rejected based on fit may be a valid alternative to implementing more complex models identified by model selection methods. However, all common substitution models may fail to recover the correct topology and assign appropriate bipartition support if the true tree shape is difficult to estimate regardless of model adequacy.
机译:为了对基于模型的系统发育方法(例如最大似然(ML)和贝叶斯分析)充满信心,必须使用通过统计学上严格的标准鉴定的分子进化模型。尽管系统发育文献中广泛使用了诸如似然比检验和Akaike信息准则之类的模型选择方法,但是如果模型选择方法不能充分拟合数据,则该模型选择方法将无法拒绝所有模型。但是,有两种方法可以评估绝对模型的充分性,即常值高盛-考克斯(GC)检验和贝叶斯后验预测模拟(PPS),它们通常与多项式对数似然检验统计量结合使用。在这项研究中,我们使用经验数据和模拟数据来评估使用频度法和贝叶斯方法的通用替代模型的充分性,并将结果与​​通过模型选择方法获得的结果进行比较。此外,我们从拓扑,分支长度和分区支持方面研究了机器学习和贝叶斯分析中模型充分性与性能之间的关系。我们显示,基于多项式似然性的模型充分性测试通常无法拒绝简单的替换模型,尤其是当模型包含站点间速率变化(ASRV)时,并且通常不能拒绝比通过模型选择方法选择的模型更简单的模型。此外,我们发现PPS常常无法拒绝比GC测试更简单的模型。使用不基于拟合拒绝的最简单替换模型通常会导致类似但不同的树形拓扑和分支长度估计。此外,使用最简单的适当替换模型可能会影响对分区支持的估计,尽管这些差异通常很小,而最大差异仅限于支持不佳的节点。我们还发现,关于ASRV的替代假设可能会影响树的拓扑,树的长度和分区支持。我们的结果表明,使用不基于拟合而拒绝的最简单替换模型可能是实现通过模型选择方法识别的更复杂模型的有效替代方法。但是,如果无论模型是否适当都难以估计真实的树形,则所有常见的替换模型都可能无法恢复正确的拓扑并分配适当的分区支持。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号