首页> 美国卫生研究院文献>Molecular Biology and Evolution >Approximating Model Probabilities in Bayesian Information Criterion and Decision-Theoretic Approaches to Model Selection in Phylogenetics
【2h】

Approximating Model Probabilities in Bayesian Information Criterion and Decision-Theoretic Approaches to Model Selection in Phylogenetics

机译:贝叶斯信息准则中的模型概率近似和系统发育模型选择的决策理论方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A priori selection of models for use in phylogeny estimation from molecular sequence data is increasingly important as the number and complexity of available models increases. The Bayesian information criterion (BIC) and the derivative decision-theoretic (DT) approaches rely on a conservative approximation to estimate the posterior probability of a given model. Here, we extended the DT method by using reversible jump Markov chain Monte Carlo approaches to directly estimate model probabilities for an extended candidate pool of all 406 special cases of the general time reversible + Γ family. We analyzed 250 diverse data sets in order to evaluate the effectiveness of the BIC approximation for model selection under the BIC and DT approaches. Model choice under DT differed between the BIC approximation and direct estimation methods for 45% of the data sets (113/250), and differing model choice resulted in significantly different sets of trees in the posterior distributions for 26% of the data sets (64/250). The model with the lowest BIC score differed from the model with the highest posterior probability in 30% of the data sets (76/250). When the data indicate a clear model preference, the BIC approximation works well enough to result in the same model selection as with directly estimated model probabilities, but a substantial proportion of biological data sets lack this characteristic, which leads to selection of underparametrized models.
机译:随着可用模型的数量和复杂性的增加,从分子序列数据中进行系统发育估计的模型的先验选择变得越来越重要。贝叶斯信息准则(BIC)和微分决策理论(DT)方法依靠保守近似来估计给定模型的后验概率。在这里,我们通过使用可逆跳马尔可夫链蒙特卡罗方法扩展了DT方法,以直接估计一般时间可逆+Γ族的所有406个特殊情况的扩展候选库的模型概率。我们分析了250个不同的数据集,以评估在BIC和DT方法下进行模型选择的BIC近似方法的有效性。 DT下的模型选择在45%的数据集(113/250)的BIC逼近方法和直接估计方法之间有所不同,并且不同的模型选择导致26%的数据集的后验分布中的树明显不同(64) / 250)。在30%的数据集中(76/250),具有最低BIC分数的模型与具有最高后验概率的模型有所不同。当数据表明模型偏好明确时,BIC近似可以很好地完成与直接估计模型概率相同的模型选择,但是相当一部分生物学数据集缺乏此特征,这导致选择参数不足的模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号