...
首页> 外文期刊>Bioinformatics >Site-specific evolutionary rates in proteins are better modeled as non-independent and strictly relative
【24h】

Site-specific evolutionary rates in proteins are better modeled as non-independent and strictly relative

机译:蛋白质中的位点特异性进化速率可以更好地建模为非独立且严格相对的

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: In a nucleotide or amino acid sequence, not all sites evolve at the same rate, due to differing selective constraints at each site. Currently in computational molecular evolution, models incorporating rate heterogeneity always share two assumptions. First, the rate of evolution at each site is assumed to be independent of every other site. Second, the values of these rates are assumed to be drawn from a known prior distribution. Although often assumed to be small, the actual effect of these assumptions has not been previously quantified in the literature.Results: Herein we describe an algorithm to simultaneously infer the set of n-1 relative rates that parameterize the likelihood of an n-site alignment. Unlike previous work (a) these relative rates are completely identifiable and distinct from the branch-length parameters, and (b) a far more general class of rate priors can be used, and their effects quantified. Although described in a Bayesian framework, we discuss a future maximum likelihood extension.Conclusions: Using both synthetic data and alignments from the Myc, Max and p53 protein families, we find that inferring relative rather than absolute rates has several advantages. First, both empirical likelihoods and Bayes factors show strong preference for the relative-rate model, with a mean Delta 1nP=-0.458 per alignment site. Second, the computed likelihoods and Bayes factors were essentially independent of the relative-rate prior, indicating that good estimates of the posterior rate distribution are not required a priori. Third, a novel finding is that rates can be accurately inferred even when up to approximate to 4 substitutions per site have occurred. Thus biologically relevant putative hypervariable sites can be identified as easily as conserved sites. Lastly, our model treats rates and tree branch-lengths as completely identifiable, allowing for the first time coherent simultaneous inference of branch-lengths and site-specific evolutionary rates.
机译:动机:在核苷酸或氨基酸序列中,由于每个位点的选择性限制不同,并非所有位点都以相同的速率进化。当前在计算分子进化中,合并速率异质性的模型总是共享两个假设。首先,假设每个站点的演化速率与其他站点无关。其次,假定这些比率的值是从已知的先验分布中得出的。尽管通常假定这些假设很小,但这些假设的实际效果尚未在文献中进行量化。结果:在此,我们描述了一种算法,该算法可同时推断一组n-1个相对速率,该参数可对n部位比对的可能性进行参数化。与先前的工作不同,(a)这些相对比率是完全可识别的,并且与分支长度参数不同;(b)可以使用更为通用的比率先验类别,并且可以量化其影响。尽管在贝叶斯框架中进行了描述,但我们讨论了未来的最大似然扩展。结论:使用合成数据和Myc,Max和p53蛋白家族的比对,我们发现推断相对速率而不是绝对速率具有多个优点。首先,经验似然和贝叶斯因子均显示出对相对速率模型的强烈偏好,每个比对位点的平均Delta 1nP = -0.458。其次,计算的似然性和贝叶斯因子基本上与先验相对利率无关,这表明不需要先验对后验利率分布进行良好的估计。第三,一个新颖的发现是,即使每个位点发生多达大约4个替换,也可以准确地推断出比率。因此,生物学上相关的推定高变位点可以与保守位点一样容易地被鉴定。最后,我们的模型将速率和树分支长度视为完全可识别的,从而首次实现了对分支长度和特定地点进化速率的一致同时推断。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号