首页> 外文学位 >Site -to -site rate variation in protein coding genes.
【24h】

Site -to -site rate variation in protein coding genes.

机译:蛋白质编码基因中的点对点速率变化。

获取原文
获取原文并翻译 | 示例

摘要

The ability to realistically model gene evolution improved dramatically with the rejection of the assumption that rates are constant across sites. Rate heterogeneity models allow for better estimates of parameters and site specific inferences such as the detection of positive selection. Recently developed models of codon evolution allow for both synonymous and nonsynonymous rates to vary independently according to discretized gamma distributions. I applied this model to mitochondrial genomes and concluded that synonymous rate variation is present in many genes, and is of appreciable magnitude relative to the amount of nonsynonymous heterogeneity. I then extending this model to allow for the two rates to vary according to a dependent bivariate distribution, permitting tests for the significance of correlation of rates within a gene. I present here the algorithm to discretize this bivariate distribution and the application of the model to many real data sets. Significant correlation between synonymous and nonsynonymous rates exists in roughly half of the data sets that I examined, and the correlation is typically positive. These data sets range over a wide group of taxa and genes, implying that the trend of correlation is general. Finally, I performed a thorough investigation of the statistical properties of using discretized gamma distributions to model rate variation, looking at the bias and variance in parameter estimates. These discretized distributions are common in modeling heterogeneity, but have weaknesses that must be well understood before making inferences.
机译:通过拒绝跨站点速率恒定的假设,现实地模拟基因进化的能力得到了显着提高。速率异质性模型可以更好地估计参数和特定于站点的推断,例如检测阳性选择。最近开发的密码子进化模型允许同义和非同义速率根据离散的伽马分布独立变化。我将此模型应用于线粒体基因组,并得出结论,许多基因中存在同义速率变化,相对于非同义异质性的数量而言,同义速率变化具有可观的幅度。然后,我扩展该模型,以允许两个比率根据相关的双变量分布而变化,从而允许测试基因中比率相关性的重要性。我在这里介绍了离散化此双变量分布的算法,以及该模型在许多实际数据集上的应用。在我检查的数据集中,大约有一半的数据存在同义和非同义率之间的显着相关性,并且该相关性通常为正。这些数据集涵盖了广泛的分类单元和基因,这表明相关趋势是普遍的。最后,我对使用离散化的伽玛分布来模拟速率变化的统计属性进行了彻底的研究,研究了参数估计中的偏差和方差。这些离散分布在异质性建模中很常见,但是存在一些弱点,在进行推断之前必须充分理解这些弱点。

著录项

  • 作者

    Mannino, Frank Vincent.;

  • 作者单位

    North Carolina State University.;

  • 授予单位 North Carolina State University.;
  • 学科 Biology Genetics.;Biology Biostatistics.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 155 p.
  • 总页数 155
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号