首页> 美国卫生研究院文献>Molecular Biology and Evolution >Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence with Correction for Sequence Errors
【2h】

Inferring Demography from Runs of Homozygosity in Whole-Genome Sequence with Correction for Sequence Errors

机译:从全基因组序列纯合子运行中推断人口统计学并校正序列错误

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Whole-genome sequence is potentially the richest source of genetic data for inferring ancestral demography. However, full sequence also presents significant challenges to fully utilize such large data sets and to ensure that sequencing errors do not introduce bias into the inferred demography. Using whole-genome sequence data from two Holstein cattle, we demonstrate a new method to correct for bias caused by hidden errors and then infer stepwise changes in ancestral demography up to present. There was a strong upward bias in estimates of recent effective population size (Ne) if the correction method was not applied to the data, both for our method and the Li and Durbin (Inference of human population history from individual whole-genome sequences. Nature 475:493–496) pairwise sequentially Markovian coalescent method. To infer demography, we use an analytical predictor of multiloci linkage disequilibrium (LD) based on a simple coalescent model that allows for changes in Ne. The LD statistic summarizes the distribution of runs of homozygosity for any given demography. We infer a best fit demography as one that predicts a match with the observed distribution of runs of homozygosity in the corrected sequence data. We use multiloci LD because it potentially holds more information about ancestral demography than pairwise LD. The inferred demography indicates a strong reduction in the Ne around 170,000 years ago, possibly related to the divergence of African and European Bos taurus cattle. This is followed by a further reduction coinciding with the period of cattle domestication, with Ne of between 3,500 and 6,000. The most recent reduction of Ne to approximately 100 in the Holstein breed agrees well with estimates from pedigrees. Our approach can be applied to whole-genome sequence from any diploid species and can be scaled up to use sequence from multiple individuals.
机译:全基因组序列可能是推断祖先人口统计学的遗传数据的最丰富来源。但是,完整序列也给充分利用如此大的数据集和确保测序错误不会在推断的人口统计学中引入偏差带来了重大挑战。利用来自两只荷斯坦牛的全基因组序列数据,我们展示了一种新方法,可以纠正由隐藏错误引起的偏差,然后推断祖先人口统计资料中的逐步变化。如果不对我们的方法以及Li和Durbin(从单个全基因组序列推论人类历史的数据)均未应用校正方法,则近期有效种群大小(Ne)的估计值将存在很大的上升偏差。 475:493-496)成对顺序马尔可夫合并方法。要推断人口统计学,我们使用基于简单合并模型的多地点连锁不平衡(LD)的分析预测变量,该模型可以实现Ne的变化。 LD统计数据汇总了任何给定人口统计学的纯合游程分布。我们推断最佳拟合的人口统计学是预测与校正的序列数据中纯合子运行的观察分布相匹配的人种。我们使用多地点LD,因为与成对LD相比,它可能包含更多有关祖先人口统计学的信息。推断的人口统计资料表明,大约170,000年前Ne的强烈减少,这可能与非洲和欧洲的Bos taurus牛的发散有关。随后,随着牛的驯化期的进一步减少,Ne在3500至6,000之间。荷斯坦奶牛中Ne的最新减少到大约100,这与家系的估计非常吻合。我们的方法可以应用于任何二倍体物种的全基因组序列,并且可以扩大规模以使用来自多个个体的序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号