首页> 外文期刊>IEEE/ACM transactions on computational biology and bioinformatics >Accuracy Assessment of Diploid Consensus Sequences
【24h】

Accuracy Assessment of Diploid Consensus Sequences

机译:二倍体共识序列的准确性评估

获取原文
获取原文并翻译 | 示例
           

摘要

If the origins of fragments are known in genome sequencing projects, it is straightforward to reconstruct diploid consensus sequences. In reality, however, this is not true. Although there are proposed methods to reconstruct haplotypes from genome sequencing projects, an accuracy assessment is required to evaluate the confidence of the estimated diploid consensus sequences. In this paper, we define the confidence score of diploid consensus sequences. It requires the calculation of the likelihood of an assembly. To calculate the likelihood, we propose a linear time algorithm with respect to the number of polymorphic sites. The likelihood calculation and confidence score are used for further improvements of haplotype estimation in two directions. One direction is that low-scored phases are disconnected. The other direction is that, instead of using nominal frequency 1/2, the haplotype frequency is estimated to reflect the actual contribution of each haplotype. Our method was evaluated on the simulated data whose polymorphism rate (1.2 percent) was based on Ciona intestinalis. As a result, the high accuracy of our algorithm was indicated: The true positive rate of the haplotype estimation was greater than 97 percent
机译:如果片段的起源在基因组测序项目中是已知的,则可以很容易地重建二倍体共有序列。但是实际上,这是不正确的。尽管提出了从基因组测序项目重建单倍型的方法,但仍需要进行准确性评估以评估估计的二倍体共有序列的置信度。在本文中,我们定义了二倍体共有序列的置信度得分。它需要计算装配的可能性。为了计算可能性,我们针对多态位点的数量提出了线性时间算法。似然计算和置信度得分用于在两个方向上进一步改善单倍型估计。一个方向是低分阶段断开。另一个方向是,不使用标称频率1/2,而是估计单倍型频率以反映每个单倍型的实际贡献。我们的方法是基于模拟数据进行评估的,该数据的多态性比率(1.2%)基于Ciona intestinalis。结果表明,我们的算法具有很高的准确性:单倍型估计的真实阳性率大于97%

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号