首页> 中文期刊> 《软件学报》 >最大节约原则下单倍型推导问题的实用算法

最大节约原则下单倍型推导问题的实用算法

         

摘要

在疾病的易感基因研究和药物反应实验中,常常需要知道单倍型,而不仅仅是基因型数据.但是直接通过生物学实验手段来测定单倍型在时间和成本上消耗过大,所以在实验室里往往仅测得基因型,而通过一些计算手段来推导出单倍型.不同于Clark著名的单倍型推导模型,Gusfield和Wang等人提出了一种通过基因型样本推导单倍型的新模型.这种模型试图按照最大节约原则去寻找可以解释基因型样本的最小单倍型集合.这种基于节约原则的模型克服了Clark模型的一些缺陷.提出了节约原则模型的一个多项式时间的贪心算法以及一种把贪心策略和分支限界策略集合在统一框架下的复合算法.相对于Wang原来提出的分支限界完全算法,贪心的近似算法运行快得多,而且同时保持了比较准确的推导结果.新的复合算法也是一种完全算法.实验结果表明,与原来的分支限界算法相比,复合算法可以极大地提高运行效率以及可应用的实例规模.%Haplotypes, rather than genotypes are required in some disease susceptibilities and drug response tests.However, it is both time-consuming and expensive to obtain haplotypes experimentally. Therefore usually genotype data are collected in the laboratory at first, then, haplotype data are inferred from them resorting to some computational approaches. Different from Clark's well-known haplotype inference method, Gusfield and Wang et al.proposed a new model according to the maximum parsimony principle. It tries to find a minimum set of haplotypes that can explain the genotype samples. This parsimony model overcomes some weaknesses of Clark's method. For the parsimony this paper presents model a polynomial time greedy algorithm and a compound algorithm that combines the greedy policy with the branch-and-bound strategy in a uniform framework. Compared with the original complete algorithm proposed by Wang et al., the greedy approximation algorithm runs much faster, and in the meanwhile, produces relatively higher accurate results. The compound algorithm is also a complete algorithm.Simulation results show that it is much more efficient and can be applied to instances of much larger scales than the original complete algorithm.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号