首页> 外文会议>International Conference on Information and Computer Technologies >Multiple Anchor Staged Alignment Algorithm – Sensitive (MASAA – S)
【24h】

Multiple Anchor Staged Alignment Algorithm – Sensitive (MASAA – S)

机译:多锚定阶段比对算法–敏感(MASAA – S)

获取原文

摘要

Sequence alignment is common nowadays as it is used in computational biology or Bioinformatics to determine how closely two sequences are similar. There are many computational algorithms developed over the course of time to not only align two sequences. The first algorithms developed were based on a technique called Dynamic Programming which rendered them slow but produce optimal alignment. Today, however heuristic approach algorithms are popular as they are faster and yet produce near optimal alignment. In this paper, we are going to improve on a heuristic algorithm called MASAA (Multiple Anchor Staged Local Sequence Alignment Algorithm) - which we published previously. This new algorithm appropriately called MASAA - S stands for MASAA Sensitive. The algorithm is based on suffix tree data structure to identify anchors first, but to improve sensitivity, we employ adaptive seeds, and shorter perfect match seeds in between the already identified anchors. When the Anchors are separated by a greater distance than a threshold 'd', we exclude such anchors. We tested this algorithm on a randomly generated sequences, and Rosetta dataset where the sequence length ranged up to 500 thousand.
机译:如今,序列比对是常见的,因为它用于计算生物学或生物信息学来确定两个序列的相似程度。随着时间的推移,开发了许多计算算法,不仅可以对齐两个序列。最初开发的算法基于一种称为“动态编程”的技术,该算法使它们变慢但可以产生最佳对齐。如今,启发式方法算法因其速度更快而又能产生接近最优的对齐方式而广受欢迎。在本文中,我们将对以前发布的名为MASAA(多锚分级局部序列比对算法)的启发式算法进行改进。这种称为MASAA-S的新算法代表MASAA Sensitive。该算法基于后缀树数据结构来首先识别锚点,但是为了提高灵敏度,我们使用了自适应种子,并且在已经识别的锚点之间使用了更短的完美匹配种子。当锚点之间的距离大于阈值“ d”时,我们将排除此类锚点。我们在随机生成的序列和Rosetta数据集中测试了该算法,该数据集的序列长度最大为50万。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号