...
首页> 外文期刊>Discrete Applied Mathematics >Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree
【24h】

Approximation algorithms for multiple sequence alignment under a fixed evolutionary tree

机译:固定进化树下多序列比对的近似算法

获取原文
获取原文并翻译 | 示例
           

摘要

We consider the problem of multiple sequence alignment under a fixed evolutionary tree: given a tree whose leaves are labeled by sequences, find ancestral sequences to label its internal nodes so as to minimize the total length of the tree, where the length of an edge is the edit distance between the sequences labeling its endpoints. We present a new polynomial-time approximation algorithm for this problem, and analyze its performance on regular d-ary trees with d a constant. On such a tree, the algorithm finds a solution within a factor (d + 1)/(d - 1) of the minimum in O(k(d)T(d, n) + k(2d)n(2)) time, where k is the number of leaves in the tree, n is the length of the longest sequence labeling a leaf, and T(d, n) is the time to compute a Steiner point for d sequences of length at most n. (A Steiner point for a set Y of sequences is a sequence P that minimizes the sum of the edit distances from P to each sequence in Y. The time T(d,n) is O(d2(d)n(d)), given O(ds(d+1))-time preprocessing for an alphabet of size s.) The approximation algorithm is conceptually simple and easy to implement, and actually applies to any metric space in which a Steiner point for any fixed-sized set can be computed in polynomial time. We also introduce a new problem, bottleneck tree-alignment, in which the objective is to label the internal nodes of the tree so as to minimize the length of the longest edge. We describe an exponential-time exact algorithm for the case of unit-cost edit operations, and show there is a simple linear-time approximation algorithm for the general case that finds a solution within a factor O(log k) of the minimum. 1998 Published by Elsevier Science B.V. All rights reserved. [References: 21]
机译:我们考虑在固定的进化树下进行多序列比对的问题:给定一棵树的叶子被序列标记的树,找到祖先序列以标记其内部节点,以最小化树的总长度,其中边的长度为标记其端点的序列之间的编辑距离。我们针对此问题提出了一种新的多项式时间近似算法,并分析了其在具有d常数的规则d元树上的性能。在这样的树上,算法在O(k(d)T(d,n)+ k(2d)n(2))的最小值的(d +1)/(d-1)内找到一个解。时间,其中k是树中叶子的数量,n是标记叶子的最长序列的长度,T(d,n)是计算最多n个d序列的Steiner点的时间。 (用于一组序列Y的Steiner点是序列P,它使从P到Y中每个序列的编辑距离之和最小。时间T(d,n)为O(d2(d)n(d)) ,给定大小为s的字母的O(ds(d + 1))时间预处理。)近似算法在概念上简单易实现,并且实际上适用于其中任何斯坦纳点都为固定大小的任何度量空间集合可以用多项式时间计算。我们还引入了一个新的问题,瓶颈树对齐,其目的是标记树的内部节点,以最大程度地减少最长边的长度。我们针对单位成本编辑操作的情况描述了一种指数时间精确算法,并针对一般情况显示了一种简单的线性时间近似算法,该算法在最小因数O(log k)内找到了一种解决方案。 1998由Elsevier Science B.V.出版,保留所有权利。 [参考:21]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号