A New Progressive Algorithm for a Multiple Longest Common Subsequences Problem and Its Efficient Parallelization

Yang Jiaoyun; Xu Yun; Sun Guangzhong; Shang Yi

首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >A New Progressive Algorithm for a Multiple Longest Common Subsequences Problem and Its Efficient Parallelization

【24h】

A New Progressive Algorithm for a Multiple Longest Common Subsequences Problem and Its Efficient Parallelization

机译：多重最长公共子序列问题的新渐进算法及其有效并行化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The multiple longest common subsequence (MLCS) problem, which is related to the measurement of sequence similarity, is one of the fundamental problems in many fields. As an NP-hard problem, finding a good approximate solution within a reasonable time is important for solving large-size problems in practice. In this paper, we present a new progressive algorithm, Pro-MLCS, based on the dominant point approach. Pro-MLCS can find an approximate solution quickly and then progressively generate better solutions until obtaining the optimal one. Pro-MLCS employs three new techniques: 1) a new heuristic function for prioritizing candidate points; 2) a novel $(d)$-index-tree data structure for efficient computation of dominant points; and 3) a new pruning method using an upper bound function and approximate solutions. Experimental results show that Pro-MLCS can obtain the first approximate solution almost instantly and needs only a very small fraction, e.g., 3 percent, of the entire running time to get the optimal solution. Compared to existing state-of-the-art algorithms, Pro-MLCS can find better solutions in much shorter time, one to two orders of magnitude faster. In addition, two parallel versions of Pro-MLCS are developed: DPro-MLCS for distributed memory architecture and DSDPro-MLCS for hierarchical distributed shared memory architecture. Both parallel algorithms can efficiently utilize parallel computing resources and achieve nearly linear speedups. They also have a desirable progressiveness property—finding better solutions in shorter time when given more hardware resources.

机译：与序列相似性的度量有关的多重最长公共子序列（MLCS）问题是许多领域中的基本问题之一。作为NP难题，在合理的时间内找到良好的近似解对于解决实际中的大型问题很重要。在本文中，我们基于优势点方法提出了一种新的渐进算法Pro-MLCS。 Pro-MLCS可以快速找到一个近似解，然后逐步生成更好的解，直到获得最佳解。 Pro-MLCS采用了三种新技术：1）一种新的启发式函数，用于对候选点进行优先级排序； 2）一种新颖的$（d）$-索引树数据结构，用于有效计算优势点；和3）使用上限函数和近似解的新修剪方法。实验结果表明，Pro-MLCS几乎可以立即获得第一个近似解，并且仅需要很小的一部分，例如占整个运行时间的3％，即可获得最佳解。与现有的最新算法相比，Pro-MLCS可以在更短的时间内找到更好的解决方案，速度快一到两个数量级。此外，还开发了Pro-MLCS的两个并行版本：用于分布式内存体系结构的DPro-MLCS和用于分层分布式共享内存体系结构的DSDPro-MLCS。两种并行算法都可以有效利用并行计算资源，并实现近乎线性的加速。它们还具有理想的累进性-在给定更多硬件资源的情况下，可以在更短的时间内找到更好的解决方案。

著录项

来源
《Parallel and Distributed Systems, IEEE Transactions on》 |2013年第5期|862-870|共9页
作者
Yang Jiaoyun; Xu Yun; Sun Guangzhong; Shang Yi;
展开▼
作者单位

University of Science and Technology of China, Hefei;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Approximation algorithms; Complexity theory; DNA; Data structures; Heuristic algorithms; Memory architecture; Parallel algorithms; Multiple longest common subsequence problem (MLCS); SMP cluster; branch-and-bound search; distributed memory architecture; progressive algorithm; skyline problem;

机译：近似算法;复杂性理论;脱氧核糖核酸;数据结构;启发式算法;内存架构;并行算法多个最长公共子序列问题（MLCS）;SMP集群;分支和边界搜索;分布式内存架构;渐进算法天际线问题;

相似文献

外文文献
中文文献
专利

1. Efficient CGM-based parallel algorithms for the longest common subsequence problem with multiple substring-exclusion constraints [J] . Parallel Computing . 2020,第Mara期

机译：具有多个子串排除约束的最长公共子序列问题的基于CGM的高效并行算法
2. An Efficient Fast Pruned Parallel Algorithm for finding Longest Common Subsequences in BioSequences [J] . Sumathy Eswaran, S. P. RajaGopalan Annals. Computer Science Series . 2010,第1期

机译：在生物序列中找到最长共同子序列的高效快速修剪并行算法
3. BIT-PARALLEL ALGORITHMS FOR THE MERGED LONGEST COMMON SUBSEQUENCE PROBLEM [J] . SEBASTIAN DEOROWICZ, AGNIESZKA DANEK International Journal of Foundations of Computer Science . 2013,第8期

机译：合并最长子序列问题的位并行算法
4. An Efficient Parallel Algorithm for the Multiple Longest Common Subsequence (MLCS) Problem [C] . Dmitry Korkin, Qingguo Wang, Yi Shang International Conference on Parallel Processing . 2008

机译：一种有效的并行算法，用于多个最长的常见子序列（MLCS）问题
5. Designing efficient and accurate parallel genetic algorithms (Parallel algorithms). [D] . Cantu-Paz, Erick. 1999

机译：设计高效，准确的并行遗传算法（并行算法）。
6. Efficient algorithms for Longest Common Subsequence of two bucket orders to speed up pairwise genetic map comparison [O] . Lisa De Mattéo, Yan Holtz, Vincent Ranwez, -1

机译：两个存储桶顺序的最长公共子序列的高效算法可加快成对遗传图谱的比较
7. An Efficient Implementation of the Longest Common Subsequence Algorithm with Bit-Parallelism on GPUs [O] . 河南克也 2015

机译：GPU上具有位并行的最长公共子序列算法的高效实现
8. Efficient and Flexible Algorithms for Digital Signal Processing on MultipleIndependent Node Parallel Computers [R] . Tolimieri, R. 1994

机译：多独立节点并行计算机上数字信号处理的高效灵活算法

A New Progressive Algorithm for a Multiple Longest Common Subsequences Problem and Its Efficient Parallelization

摘要

著录项

相似文献

相关主题

期刊订阅