Fast alignment of large genome databases: a demonstration

机译：大型基因组数据库的快速比对：演示

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We demonstrate an efficient algorithm for alignment of large genome strings. Our algorithm constructs a Boolean match table for a given query string and database string with the help of the MRS index structure. The size of the MRS index structure is approximately 1-2% of that of database. Each entry of the match table corresponds to a query/database substring pair. An entry in the match table is marked as True if the corresponding query substring and database substring potentially contain similar patterns. It is marked as False otherwise. The size of the match table is negligible compared to that of database. Once the match table is computed, we build hash tables on these strings. Once the hash table of a string is constructed the marked substrings of other string are read sequentially and exactly matching substrings of the prespecified size are found using this hash table. We call this technique MAP (match table based pruning). Experimental results show that MAP runs up to 97 times faster than BLAST.

机译：我们展示了一种用于大型基因组字符串比对的有效算法。我们的算法借助MRS索引结构为给定的查询字符串和数据库字符串构造布尔匹配表。 MRS索引结构的大小约为数据库大小的1-2％。匹配表的每个条目都对应一个查询/数据库子字符串对。如果相应的查询子字符串和数据库子字符串可能包含相似的模式，则匹配表中的条目会标记为True。否则将其标记为False。与数据库相比，匹配表的大小可以忽略不计。计算完匹配表后，我们将在这些字符串上构建哈希表。一旦构造了一个字符串的哈希表，便会依次读取其他字符串的标记子字符串，并使用此哈希表找到与预定义大小完全匹配的子字符串。我们称这种技术为MAP（基于匹配表的修剪）。实验结果表明，MAP运行速度比BLAST快97倍。

著录项

来源
《Knowledge-Based Systems for Safety Critical Applications》|1994年|p.768-770|共3页
会议地点
作者
Kahveci T.; Singh A.K.;
展开▼
作者单位

Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Integrating human genome database into electronic health record with sequence alignment and compression mechanism. [J] . Wei-Hsin Chen, Yu-Wen Lu, Feipei Lai, Journal of medical systems . 2012,第4期

机译：通过序列比对和压缩机制将人类基因组数据库整合到电子健康记录中。
2. Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases [J] . Alexander V. Alekseyenko, Christopher J. Lee Bioinformatics . 2007,第11期

机译：嵌套包含列表（NCList）：一种用于加速基因组比对和间隔数据库的间隔查询的新算法
3. Nested Containment List (NCList): a new algorithm for accelerating interval query of genome alignment and interval databases [J] . Alexander V. Alekseyenko, and Christopher J. Lee Bioinformatics . 2007,第11期

机译：嵌套包含列表（NCList）：一种用于加速基因组比对和间隔数据库的间隔查询的新算法
4. Fast alignment of large genome databases: a demonstration [C] . Kahveci, T., Singh, . 2003

机译：大型基因组数据库的快速比对：演示
5. Finding a novel way for fast sequence alignment and exploiting information theory in bacterial genomes and complete phages. [D] . Akhter, Sajia. 2013

机译：寻找一种快速的序列比对的新方法，并利用细菌基因组和完整噬菌体中的信息论。
6. Scientific Demonstration Abstracts. Demonstration Abstracts: Tools for Medical Database Construction Access and Delivery: The Genome Data Base (GDB) [O] . Christopher W. Brunn, Peter E. Cartwright, David M. Marquette, 1990

机译：科学示范文摘。演示文摘：用于医学数据库构建访问和交付的工具：基因组数据库（GDB）
7. Fast alignment of large genome databases: A Demonstration [O] . Tamer Kahveci, Ambuj K. Singh 2003

机译：大型基因组数据库的快速比对：一个演示

Fast alignment of large genome databases: a demonstration

摘要

著录项

相似文献

相关主题

期刊订阅