Multiprocessing Implementation for Building a DNA q-gram Index Hash Table

机译：建立DNA Q-GRAM指数哈希表的多处理实施

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over the past few years, next-generation sequencing has become an invaluable technology for numerous applications in the field of genomics. The success of these applications are dependent on the performance of each phase in the genomic sequence pipeline, which starts with read mapping. However, read mapping is computationally intensive since it requires mapping billions of reads to numerous locations in a large reference genome. Building a q-gram index hash table has proven to be an efficient alternative to reduce the repetitive scanning of the reference during the verification step. A q-gram index hash table stores the locations of each q-gram in the reference genome. To accelerate the process of building this data structure and to exploit the multi-core architecture, instructions can be executed in parallel and distributed to multiple CPU cores. This paper performs a comparison analysis between the sequential and multiprocessing implementation of the index build time of the three methods for building a q-gram index hash table. The implementation results show that all multiprocessing versions are faster than sequential ones, with speedups ranging from 1.53 to 2.57. Although the open addressing method yields the fastest index build time, the best speedup is achieved by the minimizer-based method.

机译：在过去几年中，下一代测序已成为基因组学领域的许多应用中的一种宝贵技术。这些应用程序的成功取决于基因组序列管道中每个阶段的性能，从读取映射开始。然而，读取映射是计算密集的，因为它需要将数十亿读取的映射到大参考基因组中的许多位置。建立Q-GRAM指数哈希表已被证明是一种有效的替代方案，可在验证步骤中减少参考的重复扫描。 Q-GRAM指数哈希表在参考基因组中存储每个Q-GRAM的位置。为了加速构建该数据结构并利用多核架构的过程，可以并行执行指令并分发到多个CPU内核。本文在构建Q-GRAM索引哈希表的三种方法的索引构建时间的顺序和多处理实现之间执行比较分析。实施结果表明，所有多处理版本的速度比顺序更快，加速度范围为1.53至2.57。虽然开放寻址方法产生最快的索引构建时间，但最小的基于izer的方法实现了最佳加速。

著录项

来源
《International Conference on Computational Science and Technology》|2020年|179-191|共13页
会议地点
作者
Candace Claire Mercado; Aaron Russell Fajardo; Saira Kaye Manalili; Raphael Zapanta; Roger Luis Uy;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Bioinformatics; Read mapping; Index-based hash table;

机译：生物信息学;读取映射;基于索引的哈希表;

相似文献

外文文献
中文文献
专利

1. Comparing the performance of concurrent hash tables implemented in Haskell [J] . Duarte Rodrigo Medeiros, Du Bois Andre Rauber, Pilla Mauricio Lima, Science of Computer Programming . 2019,第MARa15期

机译：比较Haskell中实现的并发哈希表的性能
2. Efficient implementation of error correction codes in hash tables [J] . P. Reviriego, S. Pontarelli, J.A. Maestro, Microelectronics & Reliability . 2014,第1期

机译：哈希表中纠错码的有效实现
3. Implementing aggregation and broadcast over distributed hash tables [J] . Li J, Sollins K, Lim DY Computer communication review . 2005,第1期

机译：在分布式哈希表上实现聚合和广播
4. Multiprocess Implementation of DNA Pre-alignment Filtering using the Bit Matrix Algorithm [C] . Aaron Russell Fajardo, Saira Kaye Manalili, Candace Claire Mercado, International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management . 2020

机译：使用比特矩阵算法的DNA预先定位滤波的多处理实现
5. Fundamental Design Issues in Anonymous Peer-to-peer Distributed Hash Table Protocols [D] . Baumeister, Todd A. 2019

机译：匿名点对点分布式哈希表协议中的基本设计问题
6. Matching Aerial Images to 3D Building Models Using Context-Based Geometric Hashing [O] . Jaewook Jung, Gunho Sohn, Kiin Bang, 2016

机译：使用基于上下文的几何哈希将航空影像与3D建筑模型匹配
7. The Design and Implementation of Dynamic Hashing for Sets and Tables in Icon [O] . William G. Griswold 1993

机译：图标集合表动态哈希的设计与实现

Multiprocessing Implementation for Building a DNA q-gram Index Hash Table

摘要

著录项

相似文献

相关主题

期刊订阅