首页> 外文会议>International Conference on Bioinformatics and Biomedicine Workshops >A modularized MapReduce framework to support RNA secondary structure prediction and analysis workflows
【24h】

A modularized MapReduce framework to support RNA secondary structure prediction and analysis workflows

机译:模块化MapReduce框架,用于支持RNA二级结构预测和分析工作流程

获取原文

摘要

Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures tend to yield better accuracy than predicting the secondary structure using the entire RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. The RNA sequence can be cut into chunks using different cutting methods and chunk lengths. Several prediction methods, with different degree of accuracy and computing requirements, can be used. The reconstruction of shorter predictions into the entire sequence can rely on simply gluing the parts together or on using more sophisticated merging algorithms. To allow scientists to perform a systematic analysis of the impact of the several methods and parameters, we propose a modularized framework using MapReduce. The framework enables scientists to automatically and efficiently explore large parametric spaces of chunking, prediction, re-construction, and analysis methods. This paper shows how the MapReduce framework allows scientists to gain insights about different chunking strategies easily, accurately, and efficiently.
机译:核糖核酸(RNA)分子在许多生物过程中起重要作用,包括基因表达和调节。它们的二级结构对于RNA官能度至关重要,并且广泛研究二次结构的预测。以前的研究表明,将长序列切成较短的块,以使用热力学方法独立地预测块的二次结构,并从预测的块结构重建整个次要结构倾向于产生比使用整个RNA序列预测二次结构的更好的精度。所有的。块,预测和重建过程可以使用不同的方法和参数,其中一些方法比其他方法产生更准确的预测。可以使用不同的切割方法和块长度切割RNA序列。可以使用几种预测方法,具有不同程度的精度和计算要求。重建在整个序列中的较短预测可以依赖于将部件简单地粘合在一起或使用更复杂的合并算法。为了允许科学家对几种方法和参数的影响进行系统分析,我们提出了一种使用MapReduce的模块化框架。该框架使科学家能够自动和高效地探索大型的分布,预测,重新构建和分析方法的大型参数空间。本文展示了MapReduce框架如何让科学家容易,准确,有效地了解不同的块策略的见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号