首页> 外文会议>International Conference on Bioinformatics and Computational Biology >Bayesian Approach for Identifying Short Adjacent Repeats in Multiple Sequences
【24h】

Bayesian Approach for Identifying Short Adjacent Repeats in Multiple Sequences

机译:贝叶斯方法用于在多个序列中识别短相邻重复的方法

获取原文

摘要

For the problem of identifying short adjacent repetitive patterns in long biological sequences, such as tandem repeats in DNA sequences, traditional methods have been largely relying on the periodicity of a short segment in a single long sequence. In this paper, we introduce a full probabilistic generative model and a binary vector data structure to formulate this problem. Our model allows intraunit mismatches and inter-unit insertions. It is capable of identifying the shared repetitive pattern in multiple input sequences. A Bayesian approach is used to compute the model in a de novo fashion. A collapsing technique is used to improve the computing efficiency. The experiments on both synthetic data and real data have demonstrated the effectiveness of the proposed MCMC algorithm.
机译:对于鉴定长生物序列中的短相邻重复模式的问题,例如DNA序列中的串联重复,传统方法在很大程度上依赖于单个长序列的短段的周期性。在本文中,我们介绍了一个完整的概率生成模型和二元矢量数据结构,以制定这个问题。我们的模型允许intrainit不匹配和单元间插入。它能够识别多个输入序列中的共享重复模式。贝叶斯方法用于计算DE Novo时尚的模型。崩溃技术用于提高计算效率。合成数据和实际数据的实验表明了所提出的MCMC算法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号