...
首页> 外文期刊>Journal of Biomolecular Structure and Dynamics >A novel approach for accurate identification of splice junctions based on hybrid algorithms
【24h】

A novel approach for accurate identification of splice junctions based on hybrid algorithms

机译:基于混合算法的拼接接头精确识别新方法

获取原文
获取原文并翻译 | 示例
           

摘要

The precise prediction of splice junctions as 'exon-intron' or 'intron-exon' boundaries in a given DNA sequence is an important task in Bioinformatics. The main challenge is to determine the splice sites in the coding region. Due to the intrinsic complexity and the uncertainty in gene sequence, the adoption of data mining methods is increasingly becoming popular. There are various methods developed on different strategies in this direction. This article focuses on the construction of new hybrid machine learning ensembles that solve the splice junction task more effectively. A novel supervised feature reduction technique is developed using entropy-based fuzzy rough set theory optimized by greedy hill-climbing algorithm. The average prediction accuracy achieved is above 98% with 95% confidence interval. The performance of the proposed methods is evaluated using various metrics to establish the statistical significance of the results. The experiments are conducted using various schemes with human DNA sequence data. The obtained results are highly promising as compared with the state-of-the-art approaches in literature.
机译:在生物信息学中,精确预测剪接点在给定DNA序列中为“外显子-内含子”或“内含子-外显子”边界是一项重要的工作。主要挑战是确定编码区中的剪接位点。由于固有的复杂性和基因序列的不确定性,采用数据挖掘方法变得越来越流行。在此方向上,针对不同策略开发了多种方法。本文重点介绍新型混合机器学习集成的构建,该集成可更有效地解决拼接连接任务。利用贪婪爬山算法优化的基于熵的模糊粗糙集理论,开发了一种新颖的监督特征约简技术。在95%的置信区间内,所获得的平均预测精度超过98%。使用各种度量对所提出方法的性能进行评估,以建立结果的统计显着性。使用人类DNA序列数据的各种方案进行实验。与文献中的最新方法相比,所获得的结果很有希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号