首页> 外文会议>2012 international conference on future communication and computer technology >Classification of Splice Junction DNA sequence through Data mining techniques
【24h】

Classification of Splice Junction DNA sequence through Data mining techniques

机译:通过数据挖掘技术对剪接点DNA序列进行分类

获取原文
获取原文并翻译 | 示例

摘要

Data mining on DNA sequences is gaining immense importance in state-of-the-art research as researchers and clinicians are placing more emphasis on detecting genetic markers for disease prediction and inventing new drugs for therapeutic purpose. This paper highlights the role played by machine learning algorithms in classifying a given splice gene sequence into three classes (Intron-Exon, ExonIntron, neither) that clearly differentiate between the DNA that is needed for protein creation and the superfluous DNA that is removed during protein generation. This research work involves the execution of nine classification algorithms on the Splice junctions of 3190 DNA sequences taken from the Keel data repository, each having 60 nucleotides, to detect the boundaries between introns and exons that will further aid in the process of analyzing genetic markers and understanding the mechanism of protein synthesis. The Quinlan's C4.5 algorithm and the Random Tree classification algorithm reveal 99.97% classifier accuracy on this dataset. The validity of the results has been verified by classification of test data sets using the crafted classification framework. This work will enable accurate prediction of splice junctions in a DNA sequence whose class label is unknown.
机译:随着研究人员和临床医生将更多的精力放在检测用于疾病预测的遗传标记和发明用于治疗目的的新药物上,DNA序列的数据挖掘在最新研究中正变得越来越重要。本文重点介绍了机器学习算法在将给定的剪接基因序列分为三类(Intron-Exon,ExonIntron,均未区分)中所起的作用,这三类明确区分了蛋白质创建所需的DNA和蛋白质生成过程中去除的多余DNA。代。这项研究工作涉及对从Keel数据存储库中提取的3190个DNA序列的剪接点执行九种分类算法,每个序列具有60个核苷酸,以检测内含子和外显子之间的边界,这将进一步有助于分析遗传标记和了解蛋白质合成的机制。 Quinlan的C4.5算法和随机树分类算法在此数据集上显示99.97%的分类器准确性。通过使用精心设计的分类框架对测试数据集进行分类,已验证了结果的有效性。这项工作将能够准确预测其类别标签未知的DNA序列中的剪接点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号