首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >TexAnASD: Text Analytics for ASD Risk Gene Predictions
【24h】

TexAnASD: Text Analytics for ASD Risk Gene Predictions

机译:TexAnASD:用于ASD风险基因预测的文本分析

获取原文

摘要

Autism Spectrum Disorder (ASD) is an extreme neurodevelopmental disease affecting 1 in every 59 children in the United States, and approximately 1% of US population. The clinical traits of the disorder include noticeable deficits in social interactions, language development and in many cases very narrow and repetitive interests and behaviors. ASD is highly heritable genetic disease, but the known causes including the biomarkers associated with ASD form only the tip of the iceberg. Over the past decade extensive research on exome sequences revealed only about one hundred genetic biomarkers with very high confidence. Number of putative ASD causing genes is rapidly growing with the advent of new technologies while researchers are struggling now to assess which genes are the true causing genes. Manual curation of each of the long list of genes is a cumbersome process that requires huge amount of expert work-hours, and is therefore expensive. An in silico prediction method can assist the human experts to check only a short-list of genes that were filtered through a machine learning system. Most of the existing ASD gene prediction algorithms involve high-performance computing platform to analyze large-scale genetic data which is counter-intuitive to the actual benefit of using an in silico method in the first place. We propose TexAnASD, a text analytics based ASD gene prediction algorithm that only utilizes what we know about each gene that we learn from published literatures. The proposed method outperforms most of the state-of-the-art ASD associated gene prediction methods. Moreover, the method offers an inexpensive model than those of the other competing solutions in terms of computational complexity and running time. All source codes, dataset, predictions and functional insights are available at http://ml.cse.ucdenver.edu/research/TexAnASD.
机译:自闭症谱系障碍(ASD)是一种极端的神经发育疾病,在美国每59名儿童中就有1名受到影响,约占美国总人口的1%。该疾病的临床特征包括社交互动,语言发展方面的明显缺陷,在许多情况下还包括非常狭窄和重复的兴趣和行为。 ASD是高度可遗传的遗传疾病,但已知的原因(包括与ASD相关的生物标记物)仅构成冰山一角。在过去的十年中,对外显子组序列的广泛研究显示,只有大约一百种遗传生物标记具有很高的置信度。随着新技术的出现,推定的引起ASD的基因数量正在迅速增长,而研究人员正在努力评估哪些基因是真正的引起基因。手动管理每个长长的基因列表是一个繁琐的过程,需要大量的专家工作时间,因此非常昂贵。电子计算机预测方法可以帮助人类专家检查通过机器学习系统过滤的基因的简短列表。现有的大多数ASD基因预测算法都涉及高性能计算平台,用于分析大规模遗传数据,这与使用in silico方法的实际好处背道而驰。我们提出TexAnASD,这是一种基于文本分析的ASD基因预测算法,该算法仅利用我们对从公开文献中学到的每个基因的了解。所提出的方法优于大多数最新的ASD相关基因预测方法。此外,就计算复杂度和运行时间而言,该方法提供了比其他竞争解决方案便宜的模型。所有源代码,数据集,预测和功能见解都可从http://ml.cse.ucdenver.edu/research/TexAnASD获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号