首页> 美国卫生研究院文献>Molecules >ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network
【2h】

ProLanGO: Protein Function Prediction Using Neural Machine Translation Based on a Recurrent Neural Network

机译:ProLanGO:基于递归神经网络的基于神经机器翻译的蛋白质功能预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

With the development of next generation sequencing techniques, it is fast and cheap to determine protein sequences but relatively slow and expensive to extract useful information from protein sequences because of limitations of traditional biological experimental techniques. Protein function prediction has been a long standing challenge to fill the gap between the huge amount of protein sequences and the known function. In this paper, we propose a novel method to convert the protein function problem into a language translation problem by the new proposed protein sequence language “ProLan” to the protein function language “GOLan”, and build a neural machine translation model based on recurrent neural networks to translate “ProLan” language to “GOLan” language. We blindly tested our method by attending the latest third Critical Assessment of Function Annotation (CAFA 3) in 2016, and also evaluate the performance of our methods on selected proteins whose function was released after CAFA competition. The good performance on the training and testing datasets demonstrates that our new proposed method is a promising direction for protein function prediction. In summary, we first time propose a method which converts the protein function prediction problem to a language translation problem and applies a neural machine translation model for protein function prediction.
机译:随着下一代测序技术的发展,由于传统生物学实验技术的局限性,确定蛋白质序列的速度快而便宜,而从蛋白质序列中提取有用的信息则相对缓慢且昂贵。填补大量蛋白质序列与已知功能之间的空白一直是蛋白质功能预测的长期挑战。本文提出了一种新的方法,将新提出的蛋白质序列语言“ ProLan”转换为蛋白质功能语言“ GOLan”,将蛋白质功能问题转化为语言翻译问题,并建立基于递归神经网络的神经机器翻译模型。网络将“ ProLan”语言翻译成“ GOLan”语言。我们通过参加2016年最新的第三次功能注释关键评估(CAFA 3)盲目测试了我们的方法,并且还评估了我们的方法对在CAFA竞争后释放功能的所选蛋白质的性能。在训练和测试数据集上的良好性能表明,我们提出的新方法是蛋白质功能预测的有希望的方向。总而言之,我们首次提出了一种将蛋白质功能预测问题转换为语言翻译问题并应用神经机器翻译模型进行蛋白质功能预测的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号