首页> 外文期刊>Computational Biology and Bioinformatics, IEEE/ACM Transactions on >PRBP: Prediction of RNA-Binding Proteins Using a Random Forest Algorithm Combined with an RNA-Binding Residue Predictor
【24h】

PRBP: Prediction of RNA-Binding Proteins Using a Random Forest Algorithm Combined with an RNA-Binding Residue Predictor

机译:PRBP:使用随机森林算法结合RNA结合残基预测因子的RNA结合蛋白的预测。

获取原文
获取原文并翻译 | 示例
       

摘要

The prediction of RNA-binding proteins is an incredibly challenging problem in computational biology. Although great progress has been made using various machine learning approaches with numerous features, the problem is still far from being solved. In this study, we attempt to predict RNA-binding proteins directly from amino acid sequences. A novel approach, PRBP predicts RNA-binding proteins using the information of predicted RNA-binding residues in conjunction with a random forest based method. For a given protein, we first predict its RNA-binding residues and then judge whether the protein binds RNA or not based on information from that prediction. If the protein cannot be identified by the information associated with its predicted RNA-binding residues, then a novel random forest predictor is used to determine if the query protein is a RNA-binding protein. We incorporated features of evolutionary information combined with physicochemical features (EIPP) and amino acid composition feature to establish the random forest predictor. Feature analysis showed that EIPP contributed the most to the prediction of RNA-binding proteins. The results also showed that the information from the RNA-binding residue prediction improved the overall performance of our RNA-binding protein prediction. It is anticipated that the PRBP method will become a useful tool for identifying RNA-binding proteins. A PRBP Web server implementation is freely available at http://www.cbi.seu.edu.cn/PRBP/.
机译:RNA结合蛋白的预测是计算生物学中一个极具挑战性的问题。尽管使用具有众多功能的各种机器学习方法已经取得了长足的进步,但是这个问题仍然远远没有解决。在这项研究中,我们尝试直接从氨基酸序列预测RNA结合蛋白。 PRBP是一种新颖的方法,结合基于随机森林的方法,使用预测的RNA结合残基信息预测RNA结合蛋白。对于给定的蛋白质,我们首先预测其RNA结合残基,然后根据该预测信息判断该蛋白质是否结合RNA。如果无法通过与其预测的RNA结合残基相关的信息识别该蛋白,则使用新型随机森林预测因子来确定查询蛋白是否为RNA结合蛋白。我们结合了进化信息的特征与理化特征(EIPP)和氨基酸组成特征相结合,建立了随机森林预测因子。特征分析表明,EIPP对RNA结合蛋白的预测贡献最大。结果还表明,来自RNA结合残基预测的信息改善了我们的RNA结合蛋白预测的整体性能。预期PRBP方法将成为鉴定RNA结合蛋白的有用工具。 PRBP Web服务器实现可从http://www.cbi.seu.edu.cn/PRBP/免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号