...
首页> 外文期刊>Journal of Theoretical Biology >Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition
【24h】

Predicting membrane protein types by incorporating protein topology, domains, signal peptides, and physicochemical properties into the general form of Chou's pseudo amino acid composition

机译:通过将蛋白质拓扑,结构域,信号肽和理化特性纳入周氏假氨基酸成分的一般形式来预测膜蛋白类型

获取原文
获取原文并翻译 | 示例
           

摘要

The type information of un-annotated membrane proteins provides an important hint for their biological functions. The experimental determination of membrane protein types, despite being more accurate and reliable, is not always feasible due to the costly laboratory procedures, thereby creating a need for the development of bioinformatics methods. This article describes a novel computational classifier for the prediction of membrane protein types using proteins' sequences. The classifier, comprising a collection of one-versus-one support vector machines, makes use of the following sequence attributes: (1) the cationic patch sizes, the orientation, and the topology of transmembrane segments; (2) the amino acid physicochemical properties; (3) the presence of signal peptides or anchors; and (4) the specific protein motifs. A new voting scheme was implemented to cope with the multi-class prediction. Both the training and the testing sequences were collected from SwissProt. Homologous proteins were removed such that there is no pair of sequences left in the datasets with a sequence identity higher than 40%. The performance of the classifier was evaluated by a Jackknife cross-validation and an independent testing experiments. Results show that the proposed classifier outperforms earlier predictors in prediction accuracy in seven of the eight membrane protein types. The overall accuracy was increased from 78.3% to 88.2%. Unlike earlier approaches which largely depend on position-specific substitution matrices and amino acid compositions, most of the sequence attributes implemented in the proposed classifier have supported literature evidences. The classifier has been deployed as a web server and can be accessed at http://bsaltools.ym.edu.tw/predmpt.
机译:未注释的膜蛋白的类型信息为其生物学功能提供了重要提示。膜蛋白类型的实验测定尽管更准确和可靠,但由于昂贵的实验室程序而并不总是可行的,因此需要开发生物信息学方法。本文介绍了一种使用蛋白质序列预测膜蛋白类型的新型计算分类器。该分类器由一对多的支持向量机组成,利用以下序列属性:(1)阳离子膜片的大小,跨膜片段的方向和拓扑; (2)氨基酸的理化特性; (3)信号肽或锚的存在; (4)特定的蛋白质基序。实施了新的投票方案以应对多类别预测。培训和测试序列均从SwissProt收集。去除同源蛋白质,使得数据集中不存在序列同一性高于40%的一对序列。分类器的性能通过Jackknife交叉验证和独立的测试实验进行评估。结果表明,在八种膜蛋白类型中的七种中,拟议的分类器在预测准确性方面优于早期的预测器。总体准确性从78.3%提高到88.2%。与早期的方法很大程度上依赖于位置特异性取代矩阵和氨基酸组成的方法不同,在建议的分类器中实现的大多数序列属性都支持了文献证据。分类器已部署为Web服务器,可以从http://bsaltools.ym.edu.tw/predmpt访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号