首页> 外文期刊>Journal of Molecular Biology >PROTEIN TOPOLOGY RECOGNITION FROM SECONDARY STRUCTURE SEQUENCES - APPLICATION OF THE HIDDEN MARKOV MODELS TO THE ALPHA CLASS PROTEINS
【24h】

PROTEIN TOPOLOGY RECOGNITION FROM SECONDARY STRUCTURE SEQUENCES - APPLICATION OF THE HIDDEN MARKOV MODELS TO THE ALPHA CLASS PROTEINS

机译:二级结构序列的蛋白质拓扑识别-隐马尔可夫模型在阿尔法类蛋白质中的应用。

获取原文
获取原文并翻译 | 示例
           

摘要

The three-dimensional fold of a protein is described by the organization of its secondary structure elements in 3D space, i.e. its ''topology''. We find that the protein topology can be recognized from the 1D sequence of secondary structure states of the residues alone. Automated recognition is facilitated by use of hidden Markov models (HMMs) to represent topology families of proteins. Such models can be trained on the experimentally observed secondary structure sequences of family members using well established algorithms. Here, we model various topology groups in the alpha class of proteins and identify, from a large database, those proteins having the topology described by each model. The correct topology family for protein secondary structure sequences could be recognized 12 out of 14 times. When the observed secondary structure sequences are replaced with predicted sequences recognition is still achievable 8 out of 14 times. The success rate for observed sequences indicates that our approach will become increasingly useful as the accuracy of secondary prediction algorithms is improved. Our study indicates that the HMMs are useful for protein topology recognition even when no detectable primary amino acid sequence similarity is present. To illustrate the potential utility of our method, protein topology recognition is attempted on leptin, the obese gene product, and the human interleukin-6 sequence, for which fold predictions have been previously published. (C) 1997 Academic Press Limited. [References: 62]
机译:蛋白质的三维折叠由其在3D空间中的二级结构元素的组织来描述,即其``拓扑''。我们发现,蛋白质拓扑可以从残基的二级结构状态的一维序列中识别出来。通过使用隐马尔可夫模型(HMM)代表蛋白质的拓扑家族,可以促进自动识别。可以使用公认的算法在实验观察到的家庭成员的二级结构序列上训练此类模型。在这里,我们对蛋白质的alpha类中的各种拓扑组进行建模,并从大型数据库中识别出具有每个模型所描述的拓扑的那些蛋白质。蛋白质二级结构序列的正确拓扑家族可以在14次中识别12次。当将观察到的二级结构序列替换为预测序列时,仍可实现14次中的8次识别。观察序列的成功率表明,随着二级预测算法准确性的提高,我们的方法将变得越来越有用。我们的研究表明,即使没有可检测的一级氨基酸序列相似性,HMM也可用于蛋白质拓扑识别。为了说明我们方法的潜在实用性,尝试对瘦素,肥胖基因产物和人白介素6序列进行蛋白质拓扑识别,这些方法的折叠预测先前已经发表。 (C)1997 Academic Press Limited。 [参考:62]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号