Reduced alphabet motif methodology for GPCR annotation.

Gangal R; Kumar KK

首页> 外文期刊>Journal of Biomolecular Structure and Dynamics >Reduced alphabet motif methodology for GPCR annotation.

【24h】

Reduced alphabet motif methodology for GPCR annotation.

机译：用于GPCR注释的简化字母图案方法。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Identification and Classification of G-protein coupled receptors (GPCRs) using protein sequences is an important computational challenge, given that experimental screening of thousands of ligands is an expensive proposition. There are two distinct but complementary approaches to GPCR classification --machine learning and sequence motif analysis. Machine learning methodologies typically suffer from problems of class imbalance and lack of multi-class classification. Many sequence motif methods, meanwhile, are too dependent on the similarity of the primary sequence alignments. It is desirable to have a motif discovery and application methodology that is not strongly dependent on primary sequence similarity. It should also overcome limitations of machine learning. We propose and evaluate the effectiveness of a simple methodology that uses a reduced protein functional alphabet representation, where similar functional residues have similar symbols. Regular expression motifs can then be obtained by ClustalW based multiple sequence alignment, using an identity matrix. Since evolutionary matrices like BLOSUM, PAM are not used, this method can be useful for any set of sequences that do not necessarily share a common ancestry. Reduced alphabet motifs can accurately classify known GPCR proteins and the results are comparable to PRINTS and PROSITE. For well known GPCR proteins from SWISSPROT, there were no false negatives and only a few false positives. This methodology covers most currently known classes of GPCRs, even if there are very few representative sequences. It also predicts more than one class for certain sequences, thus overcoming the limitation of machine learning methods. We also annotated, 695 orphan receptors, and 121 were identified as belonging to Family A. A simple JavaScript based web interface has been developed to predict GPCR families and subfamilies (www.insilico-consulting.com/gpcrmotif.html).

机译：考虑到数千种配体的实验筛选是一项昂贵的提议，使用蛋白质序列鉴定和分类G蛋白偶联受体（GPCR）是一项重要的计算挑战。 GPCR分类有两种不同但互补的方法-机器学习和序列基序分析。机器学习方法通常会遇到类不平衡和缺乏多类分类的问题。同时，许多序列基序方法太依赖于一级序列比对的相似性。期望具有不强烈依赖于一级序列相似性的基序发现和应用方法。它还应克服机器学习的局限性。我们提出并评估使用减少的蛋白质功能字母表示的简单方法的有效性，其中相似的功能残基具有相似的符号。然后可以使用同一矩阵通过基于ClustalW的多序列比对获得正则表达基序。由于未使用诸如BLOSUM，PAM之类的进化矩阵，因此该方法可用于不一定共享共同祖先的任何序列集。减少的字母图案可以准确地对已知的GPCR蛋白进行分类，其结果与PRINTS和PROSITE相当。对于来自SWISSPROT的众所周知的GPCR蛋白，没有假阴性，只有少数假阳性。即使很少有代表性序列，该方法学也涵盖了大多数当前已知的GPCR类。对于某些序列，它还可以预测一个以上的类，从而克服了机器学习方法的局限性。我们还注释了695个孤儿受体和121个孤儿，它们属于家族A。已经开发了基于JavaScript的简单Web界面来预测GPCR家族和亚家族（www.insilico-consulting.com/gpcrmotif.html）。

著录项

来源
《Journal of Biomolecular Structure and Dynamics》 |2007年第3期|共12页
作者
Gangal R; Kumar KK;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类分子生物学;
关键词
methodology lt; 2gt; Learning; Reduced; GPCR protein; 学习;

机译：方法2;学习;还原;GPCR蛋白;学习;

相似文献

外文文献
中文文献
专利

1. Reduced alphabet motif methodology for GPCR annotation. [J] . Gangal R, Kumar KK Journal of Biomolecular Structure and Dynamics . 2007,第3期

机译：用于GPCR注释的简化字母图案方法。
2. Automated discovery of 3D motifs for protein function annotation. [J] . Polacco BJ, Babbitt PC Bioinformatics . 2006,第6期

机译：自动发现蛋白质功能注释的3D主题。
3. Structural alphabet motif discovery and a structural motif database [J] . KuS.-Y., HuY.-J. Computers in Biology and Medicine . 2012,第1期

机译：结构字母主题发现和结构主题数据库
4. Discovering Subtype Specific n-Gram Motifs in Class C GPCR N-Termini [C] . Caroline KONIG, Rene ALQUEZAR, Alfredo VELLIDO, International Conference of the Catalan Association for Artificial Intelligence . 2017

机译：在C类GPCR N-Termini中发现亚型特异性n-gram主题
5. Reduced alphabet of amino acids and its application to alternative splicing detection. [D] . Park, Minsoo. 2009

机译：简化的氨基酸字母及其在替代剪接检测中的应用。
6. On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction [O] . Bernat Anton, Mireia Besalú, Oriol Fornes, 2021

机译：在使用直耦合分析与氨基酸的减少字母表的使用与超级二级结构基序联合蛋白质折叠预测
7. On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction [O] . Bernat Anton, Mireia Besalú, Oriol Fornes, 2021

机译：在使用直耦合分析与氨基酸的减少字母表的使用与超级二级结构基序联合蛋白质折叠预测

Reduced alphabet motif methodology for GPCR annotation.

摘要

著录项

相似文献

相关主题

期刊订阅