Mining of protein-protein interfacial residues from massive protein sequential and spatial data

Debby D. Wang; Weiqiang Zhou; Hong Yan

首页> 外文期刊>Fuzzy sets and systems >Mining of protein-protein interfacial residues from massive protein sequential and spatial data

【24h】

Mining of protein-protein interfacial residues from massive protein sequential and spatial data

机译：从大量蛋白质序列和空间数据中挖掘蛋白质-蛋白质界面残留物

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

It is a great challenge to process big data in bioinformatics. In this paper, we addressed the problem of identifying protein-protein interfacial residues from massive protein structural data. A protein set, comprising 154993 residues, was analyzed. We applied the three-dimensional alpha shape modeling to the search of surface and interfacial residues in this set, and adopted the spatially neighboring residue profiles to characterize each residue. These residue profiles, which revealed the sequential and spatial information of proteins, translated the original data into a large matrix. After vertically and horizontally refining this matrix, we comparably implemented a series of popular learning procedures, including neuro-fuzzy classifiers (NFCs), CART, neighborhood classifiers (NECs), extreme learning machines (ELMs) and naive Bayesian classifiers (NBCs), to predict the interfacial residues, aiming to investigate the sensitivity of these massive structural data to different learning mechanisms. As a consequence, ELMs, CART and NFCs performed better in terms of computational costs; NFCs, NBCs and ELMs provided favorable prediction accuracies. Overall, NFCs, NBCs and ELMs are favourable choices for fastly and accurately handling this type of data. More importantly, the marginal differences between the prediction performances of these methods imply the insensitivity of this type of data to different learning mechanisms.

机译：在生物信息学中处理大数据是一个巨大的挑战。在本文中，我们解决了从大量蛋白质结构数据中鉴定蛋白质-蛋白质界面残基的问题。分析了包含154993个残基的蛋白质组。我们将三维alpha形状建模应用于该组表面和界面残基的搜索，并采用空间相邻的残基轮廓来表征每个残基。这些残基图谱揭示了蛋白质的顺序和空间信息，将原始数据转换成大矩阵。在垂直和水平细化此矩阵之后，我们比较地实施了一系列流行的学习程序，包括神经模糊分类器（NFC），CART，邻域分类器（NEC），极限学习机（ELM）和朴素贝叶斯分类器（NBC），预测界面残留物，旨在研究这些大量结构数据对不同学习机制的敏感性。结果，ELM，CART和NFC在计算成本方面表现更好； NFC，NBC和ELM提供了有利的预测准确性。总体而言，NFC，NBC和ELM是快速而准确地处理此类数据的理想选择。更重要的是，这些方法的预测性能之间的边际差异意味着此类数据对不同的学习机制不敏感。

著录项

来源
《Fuzzy sets and systems》 |2015年第1期|101-116|共16页
作者
Debby D. Wang; Weiqiang Zhou; Hong Yan;
展开▼
作者单位

Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong;

Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong;

Department of Electronic Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Protein-protein interface prediction; 3D alpha shape modeling; Residue sequence profile; Joint mutual information (JMI); Neuro-fuzzy classifiers (NFCs); Neighborhood classifiers (NECs); CART; Extreme learning machines (ELMs); Naive Bayesian classifiers (NBCs);

机译：蛋白质-蛋白质界面预测;3D alpha形状建模;残基序列图;联合共同信息（JMI）;神经模糊分类器（NFC）;邻里分类器（NEC）;大车;极限学习机（ELM）;朴素贝叶斯分类器（NBC）;

相似文献

外文文献
中文文献
专利

1. Identifying protein-protein interfacial residues in heterocomplexes using residue conservation scores [J] . Li JJ, Huang DS, Wang B, International Journal of Biological Macromolecules: Structure, Function and Interactions . 2006,第3a5期

机译：使用残基保守评分鉴定杂合体中的蛋白质-蛋白质界面残基
2. Correlating protein hot spot surface analysis using ProBiS with simulated free energies of protein-protein interfacial residues [J] . Carl N., Hodo??ek M., Vehar B., Journal of chemical information and modeling . 2012,第10期

机译：使用ProBiS将蛋白质热点表面分析与蛋白质-蛋白质界面残基的模拟自由能相关
3. Template-based protein-protein docking exploiting pairwise interfacial residue restraints [J] . Xue Li C., Rodrigues Joao P. G. L. M., Dobbs Drena, Briefings in bioinformatics . 2017,第3期

机译：基于模板的蛋白质 - 蛋白质对接开采成对界面残留束缚
4. Inference of Protein-Protein Interaction Networks from Liquid-Chromatography Mass-Spectrometry Data by Approximate Bayesian Computation-Sequential Monte Carlo Sampling [C] . Yukun Tan, Fernando B. Lima Neto, Ulisses Braga Neto IEEE International Workshop on Machine Learning for Signal Processing . 2020

机译：通过近似贝叶斯计算-顺序蒙特卡洛采样从液相色谱质谱数据推论蛋白质-蛋白质相互作用网络
5. Low-storage sequential methods for data mining and the analysis of massive datasets. [D] . McDermott, James Patrick. 2003

机译：用于数据挖掘和海量数据集分析的低存储顺序方法。
6. Correlating protein hot spot surface analysis using ProBiS with simulated free energies of protein-protein interfacial residues [O] . Nejc Carl, Milan Hodošček, Blaž Vehar, -1

机译：使用蛋白质 - 蛋白质界面残留的模拟自由能量来关联蛋白质热点表面分析
7. Template-based protein-protein docking exploiting pairwise interfacial residue restraints [O] . Xue, Li C, Garcia Lopes Maia Rodrigues, João, Dobbs, Drena, 2016

机译：基于模板的蛋白质 - 蛋白质对接利用成对界面残留限制

Mining of protein-protein interfacial residues from massive protein sequential and spatial data

摘要

著录项

相似文献

相关主题

期刊订阅