Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction

Han Chao; Chen Jian; Wu Qingyao; Mu Shuai; Min Huaqing

首页> 外文期刊>Journal of Bioinformatics and Computational Biology >Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction

【24h】

Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction

机译：基于稀疏马尔可夫链的半监督多实例多标签蛋白质功能预测方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automated assignment of protein function has received considerable attention in recent years for genome-wide study. With the rapid accumulation of genome sequencing data produced by high-throughput experimental techniques, the process of manually predicting functional properties of proteins has become increasingly cumbersome. Such large genomics data sets can only be annotated computationally. However, automated assignment of functions to unknown protein is challenging due to its inherent difficulty and complexity. Previous studies have revealed that solving problems involving complicated objects with multiple semantic meanings using the multi-instance multi-label (MIML) framework is er effective. For the protein function prediction problems, each protein object in nature may associate with distinct structural units (instances) and multiple functional properties (class labels) where each unit is described by an instance and each functional property is considered as a class label. Thus, it is convenient and natural to tackle the protein function prediction problem by using the MIML framework. In this paper, we propose a sparse Markov chain-based semi-supervised MIML method, called Sparse-Markov. A sparse transductive probability graph is constructed to encode the affinity information of the data based on ensemble of Hausdorff distance metrics. Our goal is to exploit the affinity between protein objects in the sparse transductive probability graph to seek a sparse steady state probability of the Markov chain model to do protein function prediction, such that two proteins are given similar functional labels if they are close to each other in terms of an ensemble Hausdorff distance in the graph. Experimental results on seven real-world organism data sets covering three biological domains show that our proposed Sparse-Markov method is able to achieve better performance than four state-of-the-art MIML learning algorithms.

机译：近年来，蛋白质功能的自动分配已在全基因组研究中引起了广泛关注。随着通过高通量实验技术产生的基因组测序数据的快速积累，手动预测蛋白质功能特性的过程变得越来越繁琐。如此庞大的基因组数据集只能通过计算进行注释。然而，由于其固有的困难和复杂性，将功能自动分配给未知蛋白质具有挑战性。以前的研究表明，使用多实例多标签（MIML）框架解决涉及具有多个语义含义的复杂对象的问题是有效的。对于蛋白质功能预测问题，自然界中的每个蛋白质对象都可能与不同的结构单元（实例）和多个功能特性（类标记）相关联，其中每个单元由一个实例描述，每个功能特性都被视为类标记。因此，使用MIML框架解决蛋白质功能预测问题既方便又自然。在本文中，我们提出了一种基于稀疏Markov链的半监督MIML方法，称为Sparse-Markov。构造稀疏的转导概率图，以基于Hausdorff距离度量的集成来编码数据的亲和力信息。我们的目标是利用稀疏转导概率图中蛋白质对象之间的亲和力，以寻求马尔可夫链模型的稀疏稳态概率来进行蛋白质功能预测，这样，如果两个蛋白质彼此靠近，则会被赋予相似的功能标记以图中的整体Hausdorff距离表示。在涵盖三个生物学领域的七个真实世界生物数据集上的实验结果表明，我们提出的Sparse-Markov方法比四种最新的MIML学习算法能够实现更好的性能。

著录项

来源
《Journal of Bioinformatics and Computational Biology》 |2015年第5期|共20页
作者
Han Chao; Chen Jian; Wu Qingyao; Mu Shuai; Min Huaqing;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类细胞生物学;
关键词
Protein function prediction; multi-instance multi-label learning; Markov chain; Hausdorff distance; semi-supervised learning;

机译：蛋白质功能预测;多实例多标签学习;马尔可夫链;Hausdorff距离;半监督学习;

相似文献

外文文献
中文文献
专利

1. Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction [J] . Han Chao, Chen Jian, Wu Qingyao, Journal of Bioinformatics and Computational Biology . 2015,第5期

机译：基于稀疏马尔可夫链的半监督多实例多标签蛋白质功能预测方法
2. Genome-Wide Protein Function Prediction through Multi-Instance Multi-Label Learning [J] . Wu J., Huang S., Zhou Z. Computational Biology and Bioinformatics, IEEE/ACM Transactions on . 2014,第5期

机译：通过多实例多标签学习进行全基因组蛋白功能预测
3. Semi-Supervised Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport [J] . Yang Yang, Fu Zhao-Yang, Zhan De-Chuan, IEEE Transactions on Knowledge and Data Engineering . 2021,第2期

机译：半监控多模态多实例多标签深网络，最佳运输
4. Online Multi-Instance Multi-Label learning for protein function prediction [C] . Feng Wu, Qiong Liu, Tianyong Hao, IEEE International Conference on Bioinformatics and Biomedicine . 2016

机译：在线多实例多标签学习，用于蛋白质功能预测
5. Protein structure analysis and prediction utilizing the Fuzzy Greedy K-means Decision Forest model and Hierarchically-Clustered Hidden Markov Models method. [D] . Hudson, Cody Landon. 2013

机译：利用模糊贪婪K均值决策森林模型和层次聚类的隐马尔可夫模型方法对蛋白质结构进行分析和预测。
6. Semi-supervised prediction of protein subcellular localization using abstraction augmented Markov models [O] . Cornelia Caragea, Doina Caragea, Adrian Silvescu, 2010

机译：使用抽象增强马尔可夫模型的半监督预测蛋白亚细胞定位
7. Application of three graph Laplacian based semi-supervised learning methods to protein function prediction problem [O] . Tran, Loc 2013

机译：基于三图拉普拉斯算子的半监督学习的应用蛋白质功能预测问题的方法

Sparse Markov chain-based semi-supervised multi-instance multi-label method for protein function prediction

摘要

著录项

相似文献

相关主题

期刊订阅