Multi-label text classification based on the label correlation mixture model

He Zhiyang; Wu Ji; Lv Ping

首页> 外文期刊>Intelligent data analysis >Multi-label text classification based on the label correlation mixture model

【24h】

Multi-label text classification based on the label correlation mixture model

机译：基于标签相关混合模型的多标签文本分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the current paper, we propose a probabilistic generative model, the label correlation mixture model (LCMM), to depict multi-labeled document data, which can be utilized for multi-label text classification. LCMM assumes two stochastic generative processes, which correspond to two submodels: 1) a label correlation model; and 2) a label mixture model. The former model formulates labels' generative process, in which a label correlation network is created to depict the dependency between labels. Moreover, we present an efficient inference algorithm for calculating the generative probability of a multi-label class. Furthermore, in order to optimize the label correlation network, we propose a parameter-learning algorithm based on gradient descent. The second submodel in the LCMM depicts the generative process of words in a document with the given labels. Different traditional mixture models can be adopted in this generative process, such as the mixture of language models, or topic models. In the multi-label classification stage, we propose a two-step strategy to most efficiently utilize the LCMM based on the framework of Bayes decision theory. We conduct extensive multi-label classification experiments on three standard text data sets. The experimental results show significant performance improvements comparing to existing approaches. For example, the improvements on accuracy and macro F-score measures in the OHSUMED data set achieve 28.3% and 37.0%, respectively. These performance enhancements demonstrate the effectiveness of the proposed models and solutions.

机译：在当前的论文中，我们提出了一种概率生成模型，即标签相关混合模型（LCMM），用于描述多标签文档数据，该数据可用于多标签文本分类。 LCMM假设有两个随机生成过程，分别对应两个子模型：1）标签相关模型； 2）标签混合物模型。前一个模型制定了标签的生成过程，其中创建了标签相关网络来描述标签之间的依赖性。此外，我们提出了一种有效的推理算法，用于计算多标签类的生成概率。此外，为了优化标签相关网络，提出了一种基于梯度下降的参数学习算法。 LCMM中的第二个子模型描述了带有给定标签的文档中单词的生成过程。在此生成过程中，可以采用不同的传统混合模型，例如语言模型或主题模型的混合。在多标签分类阶段，我们基于贝叶斯决策理论的框架，提出了两步策略以最有效地利用LCMM。我们对三个标准文本数据集进行了广泛的多标签分类实验。实验结果表明，与现有方法相比，性能有了显着提高。例如，OHSUMED数据集中的准确性和宏F分数度量的改进分别达到28.3％和37.0％。这些性能增强证明了所提出的模型和解决方案的有效性。

著录项

来源
《Intelligent data analysis》 |2017年第6期|1371-1392|共22页
作者
He Zhiyang; Wu Ji; Lv Ping;
展开▼
作者单位

Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China;

Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China;

Tsinghua iFlytek Joint Lab Speech Technol, Beijing, Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Label correlation mixture model; probabilistic generative model; multi-label text classification; label correlation model; label correlation network; Bayes decision theory;

机译：标签相关混合模型;概率生成模型;多标签文本分类;标签相关模型;标签相关网络;贝叶斯决策理论;

相似文献

外文文献
中文文献
专利

1. Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms [J] . Beatriz Wilges, Gustavo Mateus, Silvia Nassar, Journal of computer sciences . 2016,第7期

机译：分类算法支持的多标签文本分类模糊建模
2. Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms | Science Publications [J] . Beatriz Wilges, Gustavo Mateus, Renato Cislaghi, Journal of computer sciences . 2016,第7期

机译：分类算法支持的多标签文本分类模糊建模科学出版物
3. History-based attention in Seq2Seq model for multi-label text classification [J] . Xiao Yaoqiang, Li Yi, Yuan Jin, Knowledge-Based Systems . 2021,第Jula19期

机译：基于历史的SEQ2SEQ模型中的多标签文本分类模型
4. Label correlation mixture model for multi-label text categorization [C] . Zhiyang He, Ji Wu, Ping Lv IEEE Workshop on Spoken Language Technology . 2014

机译：用于多标签文本分类的标签相关混合模型
5. Exploiting Label Correlations for Multi-label Classification [D] . Li, Cheng-Xian 2011

机译：利用标签相关性进行多标签分类
6. ML-Net: multi-label classification of biomedical texts with deep neural networks [O] . Jingcheng Du, Qingyu Chen, Yifan Peng, 2019

机译：ML-NET：具有深神经网络的生物医学文本的多标签分类
7. New Multi-Label Correlation-Based Feature Selection Methods for Multi-Label Classification and Application in Bioinformatics [O] . Jungjit Suwimol 2016

机译：基于多标签相关性的多标签分类新特征选择方法及其在生物信息学中的应用

Multi-label text classification based on the label correlation mixture model

摘要

著录项

相似文献

相关主题

期刊订阅