Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

Peng Hao; Li Jianxin; Wang Senzhang; Wang Lihong; Gong Qiran; Yang Renyu; Li Bo; Yu Philip S.; He Lifang

首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

【24h】

Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

机译：用于大型多标签文本分类的分类分类分类和注意力图胶囊RCNN

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

CNNs, RNNs, GCNs, and CapsNets have shown significant insights in representation learning and are widely used in various text mining tasks such as large-scale multi-label text classification. Most existing deep models for multi-label text classification consider either the non-consecutive and long-distance semantics or the sequential semantics. However, how to coherently take them into account is still far from studied. In addition, most existing methods treat output labels as independent medoids, ignoring the hierarchical relationships among them, which leads to a substantial loss of useful semantic information. In this paper, we propose a novel hierarchical taxonomy-aware and attentional graph capsule recurrent CNNs framework for large-scale multi-label text classification. Specifically, we first propose to model each document as a word order preserved graph-of-words and normalize it as a corresponding word matrix representation preserving both the non-consecutive, long-distance and local sequential semantics. Then the word matrix is input to the proposed attentional graph capsule recurrent CNNs for effectively learning the semantic features. To leverage the hierarchical relations among the class labels, we propose a hierarchical taxonomy embedding method to learn their representations, and define a novel weighted margin loss by incorporating the label representation similarity. Extensive evaluations on three datasets show that our model significantly improves the performance of large-scale multi-label text classification by comparing with state-of-the-art approaches.

机译：CNNS，RNNS，GCN和CapSnets显示了表示学习的重要见解，并且广泛用于各种文本挖掘任务，例如大规模的多标签文本分类。最现有的多标签文本分类的深层模型考虑了非连续和长距离语义或连续语义。但是，如何连贯地考虑到他们仍然远离研究。此外，大多数现有方法将输出标签视为独立的麦细管，忽略它们之间的分层关系，这导致有用的语义信息的大量损失。在本文中，我们提出了一种新颖的分类分类 - 感知和注意力图胶囊复发性CNNS框架，用于大规模的多标签文本分类。具体而言，我们首先建议将每个文档模拟作为单词秩序保留的单词，并将其标准化为保留非连续，长距离和本地连续语义的相应词矩阵表示。然后将单词矩阵输入到所提出的注意力图胶囊复制CNN，以有效地学习语义特征。为了利用类标签之间的分层关系，我们提出了一种分类分类的嵌入方法来学习其表示，并通过纳入标签表示相似度来定义新的加权保证金损失。三个数据集的广泛评估表明，我们的模型通过与最先进的方法进行比较，我们的模型显着提高了大规模多标签文本分类的性能。

著录项

来源
《IEEE Transactions on Knowledge and Data Engineering》 |2021年第6期|2505-2519|共15页
作者
Peng Hao; Li Jianxin; Wang Senzhang; Wang Lihong; Gong Qiran; Yang Renyu; Li Bo; Yu Philip S.; He Lifang;
展开▼
作者单位

Beihang Univ Beijing Adv Innovat Ctr Big Data & Brain Comp Beijing 100083 Peoples R China|Beihang Univ State Key Lab Software Dev Environm Beijing 100083 Peoples R China;

Beihang Univ Beijing Adv Innovat Ctr Big Data & Brain Comp Beijing 100083 Peoples R China|Beihang Univ State Key Lab Software Dev Environm Beijing 100083 Peoples R China;

Nanjing Univ Aeronaut & Astronaut Coll Comp Sci & Technol Nanjing 211106 Peoples R China;

Coordinat Ctr China Natl Comp Network Emergency Response Tech Team Beijing 100029 Peoples R China;

Beihang Univ Beijing Adv Innovat Ctr Big Data & Brain Comp Beijing 100083 Peoples R China|Beihang Univ State Key Lab Software Dev Environm Beijing 100083 Peoples R China;

Univ Leeds Sch Comp Leeds LS2 9JT W Yorkshire England;

Beihang Univ Beijing Adv Innovat Ctr Big Data & Brain Comp Beijing 100083 Peoples R China|Beihang Univ State Key Lab Software Dev Environm Beijing 100083 Peoples R China;

Univ Illinois Dept Comp Sci Chicago IL 60607 USA;

Lehigh Univ Dept Comp Sci & Engn Bethlehem PA 18015 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Semantics; Deep learning; Feature extraction; Computational modeling; Taxonomy; Task analysis; Data models; Multi-label classification; document modeling; graph rcnn; attention network; capsule network; taxonomy embedding;

机译：语义;专题提取;计算建模;分类;任务分析;数据模型;多标签分类;文档建模;图形rcnn;注意网络;胶囊网络;胶囊网络;分类嵌入;

相似文献

外文文献
中文文献
专利

1. Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification [J] . Gong Jibing, Teng Zhiyong, Teng Qi, Quality Control, Transactions . 2020,第期

机译：基于分层图形变换器的大型多标签文本分类的深度学习模型
2. InPHYNet: Leveraging attention-based multitask recurrent networks for multi-label physics text classification [J] . Udandarao Vishaal, Agarwal Abhishek, Gupta Anubha, Knowledge-Based Systems . 2021,第Jana9期

机译：InphyNet：利用基于关注的多任务复发网络，用于多标签物理文本分类
3. History-based attention in Seq2Seq model for multi-label text classification [J] . Xiao Yaoqiang, Li Yi, Yuan Jin, Knowledge-Based Systems . 2021,第Jula19期

机译：基于历史的SEQ2SEQ模型中的多标签文本分类模型
4. Hierarchical Multi-label Classification of Text with Capsule Networks [C] . Rami Aly, Steffen Remus, Chris Biemann Annual meeting of the Association for Computational Linguistics . 2019

机译：胶囊网络对文本进行分层多标签分类
5. Induction in hierarchical multi-label domains with focus on text categorization. [D] . Dendamrongvit, Sareewan. 2011

机译：归纳多层标签域，重点关注文本分类。
6. Analyzing the Moving Parts of a Large-Scale Multi-Label Text Classification Pipeline: Experiences in Indexing Biomedical Articles [O] . Anthony Rios, Ramakanth Kavuluru -1

机译：分析大型多标签文本分类管道的运动部分：生物医学文章索引的经验
7. Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification [O] . Jibing Gong, Hongyuan Ma, Zhiyong Teng, 2020

机译：基于分层图形变换器的大型多标签文本分类的深度学习模型

Hierarchical Taxonomy-Aware and Attentional Graph Capsule RCNNs for Large-Scale Multi-Label Text Classification

摘要

著录项

相似文献

相关主题

期刊订阅