Breaking the Closed World Assumption in Text Classification

机译：打破文本分类的封闭世界假设

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Existing research on multiclass text classification mostly makes the closed world assumption, which focuses on designing accurate classifiers under the assumption that all test classes are known at training time. A more realistic scenario is to expect unseen classes during testing (open world). In this case, the goal is to design a learning system that classifies documents of the known classes into their respective classes and also to reject documents from unknown classes. This problem is called open (world) classification. This paper approaches the problem by reducing the open space risk while balancing the empirical risk. It proposes to use a new learning strategy, called center-based similarity (CBS) space learning (or CBS learning), to provide a novel solution to the problem. Extensive experiments across two datasets show that CBS learning gives promising results on multiclass open text classification compared to state-of-the-art baselines.

机译：现有的关于多类文本分类的研究大多是封闭世界的假设，该假设着重于在训练时已知所有测试类的假设下设计准确的分类器。一个更现实的情况是期望在测试（开放世界）期间看不见的课程。在这种情况下，目标是设计一种学习系统，该系统将已知类别的文档分类为各自的类别，并拒绝来自未知类别的文档。此问题称为开放（世界）分类。本文通过在平衡经验风险的同时减少空地风险来解决这一问题。它建议使用一种新的学习策略，称为基于中心的相似性（CBS）空间学习（或CBS学习），以提供一种解决该问题的新颖方法。横跨两个数据集的大量实验表明，与最新的基准相比，CBS学习在多类开放文本分类上提供了令人鼓舞的结果。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2016年|506-514|共9页
会议地点
作者
Geli Fei; Bing Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Clinical text classification under the open and closed topic assumptions [J] . Sasaki Y., Rea B., Ananiadou S. International journal of data mining and bioinformatics . 2009,第3期

机译：开放和封闭主题假设下的临床文本分类
2. Using Collaborative Tagging for Text Classification: From Text Classification to Opinion Mining [J] . Eric Charton, Ludovic Jean-Louis, Marie-Jean Meurs, Informatics . 2013,第1期

机译：使用协作标记进行文本分类：从文本分类到意见挖掘
3. Supervised and semi-supervised learning in text classification using enhanced KNN algorithm: a comparative study of supervised and semi-supervised classification in text categorisation [J] . M. A. Wajeed, T. Adilakshmi International Journal of Intelligent Systems Technologies and Applications . 2012,第3a4期

机译：使用增强型KNN算法的文本分类中的有监督和半监督学习：文本分类中有监督和半监督分类的比较研究
4. Breaking the Closed World Assumption in Text Classification [C] . Geli Fei, Bing Liu Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2016

机译：在文本分类中打破封闭的世界假设
5. Efficient Text Classification with Linear Regression Using a Combination of Predictors for Flu Outbreak Detection [D] . Al Essa, Ali. 2018

机译：线性回归的高效文本分类，使用预测因子组合进行流感暴发检测
6. Concentration-dependent Unloading as a Necessary Assumption for a Closed Form Mathematical Model of Osmotically Driven Pressure Flow in Phloem [O] . John D. Goeschl, C. E. Magnuson, Don W. Demichele, 1976

机译：浓度依赖的卸荷作为韧皮部渗透压流封闭形式数学模型的必要假设
7. On the Breaking Strength of a Yarn on the Assumption that a Yarn is cut due to Breaking of all Fibers [O] . Kuniichi Yamada 1962

机译：关于纱线的断裂强度假设纱线由于纤维的破碎而被切割

Breaking the Closed World Assumption in Text Classification

摘要

著录项

相似文献

相关主题

期刊订阅