【24h】

Ensemble of Binary Learners for Reliable Text Categorization with a Reject Option

机译:二进制学习者组合,用于带有拒绝选项的可靠文本分类

获取原文

摘要

Text categorization is a key task in information retrieval and natural language processing. Providing a reliability measure of the classification result for a text document into a particular category can benefit the recognition rate as well as better inform the user with regard to the confidence that should be attributed to the output. A novel reliability measure is proposed starting from running different binary classifiers in the Error-Correcting Output Codes (ECOC) framework. Documents classified in a particular category which have a higher ECOC-computed distance from their classification in the next ranked category also have a higher associated reliability. This is the main idea explored in the proposed ECOC-based text classifier with a reject option. Experiments performed for some commonly used text categorization benchmark datasets demonstrate the potential of the proposed method.
机译:文本分类是信息检索和自然语言处理中的关键任务。为文本文档分类到特定类别中提供可靠度度量可以提高识别率,并更好地告知用户应归因于输出的置信度。从在纠错输出代码(ECOC)框架中运行不同的二进制分类器开始,提出了一种新颖的可靠性措施。归类到特定类别的文档比其在下一个排名类别中的类别具有更高的ECOC计算距离,也具有更高的关联可靠性。这是在建议的基于ECOC的带有拒绝选项的文本分类器中探索的主要思想。对一些常用的文本分类基准数据集进行的实验证明了该方法的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号