首页> 外文期刊>AI communications >An approach for outlier and novelty detection for text data based on classifier confidence
【24h】

An approach for outlier and novelty detection for text data based on classifier confidence

机译:基于分类器置信度的文本数据的异常值和新颖性检测方法

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper we present an approach for novelty detection in text data. The approach can also be considered as semi-supervised anomaly detection because it operates with the training dataset containing labelled instances for the known classes only. During the training phase the classification model is learned. It is assumed that at least two known classes exist in the available training dataset. In the testing phase instances are classified as normal or anomalous based on the classifier confidence. In other words, if the classifier cannot assign any of the known class labels to the given instance with sufficiently high confidence (probability), the instance will be declared as novelty (anomaly). We propose two procedures to objectively measure the classifier confidence. Experimental results show that the proposed approach is comparable to methods known in the literature.
机译:在本文中,我们在文本数据中提出了一种新颖性检测方法。 该方法也可以被视为半监督异常检测,因为它与仅包含已知类的标记实例的训练数据集。 在培训阶段期间,学习分类模型。 假设可用训练数据集中存在至少两个已知类。 在测试阶段实例基于分类器的置信度分类为正常或异常。 换句话说,如果分类器不能以足够高的置信度(概率)将任何已知类标签分配给给定实例,则该实例将被声明为新奇(异常)。 我们提出了两项程序来客观地测量分类器的信心。 实验结果表明,该方法与文献中已知的方法相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号