首页> 外文会议>IEEE International Conference on Artificial Intelligence and Computer Applications >An Ensemble Classification Algorithm for Imbalanced Text Data Streams
【24h】

An Ensemble Classification Algorithm for Imbalanced Text Data Streams

机译:不平衡文本数据流的集成分类算法

获取原文

摘要

For unbalanced text data streams, an ensemble classification algorithm for unbalanced text data streams is proposed in this paper. Firstly, an improved resampling method is used to establish balanced data subsets; secondly, the topic model is used to perform topic modeling on the balanced data subsets to establish document-topic training subsets; finally, an ensemble classifier is constructed using the WE ensemble model. The algorithm sets the difference of the F-value of the neighboring data blocks to a certain threshold as the standard of updating the classifier. When the ensemble classifier is updated, the base classifier is retrained after the error positive instance is added to error set. Experimental results show that the proposed algorithm not only has good classification performance for the positive instances, but also has good classification performance for all instances. Therefore, the algorithm proposed in this paper is an effective classification algorithm for unbalanced text data streams.
机译:针对不平衡文本数据流,提出了一种不平衡文本数据流的集成分类算法。首先,一种改进的重采样方法被用来建立平衡的数据子集。其次,使用主题模型对平衡数据子集进行主题建模,建立文档主题训练子集。最后,使用WE集成模型构造一个集成分类器。该算法将相邻数据块的F值的差异设置为某个阈值,作为更新分类器的标准。更新整体分类器时,将错误肯定实例添加到错误集后,将对基础分类器进行重新训练。实验结果表明,该算法不仅对阳性实例具有良好的分类性能,而且对所有实例均具有良好的分类性能。因此,本文提出的算法是一种有效的非平衡文本数据流分类算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号