首页> 外文期刊>Journal of Intelligent Information Systems >Optimizing text classification through efficient feature selection based on quality metric
【24h】

Optimizing text classification through efficient feature selection based on quality metric

机译:通过基于质量指标的有效特征选择来优化文本分类

获取原文
获取原文并翻译 | 示例
           

摘要

Feature maximization is a cluster quality metric which favors clusters with maximum feature representation as regard to their associated data. In this paper we show that a simple adaptation of such metric can provide a highly efficient feature selection and feature contrasting model in the context of supervised classification. The method is experienced on different types of textual datasets. The paper illustrates that the proposed method provides a very significant performance increase, as compared to state of the art methods, in all the studied cases even when a single bag of words model is exploited for data description. Interestingly, the most significant performance gain is obtained in the case of the classification of highly unbalanced, highly multidimensional and noisy data, with a high degree of similarity between the classes.
机译:特征最大化是一种集群质量度量,它偏向于具有最大特征表示的集群及其关联数据。在本文中,我们证明了这种度量的简单调整可以在监督分类的情况下提供高效的特征选择和特征对比模型。该方法适用于不同类型的文本数据集。本文表明,与现有技术相比,即使在使用单个词袋模型进行数据描述的情况下,与现有技术相比,该方法也可以显着提高性能。有趣的是,在对高度不平衡,高度多维和嘈杂的数据进行分类的情况下,获得了最显着的性能提升,并且类别之间的相似度很高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号