【24h】

An Innovative Graph-Based Approach to Advance Feature Selection from Multiple Textual Documents

机译:一种基于图的创新方法,可以从多个文本文档中选择特征

获取原文

摘要

This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms and node centrality measures. Compared to well-tried existing solutions, evaluation results show that the proposed approach increases the accuracy of most text classifiers employed and decreases the number of features required to achieve 'state-of-the-art' accuracy. Well-known datasets used for the experimentations reported in this paper include 20Newsgroups. LingSpam, Amazon Reviews and Reuters.
机译:本文介绍了一种基于图形的新颖方法,可以从多个文本文档中选择特征。所提出的解决方案能够通过利用当代的图论方法(例如社区检测算法和节点中心度度量)来研究术语在整个文档集中的重要性。与经过充分验证的现有解决方案相比,评估结果表明,该方法提高了大多数采用的文本分类器的准确性,并减少了实现“最先进”准确性所需的功能数量。本文报道的用于实验的著名数据集包括20个新闻组。 LingSpam,亚马逊评论和路透社。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号