...
首页> 外文期刊>Journal of Computers >A Novel Document Weighted Approach for Text Classification
【24h】

A Novel Document Weighted Approach for Text Classification

机译:文本分类的新文档加权方法

获取原文
           

摘要

The textual data in the internet is increasing exponentially through blogs, twitter and various social media sites. The users are not specifying the type of text that they are uploading into the internet. In this regard most of the researchers are looking for automated tools for classifying the data or assigning class label to the unknown documents. Text classification is one such area used for classifying the texts. Several solutions were provided for text classification by the researchers. The text classification approaches generally contains collection of training data, preprocessing of the text, features extraction, feature reduction, document representation and finally applying classification algorithms to build the model for class label prediction of a new textual document. In the phases of text classification, the document representation is one important step to increase the efficiency of the accuracy of text classification. In this work, a new document representation approach is proposed. The experimentation conducted on 20-Newsgroup and Reuters-21578 datasets and different types of classification algorithms. Our approach attained best accuracy results for text classification and observed that the results are more promising than most of the popular approaches for text classification.
机译:互联网中的文本数据通过博客,推特和各种社交媒体网站呈指数级增长。用户未指定它们上传到Internet的文本类型。在这方面,大多数研究人员正在寻找用于将数据或将类标签分配给未知文档的自动化工具。文本分类是用于对文本进行分类的一个这样的区域。为研究人员提供了几种解决方案。文本分类方法通常包含培训数据的集合,文本的预处理,功能提取,特征减少,文档表示,最终应用分类算法来构建新文本文档的类标签预测模型。在文本分类的阶段,文档表示是提高文本分类准确性效率的一个重要步骤。在这项工作中,提出了一种新的文件表示方法。在20次新闻组和路透社-21578数据集和不同类型的分类算法进行的实验。我们的方法达到了文本分类的最佳准确性结果,并观察到结果比文本分类的大多数流行方法更为希望。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号