Feature Selection Based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification Using Naieve Bayes

机译：基于采样和C4.5算法的特征选择，以提高使用恶劣贝叶斯的文本分类质量

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic text classification into predefined categories is an increasingly important task given the vast number of electronic documents available on the Internet and enterprise servers. Successful text classification relies heavily on the vital task of dimensionality reduction, which aims to improve classification accuracy, give greater expression to the classification process, and improve classification computational efficiency. In this paper, two algorithms for feature selection are presented, based on sampling and weighted sampling that build on the C4.5 algorithm. The results demonstrate considerable improvements with regard to classification accuracy - up to 10% - compared to traditional algorithms such as C4.5, Naieve Bayes and Support Vector Machines. The classification process is performed using the Naieve Bayes model in the space of reduced dimensionality. Experiments were carried out using data sets based on the Reuters-21578 collection.

机译：在Internet和Enterprise服务器上提供的广大电子文档，将自动文本分类为预定义类别是一个越来越重要的任务。成功的文本分类严重依赖于维度减少的重要任务，这旨在提高分类准确性，给予分类过程的更大表达，提高分类计算效率。在本文中，基于在C4.5算法上构建的采样和加权采样，提出了两个用于特征选择的算法。结果表明，与分类精度相比，高达10％ - 与传统算法相比，如C4.5，即天化贝叶斯和支持向量机。在减少维度降低的空间中使用明示贝叶斯模型进行分类过程。基于REUTERS-21578集合使用数据集进行实验。

著录项

来源
《Mexican international conference on artificial intelligence》|2014年||共12页
会议地点
作者
Viviana Molano; Carlos Cobos; Martha Mendoza; Enrique Herrera-Viedma; Milos Manic;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Feature Selection For Text Classification With Naieve Bayes [J] . Jingnian Chen, Houkuan Huang, Shengfeng Tian, Expert systems with applications . 2009,第3p1期

机译：使用朴素贝叶斯进行文本分类的特征选择
2. Privacy Preserving Feature Selection for Vertically Distributed Medical Data based on Genetic Algorithms and Naieve Bayes [J] . Boudheb Tarik, Elberrichi Zakaria International journal of information system modeling and design . 2018,第3期

机译：基于遗传算法和朴素贝叶斯的垂直分布医学数据隐私保护特征选择
3. Feature selection algorithm for text classification based on improved mutual information [J] . CONG Shuai, ZHANG Ji-bin, XU Zhi-ming, 哈尔滨工业大学学报（英文版） . 2011,第003期

机译：基于改进互信息的文本分类特征选择算法
4. Feature Selection Based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification Using Naieve Bayes [C] . Viviana Molano, Carlos Cobos, Martha Mendoza, Mexican international conference on artificial intelligence . 2014

机译：基于采样和C4.5算法的特征选择以提高朴素贝叶斯文本分类的质量
5. Automation of Feature Selection and Generation of Optimal Feature Subsets for Beehive Audio Sample Classification [D] . Bhouraskar, Aditya. 2020

机译：蜂箱音频样本分类的特征选择和最佳特征子集的生成
6. A Novel Feature Selection Technique for Text Classification Using Naïve Bayes [O] . Subhajit Dey Sarkar, Saptarsi Goswami, Aman Agarwal, 2014

机译：基于朴素贝叶斯的文本分类新特征选择技术
7. Different Classification Algorithms Based on Arabic Text Classification: Feature Selection Comparative Study [O] . Ghazi Raho, Ghassan Kanaan, Riyad Al-shalabi 2015

机译：基于阿拉伯语文本分类的不同分类算法：特征选择比较研究
8. Rough Set Feature Selection Algorithms for Textual Case-Based Classification. [R] . Gupta, K. M., Aha, D. W., Moore, P. 2006

机译：基于文本案例分类的粗糙集特征选择算法。

Feature Selection Based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification Using Naieve Bayes

摘要

著录项

相似文献

相关主题

期刊订阅