Optimizing text classification through efficient feature selection based on quality metric

Lamirel Jean-Charles; Cuxac Pascal; Chivukula Aneesh Sreevallabh; Hajlaoui Kafil

首页> 外文期刊>Journal of Intelligent Information Systems >Optimizing text classification through efficient feature selection based on quality metric

【24h】

Optimizing text classification through efficient feature selection based on quality metric

机译：通过基于质量指标的有效特征选择来优化文本分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature maximization is a cluster quality metric which favors clusters with maximum feature representation as regard to their associated data. In this paper we show that a simple adaptation of such metric can provide a highly efficient feature selection and feature contrasting model in the context of supervised classification. The method is experienced on different types of textual datasets. The paper illustrates that the proposed method provides a very significant performance increase, as compared to state of the art methods, in all the studied cases even when a single bag of words model is exploited for data description. Interestingly, the most significant performance gain is obtained in the case of the classification of highly unbalanced, highly multidimensional and noisy data, with a high degree of similarity between the classes.

机译：特征最大化是一种集群质量度量，它偏向于具有最大特征表示的集群及其关联数据。在本文中，我们证明了这种度量的简单调整可以在监督分类的情况下提供高效的特征选择和特征对比模型。该方法适用于不同类型的文本数据集。本文表明，与现有技术相比，即使在使用单个词袋模型进行数据描述的情况下，与现有技术相比，该方法也可以显着提高性能。有趣的是，在对高度不平衡，高度多维和嘈杂的数据进行分类的情况下，获得了最显着的性能提升，并且类别之间的相似度很高。

著录项

来源
《Journal of Intelligent Information Systems》 |2015年第3期|379-396|共18页
作者
Lamirel Jean-Charles; Cuxac Pascal; Chivukula Aneesh Sreevallabh; Hajlaoui Kafil;
展开▼
作者单位

LORIA, INRIA Nancy Grand Est, SYNALP Team, Vandoeuvre Les Nancy, France;

INIST CNRS, Vandoeuvre Les Nancy, France;

Int Inst Informat Technol, Ctr Data Engn, Gachibowli Hyderabad, Andhra Pradesh, India;

Int Inst Informat Technol, Ctr Data Engn, Gachibowli Hyderabad, Andhra Pradesh, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature maximization; Clustering quality index; Feature selection; Supervised learning; Unbalanced data; Text;

机译：特征最大化;聚类质量指数;特征选择;监督学习;数据不均衡;文本;

相似文献

外文文献
中文文献
专利

1. The Feature Selection Method based on Genetic Algorithm for Efficient of Text Clustering and Text Classification [J] . Sung-Sam Hong, Wanhee Lee, Myung-Mook Han International Journal of Advances in Soft Computing and Its Applications . 2015,第1aSpecial期

机译：基于遗传算法的高效文本聚类和分类的特征选择方法
2. An efficient automatic multiple objectives optimization feature selection strategy for internet text classification [J] . Huang Changqin, Zhu Jia, Liang Yuzhi, International journal of machine learning and cybernetics . 2019,第5期

机译：一种用于互联网文本分类的高效自动多目标优化特征选择策略
3. An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification [J] . Semwal Vijay Bhaskar, Singha Joyeeta, Sharma Pinki Kumari, Multimedia Tools and Applications . 2017,第22期

机译：基于增量特征分析的生物特征步态数据分类优化特征选择技术
4. A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data [C] . Jean-Charles Lamirel, Pascal Cuxac, Aneesh Sreevallabh Chivukula, Trends and applications in knowledge discovery and data mining . 2013

机译：基于质量指标的特征选择与特征对比新方法：在复杂文本数据有效分类中的应用
5. A quality metric to improve wrapper feature selection in multiclass subject invariant brain computer interfaces. [D] . Sherwood, Jesse. 2011

机译：一种用于改进多类主题不变式大脑计算机接口中包装器功能选择的质量度量。
6. Relevance popularity: A term event model based feature selection scheme for text classification [O] . Guozhong Feng, Baiguo An, Fengqin Yang, -1

机译：相关性流行度：基于术语事件模型的文本分类特征选择方案
7. A New Feature Selection and Feature Contrasting Approach Based on Quality Metric: Application to Efficient Classification of Complex Textual Data [O] . Cuxac Pascal, Chivukula Aneesh Sreevallabh, Hajlaoui Kafil, 2013

机译：基于质量指标的特征选择与特征对比新方法：在复杂文本数据有效分类中的应用

Optimizing text classification through efficient feature selection based on quality metric

摘要

著录项

相似文献

相关主题

期刊订阅