...
首页> 外文期刊>Expert systems with applications >Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction
【24h】

Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction

机译:基于特征选择和分组特征提取的分类的快速混合维度减少方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Dimensionality reduction is one basic and critical technology for data mining, especially in current "big data" era. As two different types of methods, feature selection and feature extraction each have their pros and cons. In this paper, we combine multi-strategy feature selection and grouped feature extraction and propose a novel fast hybrid dimension reduction method, incorporating their advantages of removing irrelevant and redundant information. Firstly, the intrinsic dimensionality of the data set is estimated by the maximum likelihood estimation method. Fisher Score and Information Gain based feature selection are used as multi-strategy methods to remove irrelevant features. With the redundancy among the selected features as clustering criterion, they are grouped into a certain amount of clusters. In every cluster, Principal Component Analysis (PCA) based feature extraction is carried out to remove redundant information. Four classical classifiers and representation entropy are used to evaluate the classification performance and information loss of the reduced set. The runtime results of different methods show that the proposed hybrid method is consistently much faster than the other three in almost all of the sets used. Meanwhile, the proposed method shows competitive classification performance, which has no significant difference basically compared with the other methods. The proposed method reduces the dimensionality of the raw data fast and it has excellent efficiency and competitive classification performance compared with the contrastive methods. (c) 2020 Elsevier Ltd. All rights reserved.
机译:减少维度是数据挖掘的一种基本和关键技术,特别是在当前的“大数据”时代。作为两种不同类型的方法,特征选择和特征提取各自具有它们的优点和缺点。在本文中,我们将多策略特征选择和分组特征提取结合起来,提出了一种新的快速混合尺寸减少方法,包括去除无关和冗余信息的优点。首先,通过最大似然估计方法估计数据集的内在维度。 Fisher评分和信息增益的特征选择用作多策略方法以消除无关的功能。随着所选功能之间的冗余作为群集标准,它们被分组为一定量的簇。在每个群集中,执行基于组件分析(PCA)的特征提取以删除冗余信息。四个古典分类器和表示熵用于评估减少集的分类性能和信息丢失。不同方法的运行时结果表明,所提出的混合方法在几乎所有使用的集合中始终如一比其他三个更快。同时,该方法表明竞争分类性能,与其他方法基本上没有显着差异。该方法快速降低了原始数据的维度,与对比度方法相比,它具有优异的效率和竞争性分类性能。 (c)2020 elestvier有限公司保留所有权利。

著录项

  • 来源
    《Expert systems with applications》 |2020年第7期|113277.1-113277.10|共10页
  • 作者单位

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China;

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China;

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China;

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China;

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China|Henan Key Lab Brain Sci & Brain Comp Interface Te Zhengzhou 450001 Henan Peoples R China;

    Zhengzhou Univ Sch Elect Engn Zhengzhou 450001 Henan Peoples R China|Zhengzhou Univ Ind Technol Res Inst Zhengzhou 450001 Henan Peoples R China|Henan Key Lab Brain Sci & Brain Comp Interface Te Zhengzhou 450001 Henan Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Dimensionality Reduction; Intrinsic Dimensionality; Feature Selection; Feature Cluster; PCA;

    机译:减少维度;内在的维度;特征选择;特征簇;PCA;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号