Hybrid feature selection methods for high-dimensional multi-class datasets

Amit Kumar Saxena; Vimal Kumar Dubey; John Wang

首页> 外文期刊>International journal of data mining, modelling and management >Hybrid feature selection methods for high-dimensional multi-class datasets

【24h】

Hybrid feature selection methods for high-dimensional multi-class datasets

机译：高维多类数据集的混合特征选择方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hybrid methods are very important for feature selection in case of the classification of high-dimensional datasets. In this paper, we proposed two hybrid methods which are the combination of filter-based feature selection, genetic algorithm, and sequential random search methods. The first proposed method is hybridisation of information gain and genetic algorithm. In this, first, the features are ranked based on the information gain and then a user defined features are selected from the ranked features. Genetic algorithm with these selected features is applied for the selection of optimal feature subset. It is applied for feature selection with two types of fitness functions which are single objective and multi-objective in nature. The second feature selection model is the hybridisation of information gain and sequential random K-nearest neighbour (SRKNN). In this method, again information gain is used to rank the features and a user defined top ranked number of features are selected. A set of binary population (having all feature selected by users) are generated and on each population sequential search method is applied for maximising the classification accuracy. These methods are applied to 21 high-dimensional multi-class datasets. Obtained results show that on some datasets first method's performance is good and on some datasets second method's performance is good. The results obtained by proposed methods are compared with results registered for other methods.

机译：在对高维数据集进行分类的情况下，混合方法对于特征选择非常重要。在本文中，我们提出了两种混合方法，它们是基于滤波器的特征选择，遗传算法和顺序随机搜索方法的组合。首先提出的方法是信息增益与遗传算法的混合。在此，首先，基于信息增益对特征进行排名，然后从排名的特征中选择用户定义的特征。将具有这些选定特征的遗传算法应用于最佳特征子集的选择。它用于具有两种适应度函数的特征选择，这两种适应度函数本质上是单目标和多目标。第二个特征选择模型是信息增益与顺序随机K近邻（SRKNN）的混合。在这种方法中，再次使用信息增益对特征进行排名，并选择用户定义的排名最高的特征数量。生成一组二进制种群（具有用户选择的所有特征），并在每种种群上应用顺序搜索方法以最大化分类精度。这些方法适用于21个高维多维类数据集。所得结果表明，在某些数据集上，第一种方法的性能良好，而在某些数据集上，第二种方法的性能良好。将通过提议的方法获得的结果与为其他方法注册的结果进行比较。

著录项

来源
《International journal of data mining, modelling and management》 |2017年第4期|315-339|共25页
作者
Amit Kumar Saxena; Vimal Kumar Dubey; John Wang;
展开▼
作者单位

Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur, 495009, India;

Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur, 495009, India;

Department of Information Management and Business Analysis, Montclair State University, Montclair, NJ 07043, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
intelligent mining; high-dimensional dataset; genetic algorithm; filter approach; information gain; classification;

机译：智能采矿高维数据集;遗传算法过滤方法;信息获取;分类;

相似文献

外文文献
中文文献
专利

1. Hybrid binary Coral Reefs Optimization algorithm with Simulated Annealing for Feature Selection in high-dimensional biomedical datasets [J] . Yan Chaokun, Ma Jingjing, Luo Huimin, Chemometrics and Intelligent Laboratory Systems . 2019,第期

机译：具有模拟退火的混合二进制珊瑚礁优化算法在高维生物医学数据集中的特征选择
2. A hybrid approach using rough set theory and hypergraph for feature selection on high-dimensional medical datasets [J] . Raman M. R. Gauthama, Nivethitha Somu, Kannan Krithivasan, Soft computing: A fusion of foundations, methodologies and applications . 2019,第23期

机译：一种使用粗糙集理论的混合方法和高维医学数据集特征选择的超图
3. A hybrid algorithm for feature subset selection in high-dimensional datasets using FICA and IWSSr algorithm [J] . Moradkhani Mostafa, Amiri Ali, Javaherian Mohsen, Applied Soft Computing . 2015,第Null期

机译：FICA和IWSSr算法在高维数据集中特征子集选择的混合算法
4. A Centre of Gravity-Based Preprocessing Approach for Feature Selection Using Artificial Bee Colony Algorithm on High-Dimensional Datasets [C] . M. G. Bindu, M. K. Sabu International Conference on Communication Systems and Networks . 2019

机译：基于重力中心的预处理方法，在高维数据集上使用人工蜂群算法进行特征选择
5. Robust and efficient feature selection for high-dimensional datasets. [D] . Mo, Dengyao. 2011

机译：高维数据集的稳健而高效的特征选择。
6. Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets [O] . Muhammad Umar Chaudhry, Muhammad Yasir, Muhammad Nabeel Asghar, 2020

机译：基于蒙特卡罗树搜索的递归算法用于高维数据集中的特征选择
7. Monte Carlo Tree Search-Based Recursive Algorithm for Feature Selection in High-Dimensional Datasets [O] . Muhammad Umar Chaudhry, Muhammad Yasir, Muhammad Nabeel Asghar, 2020

机译：基于蒙特卡罗树搜索的递归算法，用于高维数据集中的特征选择

Hybrid feature selection methods for high-dimensional multi-class datasets

摘要

著录项

相似文献

相关主题

期刊订阅