首页> 外文期刊>International journal of swarm intelligence research >An Efficient Parallel Hybrid Feature Selection Approach for Big Data Analysis
【24h】

An Efficient Parallel Hybrid Feature Selection Approach for Big Data Analysis

机译:An Efficient Parallel Hybrid Feature Selection Approach for Big Data Analysis

获取原文
获取原文并翻译 | 示例
           

摘要

Classification algorithms face runtime complexity due to high data dimension, especially in the context of big data. Feature selection (FS) is a technique for reducing dimensions and improving learning performance. In this paper, the authors proposed a hybrid FS algorithm for classification in the context of big data. Firstly, only the most relevant features are selected using symmetric uncertainty (SU) as a measure of correlation. The features are distributed into subsets using Apache Spark to calculate SU between each feature and target class in parallel. Then a Binary PSO (BPSO) algorithm is used to find the optimal FS. The BPSO has limited convergence and restricted inertial weight adjustment, so the authors suggested using a multiple inertia weight strategy to influence the changes in particle motions so that the search process is more varied. Also, the authors proposed a parallel fitness evaluation for particles under Spark to accelerate the algorithm. The results showed that the proposed FS achieved higher classification performance with a smaller size in reasonable time.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号