...
首页> 外文期刊>International Journal of Applied Engineering Research >Parallel Frequent Dataset Mining and Feature Subset Selection for High Dimensional Data on Hadoop using Map-Reduce
【24h】

Parallel Frequent Dataset Mining and Feature Subset Selection for High Dimensional Data on Hadoop using Map-Reduce

机译:使用Map-Refey对Hadoop上的高维数据的并行频繁数据集挖掘和功能子集选择

获取原文
获取原文并翻译 | 示例
           

摘要

Data mining mostly use for information analysis and finding frequent dataset. Now a days cloud computing is used for information storage and many other data processes like data mining, data retrieval, data distribution etc. As data increasing very rapidly on server day by day, many complications are introduced. Most common problems are load balancing on server and time optimization. To overcome these limitations parallel frequent dataset mining is very effective method. Fidoop parallel frequent dataset mining algorithm which is based on mapreduce framework helps to improve load balancing and FiDoop-HD, speed up the mining performance for high-dimensional data analysis. Fidoop is very efficient and scalable algorithm for large clusters of data. We are using Fast Clustering Based Feature Selection Algorithm for High Dimensional Data which uses minimum spanning tree (MST) to divide data into different clusters and unfasten unrelated sets and gives accurate and efficient result with similar sets.
机译:数据挖掘主要用于信息分析并找到频繁的数据集。现在,云计算用于信息存储和数据挖掘,数据检索,数据分布等的许多其他数据流程,因为数据在日常服务器上快速增长,介绍了许多并发症。最常见的问题是服务器和时间优化的负载平衡。为了克服这些限制,并行频繁的数据集挖掘是非常有效的方法。 Fidoop并行频繁数据集采用基于MapReduce框架的DataSet挖掘算法有助于提高负载平衡和FIDOP-HD,加快采矿性能进行高维数据分析。 Fidoop是用于大型数据集群的非常高效且可扩展的算法。我们正在使用基于快速聚类的特征选择算法,用于使用最小的生成树(MST)将数据划分为不同的集群,并解开具有相似集的准确和有效的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号