基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

刘木林; 朱庆华

首页> 中文期刊> 《计算机技术与发展》 >基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In order to solve the problem that the traditional association rules mining algorithm has been unable to meet the mining needs of large amount of data in the aspect of efficiency and scalability,take Apriori as an example,the algorithm is realized in the parallelization based on Hadoop framework and MapReduce model. On the basis,it is improved using the transaction reduce method for further enhance-ment of the algorithm's mining efficiency. The experiment,which consists of verification of parallel mining results,comparison on effi-ciency between serials and parallel,variable relationship between mining time and node number and between mining time and data a-mounts,is carried out in the mining results and efficiency by Hadoop clustering. Experiments show that the paralleled Apriori algorithm implemented is able to accurately mine frequent item sets,with a better performance and scalability. It can be better to meet the require-ments of big data mining and efficiently mine frequent item sets and association rules from large dataset.%为了解决传统关联规则挖掘算法在挖掘效率、算法扩展性等方面无法适应大数据挖掘需求的问题，以经典的关联规则挖掘算法—Apriori算法为例，首先基于Hadoop平台和MapReduce编程模型，实现算法的并行化。在此基础上，基于事务缩减的思想对算法进行优化，进一步提高算法的挖掘效率。搭建Hadoop集群环境，对算法的挖掘结果和挖掘效率进行实验。通过并行挖掘结果验证、串行版与并行版效率对比、挖掘时间与节点数目的变化关系、挖掘时间与数据量的变化关系4组实验，结果表明：文中实现的Apriori算法不仅能够准确挖掘频繁项集，而且比传统串行算法具有更高的挖掘性能和可扩展性。该算法能够更好地适应大数据集的挖掘要求，能够实现从大规模数据集中高效挖掘频繁项集和关联规则。

著录项

来源
《计算机技术与发展》 |2016年第7期|1-5|共5页
作者
刘木林; 朱庆华;
展开▼
作者单位

南京大学信息管理学院;

江苏南京 210023;

南京大学信息管理学院;

江苏南京 210023;

展开▼
原文格式 PDF
正文语种 chi
中图分类计算机网络;
关键词
数据挖掘; 关联规则; Hadoop; Apriori;

相似文献

中文文献
外文文献
专利

1. 基于Hadoop的关联规则挖掘算法研究 [J] . 田建勇 . 电脑编程技巧与维护 . 2020,第007期
2. 基于Hadoop的多维关联规则挖掘算法研究及应用 [J] . 杨青 ,张亚文 ,张琴 . 计算机工程与科学 . 2019,第012期
3. 关联规则挖掘Apriori算法研究综述 [J] . 饶正婵 ,范年柏 . 计算机时代 . 2012,第009期
4. 关联规则挖掘Apriori算法研究 [J] . 孙英慧 ,孙英娟 . 吉林师范大学学报（自然科学版） . 2009,第004期
5. 基于Hadoop的Apriori算法研究与优化 [J] . 孙学波 ,石飞达 . 计算机工程与设计 . 2018,第001期
6. 基于Apriori算法的目标特征关联规则挖掘 [C] . Yu Xiao hong ,于小红 ,Liu Zhicheng . 第六届中国信息融合大会 . 2014
7. 基于Hadoop的关联规则挖掘算法研究及应用 [A] . 冯世祥 . 2020

基于Hadoop的关联规则挖掘算法研究--以Apriori算法为例

摘要

著录项

相似文献

相关主题

期刊订阅