首页> 外文会议>International Conference on Contemporary Computing >Proposed algorithm for frequent item set generation
【24h】

Proposed algorithm for frequent item set generation

机译:提议的频繁项目集生成算法

获取原文

摘要

Data mining is an efficient technology to discover patterns in large databases. Association rule mining techniques are used to find the correlation between the various item sets in the database, and this correlation between various item sets are used in decision making and pattern analysis. In recent years the problem of finding frequent items and association rules from large datasets has been proposed by many researchers. Various research papers on association rule mining (ARM) are studied and analyzed first to understand the existing algorithms. The Apriori algorithm is the basic ARM algorithm, but it requires so many database scans to find frequent items. In Dynamic Item set counting (DIC) algorithm less number of database scans are needed, but complex data structure lattice is used. The main focus of this paper is to propose a new optimized algorithm (FI-generator) and to compare its performance with the existing algorithms. A secondary data set is used to find out frequent item sets and association rules with the help of existing and proposed algorithm).We observed that the proposed algorithm find out the frequent item sets and association rules from databases as compared to the existing algorithms in less numbers of database scans. In the proposed algorithm an optimized data structure adjacency matrix is used. Proposed algorithm reduces the size of candidate-K item set in successive iteration. Pruning is also done at two stages which reduces the memory space.
机译:数据挖掘是一种发现大型数据库中的模式的有效技术。关联规则挖掘技术用于查找数据库中各个项目集之间的相关性,并且各个项目集之间的这种相关性用于决策和模式分析中。近年来,许多研究人员提出了从大型数据集中查找频繁项和关联规则的问题。首先研究和分析了有关关联规则挖掘(ARM)的各种研究论文,以了解现有算法。 Apriori算法是基本的ARM算法,但是它需要进行许多数据库扫描才能找到频繁的项目。在动态项目集计数(DIC)算法中,所需的数据库扫描次数较少,但是使用了复杂的数据结构格。本文的主要重点是提出一种新的优化算法(FI发生器),并将其性能与现有算法进行比较。与现有算法相比,使用辅助数据集查找频繁项集和关联规则)(我们发现,与现有算法相比,该算法从数据库中查找频繁项集和关联规则)数据库扫描次数。在提出的算法中,使用了优化的数据结构邻接矩阵。提出的算法减少了连续迭代中候选K项集的大小。修剪也分两个阶段进行,以减少存储空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号