...
首页> 外文期刊>Journal of Computers >AT-Mine: An Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset
【24h】

AT-Mine: An Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset

机译:at-mine:在不确定数据集中的频繁替代项目集的高效算法

获取原文
           

摘要

—Frequent itemset/pattern mining (FIM) over uncertain transaction dataset is a fundamental task in data mining. In this paper, we study the problem of FIM over uncertain datasets. There are two main approaches for FIM: the level-wise approach and the pattern-growth approach. The level-wise approach requires multiple scans of dataset and generates candidate itemsets. The pattern-growth approach requires a large amount of memory and computation time to process tree nodes because the current algorithms for uncertain datasets cannot create a tree as compact as the original FP-Tree. In this paper, we propose an array based tail node tree structure (namely AT-Tree) to maintain transaction itemsets, and a pattern-growth based algorithm named AT-Mine for FIM over uncertain dataset. AT-Tree is created by two scans of dataset and it is as compact as the original FP-Tree. AT-Mine mines frequent itemsets from AT-Tree without additional scan of dataset. We evaluate our algorithm using sparse and dense datasets; the experimental results show that our algorithm has achieved better performance than the state-of-the-art FIM algorithms on uncertain transaction datasets, especially for small minimum expected support number.
机译:- 在不确定的事务数据集中初级项目集/模式挖掘(FIM)是数据挖掘中的基本任务。在本文中,我们研究了不确定数据集的FIM问题。 FIM有两种主要方法:水平明智的方法和模式 - 增长方法。级别方面的方法需要多个数据集扫描并生成候选项目集。模式 - 增长方法需要大量的内存和计算时间来处理树节点,因为不确定数据集的当前算法不能将树作为紧凑的作为原始FP-tree。在本文中,我们提出了一种基于阵列的尾节节点树结构(即树)以维护交易项目集,以及在不确定数据集中的FIM中命名的基于模式 - 生长的算法。 at-tree由两个数据集的两个扫描创建,它与原始fp树一样紧凑。在没有额外扫描数据集的情况下,雷地雷频繁出现it-tree的项目集。我们使用稀疏和密集数据集评估我们的算法;实验结果表明,我们的算法在不确定的事务数据集上实现了比最先进的FIM算法更好的性能,尤其是对于小的最小预期支撑号。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号