...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >An UpDown Directed Acyclic Graph Approach for Sequential Pattern Mining
【24h】

An UpDown Directed Acyclic Graph Approach for Sequential Pattern Mining

机译:用于顺序模式挖掘的UpDown有向无环图方法

获取原文
获取原文并翻译 | 示例
           

摘要

Traditional pattern growth-based approaches for sequential pattern mining derive length-(k+1) patterns based on the projected databases of length-k patterns recursively. At each level of recursion, they unidirectionally grow the length of detected patterns by one along the suffix of detected patterns, which needs k levels of recursion to find a length-k pattern. In this paper, a novel data structure, UpDown Directed Acyclic Graph (UDDAG), is invented for efficient sequential pattern mining. UDDAG allows bidirectional pattern growth along both ends of detected patterns. Thus, a length-k pattern can be detected in lfloor log_{2}k+1rfloor levels of recursion at best, which results in fewer levels of recursion and faster pattern growth. When minSup is large such that the average pattern length is close to 1, UDDAG and PrefixSpan have similar performance because the problem degrades into frequent item counting problem. However, UDDAG scales up much better. It often outperforms PrefixSpan by almost one order of magnitude in scalability tests. UDDAG is also considerably faster than Spade and LapinSpam. Except for extreme cases, UDDAG uses comparable memory to that of PrefixSpan and less memory than Spade and LapinSpam. Additionally, the special feature of UDDAG enables its extension toward applications involving searching in large spaces.
机译:传统的基于模式增长的顺序模式挖掘方法基于长度为k的模式的计划数据库来递归得出长度为(k + 1)的模式。在递归的每个级别,它们都沿检测到的模式的后缀单向地将检测到的模式的长度增加一个,这需要k个递归级别才能找到长度为k的模式。在本文中,为有效的顺序模式挖掘发明了一种新颖的数据结构,即UpDown有向无环图(UDDAG)。 UDDAG允许沿着检测到的图案的两端进行双向图案生长。因此,可以在最低水平log_ {2} k + 1rfloor递归级别中检测到长度为k的图案,这将导致较少的递归级别和更快的图案增长。当minSup很大以至于平均图案长度接近1时,UDDAG和PrefixSpan具有相似的性能,因为该问题会降级为频繁的项目计数问题。但是,UDDAG的扩展规模要好得多。在可伸缩性测试中,它通常比PrefixSpan高出近一个数量级。 UDDAG还比Spade和LapinSpam快得多。除极端情况外,UDDAG使用与PrefixSpan相当的内存,并且使用的内存少于Spade和LapinSpam。此外,UDDAG的特殊功能使其能够扩展到涉及在大空间中搜索的应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号