首页> 外文会议>Discovery science >Finding Frequent Patterns from Compressed Tree-Structured Data
【24h】

Finding Frequent Patterns from Compressed Tree-Structured Data

机译:从压缩的树形结构数据中查找频繁模式

获取原文
获取原文并翻译 | 示例

摘要

In this paper we present a new method for finding frequent patterns from tree-structured data, where a frequent pattern means a subgraph which frequently occurs in a given tree-structured data. We make use of a data compression method called TGCA for tree-structured data. Improving manipulation of large scaled data by compressing them has been investigated in previous studies, such as keyword search in plain texts, and frequent itemset mining from transaction data, but it has not been applied to finding frequent patterns from tree-structured data in the best of our knowledge. The TGCA algorithm is obtained by modifying the SEQUITUR algorithm for plain texts so that it can compress tree-structured data, and we show that we can count occurrences of patterns in the original data by using the data compressed by TGCA without expanding it. This is the reason why our method improves the efficiency of finding frequent patterns. The advantage of our method is shown in some experiments in the case that the data can be compressed in some good compression ratios.
机译:在本文中,我们提出了一种从树形结构数据中查找频繁模式的新方法,其中频繁模式是指在给定的树形数据中频繁出现的子图。我们对树结构数据使用称为TGCA的数据压缩方法。在先前的研究中,例如通过纯文本中的关键字搜索以及从交易数据中频繁进行项目集挖掘,已经研究了通过压缩大规模数据来改善操作的方法,但尚未将其最好地用于从树状结构数据中查找频繁模式。我们的知识。 TGCA算法是通过将SEQUITUR算法修改为纯文本而获得的,从而可以压缩树状结构数据,并且我们证明了可以通过使用TGCA压缩的数据来对原始数据中模式的出现进行计数而无需扩展它。这就是为什么我们的方法提高了发现频繁模式的效率的原因。我们的方法的优势在某些实验中得到了证明,在这种情况下,数据可以以某些良好的压缩率进行压缩。

著录项

  • 来源
    《Discovery science》|2008年|284-295|共12页
  • 会议地点 Budapest(HU);Budapest(HU)
  • 作者单位

    Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

    Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

    Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 人工智能理论;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号