首页> 外文会议>Database Systems for Advanced Applications >Summarization Graph Indexing: Beyond Frequent Structure-Based Approach
【24h】

Summarization Graph Indexing: Beyond Frequent Structure-Based Approach

机译:摘要图索引:超越基于频繁结构的方法

获取原文
获取原文并翻译 | 示例

摘要

Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph search is a key problem, which is defined as given a query Q, retrieving all graphs containing Q as a sub-graph in the graph database. Most existing sub-graph search methods try to filter out false positives (graphs that are not possible in the results) as many as possible by indexing some frequent sub-structures in graph database, such as [20,22,4,23]. However, due to ignoring the relationships between sub-structures, these methods still admit a high percentage of false positives. In this paper, we propose a novel concept, Summarization Graph, which is a complete graph and captures most topology information of the original graph, such as sub-structures and their relationships. Based on Summarization Graphs, we convert the filtering problem into retrieving objects with set-valued attributes. Moreover, we build an efficient signature file-based index to improve the filtering process. We prove theoretically that the pruning power of our method is larger than existing structure-based approaches. Finally, we show by extensive experimental study on real and synthetic data sets that the size of candidate set generated by Summarization Graph-based approach is only about 50% of that left by existing graph indexing methods, and the total response time of our method is reduced 2-10 times.
机译:图形是一种重要的数据结构,用于对复杂的结构数据进行建模,例如化学化合物,蛋白质和XML文档。在许多基于图数据的应用程序中,子图搜索是一个关键问题,其定义为给定查询Q,在图数据库中检索所有包含Q作为子图的图。大多数现有的子图搜索方法都试图通过索引图数据库中的一些频繁的子结构来尽可能多地过滤掉误报(在结果中不可能出现的图),例如[20,22,4,23]。但是,由于忽略了子结构之间的关系,因此这些方法仍然会接受较高百分比的误报。在本文中,我们提出了一个新颖的概念,即“摘要图”,它是一个完整的图,可以捕获原始图的大多数拓扑信息,例如子结构及其关系。基于摘要图,我们将过滤问题转换为具有集值属性的对象。此外,我们建立了一个有效的基于签名文件的索引来改善过滤过程。我们从理论上证明,我们的方法的修剪能力比现有的基于结构的方法大。最后,我们通过对真实和合成数据集的大量实验研究表明,基于摘要图的方法生成的候选集的大小仅是现有图索引方法剩下的约50%,并且该方法的总响应时间为减少了2-10倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号