首页> 外文会议>Proof of Designed Reliability >Index support for frequent itemset mining in a relational DBMS

【24h】

Index support for frequent itemset mining in a relational DBMS

机译：关系DBMS中频繁项集挖掘的索引支持

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many efforts have been devoted to couple data mining activities with relational DBMSs, but a true integration into the relational DBMS kernel has been rarely achieved. This paper presents a novel indexing technique, which represents transactions in a succinct form, appropriate for tightly integrating frequent itemset mining in a relational DBMS. The data representation is complete, i.e., no support threshold is enforced, in order to allow reusing the index for mining itemsets with any support threshold. Furthermore, an appropriate structure of the stored information has been devised, in order to allow a selective access of the index blocks necessary for the current extraction phase. The index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Experiments have been run for various datasets, characterized by different data distributions. The execution time of the frequent itemset extraction task exploiting the index is always comparable with and sometime faster than a C++ implementation of the FP-growth algorithm accessing data stored on a flat file.

机译：已经进行了许多努力来将数据挖掘活动与关系DBMS耦合在一起，但是很少真正实现到关系DBMS内核的真正集成。本文提出了一种新颖的索引技术，该技术以简洁的形式表示事务，适用于将频繁项集挖掘紧密集成在关系DBMS中。数据表示是完整的，即不强制执行支持阈值，以便允许将索引重新用于挖掘具有任何支持阈值的项目集。此外，已经设计了存储信息的适当结构，以便允许选择性地访问当前提取阶段所需的索引块。该索引已实现到PostgreSQL开源DBMS中，并利用了其物理级别的访问方法。已经针对以不同数据分布为特征的各种数据集进行了实验。利用索引的频繁项集提取任务的执行时间始终与访问存储在平面文件中的数据的FP-growth算法的C ++实现可比，并且有时比其快。

著录项

来源
《Proof of Designed Reliability》|1994年|p.754-765|共12页
会议地点
作者
Baralis E.; Cerquitelli T.; Chiusano S.;
展开▼
作者单位

Dipt. di Autom. e Inf., Politecnico di Torino, Italy;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Frequent Itemset Mining Using LP-Growth Algorithm Based on Multiple Minimum Support Threshold Value (Multiple Item Support Frequent Pattern Growth) [J] . M. Sinthuja, N. Puviarasan, P. Aruna Journal of computational and theoretical nanoscience . 2019,第4期

机译：使用基于多个最小支持阈值的LP-生长算法（多项支持频繁模式增长）频繁的项目集挖掘
2. Research on Frequent Itemsets Mining Algorithm based on Relational Database [J] . Jingyang Wang, Huiyong Wang, Dongwen Zhang, Journal of software . 2013,第8期

机译：基于关系数据库的频繁项集挖掘算法研究
3. Research on Frequent Itemsets Mining Algorithm based on Relational Database [J] . Jingyang Wang, Huiyong Wang, Dongwen Zhang, Journal of software . 2013,第8期

机译：基于关系数据库的频繁项集挖掘算法研究
4. Index support for frequent itemset mining in a relational DBMS [C] . Baralis, E., Cerquitelli, . 2005

机译：关系DBMS中频繁项集挖掘的索引支持
5. Mining Frequent Itemsets from Uncertain Data: Extensions to Constrained Mining and Stream Mining. [D] . Hao, Boyu. 2010

机译：从不确定的数据中挖掘频繁项集：约束挖掘和流挖掘的扩展。
6. Unravelling associations between unassigned mass spectrometry peaks with frequent itemset mining techniques [O] . Trung Nghia Vu, Aida Mrzic, Dirk Valkenborg, 2014

机译：利用频繁项集挖掘技术揭示未分配质谱峰之间的关联
7. Approximation to expected support of frequent itemsets in mining probabilistic sets of uncertain data [O] . Cuzzocrea Alfredo, Leung Carson K., Mackinnon Richard Kyle 2015

机译：挖掘不确定数据的概率集中频繁项集的预期支持的近似值

Index support for frequent itemset mining in a relational DBMS

摘要

著录项

相似文献

相关主题

期刊订阅