An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

B. Ziani; Y. Ouinten

首页> 外文期刊>Intelligent decision technologies >An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

【24h】

An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

机译：一种使用最大频繁项集自动选择关系数据仓库中多表索引的方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

System performance for data warehouses is crucially dependent on its physical design in which one of the most challenging tasks is the selection of an appropriate set of indexes for a representative workload under storage constraint. The problem becomes even more complex for multi-tables indexes such as bitmap join indexes, since it involves searching a vast space of possible configurations. Queries references to attributes and their frequencies play an important role in determining the efficiency of the selected indexes. In this paper, we consider the index selection as a typical frequent itemsets mining problem. The indexes are built with combinations of attributes, viewed as items. The queries in the workload, viewed as transactions, are described by the attributes they involve. The foundation of our approach is the concept of maximal frequent itemsets. This data mining technique helps to discover strong correlations among attributes such that the presence of some attributes in a query will imply the presence of some other attributes. Moreover, by avoiding the generation of redundent indexes, the proposed approach leads to a solution that expresses the set of relevant indexes in a more succinct way. Consequently, it guarantees the reduction of the storage space requirements. Unlike previous approaches that focus on the configuration leading to the minimum workload cost, we suggest to consider a set of optimized solutions and we propose a metric for measuring profit-effectiveness that helps to pick up the most promising one. Through a set of experiments on the ABP-1 benchmark, we show that our approach achieves better performance compared to similar methods, with significant savings in index storage.

机译：数据仓库的系统性能主要取决于其物理设计，其中最具挑战性的任务之一是为存储受限的代表性工作负载选择一组合适的索引。对于多表索引（例如位图联接索引），此问题变得更加复杂，因为它涉及到搜索可能配置的巨大空间。对属性及其频率的查询对确定所选索引的效率起着重要作用。在本文中，我们认为索引选择是一个典型的频繁项目集挖掘问题。索引是用属性的组合构建的，被视为项目。工作负载中的查询（被视为事务）由它们所涉及的属性来描述。我们方法的基础是最大频繁项集的概念。这种数据挖掘技术有助于发现属性之间的强相关性，以使查询中某些属性的存在将暗示某些其他属性的存在。此外，通过避免冗余索引的生成，所提出的方法导致了一种解决方案，该解决方案以更简洁的方式表达了相关索引的集合。因此，它保证了存储空间需求的减少。与以前的方法着重于使工作量成本降至最低的配置不同，我们建议考虑一套优化的解决方案，并提出一种衡量利润效益的指标，以帮助选择最有前途的方案。通过在ABP-1基准上进行的一组实验，我们表明，与类似方法相比，我们的方法可实现更好的性能，并显着节省了索引存储量。

著录项

来源
《Intelligent decision technologies》 |2013年第4期|279-292|共14页
作者
B. Ziani; Y. Ouinten;
展开▼
作者单位

LIM, Department of Mathematics and Computer Science, Laghouat 03000, Algeria;

Department of Mathematics and Computer Science, LIM, Laghouat, Algeria;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Datawarhouse; database physical design; bitmap join index; data mining; maximal frequent itemsets;

机译：数据仓库;数据库物理设计;位图连接索引;数据挖掘;最大频繁项集;

相似文献

外文文献
中文文献
专利

1. A distributed maximal frequent itemset mining with multi agents system on bitmap join indexes selection [J] . Hamid Necir, Habiba Drias International journal of infomation technology and management . 2015,第2a3期

机译：基于位图联接索引选择的多代理系统分布式最大频繁项集挖掘
2. EFFICIENT SUBSET-LATTICE ALGORITHMS FOR MINING CLOSED FREQUENT ITEMSETS AND MAXIMAL FREQUENT ITEMSETS IN DATA STREAMS [J] . Ye-In Chang, Chia-En Li, Wei-Hau Peng, International Journal of Electrical Engineering: Transactions of the Chinese Institute of Engineers, Series E . 2013,第2期

机译：高效的子格算法，用于挖掘数据流中的封闭频率项和最大频率项
3. Materialized View Selection for a Data Warehouse Using Frequent Itemset Mining [J] . Mohammad Karim Sohrabi, Vahid Ghods Journal of Computers . 2016,第2期

机译：使用频繁的项目集挖掘数据仓库的物化视图选择
4. Improving Star Join Queries Performance: A Maximal Frequent Pattern Based Approach for Automatic Selection of Indexes in Relational Data Warehouses [C] . Ziani B., Ouinten Y. 2011 International Conference on Internet Computing and Information Services . 2011

机译：改进星形联接查询性能：一种基于最大频繁模式的关系数据仓库中索引自动选择的方法
5. Frequent Itemset Hiding Algorithm Using Frequent Pattern Tree Approach. [D] . Alnatsheh, Rami. 2012

机译：使用频繁模式树方法的频繁项集隐藏算法。
6. Utilizing maximal frequent itemsets and social network analysis for HIV data analysis [O] . Yunuscan Koçak, Tansel Özyer, Reda Alhajj 2016

机译：利用最大频繁项集和社交网络分析进行HIV数据分析
7. HybridMiner: Mining Maximal Frequent Itemsets Using Hybrid Database Representation Approach [O] . Bashir, Shariq, Baig, Abdul Rauf 2009

机译：Hybridminer：使用混合数据库挖掘最大频繁项集表征方法

An improved approach for automatic selection of multi-tables indexes in ralational data warehouses using maximal frequent itemsets

摘要

著录项

相似文献

相关主题

期刊订阅