Query-Log Aware Replicated Declustering

Turk Ata; Yasin Oktay Kerim; Aykanat Cevdet

首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Query-Log Aware Replicated Declustering

【24h】

Query-Log Aware Replicated Declustering

机译：查询日志感知复制聚类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data declustering and replication can be used to reduce I/O times related with processing of data intensive queries. Declustering parallelizes the query retrieval process by distributing the data items requested by queries among several disks. Replication enables alternative disk choices for individual disk items and thus provides better query parallelism options. In general, existing replicated declustering schemes do not consider query log information and try to optimize all possible queries for a specific query type, such as range or spatial queries. In such schemes, it is assumed that two or more copies of all data items are to be generated and scheduling of these copies to disks are discussed. However, in some applications, generation of even two copies of all of the data items is not feasible, since data items tend to have very large sizes. In this work, we assume that there is a given limit on disk capacities and thus on replication amounts. We utilize existing query-log information to propose a selective replicated declustering scheme, in which we select the data items to be replicated and decide on their scheduling onto disks while respecting disk capacities. We propose and implement an iterative improvement algorithm to obtain a two-way replicated declustering and use this algorithm in a recursive framework to generate a multiway replicated declustering. Then we improve the obtained multiway replicated declustering by efficient refinement heuristics. Experiments conducted on realistic data sets show that the proposed scheme yields better performance results compared to existing replicated declustering schemes.

机译：数据分簇和复制可用于减少与数据密集型查询的处理相关的I / O时间。群集通过在多个磁盘之间分配查询所请求的数据项来并行化查询检索过程。通过复制，可以为单个磁盘项目选择其他磁盘，从而提供更好的查询并行性选项。通常，现有的复制分簇方案不考虑查询日志信息，而是尝试针对特定查询类型（例如范围或空间查询）优化所有可能的查询。在这样的方案中，假定要生成所有数据项的两个或更多副本，并讨论将这些副本调度到磁盘的过程。但是，在某些应用中，生成所有数据项的两个副本甚至是不可行的，因为数据项往往具有非常大的大小。在这项工作中，我们假设磁盘容量以及复制量受到一定的限制。我们利用现有的查询日志信息来提出选择性的复制分簇方案，在该方案中，我们选择要复制的数据项，并在考虑磁盘容量的同时决定它们在磁盘上的调度。我们提出并实现了一种迭代改进算法，以获取双向复制聚类，并在递归框架中使用该算法来生成多路复制聚类。然后，我们通过有效的细化启发式算法改进获得的多路复制去簇。在实际数据集上进行的实验表明，与现有的复制分簇方案相比，该方案产生了更好的性能结果。

著录项

来源
《Parallel and Distributed Systems, IEEE Transactions on》 |2013年第5期|987-995|共9页
作者
Turk Ata; Yasin Oktay Kerim; Aykanat Cevdet;
展开▼
作者单位

Bilkent University, Ankara;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computer architecture; Distributed databases; Equations; Gain; Optimal scheduling; Query processing; Time factors; Declustering; iterative improvement heuristics; parallel disk architectures; replication;

机译：计算机架构;分布式数据库;方程;获得;最佳调度;查询处理;时间因素;聚簇迭代改进启发式;并行磁盘体系结构;复制;

相似文献

外文文献
中文文献
专利

1. Efficient parallel processing of range queries through replicated declustering [J] . Hakan Ferhatosmanoglu, Ali Saman Tosun, Guadalupe Canahuate, Distributed and Parallel Databases . 2006,第2期

机译：通过复制分簇来有效并行处理范围查询
2. The 'replication crisis' in the public eye: Germans' awareness and perceptions of the (ir)reproducibility of scientific research [J] . Niels G. Mede, Mike S. Schaefer, Rivarda Ziegler, Public understanding of science . 2021,第1期

机译：“复制危机”在公众眼中：德国人对（IR）科学研究的再现性的认识和看法
3. On identity-aware replication in stochastic modeling for simulation-based dependability analysis of large interconnected systems [J] . Chiaradonna Silvano, Di Giandomenico Felicita, Masetti Giulio Performance Evaluation . 2021,第May期

机译：关于大型互联系统模拟基于仿真的可靠性模拟中的身份感知复制
4. Selective Replicated Declustering for Arbitrary Queries [C] . K. Yasin Oktay, Ata Turk, Cevdet Aykanat Euro-par 2009 parallel processing . 2009

机译：任意查询的选择性复制聚类
5. Socially-aware data replication. [D] . Nguyen, Khanh H. 2012

机译：具有社交意识的数据复制。
6. Raising Awareness for the Replication Crisis in Clinical Psychology by Focusing on Inconsistencies in Psychotherapy Research: How Much Can We Rely on Published Findings from Efficacy Trials? [O] . Michael P. Hengartner -1

机译：通过关注心理疗法研究中的不一致之处来提高对临床心理学中复制危机的认识：我们能在多大程度上依赖功效试验中发表的发现？
7. Query-log aware replicated declustering [O] . Turk, A., Yasin Oktay, K., Aykanat, C. 2013

机译：查询日志感知的复制分簇

Query-Log Aware Replicated Declustering

摘要

著录项

相似文献

相关主题

期刊订阅