...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Exploring Correlated Subspaces for Efficient Query Processing in Sparse Databases
【24h】

Exploring Correlated Subspaces for Efficient Query Processing in Sparse Databases

机译:探索相关子空间以在稀疏数据库中进行有效的查询处理

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Sparse data are becoming increasingly common and available in many real-life applications. However, relatively little attention has been paid to effectively model the sparse data and existing approaches such as the conventional "horizontal¿ and "vertical¿ representations fail to provide satisfactory performance for both storage and query processing, as such approaches are too rigid and generally do not consider the dimension correlations. In this paper, we propose a new approach, named HoVer, to store and conduct query for sparse data sets in an unmodified RDBMS, where HoVer stands for horizontal representation over vertically partitioned subspaces. According to the dimension correlations of sparse data sets, a novel mechanism has been developed to vertically partition a high-dimensional sparse data set into multiple lower-dimensional subspaces, and all the dimensions are highly correlated intrasubspace and highly unrelated intersubspace, respectively. Therefore, original data objects can be represented by the horizontal format in respective subspaces. With the novel HoVer representation, users can write SQL queries over the original horizontal view, which can be easily rewritten into queries over the subspace tables. Experiments over synthetic and real-life data sets show that our approach is effective in finding correlated subspaces and yields superior performance for the storage and query of sparse data.
机译:稀疏数据变得越来越普遍,并且可以在许多实际应用中使用。但是,已经很少关注有效地对稀疏数据进行建模,并且现有方法(例如常规的“水平”和“垂直”表示)无法为存储和查询处理提供令人满意的性能,因为此类方法过于僵化并且通常会不考虑尺寸相关性。在本文中,我们提出了一种名为HoVer的新方法,用于在未修改的RDBMS中存储和执行稀疏数据集的查询,其中HoVer代表垂直划分的子空间的水平表示。根据稀疏数据集的维数相关性,开发了一种将高维稀疏数据集垂直划分为多个低维子空间的新颖机制,所有维数分别是高度相关的子空间和高度不相关的子空间。因此,原始数据对象可以在各个子空间中由水平格式表示。使用新颖的HoVer表示,用户可以在原始水平视图上编写SQL查询,可以轻松地将其重写为子空间表上的查询。在合成和现实数据集上进行的实验表明,我们的方法可以有效地找到相关的子空间,并且在稀疏数据的存储和查询方面具有出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号