首页> 外文期刊>Journal of Computers >Dynamic Nonuniform Data Approximation in Databases with Haar Wavelet
【24h】

Dynamic Nonuniform Data Approximation in Databases with Haar Wavelet

机译:HAAR小波数据库中的动态非均匀数据近似

获取原文
           

摘要

—Data synopsis is a lossy compressed representationof data stored into databases that helps the queryoptimizer to speed up the query process, e.g. time to retrievethe data from the database. An efficient data synopsis mustprovide accurate information about the distribution of datato the query optimizer at any point in time. Due to the factthat some data will be queried more often than others, agood data synopsis should consider the use of nonuniformaccuracy, e.g. provide better approximation of data that arequeried the most. Although, the generation of data synopsisis a critical step to achieve a good approximation of theinitial data representation, data synopsis must be updatedover time when dealing with time varying data. In this paper,we introduce new Haar wavelet synopses for nonuniformaccuracy and time-varying data that can be generated inlinear time and space, and updated in sublinear time. Wefurther introduce two linear algorithms, called 2-Step andM-Step for the Point-wise approximation problem that clearlyoutperforms previous algorithms known in literature, andtwo new algorithm called, DataMapping and WeightMappingfor the Range-sum approximation problem that, to the best ofour knowledge, represent a key research milestone as beingthe first linear algorithm for arbitrary weights. For bothscenarios, we focus not only on the generation of the datasynopsis but also on their updates over time. The efficiencyof our new data synopses is validated against other linearmethods by using both synthetic and real data sets.
机译:-Data概要是存储到数据库中的数据的损失压缩表示,有助于查询优先级程序加快查询过程,例如,是时候从数据库中检索数据。有效的数据概要,可以在任何时间点进行有关Data Query Optimizer的分发的准确信息。由于事实上,一些数据将比其他数据更频繁,Agood数据概要应该考虑使用非均匀性,例如不均匀性。提供更好的近似令人厌恶最多的数据。虽然,生成数据概要的关键步骤,以实现良好的近似的神经数据表示,数据概要必须在处理时间变化数据时更新时间。在本文中,我们介绍了用于非均匀性和时变数据的新Haar小波概要,可以在线时间和空间生成,并在Sublinear时间更新。 Wefurther介绍了两个线性算法,称为2步和M-Step,用于Point-Wise逼近问题,即在文献,Andtwo新算法中清晰地表达先前的算法,DATAMAPPAPED和权重绘制的范围 - 和近似问题,这是最佳的OFOR知识,代表一个关键的研究里程碑,是任意权重的第一线性算法。对于BichScenarios而言,我们不仅专注于DataSynopsis的生成,还关注其随着时间的推移更新。通过使用合成和真实数据集,我们的新数据突录部门的效率与其他LinearMethod验证。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号