首页> 外文会议>ACM SIGMOD international conference on Management of data >Sequential sampling procedures for query size estimation
【24h】

Sequential sampling procedures for query size estimation

机译:用于查询大小估计的顺序采样过程

获取原文

摘要

We provide a procedure, based on random sampling, for estimation of the size of a query result. The procedure is sequential in that sampling terminates after a random number of steps according to a stopping rule that depends upon the observations obtained so far. Enough observations are obtained so that, with a pre-specified probability, the estimate differs from the true size of the query result by no more than a prespecified amount. Unlike previous sequential estimation procedures for queries, our procedure is asymptotically efficient and requires no ad hoc pilot sample or a a priori assumptions about data characteristics. In addition to establishing the asymptotic properties of the estimation procedure, we provide techniques for reducing undercoverage at small sample sizes and show that the sampling cost of the procedure can be reduced through stratified sampling techniques.

机译:

我们提供了一个基于随机采样的过程,用于估计查询结果的大小。该过程是顺序的,因为根据停止规则(取决于到目前为止获得的观察结果),在随机数的步骤后采样将终止。获得足够的观察结果,以便以预先指定的概率,估计值与查询结果的真实大小之差不超过预先指定的数量。与先前的查询顺序估计程序不同,我们的程序在渐近效率上是有效的,不需要任何即席导频样本或有关数据特征的先验假设。除了建立估计程序的渐近性质外,我们还提供了减少小样本量下的隐蔽性的技术,并表明可以通过分层采样技术来减少该程序的采样成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号