首页> 外文会议>Proof of Designed Reliability >Modeling and managing content changes in text databases
【24h】

Modeling and managing content changes in text databases

机译:建模和管理文本数据库中的内容更改

获取原文
获取原文并翻译 | 示例

摘要

Large amounts of (often valuable) information are stored in Web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at once. For efficiency, metasearchers rely on succinct statistical summaries of the database contents to select the best databases for each query. So far, database selection research has largely assumed that databases are static, so the associated statistical summaries do not need to change over time. However, databases are rarely static and the statistical summaries that describe their contents need to be updated periodically to reflect content changes. In this paper, we first report the results of a study showing how the content summaries of 152 real Web databases evolved over a period of 52 weeks. Then, we show how to use "survival analysis" techniques in general, and Cox's proportional hazards regression in particular, to model database changes over time and predict when we should update each content summary. Finally, we exploit our change model to devise update schedules that keep the summaries up to date by contacting databases only when needed, and then we evaluate the quality of our schedules experimentally over real Web databases.
机译:大量(通常是有价值的)信息存储在可通过Web访问的文本数据库中。 “ Metasearchers”提供统一的界面来一次查询多个此类数据库。为了提高效率,元搜索者依靠数据库内容的简洁统计摘要来为每个查询选择最佳数据库。到目前为止,数据库选择研究在很大程度上假设数据库是静态的,因此相关的统计摘要不需要随时间变化。但是,数据库很少是静态的,描述其内容的统计摘要需要定期更新以反映内容的变化。在本文中,我们首先报告一项研究结果,该研究表明152个真实Web数据库的内容摘要在52周的时间内是如何演变的。然后,我们展示了通常如何使用“生存分析”技术,尤其是Cox的比例风险回归,来对数据库随时间的变化进行建模,并预测何时更新每个内容摘要。最后,我们利用变更模型来设计更新计划,该更新计划仅在需要时才通过与数据库联系来使摘要保持最新,然后我们通过实际的Web数据库通过实验评估计划的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号