首页> 外文会议>IEEE International Conference on Fuzzy Systems >Efficient Generation of Reliable Estimated Linguistic Summaries
【24h】

Efficient Generation of Reliable Estimated Linguistic Summaries

机译:有效生成可靠的估计语言摘要

获取原文

摘要

Summarizing data with linguistic statements is a crucial and topical issue that has been largely addressed by the soft computing community. The goal of summarization is to generate statements that linguistically describe the properties observed in a dataset. This paper addresses the issue of efficiently extracting these summaries and rendering them to the final user, in the case where the data to be summarized are stored in a relational data base: it proposes a novel strategy that leverages the statistics about the data distribution maintained by the database system. This paper shows that reliable summaries can be very efficiently estimated based on these statistics only and without any costly data access. Additionally, it proposes a visualization of the set of extracted summaries that offers a fruitful interactive exploration tool to the user. Experiments performed on two real data bases show the relevance and efficiency of the proposed approach: with a negligible loss of accuracy, we provide the first linguistic summarization approach whose processing time does not depend on the size of the dataset. The generation of estimated linguistic summaries takes less than one second even for dataset containing millions of tuples.
机译:使用语言陈述来汇总数据是一个至关重要的主题,软计算社区已在很大程度上解决了这一问题。汇总的目的是生成在语言上描述在数据集中观察到的属性的语句。本文解决了在将要汇总的数据存储在关系数据库中的情况下,有效地提取这些汇总并将其呈现给最终用户的问题:它提出了一种新颖的策略,该策略利用了由以下人员维护的数据分布的统计信息:数据库系统。本文表明,仅基于这些统计信息就可以非常有效地估算出可靠的摘要,而无需进行任何昂贵的数据访问。此外,它提出了一组摘要的可视化效果,为用户提供了富有成效的交互式探索工具。在两个真实的数据库上进行的实验表明了该方法的相关性和效率:在准确性可忽略不计的情况下,我们提供了第一种语言汇总方法,该方法的处理时间不取决于数据集的大小。即使对于包含数百万个元组的数据集,估计语言摘要的生成也将花费不到一秒钟的时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号