首页> 外文会议>International conference on very large data bases >Auto-Join: Joining Tables by Leveraging Transformations
【24h】

Auto-Join: Joining Tables by Leveraging Transformations

机译:自动联接:通过转换来联接表

获取原文

摘要

Traditional equi-join relies solely on string equality comparisons to perform joins. However, in scenarios such as ad-hoc data analysis in spreadsheets, users increasingly need to join tables whose join-columns are from the same semantic domain but use different textual representations, for which transformations are needed before equi-join can be performed. We developed Auto-Join, a system that can automatically search over a rich space of operators to compose a transformation program, whose execution makes input tables equi-join-able. We developed an optimal sampling strategy that allows Auto-Join to scale to large datasets efficiently, while ensuring joins succeed with high probability. Our evaluation using real test cases collected from both public web tables and proprietary enterprise tables shows that the proposed system performs the desired transformation joins efficiently and with high quality.
机译:传统的等值连接仅依靠字符串相等性比较来执行连接。但是,在诸如电子表格中的临时数据分析之类的场景中,用户越来越需要联接其联接列来自相同语义域但使用不同文本表示形式的表,在执行等联接之前,需要对其进行转换。我们开发了Auto-Join,该系统可以自动搜索大量运算符以组成转换程序,执行该转换程序可以使输入表相等连接。我们开发了一种最佳采样策略,该策略允许自动加入有效地扩展到大型数据集,同时确保合并成功的可能性很高。我们使用从公共Web表和专有企业表中收集的真实测试用例进行评估,结果表明,所提出的系统可以高效且高质量地执行所需的转换联接。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号