首页> 外文会议>IEEE International Smart Cities Conference >Large-Scale Distributed Linkage of Records Containing Spatio-Temporal Information
【24h】

Large-Scale Distributed Linkage of Records Containing Spatio-Temporal Information

机译:包含时空信息的记录的大规模分布式链接

获取原文

摘要

Spatio-temporal information is increasingly made available in modern data sets, together with traditional numerical and categorical attributes. Such information can play a vital role in deciding whether two records, coming from disparate data sources, correspond to the same real-world entity. Linkage of records containing spatio-temporal information requires novel linkage methods and is usually associated with a significant computational overhead. To reduce computational costs, in this paper, we propose the first Spark-based approach for distributed, on-demand, spatio-temporal linkage. Through experimental evaluation, we illustrate that our Spark-based approach achieves (on average) 35% performance improvement compared with the respective Map/Reduce-based implementation.
机译:时空信息以及传统的数值和分类属性越来越多地在现代数据集中提供。在确定来自不同数据源的两个记录是否对应于同一真实世界实体时,此类信息可以发挥至关重要的作用。包含时空信息的记录的链接需要新颖的链接方法,并且通常与大量的计算开销相关联。为了减少计算成本,在本文中,我们提出了第一种基于Spark的分布式,按需,时空链接的方法。通过实验评估,我们表明,与基于Map / Reduce的实现相比,基于Spark的方法平均实现了35%的性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号