Semantic-based intelligent data clean framework for big data

机译：基于语义的大数据智能数据清理框架

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In order to overcome the limitation of existing data cleansing methods working on massive data, in this paper, we propose a generic semantic-based framework using parallelized processing model for effective big data cleansing. We also use an improved Semantic-Based Keyword Matching Algorithm to deal with duplicate data. Experimental results show that this parallelized framework with improved Semantic-Based Keyword Matching Algorithm can identify duplicates with high recall and precision and have a good performance for big data cleansing.

机译：为了克服现有的处理海量数据的数据清理方法的局限性，本文提出了一种基于语义的通用框架，该框架使用并行处理模型进行有效的大数据清理。我们还使用一种改进的基于语义的关键字匹配算法来处理重复数据。实验结果表明，该并行框架具有改进的基于语义的关键字匹配算法，能够以较高的查全率和准确性来识别重复项，并具有良好的大数据清理性能。

著录项

来源
《2014 IEEE International Conference on Security, Pattern Analysis, and Cybernetics》|2014年|448-453|共6页
会议地点 Wuhan(CN)
作者
Wang Jia; Song Zhijun; Li Qian; Yu Jun; Chen Fei;
展开▼
作者单位

The 28th Research Institute of China Electronics Technology Group Corporation, Nanjing, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Big data; Cleaning; Companies; Data models; Encoding; Real-time systems; Semantics; Semantic-based Keyword Matching; big data; data cleansing; parallelized processing; semantic-based framework;

机译：大数据;清洗;公司;数据模型;编码;实时系统;语义;基于语义的关键字匹配;大数据;数据清洗;并行处理;基于语义的框架;;

相似文献

外文文献
中文文献
专利

1. Object Oriented Intelligent Multi-Agent System Data Cleaning Architecture To Clean Email Data [J] . Dr. G. Arumugam, T. Joshva Devadas International Journal of Engineering Science and Technology . 2010,第11期

机译：面向对象的智能多代理系统数据清理架构，用于清理电子邮件数据
2. Semantic-based Big Data integration framework using scalable distributed ontology matching strategy [J] . Mountasser Imadeddine, Ouhbi Brahim, Hdioud Ferdaous, Distributed and Parallel Databases . 2021,第4期

机译：基于语义的大数据集成框架使用可扩展分布式本体匹配策略
3. An Intelligent Data Service Framework for Heterogeneous Data Sources [J] . Khan Fakhri Alam, Rehman Mujeeb Ur, Khalid Afsheen, Journal of grid computing . 2019,第3期

机译：异构数据源的智能数据服务框架
4. Cleaning Framework for BigData: An Interactive Approach for Data Cleaning [C] . Hong Liu, Ashwin Kumar Tk, Johnson P Thomas, IEEE International Conference on Big Data Computing Service and Applications . 2016

机译：大数据清洗框架：一种用于数据清洗的交互式方法
5. Cleaning Framework for Big Data [D] . Liu, Hong. 2017

机译：大数据清洗框架
6. TORNADO: Intermediate Results Orchestration Based Service-Oriented Data Curation Framework for Intelligent Video Big Data Analytics in the Cloud [O] . Aftab Alam, Young-Koo Lee 2020

机译：TORNADO：基于中间结果业务流程的面向服务的数据管理框架用于云中的智能视频大数据分析
7. Object Oriented Intelligent Multi-Agent System Data Cleaning Architecture to clean Preference based Text Data [O] . Dr. G. Arumugam, T. Joshva Devadas, Madurai Kamaraj 2011

机译：面向对象的智能多代理系统数据清理架构，用于清理基于首选项的文本数据
8. Semantic-Based Concurrency Control for Object-Oriented Database Systems Supporting Real-Time Applications. [R] . Lee, J., Son, S. H. 1994

机译：支持实时应用的面向对象数据库系统的基于语义的并发控制。

Semantic-based intelligent data clean framework for big data

摘要

著录项

相似文献

相关主题

期刊订阅