COVERED: Content-Version based Removal of Duplicates

Jyoti Malhotra; Jagdish Bakal

首页> 外文期刊>International Journal of Applied Engineering Research >COVERED: Content-Version based Removal of Duplicates

【24h】

COVERED: Content-Version based Removal of Duplicates

机译：涵盖：基于Content-Version的删除重复

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays deduplication is becoming a promising way to provide more storage space by wiping out the unwanted data, particularly duplicate and similar data copies. The similar data copies, an integral part of data versioning is seen everywhere. This paper presents a post process key to integrate data versioning, deduplication and data archiving. Version and content-based similarity detection are attained by finding the similarity scores using shingles and cosine similarity. Data from primary storage is shedded by devolving the older versions as ghost entries to archive. A novel probability model is presented which decides the permanent removal of ghost entries as per their access probabilities. The described work is evaluated on the real and synthetic datasets and active storage space was successfully released by diverting unwanted data to archive.

机译：如今重复数据删除是通过擦除不需要的数据，特别是重复和类似的数据副本来提供更多存储空间的有希望的方法。目前，可以看到类似的数据副本，数据版本的组成部分。本文介绍了集成数据版本控制，重复数据删除和数据归档的后处理密钥。通过使用带状疱疹和余弦相似度找到相似性分数来实现基于版本和基于内容的相似性检测。通过将较旧版本作为归档的幽灵条目，通过将旧版本Shedded进行归档。提出了一种新的概率模型，其决定根据其访问概率预先删除鬼魂条目。通过将不需要的数据转移到存档，在实际和合成数据集中评估所描述的工作，并通过将不需要的数据转移到存档来成功释放活动存储空间。

著录项

来源
《International Journal of Applied Engineering Research》 |2017年第3期|共8页
作者
Jyoti Malhotra; Jagdish Bakal;
展开▼
作者单位

Department of Computer Science &

Engineering G. H. Raisoni College of Engineering Nagpur RTM University Nagpur;

Department of Computer Science &

Engineering G. H. Raisoni College of Engineering Nagpur RTM University Nagpur;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类工程基础科学;
关键词
Duplicate; Post-process; Version; Checksum; Similarity; Archive;

机译：副本;后期;版本;校验和;相似;存档;

相似文献

外文文献
中文文献
专利

1. COVERED: Content-Version based Removal of Duplicates [J] . Jyoti Malhotra, Jagdish Bakal International Journal of Applied Engineering Research . 2017,第13aPta3期

机译：涵盖：基于Content-Version的删除重复
2. The Shape Interaction Matrix-Based Affine Invariant Mismatch Removal for Partial-Duplicate Image Search [J] . Yang Lin, Zhouchen Lin, Hongbin Zha IEEE Transactions on Image Processing . 2017,第2期

机译：基于形状交互矩阵的仿射不变不匹配的部分重复图像搜索
3. MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud [J] . Bioinformatics . 2017,第17期

机译：MARDRE：基于高效的MapReduce的去除云中的重复的DNA读数
4. Duplication for the Removal of Duplication [C] . Ran Ettinger, Shmuel Tyszberowicz 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering . 2016

机译：复制以消除重复
5. Ammonia removal in simulated aquaculture wastewater by biofilm-covered zeolite [D] . Mosquera Cadiz, Tomas Santiago 2016

机译：生物膜覆盖沸石去除模拟水产养殖废水中的氨
6. Surgical repair in case of covered exstrophy of bladder with complete duplication of lower genitourinary tract and visceral sequestration [O] . Sachin Sarode, Sunil Mhaske, Vinayak G. Wagaskar, 2018

机译：完全覆盖下泌尿生殖道和内脏隔离的情况下进行的膀胱修复手术
7. Research on Duplicated Documentation Removal Model Based on Information Entropy and Decision Classification Techniques [O] . 2016

机译：基于信息熵和决策分类技术的重复文档删除模型研究

COVERED: Content-Version based Removal of Duplicates

摘要

著录项

相似文献

相关主题

期刊订阅