首页> 外文会议>International conference on computer design >RepEC-Duet: Ensure High Reliability and Performance for Deduplicated and Delta-Compressed Storage Systems
【24h】

RepEC-Duet: Ensure High Reliability and Performance for Deduplicated and Delta-Compressed Storage Systems

机译:RepEC-Duet:确保重复数据删除和Delta压缩存储系统的高可靠性和性能

获取原文

摘要

Data deduplication is a widely deployed technique to remove duplicate content to save storage space, which is however incapable of eliminating the redundancy between nonidentical but similar data blocks. To achieve further space savings in deduplicated storage systems, delta compression is employed to compress post-deduplication data. Both deduplication and delta compression introduce content references among blocks, which inevitably undermines the reliability of deduplicated and delta compressed storage systems. To ensure better reliability, existing approaches utilize either replication or erasure codes to redundantly distribute data across multiple nodes. In deduplicated and delta compressed storage systems, we observe that delta compressed chunks (DCCs) are far smaller than regular chunks called non-DCCs. Motivated by this observation, we suggest a straightforward approach in which replication is used to protect DCCs and erasure code is deployed to protect non-DCCs. However, we need to address two critical challenges to ensure this solution effective. First, the random placement of DCCs replicas destroys cache locality. Second, the separate and individual recovery and restore cache could cause storage containers to be accessed repeatedly. To address these two challenges, in this paper, we propose RepEC-Duet which employs both replication and erasure codes to ensure high reliability and performance for deduplicated and delta-compressed storage systems. RepEC-Duet introduces a delta-utilization-aware filter to select and replicate containers based on the percentage of DCCs in the containers to maintain cache locality. Moreover, to avoid unnecessary container reads, we design a cooperative cache scheme that is aware of both failure recovery and regular restore cache. Our experimental results based on three real-world datasets demonstrate that RepEC-Duet significantly improves the restore performance by 26%-59%, and reduces the storage overhead by 54%-98% than the existing approaches.
机译:重复数据删除是一种广泛使用的技术,用于删除重复的内容以节省存储空间,但是,该技术无法消除不相同但相似的数据块之间的冗余。为了在重复数据删除的存储系统中进一步节省空间,采用增量压缩来压缩重复数据删除后的数据。重复数据删除和增量压缩都会在块之间引入内容引用,这不可避免地会降低重复数据删除和增量压缩存储系统的可靠性。为了确保更好的可靠性,现有方法利用复制或擦除代码在多个节点之间冗余地分布数据。在重复数据删除和增量压缩存储系统中,我们观察到增量压缩块(DCC)远小于称为非DCC的常规块。受此观察结果的启发,我们建议一种简单的方法,在该方法中,复制用于保护DCC,而擦除代码则用于保护非DCC。但是,我们需要解决两个关键挑战,以确保该解决方案有效。首先,DCC复制副本的随机放置会破坏缓存的位置。其次,单独的恢复和还原缓存可能导致重复访问存储容器。为了解决这两个挑战,在本文中,我们提出了RepEC-Duet,它同时使用复制和擦除代码来确保重复数据删除和增量压缩存储系统的高可靠性和性能。 RepEC-Duet引入了可识别增量利用的过滤器,以根据容器中DCC的百分比选择和复制容器,以维护缓存的局部性。此外,为了避免不必要的容器读取,我们设计了一种协作式缓存方案,该方案同时了解故障恢复和常规还原缓存。我们基于三个实际数据集的实验结果表明,与现有方法相比,RepEC-Duet可将还原性能显着提高26%-59%,并将存储开销减少54%-98%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号