首页> 外文期刊>International journal of semantic computing >Towards Update-Efficient and Parallel-Friendly Content-Based Indexing Scheme in Cloud Computing
【24h】

Towards Update-Efficient and Parallel-Friendly Content-Based Indexing Scheme in Cloud Computing

机译:遍及云计算中的更新高效和并行友好的基于内容的索引方案

获取原文
获取原文并翻译 | 示例
           

摘要

The sheer volume of contents generated by today’s Internet services is stored in the cloud. The effective indexing method is important to provide the content to users on demand. The indexing method associating the user-generated metadata with the content is vulnerable to the inaccuracy caused by the low quality of the metadata. While the content-based indexing does not depend on the error-prone metadata, the state-of-the-art research focuses on developing descriptive features and misses the system-oriented considerations when incorporating these features into the practical cloud computing systems. We propose an Update-Efficient and Parallel-Friendly content-based indexing system, called Partitioned Hash Forest (PHF). The PHF system incorporates the state-of-the-art content-based indexing models and multiple system-oriented optimizations. PHF contains an approximate content-based index and leverages the hierarchical memory system to support the high volume of updates. Additionally, the content-aware data partitioning and lock-free concurrency management module enable the parallel processing of the concurrent user requests. We evaluate PHF in terms of indexing accuracy and system efficiency by comparing it with the state-of-the-art content-based indexing algorithm and its variances. We achieve the significantly better accuracy with less resource consumption, around 37% faster in update processing and up to 2.5× throughput speedup in a multi-core platform comparing to other parallel-friendly designs.
机译:今天的Internet服务生成的纯粹数量存储在云中。有效的索引方法对于根据需求向用户提供内容非常重要。将用户生成的元数据与内容相关联的索引方法容易受元数据的低质量引起的不准确性。虽然基于内容的索引不依赖于易于错误的元数据,但最先进的研究侧重于开发描述性功能并在将这些特征结合到实际云计算系统时遗漏了以系统为导向的考虑因素。我们提出了一种更新高效且并行友好的基于内容的索引系统,称为分区哈希林(PHF)。 PHF系统包含基于最先进的基于内容的索引模型和多种系统导向的优化。 PHF包含基于近似的基于内容的索引,并利用分层内存系统来支持大量更新。另外,内容感知数据分区和锁定并发管理模块能够启用并行用户请求的并行处理。通过将其与基于最先进的内容的索引算法及其差异进行比较,我们在索引精度和系统效率方面评估PHF。我们达到了更好的资源消耗精度,更新处理速度较快37%,在多核平台上比其他并行友好设计相比,多核平台上的高达2.5倍吞吐量加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号