【24h】

Performance Optimization of Large Files Writes to Ceph Based on Multiple Pipelines Algorithm

机译:基于多管道算法的大文件写入Ceph的性能优化

获取原文
获取原文并翻译 | 示例

摘要

As a cloud storage platform software, Ceph can be used to obtain petabyte-scale storage system built from commodity hardware [1] [2] [3]. In the previous work, we optimized the performance of reads/writes from/to the Ceph storage cluster based on multi-threaded algorithms [4]. Experiment results indicate that the performance of small files reads/writes and large files reads algorithms improve obviously, but the performance of the large files writes to Ceph by the single pipeline algorithm has no evident improvement. To address this problem, we use multiple pipelines algorithm to optimize the performance of the large files writes to the Ceph storage cluster. The experiment results show that when the size of the data block is set to 10MB, the maximal performance improvement percentage of the multiple pipelines algorithm running on the two logical CPUs machine is 100.70%. At the same time, limited by the multi-threaded mechanism of Python language, such as Python GIL and threads thrashing, the performance of the multiple pipelines algorithm running on multiple cores machines does not increase linearly. We intend to optimize the performance of large files writes algorithm using C++ version application program interface of Ceph for the future work.
机译:作为云存储平台软件,Ceph可用于获得由商品硬件[1] [2] [3]构建的PB级存储系统。在先前的工作中,我们基于多线程算法[4]优化了对Ceph存储集群的读写性能。实验结果表明,小文件读/写和大文件读算法的性能有明显提高,但单流水线算法对Ceph的大文件写性能没有明显提高。为了解决这个问题,我们使用多管道算法来优化将大文件写入Ceph存储集群的性能。实验结果表明,当数据块的大小设置为10MB时,在两个逻辑CPU机器上运行的多管道算法的最大性能改进百分比为100.70%。同时,受Python语言的多线程机制(例如Python GIL和线程颠簸)的限制,在多核计算机上运行的多管道算法的性能不会线性增加。为了将来的工作,我们打算使用Ceph的C ++版本应用程序接口来优化大文件写入算法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号