FIPIP: A novel fine-grained parallel partition based intra-frame prediction on heterogeneous many-core systems

Wenbin Jiang; Min Long; Laurence T. Yang; Xiaobai Liu; Hai Jin; Alan L. Yuille; Ye Chi

首页> 外文期刊>Future generation computer systems >FIPIP: A novel fine-grained parallel partition based intra-frame prediction on heterogeneous many-core systems

【24h】

FIPIP: A novel fine-grained parallel partition based intra-frame prediction on heterogeneous many-core systems

机译：FIPIP：异构多核系统上基于新颖细粒度并行分区的帧内预测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intra-frame prediction is an important time-consuming component of the widely used H.264/AVC encoder. To speed up prediction, one promising direction is to introduce parallelism and there have been many heterogeneous many-core based approaches proposed. But most of these approaches are limited by their use of highly irregular prediction formulas, which require significant amount of branch instructions. They only use coarse-grained parallel partition, which considers blocks or sub-region of images as parallel processing units. In this paper, by contrast, we propose a fine-grained intra-frame prediction approach based on parallel partition (FIPIP) and implement it on Graphics Processing Unit (GPU) based heterogeneous many-core systems. The approach is characterized by the following aspects. First, our approach takes individual pixels as parallel processing units, instead of blocks. Imposing pixel-level parallelism is capable of fully exploiting the computational power of heterogeneous GPU-based systems and hence tremendously reduces the encoding time. Second, we unify irregular prediction formulas in intra-frame prediction into a well-designed uniform one, and propose a table-lookup method to efficiently perform intra-frame prediction. Our formula can eliminate unnecessary branch instructions by using a unified predictor array, which improves the efficiency of the fine-grained parallel partition significantly. Third, two optimized encoding orders assisted by an improved combined frame strategy are adopted to implement multi-level parallelism. Finally, an efficient self-synchronizing method is realized for finegrained task scheduling on heterogeneous CPU-GPU architecture. We apply FIPIP to encode a set of benchmark videos under varying conditions and compare it with other popular intra-frame prediction methods. Results show that FIPIP outperforms existing state-of-the-art work with speedups factor of 2-6.

机译：帧内预测是广泛使用的H.264 / AVC编码器的重要耗时组件。为了加快预测速度，一个有希望的方向是引入并行性，并且提出了许多基于异构多核的方法。但是，这些方法大多数都受到高度不规则的预测公式的使用的限制，这些公式需要大量的分支指令。他们仅使用粗粒度并行分区，该分区将图像的块或子区域视为并行处理单元。相反，在本文中，我们提出了一种基于并行分区（FIPIP）的细粒度帧内预测方法，并将其实现在基于图形处理单元（GPU）的异构多核系统上。该方法的特征在于以下几个方面。首先，我们的方法将单个像素而不是块作为并行处理单元。施加像素级并行能力能够充分利用基于异构GPU的系统的计算能力，从而极大地减少了编码时间。其次，我们将帧内预测中的不规则预测公式统一为设计良好的统一公式，并提出一种表查找方法来有效执行帧内预测。通过使用统一的预测变量数组，我们的公式可以消除不必要的分支指令，从而显着提高了细粒度并行分区的效率。第三，采用改进的组合帧策略辅助的两个优化编码顺序来实现多级并行性。最后，针对异构CPU-GPU架构上的细粒度任务调度，实现了一种高效的自同步方法。我们应用FIPIP在不同条件下对一组基准视频进行编码，并将其与其他流行的帧内预测方法进行比较。结果表明，FIPIP以2-6的加速系数胜过现有的最新技术。

著录项

来源
《Future generation computer systems》 |2018年第1期|316-329|共14页
作者
Wenbin Jiang; Min Long; Laurence T. Yang; Xiaobai Liu; Hai Jin; Alan L. Yuille; Ye Chi;
展开▼
作者单位

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China,Department Computer Science, College of Sciences, San Diego State University, San Diego, CA, United States;

Department Computer Science, College of Sciences, San Diego State University, San Diego, CA, United States;

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;

Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, United States,Department of Cognitive Science, Johns Hopkins University, United States,Department of Computer Science, Johns Hopkins University, United States;

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Parallelism; Fine-grained partition; Intra-frame prediction; Fast mode decision; GPU; H.264/AVC;

机译：并行性细粒度分区;帧内预测;快速模式决策;GPU;H.264 / AVC;

相似文献

外文文献
中文文献
专利

1. Towards high performance data analytic on heterogeneous many-core systems: A study on Bayesian Sequential Partitioning [J] . Lai Bo-Cheng, Wu Tung-Yu, Chiu Tsou-Han, Journal of Parallel and Distributed Computing . 2018,第DECa期

机译：面向异构多核系统上的高性能数据分析：贝叶斯顺序分区研究
2. Wedge template optimization and parallelization of depth map in intra-frame prediction algorithms [J] . Xie Xiaoyan, Wang Yu, Shi Pengfei, 高技术通讯（英文版） . 2021,第004期

机译：帧内预测算法中深度图的楔形模板优化和并行化
3. Optimized Parallel Implementation of Face Detection Based on Embedded Heterogeneous Many-Core Architecture [J] . Gao Fang, Huang Zhangqin, Wang Shulong, International Journal of Pattern Recognition and Artificial Intelligence . 2017,第7期

机译：基于嵌入式异构多核架构的人脸检测并行优化实现
4. A high-performance parallel CAVLC encoder on a fine-grained many-core system [C] . Xiao Zhibin, Baas Bevan IEEE International Conference on Computer Design . 2008

机译：在细粒度的多核系统上的高性能并行Cavlc编码器
5. A Fine-Grain Parallel Execution Model for Homogeneous/Heterogeneous Many-Core Systems [D] . Geng, Tongsheng. 2018

机译：均质/异构多核系统的细粒度并行执行模型
6. Fine-grained parallel RNAalifold algorithm for RNA secondary structure prediction on FPGA [O] . Fei Xia, Yong Dou, Xingming Zhou, 2009

机译：FPGA上用于RNA二级结构预测的细粒度并行RNAalifold算法
7. Towards high performance data analytic on heterogeneous many-core systems: A study on Bayesian Sequential Partitioning [O] . Bo-Cheng Lai, Tung-Yu Wu, Tsou-Han Chiu, 2018

机译：对异构多核系统的高性能数据分析：贝叶斯连续分区研究

FIPIP: A novel fine-grained parallel partition based intra-frame prediction on heterogeneous many-core systems

摘要

著录项

相似文献

相关主题

期刊订阅