首页> 外文会议>Practical Experience with SMDS >Efficient SIMD code generation for runtime alignment and length conversion

【24h】

Efficient SIMD code generation for runtime alignment and length conversion

机译：高效的SIMD代码生成，用于运行时对齐和长度转换

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

When generating codes for today's multimedia extensions, one of the major challenges is to deal with memory alignment issues. While hand programming still yields best performing SIMD codes, it is both time consuming and error prone. Compiler technology has greatly improved, including techniques that simdize loops with misaligned accesses by automatically rearranging misaligned memory streams in registers. Current techniques are applicable to runtime alignments, but they aggressively reduce the alignment overhead only when all alignments are known at compile time. This paper presents two major enhancements to the state of the art, improving both performance and coverage. First, we propose a novel technique to simdize loops with runtime alignment nearly as efficiently as those with compile-time misalignment. Runtime alignment is pervasive in real applications because it is either part of the algorithms, or it is an artifact of the compiler's inability to extract accurate alignment information from complex applications. Second, we incorporate length conversion operations, e.g., conversions between data of different sizes, into the alignment handling framework. Length conversions are pervasive in multimedia applications where mixed integer types are often used. Supporting length conversion can greatly improve the coverage of simdizable loops. Experimental results indicate that our runtime alignment technique achieves a 19% to 32% speedup increase over prior art for a benchmark stressing the impact of misaligned data. We also demonstrate speedup factors of up to 8.11 for real benchmarks over sequential execution.

机译：在为当今的多媒体扩展生成代码时，主要的挑战之一是处理内存对齐问题。尽管手工编程仍然可以产生性能最佳的SIMD代码，但它既耗时又容易出错。编译器技术得到了极大的改进，包括通过自动重新排列寄存器中未对齐的内存流来模拟具有未对齐访问的循环的技术。当前技术适用于运行时对齐，但是只有在编译时知道所有对齐时，它们才会积极减少对齐开销。本文提出了对现有技术的两个主要改进，同时提高了性能和覆盖范围。首先，我们提出了一种新颖的技术来模拟运行时对齐的循环，几乎与编译时未对齐的循环一样有效。运行时对齐在实际应用程序中无处不在，因为它要么是算法的一部分，要么是编译器无法从复杂应用程序中提取准确的对齐信息的产物。其次，我们将长度转换操作（例如，不同大小的数据之间的转换）合并到对齐处理框架中。在通常使用混合整数类型的多媒体应用中，长度转换很普遍。支持长度转换可以大大提高可模拟循环的覆盖范围。实验结果表明，对于强调不对齐数据影响的基准，我们的运行时对齐技术比现有技术实现了19％到32％的加速提高。对于连续执行的实际基准，我们还展示了高达8.11的加速因子。

著录项

来源
《Practical Experience with SMDS》|1995年|p.153-164|共12页
会议地点
作者

展开▼
作者单位

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient SIMD Code Generation for Irregular Kernels [J] . Seonggun Kim, Hwansoo Han ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2012,第8期

机译：针对不规则内核的高效SIMD代码生成
2. Conventional-band and long-wavelength-band efficient wavelength conversion by difference-frequency generation in sinusoidally chirped optical superlattice waveguides [J] . Gao SM, Yang CX, Jin GF Optics Communications: A Journal Devoted to the Rapid Publication of Short Contributions in the Field of Optics and Interaction of Light with Matter . 2004,第4a6期

机译：在正弦optical光学超晶格波导中通过频差产生的常规带和长波带有效波长转换
3. Agile and Highly Efficient Wavelength Conversion Using Highly Nonlinear Fiber for Optical Code-Labeled Packets [J] . Kiyoshi Onohara, Yoshinari Awaji, Naoya Wada, IEEE Photonics Technology Letters . 2005,第3期

机译：使用高度非线性光纤的光学代码标签包的敏捷和高效波长转换
4. Efficient SIMD code generation for runtime alignment and length conversion [C] . Wu, P., Eichenberger, . 2005

机译：高效的SIMD代码生成，用于运行时对齐和长度转换
5. Bandwidth-efficient communication systems based on finite-length low density parity check codes. [D] . Vu, Huy G. 2006

机译：基于有限长度低密度奇偶校验码的高效带宽通信系统。
6. Parasail: SIMD C library for global semi-global and local pairwise sequence alignments [O] . Jeff Daily 2016

机译：Parasail：用于全局半全局和局部成对序列比对的SIMD C库
7. Efficient SIMD Code Generation for Runtime Alignment and Length Conversion [O] . Peng Wu, Alexandre E. Eichenberger, Amy Wang 2005

机译：用于运行时对齐和长度转换的高效sImD代码生成

Efficient SIMD code generation for runtime alignment and length conversion

摘要

著录项

相似文献

相关主题

期刊订阅