This paper introduces a VLIW architecture and its optimizing compiler which are now under development. Based on the URPR software pipelining approach, the architecture integrates nine PEs with the same structure on a single-chip. In addition, a pipeline register file is used to reduce the inter-body dependent distance to enhance the overlapping of the adjacent loop iterations, furthermore to shorten the length of the optimized loop body. The pipeline register file also increases the bandwidth between PEs. The optimizing compiler is also based on the URPR software pipelining approach. It uses a two-level software pipelining method to implement phase-coupled resource allocation and code optimization, and obtains good time and space optimal results. A compilation example of an FFT innermost loop is discussed. The simulation results indicate that the architecture could reach high performance with the aid of the optimizing compiler.
本文介绍了正在开发的VLIW架构及其优化编译器。该架构基于URPR软件流水线方法,在单个芯片上集成了9个具有相同结构的PE。另外,使用流水线寄存器文件来减少依赖于主体的距离,以增强相邻循环迭代的重叠,此外,还可以缩短优化的循环主体的长度。流水线寄存器文件还增加了PE之间的带宽。优化的编译器也基于URPR软件流水线方法。它使用两级软件流水线方法来实现相耦合的资源分配和代码优化,并获得良好的时间和空间优化结果。讨论了FFT最内层循环的编译示例。仿真结果表明,在优化编译器的帮助下,该架构可以达到高性能。 P>
Dept. of Computer Science and Technology, Tsinghua University, Beijin 1000184, China;
机译:集群VLIW架构的编译器辅助电源优化
机译:VLIW体系结构的低功耗分支预测技术:基于编译器提示的方法
机译:TMS320C6000 VLIW DSP架构上的软件流水管理不规则循环
机译:基于软件流水线的VLIW架构和优化的编译器
机译:嵌入式VLIW处理器的互补编译器和体系结构功能。
机译:专为基于传感器的系统的VLIW DSP设计的高级编译器
机译:针对VLIW架构的C编译器中的优化
机译:编译辅助多指令重试在VLIW体系结构中的应用