提出一种超精简处理单元架构.该处理单元基于运算-跳转式单指令处理器体系.使用指令优化和内部总线上加速器,该处理单元能够执行传统算术运算式单指令处理器难于执行的高效位运算以及执行效率较低的数据转移操作.以该处理单元构成的片上大规模并行计算阵列可用于图像处理等局部性强、实时性要求高的计算任务.包含有该处理单元架构的16×16的原型阵列已经在FPGA上实现,性能达30.7GOPS@120MHz,平均功耗39.5mW.%A design of ultra-reduced microprocessor architecture and its implementation are proposed in this paper. The architecture is based on one instruction set computer with instructions of arithmetic operation and conditional jump. With instruction optimization and dedicate hardware accelerators on local bus, the hardware architecture has significant execution efficiency on bitwise operations and data transfer operations, compared with traditional one instruction set computers. A parallel computing array incorporating with the proposed microprocessor enables computation tasks, such as most low-level image processing algorithms, that require high streaming throughput with characteristic of local operation. A 16x16 prototype array has been implemented on FPGA, delivering 30.7GOPS@120MHz with a power consumption of 39.5mW.
展开▼