...
首页> 外文期刊>Microprocessors and microsystems >Scalable matrix decompositions with multiple cores on FPGAs
【24h】

Scalable matrix decompositions with multiple cores on FPGAs

机译:FPGA上具有多个内核的可扩展矩阵分解

获取原文
获取原文并翻译 | 示例
           

摘要

Hardware accelerators are getting increasingly important in heterogeneous systems for many applications, including those that employ matrix decompositions. In recent years, a class of tiled matrix decomposition algorithms has been proposed for out-of-memory computations and multi-core architectures including GPU-based heterogeneous systems. However, on FPGAs these scalable solutions for large matrices are rarely found. In this paper we use the latest tiled decomposition algorithms from high performance linear algebra for off-chip memory access and loop mapping on multiple processing cores for on-chip computation to perform scalable and high performance QR and LU matrix decompositions on FPGAs.
机译:硬件加速器在异构系统中对于许多应用(包括采用矩阵分解的应用)越来越重要。近年来,针对内存不足计算和包括基于GPU的异构系统在内的多核体系结构,提出了一种平铺矩阵分解算法。但是,在FPGA上很少找到这些适用于大型矩阵的可扩展解决方案。在本文中,我们使用高性能线性代数的最新切片分解算法进行片外存储器访问,并在多个处理内核上进行循环映射以进行片上计算,从而在FPGA上执行可扩展的高性能QR和LU矩阵分解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号