...
首页> 外文期刊>Microprocessors and microsystems >A low latency minimum distance searching unit of the SOM based hardware quantizer
【24h】

A low latency minimum distance searching unit of the SOM based hardware quantizer

机译:基于SOM的硬件量化器的低延迟最小距离搜索单元

获取原文
获取原文并翻译 | 示例
           

摘要

Parts of a SOM (Self-Organizing Map) based quantizer can be performed in parallel; i.e. distance calculation between an input pixel and a group of codewords or processing elements (PEs), and updating weight of PEs. To search for the best matching unit (BMU) whose distance is the minimum, all distances are inevitably required to compare with each other. Conventionally, the minimum distance searching unit is constructed from a group of comparators which are connected in a multistage manner in order to come up with the final single minimum distance and its index. In this way, the overall latency of the unit is linearly proportional to the number of stage of comparators log(2)(C) where C is the number of distances. In this paper, we propose a novel hardware centric algorithm with the objective to reduce the latency for the minimum distance searching unit. In a simple form, the algorithm relies on using a memory of K addresses of 1-bit word size where K is equal to the maximum value of distance. During operation, all distances are used to refer to the memory addresses in order to change their states from 'unoccupied' to 'occupied'. To efficiently search for the first address whose state is 'occupied', which is equivalent to the minimum distance, the look up table is employed. The algorithm is also adapted to make it more feasible to realize on an FPGA platform. The synthesis results compared with the conventional minimum distance searching indicate that the FPGA resource requirements of the algorithm are twice in terms of slices and LOT usages. In term of latency reduction, the implementation takes only 0.62 times of the conventional one for a PE size of 256. After integrating the unit to the SOM based quantizer, it has found that the obtained frame rate is 1.50 times of the conventional one for a PE size of 256, the image size of 512 x 512 and the clock speed of 66.67 MHz. The latency reduction can be further improved if the FPGA supports combining all the 'occupied' states in a single stage in contrast to use a group of internal limited input size LUTs. (C) 2015 Elsevier B.V. All rights reserved.
机译:基于SOM(自组织映射)的量化器的各个部分可以并行执行。即输入像素与一组码字或处理元件(PE)之间的距离计算,并更新PE的权重。为了搜索距离最小的最佳匹配单位(BMU),不可避免地需要所有距离相互比较。通常,最小距离搜索单元是由一组比较器构成的,这些比较器以多级方式连接以便得出最终的单个最小距离及其索引。这样,单元的总等待时间与比较器log(2)(C)的级数成线性比例,其中C是距离数。在本文中,我们提出了一种新颖的以硬件为中心的算法,目的是减少最小距离搜索单元的等待时间。以一种简单的形式,该算法依赖于使用1位字长的K个地址的存储器,其中K等于距离的最大值。在操作过程中,所有距离都用于引用存储地址,以便将其状态从“未占用”更改为“已占用”。为了有效地搜索状态为“已占用”(等于最小距离)的第一个地址,使用了查找表。该算法还适用于使其在FPGA平台上实现更为可行。综合结果与常规最小距离搜索相比,表明该算法的FPGA资源需求在切片和LOT使用方面是两倍。就减少等待时间而言,PE大小为256时,该实现仅花费传统方法的0.62倍。将单元集成到基于SOM的量化器后,发现获得的帧速率是传统方法的1.5倍PE大小为256,图像大小为512 x 512,时钟速度为66.67 MHz。如果FPGA支持在单个阶段中组合所有“占用”状态,而不是使用一组内部受限输入大小的LUT,则可以进一步改善延迟降低。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号