Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor

机译：使用缓冲方式预测器构建低延迟，高度关联的DRAM缓存

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The emerging die-stacked DRAM technology allows computer architects to design a last-level cache (LLC) with high memory bandwidth and large capacity. There are four key requirements for DRAM cache design: minimizing on-chip tag storage overhead, optimizing access latency, improving hit rate, and reducing off-chip traffic. These requirements seem mutually incompatible. For example, to reduce the tag storage overhead, the recent proposed LH-cache co-locates tags and data in the same DRAM cache row, and the Alloy Cache proposed to alloy data and tags in the same cache line in a direct-mapped design. However, these ideas either require significant tag lookup latency or sacrifice hit rate for hit latency. To optimize all four key requirements, we propose the Buffered Way Predictor (BWP). The BWP predicts the way ID of a DRAM cache request with high accuracy and coverage, allowing data and tag to be fetched back to back. Thus, the read latency for the data can be completely hidden so that DRAM cache hitting requests have low access latency. The BWP technique is designed for highly associative block-based DRAM caches and achieves a low miss rate and low off-chip traffic. Our evaluation with multi-programmed workloads and a 128MB DRAM cache shows that a 128KB BWP achieves a 76.2% hit rate. The BWP improves performance by 8.8% and 12.3% compared to LH-cache and Alloy Cache, respectively.

机译：新兴的裸片堆叠DRAM技术使计算机架构师可以设计具有高内存带宽和大容量的最后一级缓存（LLC）。 DRAM缓存设计有四个关键要求：最小化片上标签存储开销，优化访问延迟，提高命中率以及减少片外流量。这些要求似乎相互不兼容。例如，为了减少标签存储开销，最近提出的LH高速缓存将标签和数据共存于同一DRAM高速缓存行中，而Alloy Cache建议在直接映射设计中将同一高速缓存行中的数据和标签合在一起。然而，这些想法要么需要大量的标签查找等待时间，要么为命中等待时间牺牲命中率。为了优化所有四个关键要求，我们提出了缓冲路径预测器（BWP）。 BWP以很高的准确性和覆盖范围预测DRAM缓存请求的方式，从而允许数据和标签被背靠背获取。因此，可以完全隐藏数据的读取延迟，从而使DRAM缓存命中请求具有较低的访问延迟。 BWP技术设计用于高度关联的基于块的DRAM高速缓存，并实现了较低的未命中率和较低的片外流量。我们对多程序工作负载和128MB DRAM缓存的评估表明，128KB BWP的命中率达到76.2％。与LH缓存和Alloy Cache相比，BWP分别将性能提高了8.8％和12.3％。

著录项

来源
《IEEE International Symposium on Computer Architecture and High Performance Computing》|2016年|109-117|共9页
会议地点
作者
Zhe Wang; Daniel A. Jiménez; Tao Zhang; Gabriel H. Loh; Yuan Xie;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Random access memory; Metals; Bandwidth; System-on-chip; Memory management; Arrays; Compounds;

机译：随机存取存储器;金属;带宽;片上系统;存储器管理;阵列;化合物;

相似文献

外文文献
中文文献
专利

1. A Buffered Dual-Access-Mode Scheme Designed for Low-Power Highly-Associative Caches [J] . Yul Chu, Marven Calagos International journal of embedded and real-time communication systems . 2013,第2期

机译：专为低功耗高关联性缓存而设计的缓冲双访问模式方案
2. Controller Architecture for Low-Power, Low-Latency DRAM With Built-in Cache [J] . Zhi-Yong Liu, Hsiu-Chuan Shih, Bing-Yang Lin, Design & Test of Computers, IEEE . 2017,第2期

机译：具有内置缓存的低功耗，低延迟DRAM的控制器架构
3. Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture [J] . Seongil O, Young Hoon Son, Nam Sung Kim, Computer architecture news . 2014,第3期

机译：行缓冲区解耦：低延迟DRAM微体系结构的一种情况
4. Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor [C] . Zhe Wang, Daniel A. Jiménez, Tao Zhang, International Symposium on Computer Architecture and High Performance Computing . 2016

机译：用缓冲方式建立一个低延迟，高度关联的DRAM缓存预测器
5. Predicted performance of a SkyTherm North, a highly insulated building envelope system and a frost protected shallow foundation. [D] . Stratton, Kitrina. 2014

机译：SkyTherm North，高绝缘建筑围护系统和防霜冻浅基础的预期性能。
6. In-DRAM Cache Management for Low Latency and Low Power 3D-Stacked DRAMs [O] . Ho Hyun Shin, Eui-Young Chung 2019

机译：用于低延迟和低功耗3D堆叠DRAM的DRAM中缓存管理
7. Way-lookup buffer for low-power set-associative cache [O] . Sungjae Lee, Jinku Kang, Inhwan Lee 2011

机译：低功耗集关联缓存的方式 - 查找缓冲区

Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor

摘要

著录项

相似文献

相关主题

期刊订阅