首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >qcAffin: A Hardware Topology Aware Interrupt Affinitizing and Balancing Scheme for Multi-Core and Multi-Queue Packet Processing Systems
【24h】

qcAffin: A Hardware Topology Aware Interrupt Affinitizing and Balancing Scheme for Multi-Core and Multi-Queue Packet Processing Systems

机译:qcAffin:用于多核和多队列数据包处理系统的硬件拓扑感知中断仿制和平衡方案

获取原文
获取原文并翻译 | 示例
       

摘要

Interrupt affinitization of multi-queue network interface cards is a fundamental composition that defines how packets from individual queue are processed by which CPU-cores on multi-core platforms. In this paper, we propose to attain an optimal queue-to-core affinitization for packet processing systems based on a numerical cost model derived from hardware topology and runtime system workloads. Static architectural characteristics comprising the memory hierarchy and topology of hardware components are first analyzed to calculate static interrupt affinitization costs. Then we attempt dynamic interrupt affinitization to balance workloads on CPU-cores and improve overall performance. Classical networking applications ranging from bridging, routing, access control list (ACL) matching to deep packet inspection (DPI) with different frame sizes are extensively experimented to compare the performance of the proposed scheme and other existing approaches. As demonstrated in the comparison result, achieves the similar performance of the best affinitization approach and outperforms the Linux default affinitizer by averages of 102, 278, 248 and 131 percent on 1G NICs for the four applications. On 10G NICs, dramatic boosts of 1,424 and 1,343 percent are measured for the bridging and routing applications, respectively. Moreover, the effectiveness of dynamic interrupt balancing is justified by a maximum of 150 percent higher system utilization and 1.2 Mpps more throughput compared to the fixed affinitization approach in a simulated setup of unbalanced traffic load.
机译:多队列网络接口卡的中断亲缘关系是一个基本组成,它定义了多核平台上的CPU内核如何处理来自单个队列的数据包。在本文中,我们建议基于从硬件拓扑和运行时系统工作负载得出的数值成本模型,为数据包处理系统获得最佳的队列到核心亲缘关系。首先分析包括存储器层次结构和硬件组件拓扑的静态体系结构特征,以计算静态中断关联化成本。然后,我们尝试进行动态中断关联,以平衡CPU内核上的工作负载并提高整体性能。从桥接,路由,访问控制列表(ACL)匹配到具有不同帧大小的深层数据包检查(DPI)的经典网络应用都已进行了广泛的实验,以比较所提出的方案和其他现有方法的性能。如比较结果所示,在四个应用程序的1G NIC上,其最佳亲和化方法的性能均相似,并且优于Linux默认亲和器,分别为102%,278%,248%和131%。在10G NIC上,桥接和路由应用分别实现了1,424%和1,343%的大幅提升。此外,在不平衡流量负载的模拟设置中,与固定亲权化方法相比,系统中断利用率最多提高150%,吞吐量最多提高1.2 Mpps,证明了动态中断平衡的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号