What Is a DPU?

发布时间 2023-10-16 17:49:54作者: ImreW

一、Wikipedia介绍

A data processing unit (DPU) is a programmable computer processor that tightly integrates a general-purpose CPU with network interface hardware. Sometimes they are called "IPUs" (for "infrastructure processing unit") or "SmartNICs". They can be used in place of traditional NICs to relieve the main CPU of complex networking responsibilities and other "infrastructural" duties; although their features vary, they may be used to perform encryption/decryption, serve as a firewall, handle TCP/IP, process HTTP requests, or even function as a hypervisor or storage controller. These devices can be attractive to cloud computing providers whose servers might otherwise spend a significant amount of CPU time on these tasks, cutting into the cycles they can provide to guests.

数据处理单元 (DPU) 是一种可编程计算机处理器,它将通用 CPU 与网络接口硬件紧密集成。 有时它们被称为“IPU”(“基础设施处理单元”)或“SmartNIC”。 它们可以用来代替传统的 NIC,以减轻主 CPU 的复杂网络责任和其他“基础设施”职责; 尽管它们的功能各不相同,但它们可用于执行加密/解密、充当防火墙、处理 TCP/IP、处理 HTTP 请求,甚至充当虚拟机管理程序或存储控制器。 这些设备对云计算提供商来说很有吸引力,否则云计算提供商的服务器可能会在这些任务上花费大量的 CPU 时间,从而缩短了它们为访客提供的周期。

二、What Is a DPU?

… And what’s the difference between a DPU, a CPU and a GPU?

What’s a DPU? 

Specialists in moving data in data centers, DPUs, or data processing units, are a new class of programmable processor and will join CPUs and GPUs as one of the three pillars of computing.

DPU ,或数据处理单元,是数据中心中移动数据的专家,是一种新型可编程处理器,将与 CPU 和 GPU 一起成为计算的三大支柱之一。

Of course, you’re probably already familiar with the central processing unit. Flexible and responsive, for many years CPUs were the sole programmable element in most computers.

当然,您可能已经熟悉中央处理器。 多年来,CPU 灵活且反应灵敏,一直是大多数计算机中唯一的可编程元件。

More recently the GPU, or graphics processing unit, has taken a central role. Originally used to deliver rich, real-time graphics, their parallel processing capabilities make them ideal for accelerated computing tasks of all kinds. Thanks to these capabilities, GPUs are essential to artificial intelligence, deep learning and big data analytics applications.

最近,GPU(图形处理单元)发挥了核心作用。 它们最初用于提供丰富的实时图形,其并行处理能力使其成为各种加速计算任务的理想选择。 得益于这些功能,GPU 对于人工智能、深度学习和大数据分析应用至关重要。

Over the past decade, however, computing has broken out of the boxy confines of PCs and servers — with CPUs and GPUs powering sprawling new hyperscale data centers.

然而,在过去的十年里,计算已经突破了 PC 和服务器的局限——CPU 和 GPU 为庞大的新超大规模数据中心提供动力。

These data centers are knit together with a powerful new category of processors. The DPU has become the third member of the data-centric accelerated computing model.

这些数据中心与强大的新型处理器结合在一起。 DPU成为以数据为中心的加速计算模型的第三个成员。

“This is going to represent one of the three major pillars of computing going forward,” NVIDIA CEO Jensen Huang said during a talk earlier this month.

NVIDIA 首席执行官黄仁勋在本月早些时候的一次演讲中表示:“这将成为未来计算的三大支柱之一。”

“The CPU is for general-purpose computing, the GPU is for accelerated computing, and the DPU, which moves data around the data center, does data processing.”

“CPU用于通用计算,GPU用于加速计算,DPU在数据中心周围移动数据,进行数据处理。”

What's a DPU?

System on a chip 结合了:

  • 行业标准、高性能、软件可编程多核 CPU
  • 高性能网络接口
  • 灵活且可编程的加速引擎

CPU v GPU v DPU: What Makes a DPU Different? 

A DPU is a new class of programmable processor that combines three key elements. A DPU is a system on a chip, or SoC, that combines:

DPU 是一种新型可编程处理器,结合了三个关键要素。 DPU 是一种 system on a chip(SoC),它结合了:

  1. An industry-standard, high-performance, software-programmable, multi-core CPU, typically based on the widely used Arm architecture, tightly coupled to the other SoC components.
    • 行业标准、高性能、软件可编程、多核 CPU,通常基于广泛使用的 Arm 架构,与其他 SoC 组件紧密耦合。
  2. A high-performance network interface capable of parsing, processing and efficiently transferring data at line rate, or the speed of the rest of the network, to GPUs and CPUs.
    • 高性能网络接口,能够解析、处理数据并以线速或网络其余部分的速度高效地将数据传输到 GPU 和 CPU。
  3. A rich set of flexible and programmable acceleration engines that offload and improve applications performance for AI and machine learning, zero-trust security, telecommunications and storage, among others.
    • 一组丰富的灵活且可编程的加速引擎,可以减轻人工智能和机器学习、零信任安全、电信和存储等应用程序的负担并提高其性能。

All these DPU capabilities are critical to enable an isolated, bare-metal, cloud-native computing platform that will define the next generation of cloud-scale computing.

所有这些 DPU 功能对于实现隔离的裸机云原生计算平台至关重要,该平台将定义下一代云规模计算。

DPUs Incorporated into SmartNICs

The DPU can be used as a stand-alone embedded processor. But it’s more often incorporated into a SmartNIC, a network interface controller used as a critical component in a next-generation server.

DPU 可用作独立的嵌入式处理器。 但它更常被集成到 SmartNIC 中,SmartNIC 是一种网络接口控制器,用作下一代服务器的关键组件。

Other devices that claim to be DPUs miss significant elements of these three critical capabilities.

其他声称是 DPU 的设备忽略了这三个关键功能的重要元素。

For example, some vendors use proprietary processors that don’t benefit from the broad Arm CPU ecosystem’s rich development and application infrastructure.

例如,一些供应商使用的专有处理器无法从广泛的 Arm CPU 生态系统的丰富开发和应用基础设施中受益。

Others claim to have DPUs but make the mistake of focusing solely on the embedded CPU to perform data path processing.

其他人声称拥有 DPU,但错误地只关注嵌入式 CPU 来执行数据路径处理。

A Focus on Data Processing

That approach isn’t competitive and doesn’t scale, because trying to beat the traditional x86 CPU with a brute force performance attack is a losing battle. If 100 Gigabit/sec packet processing brings an x86 to its knees, why would an embedded CPU perform better?

这种方法没有竞争力,也无法扩展,因为试图通过强力性能攻击来击败传统的 x86 CPU 是一场失败的战斗。 如果 100 Gigabit/sec 的数据包处理能力让 x86 不堪重负,为什么嵌入式 CPU 的性能会更好呢?

Instead, the network interface needs to be powerful and flexible enough to handle all network data path processing. The embedded CPU should be used for control path initialization and exception processing, nothing more.

相反,网络接口需要足够强大和灵活,以处理所有网络数据路径处理。 嵌入式CPU应该用于控制路径初始化和异常处理,仅此而已。

At a minimum, there 10 capabilities the network data path acceleration engines need to be able to deliver:

网络数据路径加速引擎至少需要提供 10 项功能:

  1. Data packet parsing, matching and manipulation to implement an open virtual switch (OVS)
    • 数据包解析、匹配和操作以实现开放虚拟交换机(OVS)
  2. RDMA data transport acceleration for Zero Touch RoCE
    • 零接触 RoCE 的 RDMA 数据传输加速
  3. GPUDirect accelerators to bypass the CPU and feed networked data directly to GPUs (both from storage and from other GPUs)
    • GPUDirect 加速器可绕过 CPU 并将网络数据直接提供给 GPU(来自存储和其他 GPU)
  4. TCP acceleration including RSS, LRO, checksum, etc.
    • TCP加速包括RSS、LRO、校验和等。
  5. Network virtualization for VXLAN and Geneve overlays and VTEP offload
    • 适用于 VXLAN 和 Geneve 覆盖以及 VTEP 卸载的网络虚拟化
  6. Traffic shaping “packet pacing” accelerator to enable multimedia streaming, content distribution networks and the new 4K/8K Video over IP (RiverMax for ST 2110)
    • 流量整形“数据包调速”加速器可实现多媒体流、内容分发网络和新的 4K/8K IP 视频(RiverMax for ST 2110)
  7. Precision timing accelerators for telco cloud RAN such as 5T for 5G capabilities
    • 适用于电信云 RAN 的精密定时加速器,例如用于 5G 功能的 5T
  8. Crypto acceleration for IPSEC and TLS performed inline, so all other accelerations are still operational
    • IPSEC 和 TLS 的加密加速内联执行,因此所有其他加速仍然可以运行
  9. Virtualization support for SR-IOV, VirtIO and para-virtualization
    • 对 SR-IOV、VirtIO 和半虚拟化的虚拟化支持
  10. Secure Isolation: root of trust, secure boot, secure firmware upgrades, and authenticated containers and application lifecycle management
    • 安全隔离:信任根、安全启动、安全固件升级以及经过身份验证的容器和应用程序生命周期管理

These are just 10 of the acceleration and hardware capabilities that are critical to being able to answer yes to the question: “What is a DPU?”

这些只是对于回答“什么是 DPU?”这个问题至关重要的 10 项加速和硬件功能。

So what is a DPU? This is a DPU:

那么什么是DPU呢? 这是一个 DPU:

Many so-called DPUs focus on delivering just one or two of these functions.

许多所谓的 DPU 只专注于提供其中一两个功能。

The worst try to offload the datapath in proprietary processors.

最糟糕的尝试是在专有处理器中卸载数据路径。

While good for prototyping, this is a fool’s errand because of the scale, scope and breadth of data centers.

虽然这对于原型设计很有好处,但由于数据中心的规模、范围和广度,这是一个愚蠢的任务。