DPU development is expected to enter the fast lane. DPU (Data Processing Unit) is considered the "third main chip" after CPU and GPU. Thanks to the gradual maturity of smart network card solutions, the steady growth of global server shipments, the technology implementation of L3 and above smart driving vehicles, and the increasing demand in the industrial control field, the global and domestic DPU industries are expected to achieve rapid development. .
The global DPU market continues to be booming, and the domestic DPU market is accelerating to catch up. According to CCID Consulting's "China DPU Industry Development White Paper", the global DPU industry market size reached US$3.05 billion in 2020, and it is expected that the global DPU industry market size will exceed US$24.53 billion by 2025, with a compound The growth rate reached 51.73%.
In 2020, China's DPU industry market size reached 390 million yuan. It is expected that by 2025, China's DPU industry market size will exceed 56.59 billion yuan, with a compound growth rate of 170.6%.
DPU industry chain analysis:
DPU midstream link (DPU chip manufacturers): Overseas giants are temporarily in the lead, and domestic manufacturers are ready to go. According to data from the Toubao Research Institute, in the domestic DPU market in 2020, the shares of the three international giants NVIDIA, Broadcom, and Intel reached 55%, 36%, and 9% respectively. Among domestic manufacturers, Huawei, Alibaba, Baidu, and Tencent have also conducted self-developed and outsourced DPUs for their own servers in recent years. The main functions are data, storage, and security.
DPU upstream links: EDA, IP, etc. are all important foundations for research and development. The domestic EDA market has long been dominated by the three international giants and is expected to see a breakthrough in the future. Supply and demand work together to nurture the IP core industry, and future demand will open up new channels. Semiconductor localization continues to evolve, and domestic IP suppliers will occupy the high ground of scarcity value, including Cambrian, VeriSilicon, etc.
DPU downstream applications: multiple blooms, bright future prospects. The core DPU market revolves around data centers, with servers as hardware carriers, and downstream scenarios cover cloud computing, high-performance computing, network security, edge computing and other fields. From a domestic perspective, diverse computing power demand scenarios such as high-tech, digital transformation and terminal consumption are constantly emerging, and the computing power empowerment effect is prominent.
ChatGPT and other AI technology development trends have highlighted the demand for computing power. DPU is expected to usher in a golden period of development. The global and domestic DPU industry market scale has shown a trend of increasing year by year, and core enterprises are expected to benefit from the industry development trend.
DPU (data processing chip Data Process Unit) is considered to be the "third main chip" after CPU and GPU The third main chip". DPU (Data Processing Unit) is a newly developed special-purpose processor. In the DPU product strategy released by NVIDIA in 2020, it is positioned as the "third main chip" in the data center after the CPU and GPU. With the continuous improvement of manufacturing processes in the chip industry and the development of digital technologies such as AI, the chip industry continues to innovate. As a new type of chip, DPU's emergence is a milestone in heterogeneous computing.
1. There is a scissor gap between the increase in computing power and the increase in data, and the demand for DPU is highlighted
DPU is a dedicated data processing core The data processing unit is an offloading platform for network, security and storage of traditional computing resources. Traditional data centers use the CPU as the main data processing unit, and the operation of the huge infrastructure usually occupies a considerable part of the CPU cores, posing great challenges to data processing tasks.
DPU has actually been in the industry for a long time, from early network protocol processing offloading to subsequent network, storage, and virtualization offloading.
According to Ferris wheel data, Amazon's AWS developed the Nitro product as early as 2013, which put all data center overhead (providing remote resources for virtual machines, encryption and decryption, fault tracking, security policies and other service programs) into Executed on a dedicated accelerator. The Nitro architecture uses a lightweight hypervisor and customized hardware to separate the computing (mainly CPU and memory) and I/O (mainly network and storage) subsystems of the virtual machine and connect them through the PCIe bus, saving 30% of CPU resources.
In 2016-2017, Alibaba Cloud proposed the X-Dragon system architecture, whose core is the MOC card and has a relatively rich external interface, including computing resources, storage resources and network resources. X-Dragon SOC, the core of the MOC card, uniformly supports the virtualization of network, I/O, storage and peripherals, and provides a unified resource pool for virtual machines, bare metal, and container clouds.
According to data from NetEase and Xinxixi, in 2019, Fungible, an American start-up company, launched the product F1DPU, proposing the concept of DPU for the first time. In October 2020, NVIDIA named the Smart NIC based on the Mellanox solution DPU, redefining the concept of DPU. In 2020, the DPU product strategy released by Nvidia positioned it as the "third main chip" in the data center after CPU and GPU, setting off an industry craze.
2. With the goal of reducing costs and increasing efficiency, DPU directly addresses the pain points of the industry
The core problem that DPU wants to solve is the "cost reduction and efficiency improvement" of infrastructure, that is, "low CPU processing efficiency, GPU The load that cannot be processed is offloaded to the dedicated DPU, which improves the efficiency of the entire computing system and reduces the total cost of ownership (TCO) of the entire system.
Excessive CPU resource load is a pain point in the industry, and smart NIC is the predecessor of DPU. In the field of communications, with the advent of the era of 5G, cloud network integration, and the introduction of technologies such as virtual switching, the complexity of the server-based network data plane has increased dramatically. Massive data transfer work is borne by the CPU, resulting in a sharp increase in network interface bandwidth and excessive CPU resource load, which greatly affects the CPU's ability to release computing power into applications. In order to improve the processing performance of the host CPU, Smart NIC (intelligent network card) will Some network functions of the CPU (such as IP fragmentation, TCP segmentation, etc.) are transferred to the network card hardware for the purpose of accelerating calculations. It can be regarded as the predecessor of the DPU. The advantage of the new generation of DPU is that it can not only serve as an acceleration engine for computing, but also has the function of a control plane, which can more efficiently complete tasks such as network virtualization, I/O virtualization, and storage virtualization, and completely utilize the computing power of the CPU. Released to the application.
# In terms of functions, DPU has multiple functions such as integrating basic services, network data acceleration, zero trust protection, and separation of computing and storage. It can effectively solve many problems such as the inability of current CPU computing power to fully apply to applications, slow data processing speed, data leakage caused by credit, and poor compatibility of storage solutions. Specifically:
1. DPU realizes the operational separation of business and infrastructure. DPU transfers infrastructure tasks from the CPU to the DPU, freeing up CPU resources so that more server CPU cores can be used to run applications and complete business calculations, thus improving the efficiency of servers and data centers.
2.DPU offloads network data to improve performance. DPU is optimized for cloud-native environments, providing data center-level software-defined and hardware-accelerated networking, storage, security, and management services. According to Programmer Inn data, Red Hat's containerized cloud platform as a service (PaaS) 0penShift uses DPU to optimize data center resource utilization and offload network-related data processing (such as VxLan and IPSec, etc.) to DPU to accelerate execution. , under 25Gb/s network conditions, Open Shift deploys DPU for acceleration, and can achieve 25Gb/s performance with only 1/3 of the CPU usage, while under 100Gb/s network conditions, the scenario without deploying DPU will reach At less than 100Gb/s network line speed, DPU can bring 10 times the performance advantage.
3.DPU can provide zero trust security protection. Zero Trust is a security-centered model based on the following ideas: Enterprises should not grant default trust options to anything inside or outside them. . Zero trust can reduce data leakage and deny unauthorized access, so it is of great value in data security.
Method: DPU provides zero-trust protection for enterprises by decentralizing the control plane from the host to the DPU, achieving complete isolation of the host business and control plane, and data will not be able to penetrate, ensuring security.
The emergence of DPU is equivalent to equipping each server with a "computer in front of the computer" to provide independent, secure infrastructure services and secure isolation from the server application domain. If a host is compromised, the DPU isolation layer between the security control agent and the compromised host prevents the attack from spreading throughout the data center. In this way, DPU solves the situation where enterprises are unwilling to deploy security agents directly on the computing platform. By deploying security agents on DPUs that are completely isolated from application domains, enterprises gain visibility into application workloads and enforce consistent security policies across their infrastructure.
4.DPU helps realize "computing and storage separation". BlueField SNAP technology solution introduces computing resources at the data entrance of the server system and independently implements storage solutions that meet application needs on the DPU, helping storage manufacturers in the data center. Flexibly deploy and upgrade advanced storage protocols at low cost without any changes to the existing software stack. Storage manufacturers can use open system direct-attached storage (DAS), vertical expansion (Scale-up), horizontal expansion (Scale-out), hyperconverged architecture (Hyperconverged) and other storage solutions developed by their own teams for various industry applications. It can be extended to existing business processing platforms and data center infrastructure in various application fields with zero overhead, and all complex and necessary functions such as security encryption, data compression, and load balancing are completely transparently offloaded by the DPU. The storage industry's innovative algorithms and implementations can be deployed in the DPU architecture independently of the server operating system. DPU technology helps storage manufacturers achieve true "computing and storage separation", fully leverage the technical advantages of their own products, and open up channels to serve application needs most efficiently.
3. Relying on smart network cards, FPGA and hybrid architecture routes have become mainstream
Smart NIC can be regarded as the predecessor of DPU, including ASIC based on multiple CPU cores and FPGA-based smart network cards and other types. As technology evolves, FPGAs, ASICs, and SoCs merge with each other, and the lines between them become increasingly blurred. For example, with the development of FPGAs, many FPGAs now integrate hard cores, which are ASICs in the traditional sense; from the perspective of hardware programmability, SoCs are the opposite of FPGAs and can be regarded as ASICs. ASIC mainly refers to hardware that is not programmable, rather than solely referring to a specific function chip.
NIC stands for Network Interface Card. Essentially, a NIC is a PCIe card that plugs into a server or storage box to connect to an Ethernet network. DPU-based Smart NICs go beyond simple connectivity and, in the case of a base NIC, implement the network traffic processing that the CPU must perform on the NIC.
DPU-based Smart NICs can be ASIC, FPGA and SoC based. There are various trade-offs between these different routes in terms of cost, ease of programming, and flexibility. 1) ASICs are cost-effective and may offer the best price/performance, but have limited flexibility. ASIC-based NICs, such as the NVIDIA ConnectX-5, can have relatively simple programmable data paths. Ultimately, the functionality is limited based on the capabilities defined in the ASIC, which may prevent support for certain workloads. 2) In contrast, FPGA NICs (such as NVIDIA Innova-2 Flex) are highly programmable. Given enough time and effort, almost any feature can be supported relatively efficiently within the constraints of the available gates. However, FPGAs are notoriously difficult and expensive to program. 3) For more complex use cases, SOCs such as the Mellanox BlueField DPU – Programmable Smart NIC provide what appears to be the best DPU-based Smart NIC implementation.
4. The core value of DPU lies in the offloading, release and expansion of computing power. The interconnection of heterogeneous computing power promotes the rapid development of DPU in multiple fields
1. Computing power offloading: that is, using the DPU to integrate some basic functions of data processing, and then offloading these functions from the CPU to increase the CPU's computing power for some applications. Part of the value of the DPU is reflected in the cost savings of this part of the computing power - the cost of the DPU itself. Therefore, the more computing power the DPU saves, or the lower the cost of the DPU, the higher the value it brings. At the same time, due to the specialization of the DPU, the business performance will be improved after the DPU offloads some control functions related to network, storage, security, and management. Therefore, another part of the value of the DPU lies in the time it can save for the business. and user experience.
According to Technology Neighbor data, in large data center scenarios, the computing power offloading function of DPU can be used to reduce data center taxes. Since traffic processing in the data center accounts for 30% of computing resources, AWS calls these computing resources that will be occupied by accessing network data before running business programs "Data Center Tax".
In data security scenarios, due to its independent and secure architecture, DPU can solidify some encryption and decryption algorithms in the DPU hardware to solve users' data security problems in massive data in a physically isolated manner. Provides an additional layer of security between external network business tenants.
2. According to data from China Academy of Information and Communications Technology, computing power release: Computing power release does not require CPU intervention to access memory and peripherals multiple times, avoid unnecessary data transfer, copying and context switching, and complete the data directly on the network card hardware. Applications that process and deliver the data to final consumption. The traditional CPU-centered computer architecture needs to copy and access data between the kernel and the application multiple times during data processing, which brings huge performance losses. The data-centered DPU architecture can effectively improve the problem of excessive CPU involvement in data processing. The CPU is not required to participate in the data processing process, and the data is directly sent to the application, related GPU or storage device, which can effectively avoid performance bottlenecks and Exception caused by excessive CPU load.
DPU architecture and technology enable business applications and operating system kernels running on servers to achieve efficient and transparent access to distributed, hyper-converged or software-defined storage systems using simple local storage access APIs. Storage manufacturers can promote direct-attached storage (DAS), vertical expansion (Scale-up), horizontal expansion (Scale-out), hyperconverged architecture (Hyperconverged) and other storage solutions developed for various industry applications at zero cost. In the existing business processing platforms and data center infrastructure in various application fields, all complex and necessary functions such as security encryption, data compression, and load balancing are completely offloaded by the DPU transparently. The storage industry's innovative algorithms and implementations can be deployed independently of the server operating system in a DPU architecture.
DPU technology helps storage manufacturers achieve true "computing and storage separation", fully leverage the technical advantages of their own products, and open up the most efficient way to serve application needs.
3. Computing power expansion: Computing power expansion eliminates cross-node network communication bottlenecks by effectively avoiding congestion, significantly reducing the proportion of communication time-consuming in the distributed application task cycle, and improving the large-scale cluster dimension Compute the overall computing power of the cluster. In order to improve computing power, the industry continues to evolve on multiple paths. It is difficult for general-purpose CPUs to continue to significantly increase computing power by improving single-core and single-thread performance and expanding on-chip multi-cores. After the process of single-core chips was upgraded to 3nm, development slowed down; by superimposing multiple cores to increase computing power, as the number of cores increases, the power consumption per unit computing power will also increase significantly. When 128 cores increase to 256 cores, the total computing power level It cannot be improved linearly. The process evolution of computing units has approached the baseline. In order to meet the demand for large computing power, expanding the scale of computing clusters, increasing network bandwidth, and reducing network latency through distributed systems have become the main means to improve the computing power of data center clusters.
5. DPU drives heterogeneous computing power interconnection, and the application market covers many fields of high-tech industries
Heterogeneous computing power interconnection is between GPU, FPGA, ASIC or other accelerator cards and CPU data connection. The chip interconnection technology formed between the CPU and the accelerator card, as well as between the accelerator card, is increasingly adopted. Although PCIe has a very common standardized design, limited bandwidth will create bottlenecks. Next-generation interconnect technologies such as CXL and Gen-Z have achieved rapid development. As a sandbox for the integration of various high-speed interconnect protocols, DPU is most suitable to become a flexible high-speed interconnect carrier. By adopting and expanding "memory-centric" The interconnection protocol will bring the opportunity to expand sub-microsecond latency technology outside a single chassis, creating possibilities for next-generation computing architecture innovation.
With the deepening of informatization construction and application, the market continues to rise, and the demand for DPU industry applications in telecommunications, Internet, intelligent driving, AI servers and other industries continues to grow.
1) In the field of telecommunications, the three major operators are actively deploying, promoting product verification, and are willing to cooperate with manufacturers in the industry chain to promote the development of the DPU industry.
2) In the Internet field, with the development needs of cloud computing, cloud native and other business scenarios, DPU, as the focus of data center evolution, has received widespread attention from major cloud vendors. Leading manufacturers have invested resources in self-research or strategic cooperation to reduce costs and increase efficiency to maximize benefits.
3) In the field of intelligent driving, domestic and foreign chip manufacturers are accelerating the deployment of intelligent driving, continuously improving research and development efficiency, and laying the foundation for the market development of DPU.
4) Regarding AI servers and other fields, under the influence of policies such as the digital economy and “digitizing in the east and computing in the west”, China’s AI servers, finance, terminal government enterprises and other fields have continued to develop at a rapid pace, which has had an impact on computing power. With the increasing demand, traditional technology can no longer meet the current business development needs. DPU can provide mature hardware acceleration solutions, improve the efficiency of the entire system, provide technical support for the development of AI servers, finance and other fields, and comprehensively promote the DPU industry. future development process.
The above is the detailed content of When DPU collides with ChatGPT, what sparks will arise in computing power efficiency?. For more information, please follow other related articles on the PHP Chinese website!