As artificial intelligence becomes more widely used in enterprises, one of the consequences is that it consumes a larger proportion of the workload in the data center.
Artificial intelligence will not only accelerate demand for data centers and create new incentives for investment, but it will also have an impact on data center sustainability strategies and the nature of the infrastructure to be deployed.
For example, Tirias Research predicts that, as things currently stand, generative AI data center server infrastructure plus operating costs will exceed $76 million by 2028, twice the current estimated annual operating costs of Amazon AWS. More than twice that, accounting for one-third of the global cloud services market.
Hardware computing performance is expected to increase by 400%, dwarfing Tirias’s estimated 50x increase in processing workloads
According to Schneider Electric The explosion of large training clusters and small edge inference servers will also mean a shift to higher rack power density, according to a new white paper.
The white paper states: "Artificial intelligence startups, enterprises, colocation providers and Internet giants must now consider the impact of these densities on the design and management of data center physical infrastructure."
Schneider The Energy Management Research Center predicts the impact of artificial intelligence on energy demand. According to estimates, AI currently represents 4.3GW of electricity demand and is expected to grow at a CAGR of 26% to 36% by 2028
This will lead to total demand reaching 13.5GW to 20GW, is the data Two to three times the center's overall power demand growth. By 2028, AI workloads will account for 20% of total data center energy
Schneider noted that while they are expected to consume more power than training clusters, inference workloads can run at a variety of rack densities. run.
"AI training workloads, on the other hand, have been running at very high densities, with 20-100 kW or more per rack."
Network requirements and costs is what prompted these training racks to come together. These high-power-density clusters pose fundamental challenges to data center power, cooling, rack and software management designs
Schneider outlines four possible Key areas of impact: Power, cooling, racks and software management
On the power side, AI workloads pose challenges to power systems in switchgear and distribution systems.
Some voltages currently in use will prove impractical to deploy, while smaller distribution block sizes may waste IT space. Higher rack temperatures also increase the chance of failure and hazards. What's been rewritten: Some of the voltages currently in use may prove impractical when deployed, while smaller power distribution block sizes may waste IT space. At the same time, higher rack temperatures also increase the likelihood of failures and hazards
As data centers transition to liquid cooling, cooling will be critical and one of the areas that will require significant changes , liquid cooling has been used in professional high-performance computing for more than half a century.
Schneider said: "While air cooling will still be around in the near future, it is predicted that the transition from air cooling to liquid cooling will become the preferred or necessary solution for data centers with artificial intelligence clusters." Reworded: According to Schneider, while there will still be air cooling in the near future, forecasts indicate that the shift from air cooling to liquid cooling will become the preferred or necessary solution for data centers with artificial intelligence clusters
Liquid cooling has many advantages compared to air cooling. First, liquid cooling improves processor reliability and performance. Second, liquid cooling saves space and increases rack density. In addition, the water in liquid cooling has greater thermal inertia and can reduce water consumption
For artificial intelligence clusters, servers need to be deeper, power requirements are greater, and cooling is more complex.
To meet demand, racks must have higher density and load-bearing capacity
Eventually, software tools such as DCIM, BMS and electrical design tools will become the management The key to artificial intelligence clusters
With appropriate configuration and implementation of software, a digital twin of the data center can be realized to identify power constraints and the performance of cooling resources and provide relevant information for optimal layout decisions
In an increasingly dynamic environment, the smaller the margin for error, the higher the operational risk. Therefore, Schneider recommends creating a digital twin of the entire IT space, including equipment in racks and virtual machines.
By digitally adding or moving IT loads, you can verify that there is sufficient power, cooling and floor load-bearing capacity to support it. This informs decisions to avoid stranded resources and minimize human error that can lead to downtime
The above is the detailed content of Artificial intelligence forces data centers to rethink design. For more information, please follow other related articles on the PHP Chinese website!