Launched in late September 2022, the Nvidia Hopper H100 GPU is currently powering the world's most potent AI training systems. Last weekend, Elon Musk's AI firm xAI got ahead of its competitors thanks to the Colossus 100k H100 training system, which required a 122-day effort from starting its implementation to going online. As its name suggests, it uses no less than 100,000 H100 GPUs. Its main competitors belong to Google (90,000), OpenAI (80,000), and Meta (70,000). Microsoft and Nvidia get the next two places in the top, with 60,000 and 50,000 GPUs.
However, this achievement is not enough for Elon Musk, who has already laid out the plans for xAI's future. The tech genius wants to double the power of xAI's training system as soon as possible. In his post regarding the new Colossus AI training system, Musk mentioned that "it will double in size to 200k (50k H200s) in a few months" without mentioning the goal of this rapid expansion.
Back in 2023, when xAI was founded, Elon Musk said that this company's goal is "to understand the true nature of the universe," and it still remains to see what will come out of this entire effort. The H200 chips that are probably going to be used in the next wave of expansion by xAI already have a successor, namely the Nvidia Blackwell. Compared to the H200, it has 36.2% higher top-end capacity and a 66.7% improvement in total bandwidth.
For now, xAI can use the Colossus AI training system freely, but things might change starting next month if California's Governor Gavin Newsom approves the AI safety bill. Those who want to know more about AI safety should check out Chris Ategeka's Safeguarding Humanity: A Comprehensive Guide to AI Safety, available for $19.99 in paperback format.
The above is the detailed content of Elon Musk\'s xAI brings the Colossus 100k H100 training cluster online. For more information, please follow other related articles on the PHP Chinese website!