On July 14, Huawei released a new AI storage product in the era of large models, which provides optimal storage solutions for basic model training, industry model training, and segmented scenario model training and reasoning, releasing new AI momentum.
In the process of developing and implementing large model applications, enterprises face four major challenges:
First of all, data preparation takes a long time, data sources are scattered, and collection is slow. It takes about 10 days to preprocess 100 TB of data. Secondly, multi-modal large models use massive texts and pictures as training sets. Currently, the loading of massive small files The speed is less than 100MB/s, and the training set loading efficiency is low; thirdly, large model parameters are frequently tuned, and the training platform is unstable. Training is interrupted once every two days on average, requiring the Checkpoint mechanism to resume training, and failure recovery takes more than a day; finally , the implementation threshold of large models is high, the system construction is complicated, resource scheduling is difficult, and the GPU resource utilization rate is usually less than 40%.
Huawei complies with the development trend of AI in the era of large models and launches OceanStor A310 deep learning data lake storage and FusionCube A3000 training/pushing hyper-converged all-in-one machine for large model applications in different industries and scenarios.
Zhou Yuefeng, President of Huawei Data Storage Product Line
OceanStor A310 deep learning data lake storage is oriented to basic/industry large model data lake scenarios, and realizes AI full-process massive data management from data collection and preprocessing to model training and inference applications.
OceanStor A310 single-frame 5U supports the industry's highest bandwidth of 400GB/s and the highest performance of 12 million IOPS, and can be linearly expanded to 4096 nodes to achieve lossless interoperability with multiple protocols. The global file system GFS realizes cross-regional intelligent data weaving and simplifies the data collection process; it realizes near-data preprocessing through near-memory computing, reduces data movement, and improves preprocessing efficiency by 30%.
FusionCube A3000 training/push hyper-converged all-in-one machine is oriented to industry large model training/inference scenarios and for tens of billions of model applications. It integrates OceanStor A300 high-performance storage nodes, training/push nodes, switching equipment, AI platform software and management Operation and maintenance software provides large model partners with a turn-key deployment experience and achieves one-stop delivery. It works right out of the box and can be deployed within 2 hours. In order to adapt to the needs of models of different sizes, training nodes and storage nodes can be independently expanded horizontally. At the same time, FusionCube A3000 uses high-performance containers to share the GPU for multiple model training and inference tasks, increasing resource utilization from 40% to more than 70%. FusionCube A3000 supports two flexible business models, including Huawei's Ascend one-stop solution and third-party partner's one-stop solution for open computing, network, and AI platform software.
Zhou Yuefeng said that in the era of big data, data determines the level of artificial intelligence. As a carrier of data, data storage has become a key infrastructure for large AI models. Huawei Data Storage will continue to innovate in the future, provide diversified solutions and products for the era of AI large models, and work with partners to promote AI empowerment in various industries. ”
The content on this website (including but not limited to text, pictures, audio and video), except for reprinting, is the copyright of Times Online. Reproduction, linking, reposting or other use is prohibited without written authorization. Anyone who violates the above statement will be held accountable by this website for relevant legal liability. Please contact Mr. Ding (news@time-weekly.com) to obtain permission to reprint on this website, if other media, websites or individuals want to reprint it
The above is the detailed content of Huawei releases new AI storage products in the era of large models. For more information, please follow other related articles on the PHP Chinese website!