Thanks to netizen Mr. Aviation for submitting the clue! According to news on July 2, Tencent yesterday released version 2.0 of its self-developed Xingmai network. The upgraded Xingmai network supports a single cluster of 100,000 cards. The network communication efficiency is 60% higher than the previous generation, and the large model training efficiency is increased by 20%. , fault location is reduced from days to minutes. It is learned that Tencent’s self-developed switches have been upgraded from 25.6T to 51.2T, doubling the capacity; self-developed silicon optical modules have been upgraded from 200G to 400G, doubling the speed; equipped with self-developed computing power network cards; the entire machine’s communication bandwidth is 3.2T, which is The highest in the industry. The deployment location of Tencent's self-developed new communication protocol TiTa2.0 has been moved from switches to network cards, and the congestion algorithm has been upgraded from a passive congestion algorithm to an active congestion control algorithm, increasing the communication efficiency of the Xingmai network by 30% and improving the training efficiency of large models. Improved by 10%. Tencent's new high-performance collective communication library TCCL2.0 uses NVLINK+NET heterogeneous parallel communication to achieve parallel transmission of data. The Auto-Tune Network Expert adaptive algorithm can automatically adjust parameters such as packet segmentation size and matching algorithm based on differences in model, network size, model algorithm, etc. The communication performance of Xingmai network is improved by 30%, which increases the training efficiency of large models by another 10%.
▲ Parallel transmission of data (Tencent Cloud) The superimposed effects of TiTa and TCCL upgrades have increased the communication efficiency of the Xingmai network by a total of 60%, and the large model training efficiency by a total of 20%.The above is the detailed content of Tencent releases Xingmai Network 2.0: large model training efficiency increased by 20% compared to the previous generation. For more information, please follow other related articles on the PHP Chinese website!