experience,” Jin said on X. “With hand-written CUNN kernels and optimizations, the performance is higher.” Jin also noted that the 910C could also be used for training, but the R1 was officially trained using H800 chips, though that doesn’t mean DeepSeek will continue to use those H800s forever.
Performance is a significant problem for Nvidia in China, as Biden-era sanctions issued by the US government prevent companies from selling processors that are deemed too fast. Many of Nvidia’s best data center GPUs, like its H200 and B200, can’t be legally exported to China, forcing Nvidia to develop new models specifically for China that just barely meet the performance limit.
In fact, the H800, which DeepSeek claimed to use to train the R1 LLM, was launched after the Biden administration’s initial round of GPU export restrictions on China, in order to offer an alternative to the banned H100. However, the H800 and other Nvidia GPUs for the Chinese market were banned after the next round of sanctions, which lowered the performance cap of chips that could be legally sold in China.
Because of the US government’s export restrictions, Nvidia is forced to compete in China with weaker hardware; the chip company’s flagship for China, the H20, has much less memory, memory bandwidth, and TFLOPs than the H200, the top-end Hopper-based card.
This has apparently had a very real impact on Nvidia’s fortunes in China, and in May 2024, it was selling the H20 for less than Huawei’s Ascend 910B. However, H20 sales were apparently much better in the second half of last year, with its revenue growing by 50% in Q4 compared to Q3, after back-to-back quarters of healthy growth. Either way, Nvidia would certainly be in a better position against its Chinese competitors if it could sell its most powerful GPUs to China.
It’s not just about Nvidia being able to compete in China, though. Being able to run a Chinese LLM with cutting-edge performance on Chinese processors could be a major milestone for the country’s path to technological autarky. If the Ascend 910C or another Chinese GPU proves sufficient for training and inference, there will probably be even less need for processors like the H20. Of course, China isn’t quite ready to completely ditch Western chips until it progresses in chip manufacturing, but companies like Huawei are working on it.