Nvidia claims leadership in performance and versatility


Nvidia leads this year’s MLPerf inference benchmark. New data shows performance leaps with Hopper and new hardware.

In the MLPerf benchmark, hardware vendors and service providers compete with their AI systems. The test is hosted by MLCommons and aims to transparently compare different chip architectures and system variants.

Today MLPerf released new results of the MLPerf Inference 3.0 benchmark. A new feature is a network environment that tests the AI ​​performance of different systems under more realistic conditions: Data is streamed to an inference server. The test is designed to more accurately reflect how data enters the AI ​​accelerator and is output in the real world, thus revealing bottlenecks in the network.

Nvidia Hopper makes significant year-over-year gains

According to Nvidia, the H100 Tensor Core GPUs in the DGX H100 systems have up to 54 percent more inference performance than last year due to software optimizations. This jump is seen in RetinaNet inference, with other models such as 99% accurate BERT running 12% faster, ResNet-50 running 13% faster, and 3D U-Net used in medical applications running 31% faster.


Nvidia’s H100 delivers significantly higher performance in some cases this year due to software optimizations. | Picture: Nvidia

In presenting the results, Nvidia emphasized that they see themselves as the clear leader in performance, but also as the equally important leader in the versatility of their architecture. Nvidia is the only company to present results for all tasks in MLPerf Inference 3.0.

A comparison with some accelerators that participated in MLPerf 3.0. | Picture: Nvidia

Compared to an A100 GPU, the H100 GPU is also significantly stronger at inferencing transformer models, such as BERT 99.9, thanks to the Transformer engine, where the H100 delivers more than four times the performance.

As a result, the card promises to deliver big performance gains for many generative AI models, such as those that generate text, images, or 3D models.

New L4 card up to 3 times faster than predecessor

New to the benchmark is Nvidia’s L4 Tensor GPU, which the company recently unveiled at GTC. The card is already available from some cloud providers and delivers 2.2 to 3.1 times the inference performance of its predecessor, the T4, in the benchmarks.

Nvidia’s L4 is significantly faster than the predecessor T4. | Picture: Nvidia

Nvidia’s Jetson AGX Orin for the Edge is also up to 63 percent more energy efficient and up to 81 percent more powerful than last year thanks to numerous improvements.


Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top