NVIDIA and OpenAI Launched Fastest Open Reasoning Models
NVIDIA and OpenAI released fast open AI models using NVFP4 and CUDA, making advanced reasoning easier and more accessible.

Quick Take
Summary is AI generated, newsroom reviewed.
NVIDIA and OpenAI released two open-weight reasoning models: gpt-oss-120b, 20b
The 120b model processes 1.5M tokens/sec using NVIDIA’s GB200 NVL72 system
NVFP4 precision format enables faster, energy-efficient inference without accuracy loss
Models run on CUDA-compatible hardware, from cloud servers to RTX desktops
Open-source release helps startups, researchers, and developers to build custom solutions
NVIDIA and OpenAI have just released two new open-weight reasoning models, gpt-oss-120b and gpt-oss-20b. The 120b model can process 1.5 million tokens per second on a single NVIDIA GB200 NVL72 system. It’s made possible by a mix of NVIDIA’s Blackwell architecture and a new 4-bit precision format called NVFP4. That format helps strike a balance between accuracy and efficiency.
What Powers the Models
What helps these models run so efficiently is a mix of new hardware and smart software. They were trained on NVIDIA’s powerful H100 GPUs and are designed to work smoothly across a wide range of devices. You can use these from big cloud systems to regular desktop PCs with NVIDIA RTX cards. If you already use CUDA, you can probably run these models without much extra work.
Both models are also packaged as what NVIDIA calls “Inference Microservices.” This makes models comparatively faster and easier. You don’t need to build everything from scratch. And if you’re already using popular AI tools like Hugging Face or Llama.cpp, these models will plug right in.
NVIDIA’s newer Blackwell hardware plays a big role here, too. It includes a feature called NVFP4, which helps the models run faster and more efficiently by using lower-precision numbers without losing accuracy. That might sound technical, but the result is pretty simple. It will result in faster AI that uses less power and memory. For businesses, that can mean lower costs.
There’s also a long-running relationship between NVIDIA and OpenAI that’s helped make this possible. This relationship goes back to when Jensen Huang literally delivered the first DGX-1 in person. What’s happening now with the gpt-oss series feels like the next logical step in that collaboration. Those productions, however, will require orders of magnitude more computing power, polish, and operational readiness. Its hardware, software, and services are all working together, which is rare to see at this level.
Open for Everyone to Build
One of the most important things about this release is that the models are open. It means anyone from startups & universities can also work on them. They can build on them, customize them, and use them in their systems. OpenAI now has over 4 million lifetime developers building on its platform. NVIDIA, on its side, has more than 6.5 million developers using its software tools. They’ve been working together for nearly a decade, and the reach is massive. There are hundreds of millions of GPUs worldwide that run on the NVIDIA CUDA platform. When technology like this gets released into an ecosystem that large and experienced, adoption tends to move quickly. And that’s where this starts to feel less like a launch and more like a turning point.

Follow us on Google News
Get the latest crypto insights and updates.
Related Posts

SIM Mining Gains Attention as Cloud Mining Option for BTC and DOGE Users
Coinfomania News Room
Author

Ripple SEC Decision Expected Today Could End XRP Case
Hanan Zuhry
Author

China Gold Reserves Show Steady Growth in 2025
Hanan Zuhry
Author

Trump Tariffs Spark Global Trade Turmoil as New Duties Hit Over 90 Countries
Triparna Baishnab
Author