Skip to content

Nvidia Nemotron Nano 9B V2

Nvidia Nemotron Nano 9B V2 is a dense hybrid Mamba-Transformer reasoning model that matches or exceeds Qwen3-8B accuracy at up to 6x the throughput, with built-in thinking budget control.

ReasoningTool Use
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'nvidia/nemotron-nano-9b-v2',
prompt: 'Why is the sky blue?'
})

More models by NVIDIA

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
3.1s
228tps
$0.37/M$1.08/M
Read:$0.12/M
Write:
+1
baseten logo
blackbox logo
deepinfra logo
+1
06/04/2026
256K
0.9s
552tps
$0.15/M$0.65/M
Read:
Write:$0.06/M
baseten logo
bedrock logo
03/18/2026
262K
0.2s
107tps
$0.05/M$0.24/M
deepinfra logo
12/01/2024
131K
0.2s
152tps
$0.20/M$0.60/M
+1
bedrock logo
deepinfra logo
12/01/2024