Skip to content

DeepSeek V4 Flash

DeepSeek V4 Flash is DeepSeek's April 23, 2026 efficiency-tier model in the V4 series. It pairs a hybrid attention architecture with a context window of 1.0M tokens and supports reasoning, tool use, and implicit caching.

ReasoningTool UseImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({
model: 'deepseek/deepseek-v4-flash',
prompt: 'Why is the sky blue?'
})

More models by DeepSeek

Model
Context
Latency
Throughput
Input
Output
Cache
Web Search
Per Query
Capabilities
Providers
ZDR
No Training
Release Date
1M
0.4s
106tps
$1.74/M$0.43/M
$3.48/M$0.87/M
Read:$0.0/M
Write:
+1
baseten logo
deepinfra logo
deepseek logo
+3
04/23/2026
164K
0.7s
101tps
$0.28/M$0.42/M
Read:$0.03/M
Write:
+3
bedrock logo
deepinfra logo
deepseek logo
+2
12/01/2025
164K
0.6s
124tps
$0.28/M$0.42/M
Read:$0.03/M
Write:
+3
bedrock logo
deepinfra logo
deepseek logo
+2
12/01/2025
131K
1.7s
27tps
$0.27/M$1.00/M
Read:$0.14/M
Write:
novita logo
09/22/2025
164K
0.9s
38tps
$0.21/M$0.79/M
Read:$0.13/M
Write:
deepinfra logo
fireworks logo
novita logo
+2
08/21/2025
164K
1.1s
38tps
$0.27/M$1.12/M
Read:$0.14/M
Write:
novita logo
12/26/2024