Latency-optimized Grok 4 variant built for real-time and high-volume workloads. Delivers very high output throughput and low time-to-first-token at the lowest price in the Grok 4 line, with a 256K-token context window and text+image input.
profile normalized against the 47-model field
Context window· 256K of 1.05M24%
Max output· 16K of 272K6%
Output speed· 260 tok/s81%
Affordability—
Capability breadth· 7 of 1164%
Capability switches · 7 of 11
Reasoning mode
Tool / function use
Streaming
JSON mode
Structured outputs
Prompt caching
Fine-tuning
Web search
Code execution
Vision input
Audio input
Specifications
Every value carries a primary source and a verification date.
Capacity
Context window
256K
Max output
16K
Performance
Output speed
260 tok/s tok/s
Time to first token
190 ms ms
Capabilities
Input modalities
Text, image
Output modalities
Text
Reasoning mode
No
Tool / function use
Yes
Streaming
Yes
JSON mode
Yes
Structured outputs
Yes
Prompt caching
Yes
Web search
Yes
Vision input
Yes (image input)
API
API model ID
grok-4-fast
Batch API
Yes
General
Release date
March 12, 2026
Benchmarks
Sourced evaluation scores, each verified against its primary source.