ReleaseVerified

NVIDIA releases Nemotron 3.5 Content Safety, unifying multimodal safety, multilingual coverage, and custom policy enforcement in a single 4B model

ListenJun 4, 2026published Jun 5, 2026

NVIDIA has released Nemotron 3.5 Content Safety, a 4-billion-parameter model that brings together multimodal input evaluation, support for 12 languages, custom enterprise policy enforcement, and auditable reasoning into a single API call. The model was published June 4, 2026 on Hugging Face, where it is available as open weights alongside NVIDIA's hosted inference endpoint.

What's new

Built on Google's Gemma 3 4B IT base model and fine-tuned on multimodal and multilingual content-safety datasets, Nemotron 3.5 Content Safety accepts:

A user prompt (text)
An optional image
An optional assistant response
An optional user-defined safety policy

It returns a safe/unsafe classification, applicable safety-category labels, and—in THINK mode—a step-by-step reasoning trace before the verdict.

Key specifications:

Parameters: 4B total, 128K context window
Languages: 12 with explicit training (English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, Italian); approximately 140 via Gemma 3 base-model transfer
Output modes: Binary verdict / verdict with categories / THINK mode (auditable reasoning trace)
Safety taxonomy: 13 core categories plus 10 fine-grained subcategories under the Aegis 2.0 framework
Hardware: Runs in 4-bit quantization at approximately 6.7 GB VRAM

The THINK mode is the most significant new addition. It generates a chain-of-thought trace that explains why content was flagged, which NVIDIA positions as supporting enterprise audit requirements.

Context

NVIDIA's content-safety stack has expanded steadily since 2024. Nemotron 3 Content Safety, released in March 2026, was the previous generation and marked the first time NVIDIA combined multimodal evaluation and multilingual coverage in a single 4B model. A separate model, Nemotron Content Safety Reasoning 4B, added custom policy enforcement and reasoning traces.

Nemotron 3.5 merges all four capabilities. The consolidation is deliberate: each capability that previously required its own model deployment is now handled in one inference call. The model is available on NVIDIA's build.nvidia.com API, via DeepInfra and OpenRouter, and as open weights at huggingface.co/nvidia/Nemotron-3.5-Content-Safety.

Why it matters

Enterprise AI deployments increasingly run fragmented safety tooling—separate models for text moderation, image moderation, multilingual coverage, and policy enforcement. Each additional model adds latency, cost, and integration work. A single model that handles all four in one call simplifies the stack meaningfully.

The Aegis 2.0 taxonomy covers the categories most targeted by regulators, including the EU AI Act and emerging U.S. state AI laws. The custom-policy input lets enterprises define their own restrictions rather than relying on a fixed taxonomy, which matters for specialized verticals—legal, medical, financial—with domain-specific requirements.

The auditable THINK mode has direct compliance implications. Several emerging AI governance frameworks require that safety decisions be explainable. A reasoning trace that shows why content was flagged provides documentation that a binary safe/unsafe verdict cannot.

For teams building agentic pipelines, a compact, high-throughput safety layer that handles text and images simultaneously—without routing between separate models—reduces both latency and operational complexity. At 4B active parameters with hardware requirements low enough for commodity GPUs, Nemotron 3.5 is a practical candidate for inline moderation in production systems.

Corroborating sources

Huggingface.co
https://huggingface.co/blog/nvidia/nemotron-3-5-content-safety
“Today, we are releasing Nemotron 3.5 Content Safety, which completes that arc: a single model that unifies multimodal input, multilingual reach, custom enterprise policy enforcement, and auditable reasoning into one inference call.”
Build.nvidia
https://build.nvidia.com/nvidia/nemotron-3.5-content-safety

What's new

Built on Google's Gemma 3 4B IT base model and fine-tuned on multimodal and multilingual content-safety datasets, Nemotron 3.5 Content Safety accepts:

A user prompt (text)

An optional image

An optional assistant response

An optional user-defined safety policy

It returns a safe/unsafe classification, applicable safety-category labels, and—in THINK mode—a step-by-step reasoning trace before the verdict.

Key specifications:

Parameters: 4B total, 128K context window

Languages: 12 with explicit training (English, French, Spanish, German, Chinese, Japanese, Korean, Arabic, Hindi, Russian, Portuguese, Italian); approximately 140 via Gemma 3 base-model transfer

Output modes: Binary verdict / verdict with categories / THINK mode (auditable reasoning trace)

Safety taxonomy: 13 core categories plus 10 fine-grained subcategories under the Aegis 2.0 framework

Hardware: Runs in 4-bit quantization at approximately 6.7 GB VRAM

The THINK mode is the most significant new addition. It generates a chain-of-thought trace that explains why content was flagged, which NVIDIA positions as supporting enterprise audit requirements.

Context

Why it matters