Google DeepMind's Gemma 4 family arrives on Amazon Bedrock in three variants with multimodal and reasoning support

No audio yetJun 10, 2026published Jun 15, 2026

Amazon Web Services announced the availability of Google DeepMind's Gemma 4 open-weight model family on Amazon Bedrock on June 10, 2026. Three model variants are now available through Bedrock's managed inference service, each targeting different workload types — from high-reasoning and coding tasks to low-latency interactive applications.

What's new

The three variants available on Bedrock:

Gemma 4 31B — a 30.7-billion-parameter dense model optimized for reasoning and coding-heavy workloads, with a 256,000-token context window.
Gemma 4 26B-A4B — a mixture-of-experts architecture targeting cost- and latency-sensitive workloads, with 4 billion active parameters.
Gemma 4 E2B — the smallest variant in the family, designed for low-latency interactive use cases.

All three models include "built-in reasoning, native function calling, support for 35+ languages and multimodal input across text, image, video and audio." The models use an encoder-free architecture that processes visual and audio tokens directly through the language model backbone, reducing overhead compared to traditional multimodal pipelines.

With Gemma 4 on Bedrock, teams can build applications for "reasoning, multimodal understanding, agentic, and software engineering workflows" using AWS's managed inference layer — including Bedrock's guardrails, cross-region inference, and streaming — without managing their own GPU infrastructure.

Context

The Gemma 4 family was released by Google DeepMind in early June 2026 under Apache 2.0 license, first available as downloadable weights on Hugging Face and Kaggle for local deployment. The Bedrock launch extends the family's reach into managed enterprise cloud deployment, where organizations want access to capable open-weight models without the operational complexity of self-hosting.

Gemma 4 was previously available on Amazon SageMaker JumpStart as of April 2026. The Bedrock launch adds a different deployment profile: managed inference at scale, with the billing and security posture that enterprise teams expect from Bedrock.

Why it matters

For organizations running multimodal AI applications on AWS, Gemma 4 on Bedrock adds a Google DeepMind-produced option alongside Anthropic's Claude series, Meta's Llama variants, and Amazon's own Titan models — without changing the API or billing infrastructure. Teams that need vendor diversity as a policy or risk-management requirement gain a capable alternative within the same framework.

The mixture-of-experts 26B-A4B variant is particularly relevant for cost-conscious deployments: MoE architectures route tokens through only a subset of parameters per step, substantially reducing compute cost compared to a dense model of equivalent size.

With 35+ language support and native multimodal inputs across text, image, video, and audio, Gemma 4 on Bedrock positions the family as viable infrastructure for international product teams building multimodal pipelines at enterprise scale.

Corroborating sources

Aws.amazon
https://aws.amazon.com/about-aws/whats-new/2026/06/gemma-4-amazon-bedrock/
“built-in reasoning, native function calling, support for 35+ languages and multimodal input across text, image, video and audio.”