FeatureGoogleVerified

Google launches AI Edge Gallery and Eloquent dictation on macOS, powered by Gemma 4 12B

ListenJun 3, 2026published Jun 4, 2026

Google quietly opened a new front in the on-device-AI race today, bringing two Mac-native apps out of the lab and onto everyday laptops. The Google Developers Blog announced that Google AI Edge Gallery and Google AI Edge Eloquent are now available on macOS, both running locally on a user's machine and both powered by the newly released Gemma 4 12B open model.

What's new

Google AI Edge Gallery on macOS. Google's local-AI showcase app, previously limited to Android and iOS, is now a Mac-native experience. Google describes it as letting users "generate and execute scripts on the fly for tasks such as data analysis." The Mac build currently exposes five of Google's own Gemma models.
Google AI Edge Eloquent on macOS. A free, on-device voice-dictation app that transcribes speech and cleans it up in place, removing filler words and disfluencies. Google added the ability "to interactively polish and rewrite text through voice commands, entirely on-device, powered by the new Gemma 4 12B model."
LiteRT-LM serve command. Google's LiteRT-LM CLI gained a serve subcommand for spinning up a local OpenAI-compatible endpoint, lowering the lift to swap a hosted model for an on-device one inside existing code.
Gemma 4 12B as the engine. The 12B-parameter, multimodal Gemma 4 release is the model Google is leaning on across all three surfaces — it is designed to run on consumer MacBooks with roughly 16 GB of RAM, and Google reports "a 60%+ jump in overall quality" over earlier Gemma generations.

Context

Google released Gemma 4 12B earlier the same day as an open, encoder-free multimodal model sized for laptops. That release covered the model. Today's developer-blog post covers the delivery channel: the polished, Mac-native apps that put Gemma 4 12B in the hands of users who would never download a .gguf file or stand up Ollama. The Edge Gallery has existed on mobile for some time, but the macOS port — and the addition of a real productivity app in Eloquent — is the first time Google's on-device stack has shown up on the OS where most developers actually work. It also lands directly opposite Apple's own on-device foundation-model push, on Apple's own hardware.

Why it matters

On-device AI has spent the last 18 months mostly being a story about chips, not apps. Google is doing the opposite here: shipping two end-user surfaces that demonstrate concretely what a mid-sized local model is good for. Eloquent in particular is an interesting tell — voice-to-clean-text is a workflow where latency and privacy both reward local execution, and where the perceived quality gap to a frontier hosted model is small. If Gemma 4 12B is genuinely a 60%+ quality jump over its predecessors, Google has plausibly crossed the threshold where a 16-GB MacBook can replace a cloud round-trip for a meaningful slice of everyday tasks. The presence of an OpenAI-compatible local endpoint in LiteRT-LM signals where this is going: Google wants Gemma to be a drop-in target inside developer codebases that today point at hosted APIs, with the laptop quietly absorbing more of the workload.

Corroborating sources

Developers.googleblog
https://developers.googleblog.com/bringing-gemma-4-12b-to-your-laptop-unlocking-local-agentic-workflows-with-google-ai-edge/
“With the 12B model you can generate and execute scripts on the fly for tasks such as data analysis.”

What's new

Google AI Edge Gallery on macOS. Google's local-AI showcase app, previously limited to Android and iOS, is now a Mac-native experience. Google describes it as letting users "generate and execute scripts on the fly for tasks such as data analysis." The Mac build currently exposes five of Google's own Gemma models.

Google AI Edge Eloquent on macOS. A free, on-device voice-dictation app that transcribes speech and cleans it up in place, removing filler words and disfluencies. Google added the ability "to interactively polish and rewrite text through voice commands, entirely on-device, powered by the new Gemma 4 12B model."

LiteRT-LM serve command. Google's LiteRT-LM CLI gained a serve subcommand for spinning up a local OpenAI-compatible endpoint, lowering the lift to swap a hosted model for an on-device one inside existing code.

Gemma 4 12B as the engine. The 12B-parameter, multimodal Gemma 4 release is the model Google is leaning on across all three surfaces — it is designed to run on consumer MacBooks with roughly 16 GB of RAM, and Google reports "a 60%+ jump in overall quality" over earlier Gemma generations.

Context

Why it matters