Models & Releases
New models, versions, and capabilities shipping across the frontier and open-model labs.
New models, versions, and capabilities shipping across the frontier and open-model labs.
Amazon Web Services has launched Web Search on Amazon Bedrock AgentCore as a generally available feature, giving AI agents access to a continuously updated proprietary web index without routing…
OpenAI announced on June 18 that GPT-5.5 Instant — the model now powering health responses for all free ChatGPT users — outperformed physician-written answers on a set of 3,500 real-world health…
Google announced on June 15 the deprecation of six media generation models across its Gemini API — all three Imagen 4 image generation variants and all three Veo video generation models in the 2.0…
xAI's Grok 4.3 became available on Amazon Bedrock on June 15, 2026, making the reasoning-first model accessible to enterprise developers through AWS's managed AI infrastructure. The launch gives Grok…
Z.ai released GLM-5.2 on June 16, 2026, an open-source model with a 1M-token lossless context window and benchmark scores that place it at the front of available open-source coding models. The model…
Amazon has updated SageMaker AI Async Inference to accept raw inference payloads directly in the request body, removing the long-standing requirement to stage input data in Amazon S3 before…
Google updated the Gemini API on June 17, 2026 to support streaming for text-to-speech output via the gemini-3.1-flash-tts-preview model. Developers can now receive audio chunks as they are generated…
Google announced on June 15, 2026 the deprecation of its Imagen 4 image generation family and Veo 2/3 video generation models from the Gemini API. Veo users face a June 30 deadline — less than two…
Amazon Web Services has added container caching to Amazon SageMaker AI, removing the container image download step from the scale-out critical path and reducing model startup latency by 51% in tested…
Amazon Web Services has made P-EAGLE, a parallelized speculative decoding technique, available through Amazon SageMaker JumpStart. The method accelerates large language model inference by generating…
Meta introduced AI Mode on Facebook on June 16, 2026, a new search capability that replaces traditional link-based results with AI-generated answers drawn from public conversations happening across…
Cursor shipped a significant update to Bugbot on June 10, 2026, making its automated code review agent more than three times faster, 22% cheaper to run, and 10% more effective at catching bugs. The…
xAI has shipped a series of updates to its Grok Imagine API in June 2026, adding deep integration with the Files API, public URL sharing for stored assets, and a new image-to-video model — Grok…
Databricks completed its native AI SQL function suite in June 2026, promoting aiextract, aiclassify, and aiquery to general availability in a series of releases between June 11 and June 15 — giving…
Google announced on June 15 that it is deprecating all three Imagen 4 image generation models and three Veo video generation models from the Gemini API. The Imagen 4 family shuts down August 17,…
Anthropic retired two original Claude 4 models — Claude Sonnet 4 and Claude Opus 4 — effective June 15, 2026. All API requests to the model IDs claude-sonnet-4-20250514 and claude-opus-4-20250514 now…
Google released Gemini 3.5 Flash as a generally available model on May 19, 2026, and simultaneously launched Managed Agents in the Gemini API in public preview — a hosted agent execution service that…
Anthropic has shipped a /fork command to Claude Code, enabling developers to branch an AI coding session into a parallel subagent that inherits the full context of the existing conversation — a…
Midjourney set V8.1 as its default model on June 11, 2026, replacing V7 after user testing and feedback. The transition marks V8.1's general availability as the standard experience for all Midjourney…
Google formally ended API access to all four Gemini 2.0 Flash models on June 1, 2026. Requests to gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001 now…
xAI updated its developer platform on June 10, 2026, closing the loop between the Files API and the Imagine image and video generation endpoints. Three interlocking changes let applications…
Alongside the June 9, 2026, launch of Claude Fable 5 and Claude Mythos 5, Anthropic shipped a new beta API parameter called fallbacks that lets developers automatically reroute refused requests to a…
xAI launched Grok Build in May 2026, entering the agentic coding tool space alongside Anthropic's Claude Code, OpenAI's Codex CLI, and GitHub Copilot Workspace. The release packages two connected…
Google is integrating Gemini directly with Google Business Profile, giving small business owners an AI assistant that has access to their real operational data — customer reviews, search impressions,…
Anthropic announced on June 5, 2026 that Claude Sonnet 4 (claude-sonnet-4-20250514) and Claude Opus 4 (claude-opus-4-20250514) will be retired from the Claude API on June 15, 2026—five days from…
Anthropic has shipped three developer-facing additions to its Claude Managed Agents platform alongside the June 9 Claude Fable 5 launch. The most significant is scheduled deployments: developers can…
Anthropic on June 9, 2026, launched Claude Fable 5 and Claude Mythos 5, its most capable publicly released models to date. Fable 5 is immediately available to all users across Claude.ai and the API;…
OpenAI updated its Responses API on June 9, 2026, enabling web search to return image results alongside text. The change is live in the v1/responses endpoint and targets applications where a…
Google has launched Gemini 3.5 Live Translate, a speech-to-speech translation model that handles more than 70 languages and 2,000 language pair combinations in real time. The model enters public…
At London Tech Week on June 7, NVIDIA founder Jensen Huang and UK Prime Minister Keir Starmer declared that "the U.K. would be an AI maker, not an AI taker" — a commitment backed by a cluster of…
OpenAI on June 4, 2026 added moderation scoring to both the Responses API and the Chat Completions API, enabling developers to check the safety of prompts and model outputs in a single generation…
NVIDIA and LG Group announced on June 7, 2026 a broad collaboration to build an AI factory designed to accelerate LG's businesses across robotics, autonomous driving, data center infrastructure, and…
NVIDIA and Doosan Group announced an expanded collaboration on June 7, 2026, covering physical AI systems, industrial robotics, power infrastructure for AI factories, and substrate materials for AI…
Anthropic shipped two API changes on June 2, 2026 that reduce billing costs and improve control for developers running inference at scale. What's New No billing for zero-output refusals: On the…
Google has moved two of its native Gemini image generation models from preview to general availability on the Gemini API: gemini-3.1-flash-image and gemini-3-pro-image. The GA release also introduces…
ElevenLabs on May 28, 2026, released Dubbing v2, an AI dubbing model designed to preserve the emotional performance and vocal characteristics of the original speaker across more than 90 languages.…
ElevenLabs on May 26, 2026, released Music v2, the next version of its AI music generation model. The update adds genre-transition support, embedded sound effects, an improved multilingual engine,…
OpenAI on June 7, 2026 introduced Trusted Contact — an optional, opt-in safety feature in ChatGPT that can notify a designated contact if automated systems and trained human reviewers determine a…
OpenAI on June 7, 2026 published a detailed overview of three new realtime voice models now available in the API: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. The models advance…
Google moved two native visual generation models to general availability on May 28, 2026: Gemini 3.1 Flash Image (internally branded "Nano Banana 2") and Gemini 3 Pro Image ("Nano Banana Pro"). The…
Google retired four Gemini 2.0 API models on June 1, 2026, ending support for the Flash and Flash Lite variants that anchored the 2.0 generation. Developers still hitting these endpoints receive…
Amazon Web Services on June 4, 2026 added NVIDIA Nemotron 3 Ultra to SageMaker JumpStart, giving cloud developers one-click access to a 550-billion-parameter reasoning model designed specifically for…
Anthropic on June 2 shipped two developer-facing improvements to the Claude API: a new parameter for capping the advisor tool's output per call, and a billing change that eliminates charges on API…
Google used its annual I/O developer conference on May 19, 2026 to announce Gemini 3.5 Flash — a frontier-class model designed for fast, low-cost agentic inference — and Gemini Spark, a 24/7 personal…
OpenAI's GPT-5.4 and GPT-5.5 became available on Amazon Bedrock on June 1, 2026, through an OpenAI-compatible Responses API endpoint. The move lets AWS customers access frontier OpenAI models within…
Alibaba's Qwen team released Qwen3.6-27B on April 21, 2026, an open-weight vision-language model optimized for agentic coding. Despite its 27B parameter count — dense, not mixture-of-experts — it…
DeepSeek released V4-Pro and V4-Flash via its API on April 24, 2026, marking a new generation of its language model lineup. The update also sets a hard deprecation deadline: the long-standing…
OpenAI has launched Lockdown Mode, a new security feature for ChatGPT designed to reduce the risk of sensitive data being exfiltrated through prompt injection attacks. The feature is now rolling out…
OpenAI announced a round of platform deprecations in early June 2026, setting a November 30 deadline for the Evals Platform, Agent Builder, and reusable prompts, while also retiring three older image…
Mistral launched Vibe on May 28, 2026, a unified agent platform that consolidates productivity, developer, and conversational AI tools under a single interface at chat.mistral.ai. What's new Vibe…
NVIDIA unveiled RTX Spark on May 31, 2026, a purpose-built superchip that combines a Blackwell RTX GPU with a custom 20-core Grace CPU to deliver 1 petaflop of AI performance in slim laptops and…
Google unveiled Gemini for Science at Google I/O 2026 on May 19, 2026, announcing a suite of AI tools designed to accelerate scientific research workflows. Rather than a single purpose-trained model,…
xAI launched Grok Build in May 2026, entering the coding agent market with a purpose-trained model — grok-build-0.1 — built specifically for software engineering workflows. The release positions xAI…
xAI released Context Compaction for the Grok API in late May 2026, providing developers with a server-side mechanism for trimming accumulated conversation histories without losing the context that…
Google introduced Gemini Omni at Google I/O 2026, a new multimodal model that accepts images, audio, video, and text in any combination and generates video as its primary output. It is Google's first…
Anthropic added two enterprise-oriented features on May 19, 2026: MCP Tunnels as a Research Preview, enabling connections to Model Context Protocol servers running inside private networks, and…
Anthropic shipped two developer-facing improvements to the Claude API on June 2, 2026: a billing policy change that eliminates charges for zero-output refusals, and a new maxtokens parameter on the…
OpenAI updated its default prompt caching behavior on May 29, 2026, changing the default promptcacheretention setting from inmemory to 24h for all organizations that do not have Zero Data Retention…
OpenAI introduced Workload Identity Federation on May 26, 2026, a security capability that lets enterprise workloads authenticate to the OpenAI API without storing long-lived API keys. Trusted…
Google launched Gemini 3.5 Flash on May 19, 2026 at Google I/O 2026, the first model in a new series it describes as combining frontier-level intelligence with Flash-tier speed. Alongside it, Google…
On May 18, 2026, Anthropic updated the web search tool in the Claude API to return richer SEC filing data. The change improves the tool's usefulness for financial research agents, earnings analysis…
On May 27, 2026, Anthropic shipped a small but practically useful addition to the Claude API: responses now include a usage.outputtokensdetails.thinkingtokens field that reports exactly how many…
Anthropic on June 2, 2026, shipped two developer-facing API updates: a new cost-control parameter for the advisor tool, and a billing change that eliminates charges for requests the model refuses…
Anthropic shipped two developer-facing Claude API changes on June 2, 2026: a maxtokens parameter for the advisor tool that caps advisor model output per call, and a billing policy change that…
OpenAI on May 29, 2026 launched Rosalind Biodefense, a program that expands trusted access to GPT-Rosalind for vetted developers and U.S. government partners focused on biodefense, public health, and…
xAI updated its web search capability on May 27, 2026 to support explicit image search. The addition allows Grok-powered applications to retrieve and embed relevant images as part of model responses,…
xAI released Grok Build 0.1 on May 19, 2026, a model trained specifically for agentic coding tasks rather than general assistance. The release followed a May 14 beta launch of the Grok Build product,…
Anthropic announced on June 5, 2026 that Claude Opus 4.1 (claude-opus-4-1-20250805) is deprecated, with retirement from the Claude API scheduled for August 5, 2026. Teams still using Opus 4.1 have…
xAI shipped three API features on May 29, 2026 targeting production agentic workloads: a Context Compaction API for compressing long conversations into reusable shorter contexts, a WebSocket…
Google made three connected announcements on May 19, 2026: Gemini 3.5 Flash reached general availability, Managed Agents launched in the Gemini API as a public preview, and Google shipped Antigravity…
NVIDIA has released Nemotron 3.5 Content Safety, a 4-billion-parameter model that brings together multimodal input evaluation, support for 12 languages, custom enterprise policy enforcement, and…
Google moved gemini-3.1-flash-image and gemini-3-pro-image to general availability on May 28, 2026, bringing native visual generation to the Gemini API's stable tier. The flash model gains an…
Mistral AI on May 22, 2026 released Mistral Medium 3.5, a 128B open-weight dense model designed for coding, reasoning, and instruction-following in a single set of weights, alongside a new…
Google launched Gemini Omni on May 29, 2026 — a new model built specifically for video generation from any input type, including images, audio, existing video, and text. Its primary differentiator is…
OpenAI updated its Responses API and Chat Completions API on June 4, 2026 to support inline moderation scoring — letting developers receive policy-violation checks on both their input prompt and the…
Anthropic on May 29, 2026 extended its Claude Platform on AWS with three significant capabilities: webhook support for Claude Managed Agents, multiagent orchestration, and self-hosted sandboxes. The…
Anthropic on June 3 formally structured the services side of its Claude Partner Network, introducing a tiered Services Track and a public-facing Partner Hub — two moves that shift the company's…
Anthropic on May 19, 2026 launched two infrastructure features for Claude Managed Agents: self-hosted sandboxes and MCP tunnels. Both are in research preview and address enterprise concerns about…
Anthropic closed a $65 billion Series H funding round on May 28, 2026, at a $965 billion post-money valuation — the highest valuation ever recorded for a private AI company. The round was led by…
Google on June 1 shut down the entire Gemini 2.0 Flash family on the Gemini API, retiring gemini-2.0-flash, gemini-2.0-flash-001, gemini-2.0-flash-lite, and gemini-2.0-flash-lite-001 in a single…
OpenAI began rolling out a new long-term memory architecture for ChatGPT on June 4, 2026. The system, which the company calls Dreaming V3, replaces the saved-memories plus Dreaming V0 combination…
Google quietly opened a new front in the on-device-AI race today, bringing two Mac-native apps out of the lab and onto everyday laptops. The Google Developers Blog announced that Google AI Edge…
OpenAI on June 2, 2026 launched six role-specific Codex plugins, for data analytics, creative production, sales, product design, public equity investing, and investment banking, alongside Sites, a…
OpenAI broke ground on June 1, 2026 on a 1 gigawatt data center campus called The Barn in Saline, Michigan, the first publicly disclosed Stargate site east of Texas. Governor Gretchen Whitmer…
Anthropic on May 28, 2026 introduced Dynamic Workflows in Claude Code, a research preview that lets Claude plan and execute long-horizon coding tasks by orchestrating tens to hundreds of subagents…
OpenAI's two frontier chat models and its Codex coding agent are now generally available on Amazon Bedrock. AWS announced the move on June 1, 2026, one month after the two companies expanded a…
NVIDIA shipped JetPack 7.2 with NemoClaw agent-skill support on 2026-06-01, framing the release as Jetson's transition from a developer board into a production-grade platform for agentic AI at the…
Fundamental's NEXUS tabular foundation model became available on Amazon SageMaker JumpStart on 2026-06-03, giving AWS customers a one-click deployment path to a pre-trained model designed…
Mayo Clinic and Microsoft announced a strategic collaboration on 2026-06-02 to build and deploy a frontier AI model purpose-built for healthcare. The model will be owned by Mayo Clinic and made…
On May 31, 2026, NVIDIA announced Vera, its first standalone CPU positioned for the agentic-AI era. Vera pairs 88 NVIDIA-designed Olympus cores with high-bandwidth LPDDR5X memory and is being…
NVIDIA's largest open Nemotron model, Nemotron 3 Ultra, is set to land on June 4, 2026 as a NIM microservice and on the major open-weights hubs. NVIDIA framed Ultra at its May 31 software-and-agents…
Anthropic shipped two developer-facing ergonomics changes to the Claude API on June 2, 2026, both posted in the company's official Claude Platform release notes. The advisor tool — the…
Microsoft on June 2, 2026 launched Scout, an always-on personal AI agent that lives inside Microsoft 365 and acts on a user's behalf without being prompted each turn. Microsoft is positioning Scout…
OpenAI on June 3, 2026 shipped a major update to GPT-Rosalind, its life-sciences reasoning model, alongside two production plugins for Codex that turn the model's intelligence into repeatable…
xAI has added Smart Turn end-of-turn detection to its streaming Speech-to-Text API, a feature aimed at one of the most stubborn problems in voice interfaces — knowing when a speaker has actually…
OpenAI announced on June 3, 2026 the deprecation of three developer-platform products — reusable prompt objects (the v1/prompts API), the Evals platform, and Agent Builder — with a coordinated…
Google introduced Gemma 4 12B on June 3, 2026, a new open-weights multimodal model from Google DeepMind that is designed to run on consumer hardware while keeping benchmark performance close to a…
Together AI on June 2, 2026 published an engineering deep-dive on serving MiniMax's forthcoming M3 model on its inference cloud, claiming order-of-magnitude speedups on the model's signature…
Microsoft on June 2, 2026 used its Build developer conference to unveil seven foundation models built from scratch by its in-house Microsoft AI division — the first time the company has rolled out a…
Mistral released Search Toolkit on May 28, 2026, an open-source framework for building production search pipelines for AI applications. The framework ships with BM25 sparse retrieval, dense…
H-company released Holo3.1 on June 2, 2026, a family of computer-use agents in four sizes — 0.8B, 4B, 9B, and 35B-A3B (a 35-billion-parameter Mixture-of-Experts with roughly 3 billion active per…
JetBrains released Mellum2 on June 1, 2026, a 12-billion-parameter Mixture-of-Experts (MoE) model trained from scratch on natural language and code, with only 2.5 billion parameters active per token.…
At its GTC Taipei keynote at COMPUTEX, NVIDIA showcased more than a dozen industrial-software companies — including Cadence, Dassault Systèmes, Siemens and Synopsys — building autonomous AI engineers…
At CVPR 2026, NVIDIA unveiled a sweeping set of physical AI agent skills — runtime tools that let researchers stitch together world models, simulators, training and evaluation into a single…
NVIDIA and Microsoft used Microsoft Build on June 2, 2026 to announce a coordinated agentic-AI stack that spans Windows PCs, Azure cloud, and local enterprise deployment, plus a concrete hardware…
Mistral launched Vibe on May 28, 2026 as a unified agent for long-horizon productivity and coding, packaging two modes — Work and Code — alongside a new VS Code extension. Pricing starts free, with a…
Mistral introduced a new class of models on May 27, 2026 called Physics AI — data-driven models that predict physical behavior directly from geometry and boundary conditions, running on a single GPU…
Anthropic launched Claude Platform on AWS on 2026-05-11, a new AWS-native surface for the Claude API that runs on Anthropic-managed infrastructure with AWS billing and IAM authentication. The…
Anthropic released Claude Opus 4.8 on May 28, 2026, 41 days after Opus 4.7. The company says it improves on Opus 4.7 across benchmarks and ships at the same API price: $5 per million input tokens and…
xAI opened Grok 4.3 on its public API on April 30, 2026, after a SuperGrok Heavy beta that began April 17. xAI emphasizes improved agentic performance and a lower cost to run benchmark suites. On…
OpenAI released GPT-5.5 and GPT-5.5 Pro on April 23, 2026, calling GPT-5.5 its "smartest and most intuitive to use model yet," with gains in agentic coding, knowledge work, math, and scientific…
Anthropic announced general availability of Claude Opus 4.7 on April 16, 2026, with stronger coding and agentic performance than Opus 4.6 and higher-resolution vision. It reached 64.3% on SWE-bench…
Google released Gemini 3.1 Pro in preview on February 19, 2026 - its first ".1" mid-cycle Gemini update - built for complex problem-solving and advanced reasoning. It scored 77.1% on ARC-AGI-2, which…
On December 17, 2025, Google launched Gemini 3 Flash and made it the default model in the Gemini app and AI Mode in Search globally, replacing Gemini 2.5 Flash. Google positions it as Pro-grade…