OpenAI adds inline moderation scores to the Responses and Chat Completions APIs
OpenAI updated its Responses API and Chat Completions API on June 4, 2026 to support inline moderation scoring — letting developers receive policy-violation checks on both their input prompt and the model's output within a single API call, rather than making a separate request to the standalone moderation endpoint.
What's new
Developers can now pass a moderation object in any standard generation request. The API returns moderation results alongside the model response, covering:
- Input moderation — the prompt or conversation sent to the model
- Output moderation — the text generated in response
The feature is available on both the Responses API and Chat Completions API, OpenAI's two primary text-generation endpoints. Full usage details are in OpenAI's Moderation guide at developers.openai.com.
Context
OpenAI has offered a standalone /v1/moderations endpoint for years, but integrating it into production pipelines required a separate API call per request — adding latency and making it easy to accidentally ship unmoderated output if the secondary call was skipped or failed.
Inline moderation removes that architectural seam. The check arrives with the generation, meaning the developer either gets both results or neither — there is no path where a response is returned without a moderation decision attached.
The June 4 changelog entry follows a productive stretch for the OpenAI developer platform: June 3 saw the deprecation of reusable prompt objects, the Evals platform, and Agent Builder; June 2 brought a billing simplification for container sessions; and June 1 brought OpenAI model availability in Amazon Bedrock.
Why it matters
Combining moderation with generation cuts at least one round-trip from any content-safety pipeline, which is significant for latency-sensitive applications like chat interfaces and voice agents. For enterprise deployments, it also simplifies compliance documentation: the moderation signal is structurally tied to every generation, not dependent on a developer remembering to wire it in.
Regulatory pressure is a tailwind. The EU AI Act's high-risk application requirements and several U.S. state AI laws are increasing scrutiny of content safeguards in customer-facing AI products. Having inline moderation available as a single-parameter addition lowers the engineering cost of meeting those requirements, which may accelerate adoption among teams that have been deferring the integration.
Corroborating sources
- Developers.openai
https://developers.openai.com/api/docs/changelog
“Added moderation scores to the Responses API and Chat Completions API. Pass a `moderation` object in a generation request to receive moderation results for both model input and generated output in the same response.”