Models
October 29, 2025

Open-weight “gpt-oss” models release

OpenAI released gpt-oss-safeguard, open-weight reasoning models (20B and 120B) enabling developers to apply custom policies at inference, classify messages, completions and chats while explaining decision logic.

OpenAI introduced the gpt-oss-safeguard model series (gpt-oss-safeguard-20B and -120B) as open-weight reasoning engines tailored for safety and trust-and-safety classification tasks.

Developers supply their own policy text at runtime and the model reasons over input accordingly, classifies conversation elements (user messages, completions, full chats) and emits chain-of-thought explanations of how decisions are made.

OpenAI positions them as alternatives to rigid classifiers: they permit iterative policy changes without retraining. Limitations noted include higher compute/latency and that traditional classifiers may still win in ultra-high precision contexts.

#
OpenAI

Read Our Content

See All Blogs
Gen AI

Anthropic’s Claude Managed Agents platform accelerates AI agent deployment for teams

Deveshi Dabbawala

April 9, 2026
Read more
AI safety

Everything you need to know about Anthropic's Project Glasswing

Deveshi Dabbawala

April 8, 2026
Read more