The launch
GPT-5.5 was announced on April 23, 2026 for Plus, Pro, Business and Enterprise subscribers. On May 5, GPT-5.5 Instant launched as the ChatGPT default, replacing GPT-5.3 Instant. Also available via API as chat-latest.
"Instant" is the tier optimized for low latency and low cost, designed for massive conversational use. It's what free users and most Plus users see on simple queries.
The number that matters
on high-stakes prompts
medicine, legal, finance
time-to-first-token
Hallucinations are incorrect claims presented with confidence by the model. In critical domains — medical diagnoses, legal citations, financial data — a hallucination can have serious consequences.
OpenAI measured internally on a proprietary dataset of adversarial prompts in these three domains. The 52.5% reduction is the largest improvement in an Instant iteration since GPT-4 → GPT-4 Turbo.
How they did it
OpenAI didn't publish technical details, but the combination implied is three things: better default RAG (the model consults verified sources before responding in these domains), stricter RLHF with domain-expert annotators, and internal chain-of-thought where the model "doubts" before asserting.
The latter mechanism is notable: GPT-5.5 Instant went from "respond quickly" to "respond quickly but verify internally". The latency cost is low (~50ms extra) but the change in reliability is large.
And also: Codex and voice
OpenAI simultaneously launched GPT-5.3-Codex, the first model combining Codex + GPT-5 training stacks. Better code generation + general reasoning + general-purpose intelligence, all in one.
Realtime models were also updated: GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper. Voice latency reduced to 180ms TTFB and support for 50+ languages.
GPT-5.5-Cyber for defenders
On May 7, OpenAI broadened access to GPT-5.5-Cyber, a model specialized for vetted cybersecurity teams. It arrived a month after Anthropic's Mythos (a similar model). Tasks: vulnerability analysis, threat hunting, writing detection rules, malware analysis.
Access is controlled — only verified teams. The reason: the same models can be used for offense or defense.
Product implications
For companies building on GPT via API, the improvement is transparent — if you use gpt-4o-mini or equivalent, you're already receiving GPT-5.5 Instant. The change shows up in production: fewer manual corrections, fewer cases where the model "invents".
The use case that benefits most: assistants in regulated domains. Healthcare, banking, lawyers. Where an integration previously required a strong human validation layer, it can now be lighter.
Conclusion
The 52.5% is not a marketing number — it's a reproducible internal benchmark. If it holds in production, GPT-5.5 Instant will be the first "Instant" model usable directly in high-stakes flows without strong guardrails. That changes the economics of many products that today spend more on validation than on inference.