GPT-5.5 Instant: 52.5% fewer hallucinations in medicine, legal and finance

The launch

GPT-5.5 was announced on April 23, 2026 for Plus, Pro, Business and Enterprise subscribers. On May 5, GPT-5.5 Instant launched as the ChatGPT default, replacing GPT-5.3 Instant. Also available via API as chat-latest.

"Instant" is the tier optimized for low latency and low cost, designed for massive conversational use. It's what free users and most Plus users see on simple queries.

The number that matters

-52.5%

Hallucinations vs GPT-5.3
on high-stakes prompts

Critical domains:
medicine, legal, finance

200ms

Typical
time-to-first-token

Hallucinations are incorrect claims presented with confidence by the model. In critical domains — medical diagnoses, legal citations, financial data — a hallucination can have serious consequences.

OpenAI measured internally on a proprietary dataset of adversarial prompts in these three domains. The 52.5% reduction is the largest improvement in an Instant iteration since GPT-4 → GPT-4 Turbo.

How they did it

OpenAI didn't publish technical details, but the combination implied is three things: better default RAG (the model consults verified sources before responding in these domains), stricter RLHF with domain-expert annotators, and internal chain-of-thought where the model "doubts" before asserting.

The latter mechanism is notable: GPT-5.5 Instant went from "respond quickly" to "respond quickly but verify internally". The latency cost is low (~50ms extra) but the change in reliability is large.

And also: Codex and voice

OpenAI simultaneously launched GPT-5.3-Codex, the first model combining Codex + GPT-5 training stacks. Better code generation + general reasoning + general-purpose intelligence, all in one.

Realtime models were also updated: GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper. Voice latency reduced to 180ms TTFB and support for 50+ languages.

GPT-5.5-Cyber for defenders

On May 7, OpenAI broadened access to GPT-5.5-Cyber, a model specialized for vetted cybersecurity teams. It arrived a month after Anthropic's Mythos (a similar model). Tasks: vulnerability analysis, threat hunting, writing detection rules, malware analysis.

Access is controlled — only verified teams. The reason: the same models can be used for offense or defense.

Product implications

For companies building on GPT via API, the improvement is transparent — if you use gpt-4o-mini or equivalent, you're already receiving GPT-5.5 Instant. The change shows up in production: fewer manual corrections, fewer cases where the model "invents".

The use case that benefits most: assistants in regulated domains. Healthcare, banking, lawyers. Where an integration previously required a strong human validation layer, it can now be lighter.

Conclusion

The 52.5% is not a marketing number — it's a reproducible internal benchmark. If it holds in production, GPT-5.5 Instant will be the first "Instant" model usable directly in high-stakes flows without strong guardrails. That changes the economics of many products that today spend more on validation than on inference.