OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark

The Decoder·Matthias Bastian

4h ago

·~3 min·6/23/2026·en·3

Quick Answer

OpenAI's new GPT-5.5-Cyber model surpasses Anthropic's Mythos in cybersecurity benchmarks, enhancing its Daybreak initiative with an updated Codex Security plugin.

Quick Take

OpenAI's new GPT-5.5-Cyber model surpasses Anthropic's Mythos in cybersecurity benchmarks, enhancing its Daybreak initiative with an updated Codex Security plugin. The focus now shifts to automatic vulnerability patching, supported by a network of over 25 security firms and several governments.

Key Points

GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmarks.
OpenAI expands its Daybreak initiative with a full GPT-5.5-Cyber model.
The updated Codex Security plugin enhances automatic vulnerability patching.
Partnership includes over 25 security firms and several government entities.

📖 Reader Mode

~3 min read

Matthias Bastian

OpenAI is expanding its Daybreak cybersecurity initiative with an updated Codex Security plugin, the full GPT-5.5-Cyber model, and a partner network of more than 25 security firms and several governments.

Anthropic recently made a similar point, and OpenAI agrees. The real bottleneck in cybersecurity has moved from finding flaws to actually patching them. To close that gap, OpenAI is shipping an updated Codex Security plugin that covers the full pipeline from discovery to patch generation, along with the full release of GPT-5.5-Cyber, a specialized model that sets new highs on security benchmarks. OpenAI also launched an open-source patching initiative and a partner program with more than 25 security firms.

Codex Security update closes the loop from discovery to patch

The Codex Security plugin shipped as a research preview back in March. Since then, it's scanned over 30 million commits across more than 30,000 codebases, OpenAI says. Over 500,000 findings were automatically flagged as fixed, and human reviewers manually confirmed another 70,000.

OpenAI wants the updated plugin to act like a security engineer sitting next to every developer. It analyzes code alongside a threat model, spots flaws, checks whether affected code is actually reachable, builds a targeted patch, and verifies the result.

New in this update are deep scans of entire codebases, attack path analysis, and export to existing vulnerability management systems through SARIF files or CodeQL queries. The plugin can also triage findings from other scanners or bug bounty reports and automate patch generation in batch mode. Humans still sign off on every change, OpenAI says.

GPT-5.5-Cyber stays locked to vetted defenders

The full version of GPT-5.5-Cyber replaces an earlier preview that mostly aimed to cut unnecessary refusals in security workflows. OpenAI calls the updated model the most capable single model for finding and patching software flaws.

GPT-5.5-Cyber leads on all key cybersecurity benchmarks, according to OpenAI. CyberGym measures whether an agent can reproduce known flaws in software environments. ExploitGym tests whether agents can turn vulnerabilities into working exploits. SEC-bench Pro evaluates long-term vulnerability discovery.

Model	CyberGym	ExploitGym	SEC-bench Pro
GPT-5.5-Cyber	85.6%	39.5%	69.8%
Mythos 5	83.8%	–	–
GPT-5.5	81.8%	25.95%	63.1%
GPT-5.4	79.0%	–	–
Claude Opus 4	73.1%	–	–

The latest version of GPT-5.5-Cyber is deliberately more permissive than standard models and refuses fewer requests, OpenAI says. But only verified defenders can access it, and OpenAI ties that access to verification, monitoring, and guardrails. Most users should stick with GPT-5.5 paired with Trusted Access for Cyber and Codex Security, OpenAI says.

Over 25 security firms and several governments join the program

Through the Daybreak Cyber Partner Program, security companies can plug GPT-5.5 with Trusted Access for Cyber into their own products. Partners include Cisco, CrowdStrike, Cloudflare, Palo Alto Networks, IBM, Fortinet, Wiz, SentinelOne, Darktrace, Palantir, Accenture, PwC, and KPMG.

OpenAI is also expanding its government work. The company says it has Trusted Access partnerships with Australia, Canada, France, Germany, Japan, South Korea, the EU agency ENISA, and the UK. In the US, OpenAI is working to carry out a recently issued executive order on AI security and plans to collaborate directly with critical infrastructure operators.

OpenAI also launched Patch the Planet together with Trail of Bits, HackerOne, and Calif to bring the same patching tools to open-source software. More than 30 open-source projects have signed on, including cURL, Go, Python, Sigstore, and pyca/cryptography. Security researchers work with maintainers to validate and deduplicate flaws and patches before anything gets merged. A first five-day sprint turned up hundreds of issues and led to dozens of merged patches, OpenAI says.

— Originally published at the-decoder.com

Continue reading on the-decoder.com

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from The Decoder

See more →

The Decoder·Maximilian Schreiner

2h ago

FeaturedOriginal

Cursor announces its own AI model, a new Git platform, and a mobile app

AI Summary

Cursor has launched its first in-house AI model alongside a new Git platform and a mobile app, aiming to enhance developer productivity. The AI model is designed to streamline coding processes, while the Git platform offers improved version control features tailored for collaborative projects.

#LLM #AI Coding #Open Source #AI Startup

OpenAI says new GPT-5.5-Cyber outperforms Anthropic's Mythos on cybersecurity benchmark

Quick Answer

Quick Take

Key Points

📖 Reader Mode

Codex Security update closes the loop from discovery to patch

GPT-5.5-Cyber stays locked to vetted defenders

Over 25 security firms and several governments join the program

Want this in your inbox every morning?

More from The Decoder

Cursor announces its own AI model, a new Git platform, and a mobile app

OpenAI models now available on Amazon Web Services

Microsoft and Nvidia reportedly team up on AI PCs that run actual agents instead of Copilot

Related in this space

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

As AI agents become employees, NewCore emerges with $66M to give them identities

The Importance of Out-of-Band Metadata for Safe Autonomous Agents: The Redpanda Agentic Data Plane