GPT-5.5-Cyber is OpenAI’s most capable security model yet, and on June 22, 2026 the company released it in full as the centrepiece of an expanded Daybreak initiative — pairing a vendor-stated 85.6% on the CyberGym benchmark with Patch the Planet, a program that ships AI-discovered fixes to more than 30 open-source projects under Trail of Bits review.

The headline number is real but heavily caveated: OpenAI scored its own model, and the most capable tier is not available to the public. The more durable story sits underneath the benchmarks. OpenAI argues the security bottleneck has inverted — for years the hard part was finding vulnerabilities; now defenders are buried in findings and the hard part is patching them fast enough. Daybreak is built around that inversion.

This guide separates what actually shipped from what is marketing, recomputes the benchmark deltas, maps the three-tier access model so you can see where your organisation sits, and translates the SMB implication that most coverage skips: for small and mid-sized businesses, this capability arrives through your existing security vendor, not a direct OpenAI key.

Key takeaways

01
Daybreak expanded on June 22, 2026.OpenAI shipped four additions: the full release of GPT-5.5-Cyber (beyond its earlier permissive-only preview), an updated Codex Security plugin, the Daybreak Cyber Partner Program, and Patch the Planet. The initiative first launched May 12, 2026.
02
The bottleneck moved from finding to patching.OpenAI’s framing is that AI now surfaces vulnerabilities faster than teams can remediate them, so the constraint is no longer discovery — it’s landing the fix. That reframing, not the raw benchmark, is the argument worth taking seriously.
03
Every benchmark figure here is OpenAI’s own.The 85.6% CyberGym, 39.5% ExploitGym, and 69.8% SEC-bench Pro scores are self-reported and not independently audited. CyberGym is a UC Berkeley benchmark, but OpenAI ran the evaluation. Treat all figures as vendor-stated.
04
The most capable models are gated, not public.GPT-5.5-Cyber sits behind Trusted Access for Cyber, and OpenAI’s comparison places Anthropic’s Mythos 5 (also restricted) close behind. The open frontier defenders can actually reach is base GPT-5.5 at a vendor-stated 81.8%.
05
SMBs reach Daybreak through their security vendor.Direct model access stays with the 28 named partners — CrowdStrike, Sophos, SentinelOne, Cloudflare, and others. Smaller businesses encounter these capabilities embedded in the products they already run, not via a direct OpenAI API.

01 — What ShippedA May launch, a June expansion.

Daybreak first launched on May 12, 2026 as an initiative combining GPT-5.5, Codex Security, and Trusted Access for Cyber (TAC) to help organisations find and patch vulnerabilities before attackers exploit them. By May, OpenAI said hundreds of organisations and thousands of individual defenders were already enrolled in the TAC program.

June 22, 2026 was the substantive expansion. OpenAI shipped four things at once: the full release of GPT-5.5-Cyber — moving beyond the initial “permissive-only preview” — an updated Codex Security plugin, the Daybreak Cyber Partner Program with 28 named corporate partners, and Patch the Planet, an open-source patching effort run with Trail of Bits.

Model

GPT-5.5-Cyber

full release · restricted access

OpenAI's most capable security model, now released in full beyond the earlier permissive-only preview. Distributed through continued limited release to trusted defenders — not the general public.

Trusted Access for Cyber

Tooling

Codex Security plugin

updated · GPT-5.5 + TAC

Understands a team's code and threat model, flags plausible vulnerabilities, checks reachability, develops a targeted patch, and verifies the result — with humans deciding what to investigate and apply.

Human-in-control

Ecosystem

Cyber Partner Program

28 named partners

Accenture, Akamai, Cisco, Cloudflare, CrowdStrike, Darktrace, IBM, Okta, Palo Alto Networks, SentinelOne, Sophos, Wiz, Zscaler and more embed GPT-5.5 with Trusted Access for Cyber inside their own security products.

Indirect SMB path

Open source

Patch the Planet

with Trail of Bits

Co-founded with Trail of Bits, alongside HackerOne and Calif, to move open-source maintainers from findings to fixes. Expert human review precedes every finding that reaches a maintainer.

30+ projects committed

Keep the timeline straight

Three dates anchor this story. Codex Security went to research preview in March 2026. The Daybreak initiative launched on May 12, 2026. The GPT-5.5-Cyber full release, partner program, and Patch the Planet are the June 22, 2026 expansion. They are distinct milestones, not a single launch.

02 — The InversionThe bottleneck moved from finding to patching.

The single most useful idea in the whole announcement is not a number. It is a reframing of where security work actually gets stuck. For most of the last decade, the scarce skill was finding vulnerabilities — fuzzing harnesses, manual review, bug bounties. AI has changed the slope of that curve sharply enough that, in OpenAI’s telling, defenders are now drowning in findings and the real constraint has shifted downstream to remediation.

OpenAI’s reframing

“The bottleneck historically has been finding vulnerabilities, but now defenders are overwhelmed with the number of vulnerabilities found. Instead, the bottleneck is now patching vulnerabilities.” — OpenAI’s Daybreak announcement, June 22, 2026.

That claim is plausible and partly self-serving — a company selling patching tooling has every reason to declare patching the new frontier. But the supporting scale numbers are genuinely large. Since the Codex Security research preview in March 2026, OpenAI says the plugin has scanned more than 30 million commits across over 30,000 codebases. Human reviewers manually marked more than 70,000 findings as fixed, and a further 500,000-plus findings were automatically determined to be fixed. Whatever discount you apply for vendor framing, that is a volume of findings no human triage queue absorbs without help.

The forward-looking question this raises is uncomfortable for defenders. If AI keeps compounding the discovery rate while patch velocity stays human-paced, the gap between “known vulnerable” and “actually fixed” widens rather than closes — and that window is exactly where attackers operate. The bet embedded in Daybreak is that the same models can be pointed at the fix side fast enough to keep that window from blowing open. It is a bet, not a settled result.

"AI is already good and about to get super good at cybersecurity."— Sam Altman, CEO, OpenAI

03 — BenchmarksThe numbers, and why every one is vendor-stated.

OpenAI reports three benchmark results for GPT-5.5-Cyber, all measured against its own base model. On CyberGym — a UC Berkeley benchmark that tests whether an agent can reproduce 1,507 known software vulnerabilities from 188 open-source projects — the cyber variant posts a vendor-stated 85.6% against 81.8% for base GPT-5.5. On ExploitGym, which tests turning known vulnerabilities into working exploits, it reaches 39.5% versus 25.95%. On SEC-bench Pro, covering long-horizon vulnerability discovery and proof-of-concept generation, it scores 69.8% versus 63.1%.

CyberGym score · single-model, vendor-stated

Source: OpenAI (vendor-stated); Mythos 5 / Opus 4.7 figures per OpenAI's comparison, not independently audited

GPT-5.5-CyberRestricted — Trusted Access for Cyber

85.6%

OpenAI-stated SOTA

Claude Mythos 5Restricted — per OpenAI's comparison

83.8%

GPT-5.5 (baseline)Public — standard OpenAI API

81.8%

Claude Opus 4.7Public — per OpenAI's comparison

73.1%

GPT-5.5-Cyber (gated)Other models

The proprietary table below isolates the apples-to-apples comparison — cyber variant versus base, both scored by OpenAI on the same benchmarks — and recomputes the uplift in percentage points so the gains are not inflated by relative-percentage framing.

GPT-5.5-Cyber versus base GPT-5.5 across CyberGym, ExploitGym, and SEC-bench Pro, with the absolute uplift in percentage points and what each benchmark measures. All figures OpenAI vendor-stated.
Benchmark	GPT-5.5 (base)	GPT-5.5-Cyber	Uplift (pp)	What it measures
CyberGym	81.8%	85.6%	+3.8	Reproducing known vulnerabilities
ExploitGym	25.95%	39.5%	+13.55	Turning vulnerabilities into working exploits
SEC-bench Pro	63.1%	69.8%	+6.7	Long-horizon discovery and PoC generation

Read the ExploitGym number honestly

The ExploitGym jump from 25.95% to 39.5% is the largest relative gain — roughly 52% higher than base GPT-5.5. But a 39.5% score still means the model fails about 60% of exploit-generation tasks. This is meaningful assistance, not near-autonomous exploitation. OpenAI itself notes the benchmark figures came from its own testing, which it said was continuing on real-world fixes.

04 — The FieldBeating Mythos 5 — but both models stay gated.

OpenAI is making a public comparison: per its own CyberGym numbers, GPT-5.5-Cyber’s 85.6% edges Anthropic’s Claude Mythos 5 at 83.8%, while base GPT-5.5 (81.8%) and Claude Opus 4.7 (73.1%) sit below. Two cautions matter here. First, these cross-vendor figures are OpenAI-stated, not independently audited — the Mythos 5 and Opus 4.7 scores come from OpenAI’s comparison, not from Anthropic. Second, Mythos is not generally available; it is restricted to a small number of organisations under Anthropic’s rival Project Glasswing.

Multi-benchmark comparison of GPT-5.5-Cyber, Claude Mythos 5, GPT-5.5 baseline, and Claude Opus 4.7 across CyberGym, ExploitGym, and SEC-bench Pro, with public-access status and intended access path. All figures OpenAI vendor-stated; cross-vendor figures per OpenAI’s comparison.
Model	CyberGym	ExploitGym	SEC-bench Pro	Public access	Access path
GPT-5.5-Cyber	85.6%*	39.5%*	69.8%*	No	Trusted Access for Cyber — verified defenders only
Claude Mythos 5	83.8%*	N/A	N/A	No	Project Glasswing — small set of cyber orgs
GPT-5.5 (baseline)	81.8%*	25.95%*	63.1%*	Yes (API)	Standard OpenAI access
Claude Opus 4.7	73.1%*	N/A	N/A	Yes (API)	Standard Anthropic access

* All figures OpenAI-stated; Mythos 5 and Opus 4.7 scores come from OpenAI’s comparison, not independently audited. ExploitGym and SEC-bench Pro results for the Anthropic models have not been published, so those cells are left as N/A rather than estimated.

This produces a genuine paradox. The two most capable security models on the board are both gated — GPT-5.5-Cyber behind Trusted Access for Cyber, Mythos behind Project Glasswing — converging on the same restricted-access philosophy. The frontier defenders and attackers can actually reach today is base GPT-5.5 at 81.8% and Opus 4.7 at 73.1%. Not everyone is convinced the gated tier changes the underlying picture. As SpecterOps CTO Jared Atkinson put it, AI will accelerate offensive security operations, but it does not fundamentally change the underlying problems defenders face. The capability is moving fast; the structural problems of patching, ownership, and coordination are not.

05 — Access TiersThree tiers of access — who gets what.

OpenAI describes the access model across several pages, but never as a single map. There are three tiers, and the distinction that trips people up is that GPT-5.5-Cyber is not the same thing as GPT-5.5 with Trusted Access for Cyber. The cyber variant is the most permissive, most tightly gated tier; TAC is the middle tier for standard enterprise defensive work.

The three Daybreak access tiers — default GPT-5.5, GPT-5.5 with Trusted Access for Cyber, and GPT-5.5-Cyber — with the gate, who it is for, and how SMBs reach each.
Tier	Gate	Who it’s for	SMB access path
GPT-5.5 (default)	None — standard OpenAI account	All developers — secure coding, review, triage, patch validation	Direct, via the Codex Security plugin
GPT-5.5 + Trusted Access	Application + identity verification; phishing-resistant auth required from June 1, 2026	Cyber teams, security vendors, integrators, DevSecOps	Indirect — through partner vendor products
GPT-5.5-Cyber	Stricter verification, scoping, logging, ongoing review	Authorized red teams and penetration testers	Not typically available — enterprise/government path only

A gate with a deadline

Trusted Access for Cyber members must enable Advanced Account Security from June 1, 2026, or attest to phishing-resistant single sign-on. OpenAI’s stated guidance: for most defenders, GPT-5.5 with Trusted Access for Cyber and Codex Security remains the right starting point — the cyber variant is not the default recommendation.

OpenAI’s framing of why it gates at all is worth quoting in its own words: the company says it does not think it is practical or appropriate to centrally decide who gets to defend themselves. The tiering is the attempt to square broad defensive access with limiting the most offense-capable behaviour to verified, scoped, logged users.

06 — Patch the PlanetFrom findings to fixes, with humans in front.

Patch the Planet is the part of the announcement with the most concrete, checkable detail — and the design choice that distinguishes it. Co-founded with Trail of Bits, in collaboration with HackerOne and Calif, the program helps open-source maintainers move from findings to fixes. More than 30 projects have committed, including cURL, Go, Python, Sigstore, NATS Server, aiohttp, and python.org. The deliberate design principle: expert human review precedes every finding that reaches a maintainer. Trail of Bits engineers manually deduplicate, correct severity, and filter false positives before anything is submitted — the opposite of a raw AI bug-dump that floods maintainers faster than they can respond.

That matters because the people on the receiving end are stretched thin. OpenAI cites Harvard and Linux Foundation research finding that 94% of widely used open-source projects studied had fewer than ten developers responsible for more than 90% of the code added in a year. An AI that surfaces hundreds of issues into a one-maintainer project is a denial-of-service on attention unless something filters first.

Linux kernel

LPE exploits + 8 leak PoCs

GPT-5.5-Cyber analyzed 30M+ lines of kernel code, generating 24 local privilege-escalation exploits and 8 pointer information-leak proof-of-concepts from hundreds of flagged potential issues.

vendor-stated

FreeBSD

vulnerabilities confirmed

OpenAI researchers confirmed 34 FreeBSD vulnerabilities and produced 7 local privilege-escalation PoCs, with CVE disclosures documented on freebsd.org.

7 LPE PoCs

Chrome V8

exploitable bugs reported

Five exploitable vulnerabilities found in Chrome's V8 JavaScript engine; three were identified and remediated within days of being introduced into the codebase.

3 fixed in days

Safari WebKit

bugs in roughly a week

10+

More than ten exploitable WebKit vulnerabilities found and reported during roughly one week of focused work — a pace that is the whole point of the patching-bottleneck argument.

vendor-stated

The single most vivid example does not require parsing a benchmark. During its own safety evaluations, GPT-5.5 — the base model, not even the cyber variant — identified a WebAssembly vulnerability in Firefox, recorded as CVE-2026-8390. Mozilla patched it two days before Pwn2Own Berlin. Five of the six registered Firefox competition entries withdrew, and no Firefox exploit was successfully demonstrated at the event. A model found a real, exploitable browser bug during routine testing and quietly took an entire competition track off the board.

Other findings round out the picture. AI models identified a 23-year-old use-after-free in OpenBSD’s kernel implementation of System V semaphores, confirmed to let an unprivileged local user escalate to root. Calif used Codex to discover an HTTP/2 denial-of-service technique affecting major server software including NGINX, Apache, IIS, and Pingora; its analysis estimated more than 880,000 internet-facing websites were running affected software with HTTP/2 enabled. And Codex Security independently identified vulnerable patterns corresponding to four of the six dnsmasq CVEs fixed in release 2.92rel2. Trail of Bits also reported building a complete fuzzing lab in less than a day using GPT-5.5-Cyber — work it estimates would ordinarily take at least several weeks by hand.

Assistive, not autonomous

OpenAI is explicit that this is validated remediation with a human in control: the system identifies plausible vulnerabilities, checks reachability, gathers evidence, develops a targeted patch, and verifies it — but humans remain in control of which findings to investigate, which changes to apply, and what information to share. Patch the Planet is an open-source initiative run with Trail of Bits, not a self-serve enterprise patching service.

07 — ImplicationsWhat it means for SMBs and agencies.

The practical takeaway most coverage skips: for small and mid-sized businesses, Daybreak is not something you buy directly. The Cyber Partner Program is explicitly architectural — OpenAI routes capabilities through 28 partners who embed them inside their own products, keeping direct model access in the hands of those partners. If you run CrowdStrike, Sophos, SentinelOne, Cloudflare, or one of the other named vendors, you will encounter this AI as a feature of tools you already pay for, not as an API key you provision. That is the same build-vs-buy decision for AI-assisted workflows playing out in security: the realistic path for most teams is buy, through the stack they already operate.

Most developers

Start with the default tier

OpenAI's own guidance is that GPT-5.5 with Trusted Access for Cyber and Codex Security is the right starting point for most defenders. The base tier needs no special gate — secure coding, review, and patch validation are available now.

Use GPT-5.5 default

SMBs

Access arrives through your security vendor

Direct model access stays with the 28 partners. If you use CrowdStrike, Sophos, or SentinelOne, the capability reaches you embedded in those products — not via a direct OpenAI key. Ask your vendor what they have integrated.

Buy through your stack

Security teams

Apply for Trusted Access for Cyber

Advanced defensive work — triage, malware analysis, detection engineering, incident response — needs the middle tier, which requires application, identity verification, and phishing-resistant auth from June 1, 2026.

Apply for TAC

Red teams

GPT-5.5-Cyber is gated tightest

The cyber variant is reserved for authorized red teams and penetration testers under stricter verification, scoping, and logging. There is no self-serve path; this is an enterprise and government channel.

Enterprise/gov only

For agencies and engineering teams, the strategic read is that defensive tooling is becoming a model-routing question rather than a single-vendor choice. The same discipline we bring to agentic security risks applies here: decide which workloads justify gated access, which are well served by base GPT-5.5, and which belong inside a partner product you already trust. Pair that with hands-on hygiene — the kind of account security audit practices that close the gaps no model patches for you. If you are weighing where AI-assisted security fits in your own builds, our secure web development engagements and AI digital transformation programs start with exactly this kind of routing and governance decision. The named partners — including Accenture’s AI security partnerships — show how the integration layer is already forming.

08 — ConclusionThe capability is here; the distribution is the question.

The shape of defensive AI, June 2026

Finding bugs got cheap. Landing the fix is the new frontier.

GPT-5.5-Cyber and the expanded Daybreak initiative are a real step in AI-assisted security — but the durable insight is the reframing, not the leaderboard. When models surface vulnerabilities faster than teams can remediate them, the constraint moves to patching, and Patch the Planet is OpenAI’s attempt to put a human-reviewed pipeline around that shift.

Hold the benchmarks at arm’s length. The 85.6% CyberGym, 39.5% ExploitGym, and 69.8% SEC-bench Pro figures are OpenAI’s own, unaudited, and the cross-vendor comparisons against Mythos 5 and Opus 4.7 come from OpenAI rather than Anthropic. Even taken at face value, a 39.5% exploit-generation score means the model fails most of those tasks. This is a force multiplier for defenders, in Cisco’s framing — not an autonomous patching machine.

The forward signal is about distribution, not capability. The two strongest security models on the board are both gated, the open frontier defenders can reach is base GPT-5.5, and for most businesses the capability arrives indirectly through a security vendor. The winning move is not chasing the gated tier — it is deciding, workload by workload, where AI-assisted security belongs in a stack you already run, and making sure the patch actually ships.

GPT-5.5-Cyber & Daybreak: AI That Now Patches Code

01 — What ShippedA May launch, a June expansion.

GPT-5.5-Cyber

Codex Security plugin

Cyber Partner Program

Patch the Planet

02 — The InversionThe bottleneck moved from finding to patching.

03 — BenchmarksThe numbers, and why every one is vendor-stated.

CyberGym score · single-model, vendor-stated

04 — The FieldBeating Mythos 5 — but both models stay gated.

05 — Access TiersThree tiers of access — who gets what.

06 — Patch the PlanetFrom findings to fixes, with humans in front.

LPE exploits + 8 leak PoCs

vulnerabilities confirmed

exploitable bugs reported

bugs in roughly a week

07 — ImplicationsWhat it means for SMBs and agencies.

Start with the default tier

Access arrives through your security vendor

Apply for Trusted Access for Cyber

GPT-5.5-Cyber is gated tightest

08 — ConclusionThe capability is here; the distribution is the question.

Finding bugs got cheap. Landing the fix is the new frontier.

The hard part is no longer finding the bug — it’s shipping the fix.

AI security routing engagements

The questions teams ask about Daybreak.

Continue exploring AI & security.

OpenAI AgentKit: Build AI Agents Step-by-Step

GPT-5.6 Sol, Terra & Luna: OpenAI's New Model Family

Gemini 3.5 Flash Computer Use: Agentic Automation 2026

Sakana Fugu: A Multi-Agent AI Orchestration Model 2026

AI Agent Governance: Policy and Compliance 2026 Guide

Google AI Plans: Free vs Plus vs Pro vs Ultra 2026