The Codex CLI v1 to v2 migration is the largest schema change the OpenAI Codex CLI has shipped since launch — four breaking-change axes that touch every team's configuration, authentication, profile setup, and sandbox flags simultaneously. Done well, the cut-over is a three- to five-day project; done ad-hoc, it eats a sprint and produces a config sprawl that haunts the team for quarters.
The good news is that the v2 design isn't arbitrary — every breaking change is in service of a coherent goal. Per-profile settings unlock genuine separation between developer laptops, CI workers, and production agents; the auth surface gains dedicated headless patterns for unattended pipelines; sandbox flag renames bring naming in line with how teams actually deploy. The migration is mostly mechanical once you understand the shape.
This playbook walks every change a team needs to plan, sequence, and verify. What ships in v2, the config.toml schema migration, the auth surface and its three modes, the profile API revamp, the sandbox flag renames, a phased rollout pattern that minimises blast radius, and the four common upgrade failures with their diagnostic signals. The closing FAQ covers the questions teams ask before bumping the CLI in production.
- 01config.toml schema migration is the bulk of the work.Most teams need a one-time sweep across every checked-in config file. Deprecated keys still parse with warnings during the rollback window; new section headers must be added explicitly.
- 02Headless agent auth now has dedicated patterns.v2 separates interactive OAuth, long-lived service tokens, and short-lived CI-issued credentials into three distinct auth modes. Adopt the CI mode for unattended pipelines — it avoids the credential-rotation footgun.
- 03Per-profile settings unlock dev / CI / prod separation.The profile API revamp lets one config file describe many environments. Use named profiles deliberately rather than overloading environment variables — fewer foot-guns and the activation is explicit.
- 04Sandbox flag renames are mechanical — codemods cover them.Most renamed flags have direct equivalents. Run the official codemod across your repos before the cut-over date; the remaining manual cases will be visible in the codemod's report.
- 0572-hour rollback window is the right safety net.v2 ships a documented rollback path that accepts v1 configs during the first 72 hours after upgrade. Use the window deliberately rather than treating the cut-over as one-way — it's the cheapest insurance you have.
01 — What's Newv2 ships in four axes — config, auth, profiles, sandbox.
The v2 release is unusual because it bundles four breaking changes into one cut-over rather than spreading them across point releases. The reasoning is honest: every axis depends on the others — the new profile API needs the new config schema, the new sandbox flags assume per-profile activation, the new auth modes assume profile-scoped credentials — so shipping them sequentially would force every team through three painful migrations instead of one.
For most teams the breaking changes are mechanical: rename a section header, swap a flag, point an environment variable at a new file location. The non-mechanical parts are the auth surface (where CI workflows need their own credential pattern) and the profile design (where teams need to think about which environments deserve their own profile versus sharing one). Plan a half-day of human thinking on top of whatever the codemod produces.
config.toml schema
new sections · deprecated keysTop-level [model], [sandbox], and [profiles.*] sections replace the v1 flat layout. Deprecated keys continue to parse with a warning during the 72-hour rollback window. Most teams need a one-time sweep across every checked-in config file.
Where most teams spend their timeAuth surface
OAuth · long-lived token · headless CIv2 splits authentication into three explicit modes rather than relying on a single token field. Interactive OAuth for developer laptops, long-lived service tokens for trusted agents, short-lived CI-issued credentials for unattended pipelines.
Three modes, one configProfile API
per-profile model · sandbox · approvalNamed profiles in [profiles.dev], [profiles.ci], [profiles.prod] each carry their own model, sandbox configuration, and approval policy. Activation is explicit — CODEX_PROFILE env var or --profile flag — rather than inferred.
Deliberate environment separationSandbox flags
renamed · new defaultsSandbox flag names align with the new profile model — old standalone flags either move under [sandbox] or are renamed for clarity. Defaults also shifted: write access is opt-in rather than opt-out, network access is profile-scoped.
Mechanical with codemod coverageThree of the four axes have codemods or documented migration steps. The fourth — the profile API — is the one that benefits from human thinking, because the right profile layout for your team isn't something a tool can infer. Most teams land on three profiles (dev, ci, prod) and stop there; a minority with multiple production surfaces add a fourth or fifth deliberately.
02 — config.tomlSchema migration — new sections, deprecated keys.
The config schema migration is where most teams spend their time, because every checked-in config.toml file needs touching and many repos have several (developer defaults, CI overrides, per-package configs in monorepos). The v2 schema is structurally cleaner — flat key-value pairs from v1 move into named sections — but the migration itself is a mechanical sweep.
The simplest way to think about it: v1 had one namespace and relied on key prefixes to organise keys (sandbox_, model_, approval_ and so on). v2 has explicit sections ([sandbox], [model], [approval]) and per-profile sub-sections ([profiles.dev.sandbox] and so on). The information is the same; the layout is more legible and easier to override.
# v1 config.toml — flat layout
model = "gpt-5.5-codex"
sandbox_mode = "workspace-write"
sandbox_network = false
approval_policy = "untrusted"
auth_token = "sk-codex-..."
# v2 config.toml — sectioned layout
[model]
name = "gpt-5.5-codex"
[sandbox]
mode = "workspace-write"
network = false
[approval]
policy = "untrusted"
[auth]
mode = "long-lived-token"
token_env = "CODEX_AUTH_TOKEN"
The token field stops being a literal string by default — v2 prefers an environment-variable reference (token_env) so configs can be safely committed without leaking credentials. Inline tokens still work for back-compat but emit a deprecation warning; teams should remove them before the 72-hour rollback window closes to avoid having to revisit the file later.
model → [model] section
Old top-level model = '...' moves into [model] name = '...'. Optional [model] sub-keys (temperature, max_output_tokens) live in the same section. The codemod handles this rename automatically.
Run codemodsandbox_* keys → [sandbox] block
Every sandbox_-prefixed key collapses into a single [sandbox] section. New defaults shipped at the same time — write access is opt-in rather than the v1 opt-out default. Read section 05 before adopting.
Run codemod + auditauth_token → [auth] mode + token_env
Inline tokens still parse with a deprecation warning. The right replacement is the [auth] section with mode = 'long-lived-token' and token_env pointing at an environment variable. Drop inline tokens before the rollback window closes.
Manual reviewapproval_policy → [approval] section
Direct rename. Per-profile overrides become possible — [profiles.dev.approval] can relax to 'never' for trusted developer laptops while [profiles.ci.approval] stays 'untrusted'. Codemod handles the top-level rename.
Run codemodThe codemod ships as part of the v2 CLI — codex migrate scans every config file under a given root, prints a diff, and (with --write) applies the changes. Run it twice: once with --dry-run to review the diff and capture any manual cases the codemod flags, then again with --writeafter spot-checking the output. Commit the codemod's changes in their own PR so reviewers can audit the mechanical sweep separately from any human edits.
Three classes of cases the codemod can't handle on its own: configs assembled at runtime via shell scripts (the codemod only touches files), configs nested inside multi-tool files (e.g. pyproject.tomlwith a Codex table), and configs where comments document the v1 semantics and need re-writing for v2 readers. Plan a short review pass for each of those cases — the codemod's report names every file it skipped and why.
[extra](which v2 reserves for tool-specific extensions). Audit for custom keys before the cut-over — the codemod's strict-mode warning will list them.03 — Auth SurfaceToken rotation, OAuth flows, headless agent auth.
v2's auth surface is the redesign that benefits unattended workflows most. v1 had a single token concept that everyone stretched: developer laptops, CI workers, and production agents all used the same long-lived token, and the only way to distinguish them was at the IAM layer. v2 splits the auth surface into three explicit modes, each with its own rotation story and audit trail.
Pick the mode that matches the workload, not the one that matches the credential you already have. Developer laptops should use interactive OAuth; trusted agents that run unattended should use long-lived service tokens; CI workers should use short-lived credentials issued per-run. Mixing them (a long-lived token in CI for convenience) is the credential- rotation footgun that v2 is trying to eliminate.
Interactive OAuth — developer laptops
The Codex CLI runs a browser flow on first use. Tokens are stored in the OS keychain and refreshed automatically. No credentials in config files, no rotation duty for the developer. The right mode for any environment where a human is present at first use.
Pick for laptopsLong-lived service token
A long-lived token issued from the OpenAI org dashboard, stored in a secrets manager and read via environment variable. Suitable for trusted agents that run unattended and where rotation is owned by a security team rather than the CI pipeline. Quarterly rotation is the sane cadence.
Pick for trusted agentsShort-lived CI-issued credential
v2's headless mode — the CI runtime requests a per-run credential from the OpenAI auth endpoint using a workload identity (GitHub OIDC, GitLab ID token, AWS IAM). The credential is valid only for the duration of the job, eliminating the credential-rotation problem entirely.
Pick for CIThe CI mode is the one that pays back the migration cost most quickly. In v1, CI workflows held long-lived tokens in repository secrets; a leak meant rotating the token across every consumer. In v2, the workflow requests a fresh credential at the start of every run using its existing workload identity — no shared secret, no rotation duty, no blast radius if a log file accidentally captures a token. The migration is a ten-line change to the workflow YAML and a one-time configuration in the OpenAI org dashboard.
For teams currently running v1 in CI with a shared long-lived token, the migration sequence is: enable workload identity federation in the OpenAI org dashboard, update each CI workflow to use mode 03, run both modes in parallel for a week to verify, then revoke the shared long-lived token. The parallel-run window is the cheap insurance that catches edge cases — a workflow that doesn't have workload identity available, for instance — before they become a 2am firefight.
"v2's headless CI auth mode is the single feature that pays back the entire migration. A long-lived shared token in CI is a credential-rotation footgun; a per-run workload identity is just a workflow."— Internal note, Digital Applied agentic engineering team
04 — Profile APIPer-profile model, sandbox, approval settings.
The profile API is the v2 feature that takes the most thought to use well. The mechanics are simple: a config file can define multiple named profiles, each with its own model, sandbox configuration, approval policy, and auth mode. The judgement call is which environments deserve their own profile, and which can share.
Three profiles is the most common landing point — dev, ci, prod — and most teams should start there before adding more. Profiles aren't free: every profile is a configuration surface that someone has to maintain, and the surface compounds with the number of repos that consume it. The right question is "does this environment have meaningfully different requirements" rather than "could we make a profile for this?".
# v2 config.toml with three profiles
[profiles.dev]
[profiles.dev.model]
name = "gpt-5.5-codex"
[profiles.dev.sandbox]
mode = "workspace-write"
network = true
[profiles.dev.approval]
policy = "never"
[profiles.ci]
[profiles.ci.model]
name = "gpt-5.5-codex"
[profiles.ci.sandbox]
mode = "workspace-write"
network = false
[profiles.ci.approval]
policy = "untrusted"
[profiles.ci.auth]
mode = "ci-issued"
[profiles.prod]
[profiles.prod.model]
name = "gpt-5.5-codex"
[profiles.prod.sandbox]
mode = "read-only"
network = false
[profiles.prod.approval]
policy = "untrusted"
[profiles.prod.auth]
mode = "long-lived-token"
token_env = "CODEX_PROD_TOKEN"
Activation is explicit — either via the --profile flag or the CODEX_PROFILE environment variable — rather than inferred from the environment. This is a deliberate design choice: implicit activation in v1 led to confused incidents where the wrong profile was active and nobody noticed until a generated commit landed with the wrong auth. v2's explicit activation means a missing or wrong profile fails loudly at startup rather than silently mid-run.
Developer laptops
Workspace-write sandbox, network enabled, approval policy 'never' for trusted developers. Interactive OAuth for auth. The most permissive profile — appropriate because a human is at the keyboard.
Interactive OAuthCI workers
Workspace-write sandbox but network disabled — generated tests don't need outbound calls. Approval policy 'untrusted'. Short-lived CI-issued credential. The right balance of capability and containment for unattended runs.
ci-issued authProduction agents
Read-only sandbox, network disabled, approval 'untrusted', long-lived token rotated quarterly. For agents that run in production but only need to read code or write to a tightly scoped output channel — most observability and triage agents.
Long-lived tokenTwo practical rules for designing profiles. First, name them after environments rather than people or teams — dev, ci, prod ages better than alice, frontend-team, migration-project. Second, default to inheriting from a base profile rather than duplicating settings; v2 supports profile extension via extends = "base" at the top of any profile section, which keeps shared defaults in one place.
For teams currently using environment variables to switch behaviour (the v1 pattern), the migration is straightforward: create profiles for each environment, move the variable-driven settings into the corresponding profile, and replace the variable-switching shell wrapper with a single CODEX_PROFILEexport. The result is a config surface that's easier to read, easier to review, and harder to misconfigure silently.
05 — Sandbox FlagsRenamed flags and the new defaults.
Sandbox flag renames are the most mechanical part of the migration and the part where the codemod does the most work. Most v1 flags have direct v2 equivalents; the codemod swaps them automatically and prints a report of anything it couldn't resolve. What does benefit from human attention is the shift in defaults — v2 is more conservative than v1, and a few workloads need explicit re-permissioning.
Sandbox flag migration · share of cases by handling
Source: Digital Applied internal benchmark, May 2026 · n = 18 repos · 47 v1 configsThe default-change worth knowing in detail: v1 sandboxes defaulted to workspace-write with network access enabled; v2 defaults to workspace-writewith network access disabled. Teams that relied on the implicit network access — for example, scripts that pulled dependencies during a run — will see those operations fail until they explicitly enable network in the appropriate profile. The fix is one line per profile, but it's a line that needs to be added deliberately.
The second default-change worth knowing: v2's read-onlysandbox mode is genuinely read-only, where v1's "read-only" allowed writes to a temporary directory by default. Production profiles using v1 read-only with temp writes need to either upgrade to v2'sread-only-with-tmpmode or refactor to avoid the temp directory. Most production agents don't actually need temp writes — they were inheriting a v1 default they never thought about — but a few do.
--sandbox-mode → --sandbox.mode
Direct rename. Old --sandbox-mode workspace-write becomes --sandbox.mode workspace-write or [sandbox] mode = 'workspace-write' in config. Codemod handles both the CLI flag and the config form.
Codemod--network-enabled → explicit network flag
v1's --network-enabled (default true) becomes v2's --sandbox.network (default false). The flag name is similar but the default flipped — workloads that didn't pass the flag explicitly now need to.
Audit + codemod--approval → --approval.policy
Moved under the approval section to allow per-profile override. The codemod handles the rename. Profiles can now relax or tighten approval independently — the developer profile can run with 'never' while CI stays 'untrusted'.
Codemod--writable-paths → removed
v1's --writable-paths is removed in favour of profile-scoped paths in the [sandbox.paths] block. The codemod migrates known patterns; bespoke path lists need manual review. Most teams find the new structured form clearer than the v1 comma-separated string.
Manual reviewFor the codemod-handled 82% of cases, the human work is just reviewing the diff. For the 12% re-scoped cases, plan a short audit — run the codemod with --report-changes and walk through each flagged case to confirm the semantics are what you want. For the 6% removed cases, the codemod prints a replacement suggestion and the migration is a focused per-case decision.
06 — Phased RolloutPilot → wave 1 → wave 2 → cut over.
The phased rollout pattern below is what we recommend for any team with more than a handful of repos using Codex CLI. Big-bang migrations are tempting because they're conceptually simple, but they concentrate every failure mode into one window and leave no room to learn. A four-phase rollout spreads the risk across two to three weeks, lets each wave inform the next, and keeps a working v1 fallback through the entire process.
Pilot · one repo
1 repo · 1 team · 3 daysPick a low-traffic repo with a small team. Run the codemod, set up profiles, migrate the CI workflow to mode 03 auth. Document every issue. The pilot's job is to surface the unknowns before they cost a wave.
Goal: find the gotchasWave 1 · ~30% of repos
non-production · 5–7 daysApply the pilot's lessons to the next batch — typically internal tools, documentation repos, and other non-critical surfaces. Two engineers shepherd the wave; one owns codemod runs, one owns CI workflow updates.
Goal: prove the patternWave 2 · production
production repos · 5–7 daysMigrate the production repos. Keep v1 fallback configurations in place. Communicate the cut-over to dependent teams a week in advance — agentic pipelines breaking unexpectedly is the avoidable outage.
Goal: ship the valueCut over · retire v1
post-window cleanup · 1–2 daysAfter the 72-hour rollback window closes on the last wave, remove v1 fallback configs, revoke v1 long-lived tokens, and run codex migrate --strict across the fleet to confirm zero deprecated keys remain. The migration is done.
Goal: leave it cleanTwo practical operating rules for the phased rollout. First, keep the codemod and the human edits in separate commits — a reviewer reading the PR should be able to see the mechanical changes at a glance and focus their attention on the non-mechanical ones. Second, every wave should produce a short retrospective note (what worked, what didn't, what changed for the next wave) — the second and third waves are cheaper precisely because the pilot's lessons compound.
One temptation to resist: combining the v2 migration with other related changes (a model upgrade, a sandbox tightening, a profile reorganisation) into one cut-over. Each of those changes is worth doing on its own merits but bundling them into the migration window makes diagnosis harder when something breaks. Ship the v2 migration first, prove it's stable for a week, then make the other changes as deliberate follow-up PRs.
07 — Common PitfallsFour upgrade failures with diagnostic signals.
The migration failures below are the ones we see most often across teams — each has a clear diagnostic signal that points at the cause. None are catastrophic if caught early; all are painful if caught late.
Most common v1 → v2 migration failures · share
Source: Digital Applied migration support, May 2026 · n = 23 incidentsFailure 01 — Silent v1 key still in config. The most common failure by a wide margin. A custom v1 key (often added years ago for in-house tooling) survives the codemod sweep because the codemod doesn't know about it, and v2's strict-mode parser rejects it on first run. The diagnostic signal is a startup error mentioning "unknown key" with the offending key name. The fix is either removing the key (if obsolete), moving it under [extra] (if a custom tool reads it), or filing a feature request if the key represents a v1 capability v2 dropped.
Failure 02 — CI workflow auth not migrated. The team migrates configs and profiles in a sweep but forgets that CI workflows still hold the v1 long-lived token. v2 still accepts it (auth back-compat is generous), so nothing breaks — but the team loses the security benefit of mode 03 entirely. The diagnostic signal is a long-lived token still listed in the OpenAI org dashboard's active-tokens view a week after the cut-over. Audit on day seven and remediate any workflows still using v1 auth.
Failure 03 — Sandbox network default surprise. A workload that relied on v1's default-on network silently breaks because v2 defaults network off. The diagnostic signal is a workload failure with no obvious network-related error — Codex itself runs fine, but the generated code can't reach an outbound service it used to reach. The fix is adding network = true to the relevant [profiles.*.sandbox] block, but the more structural fix is questioning whether the workload should need outbound network at all.
Failure 04 — Profile activation missing. A team sets up named profiles but a wrapper script or CI workflow forgets to export CODEX_PROFILE or pass --profile. v2 falls back to the default profile (or to no profile at all, depending on config), and the wrong settings apply silently. The diagnostic signal is a Codex run that succeeds but produces output inconsistent with the intended profile — for example, a CI run that shouldn't have network access making outbound calls. Always grep your CI workflows for the activation pattern after the migration; the explicit-activation discipline is what makes profiles trustworthy.
For broader context on the Codex ecosystem and how this migration sits inside it, our Codex test-generation pipeline tutorial walks the canonical CI-side pattern that benefits most from v2's headless auth, and the Claude Code custom subagent tutorial covers the parallel pattern in the Anthropic ecosystem — the architectural shape carries across vendors. Teams running CLI migrations across many surfaces should read our AI digital transformation engagements for the longer-form playbook on coordinated rollouts.
CLI migrations are predictable when phased — pilot, wave, cut over, retire.
The Codex CLI v2 migration is the largest schema change the CLI has shipped, but it isn't a mysterious one. Four breaking-change axes, a codemod that handles the bulk of the mechanical work, a 72-hour rollback window that takes the edge off the cut-over, and a phased rollout pattern that spreads the risk across two to three weeks. Done well, it's a three- to five-day project; done ad-hoc, it eats a sprint and produces config sprawl that haunts the team for quarters.
The migration's payback is real and worth naming: per- profile separation that finally makes dev / CI / prod distinctions explicit, a headless CI auth mode that eliminates the long-lived shared-token rotation problem, and a config layout that's legibly sectioned rather than prefix-organised. Teams that adopt the new profile API deliberately — three named profiles, explicit activation, inherited base settings — find the post-migration config surface meaningfully smaller and harder to misconfigure.
The broader pattern is the one to keep. Treat every CLI migration as a phased rollout, not a big-bang cut-over. Separate codemod commits from human edits. Preserve a last-known-good config through the rollback window. Audit on day seven for migrations that look done but quietly aren't. The same shape applies to every CLI bump you'll do in the next two years — Codex, Claude Code, Gemini, whatever ships next — and the team that internalises it once stops dreading the upgrade cycle for good.