SYS/2026.Q1Agentic SEO audits delivered in 72 hoursSee how →
DevelopmentMigration14 min readPublished May 15, 2026

config.toml migration, auth surface changes, profile API revamp, sandbox flag renames — the Codex CLI v2 upgrade that touches every team config.

Codex CLI v1 to v2 Migration Playbook: Config Changes

The Codex CLI v2 upgrade reshapes four surfaces simultaneously — config.toml schema, authentication, the profile API, and sandbox flag names. This playbook walks every change a team needs to plan, sequence, and verify before cutting over.

DA
Digital Applied Team
Agentic engineering · Published May 15, 2026
PublishedMay 15, 2026
Read time14 min
ScopeCLI v1 → v2
Breaking-change axes
4
config · auth · profiles · sandbox
Typical migration duration
35
days end-to-end for a mid-sized team
Rollback window
72h
supported safe-revert period
Profile slots
Unlimited
per-profile model, sandbox, approval

The Codex CLI v1 to v2 migration is the largest schema change the OpenAI Codex CLI has shipped since launch — four breaking-change axes that touch every team's configuration, authentication, profile setup, and sandbox flags simultaneously. Done well, the cut-over is a three- to five-day project; done ad-hoc, it eats a sprint and produces a config sprawl that haunts the team for quarters.

The good news is that the v2 design isn't arbitrary — every breaking change is in service of a coherent goal. Per-profile settings unlock genuine separation between developer laptops, CI workers, and production agents; the auth surface gains dedicated headless patterns for unattended pipelines; sandbox flag renames bring naming in line with how teams actually deploy. The migration is mostly mechanical once you understand the shape.

This playbook walks every change a team needs to plan, sequence, and verify. What ships in v2, the config.toml schema migration, the auth surface and its three modes, the profile API revamp, the sandbox flag renames, a phased rollout pattern that minimises blast radius, and the four common upgrade failures with their diagnostic signals. The closing FAQ covers the questions teams ask before bumping the CLI in production.

Key takeaways
  1. 01
    config.toml schema migration is the bulk of the work.Most teams need a one-time sweep across every checked-in config file. Deprecated keys still parse with warnings during the rollback window; new section headers must be added explicitly.
  2. 02
    Headless agent auth now has dedicated patterns.v2 separates interactive OAuth, long-lived service tokens, and short-lived CI-issued credentials into three distinct auth modes. Adopt the CI mode for unattended pipelines — it avoids the credential-rotation footgun.
  3. 03
    Per-profile settings unlock dev / CI / prod separation.The profile API revamp lets one config file describe many environments. Use named profiles deliberately rather than overloading environment variables — fewer foot-guns and the activation is explicit.
  4. 04
    Sandbox flag renames are mechanical — codemods cover them.Most renamed flags have direct equivalents. Run the official codemod across your repos before the cut-over date; the remaining manual cases will be visible in the codemod's report.
  5. 05
    72-hour rollback window is the right safety net.v2 ships a documented rollback path that accepts v1 configs during the first 72 hours after upgrade. Use the window deliberately rather than treating the cut-over as one-way — it's the cheapest insurance you have.

01What's Newv2 ships in four axes — config, auth, profiles, sandbox.

The v2 release is unusual because it bundles four breaking changes into one cut-over rather than spreading them across point releases. The reasoning is honest: every axis depends on the others — the new profile API needs the new config schema, the new sandbox flags assume per-profile activation, the new auth modes assume profile-scoped credentials — so shipping them sequentially would force every team through three painful migrations instead of one.

For most teams the breaking changes are mechanical: rename a section header, swap a flag, point an environment variable at a new file location. The non-mechanical parts are the auth surface (where CI workflows need their own credential pattern) and the profile design (where teams need to think about which environments deserve their own profile versus sharing one). Plan a half-day of human thinking on top of whatever the codemod produces.

Axis 01
config.toml schema
new sections · deprecated keys

Top-level [model], [sandbox], and [profiles.*] sections replace the v1 flat layout. Deprecated keys continue to parse with a warning during the 72-hour rollback window. Most teams need a one-time sweep across every checked-in config file.

Where most teams spend their time
Axis 02
Auth surface
OAuth · long-lived token · headless CI

v2 splits authentication into three explicit modes rather than relying on a single token field. Interactive OAuth for developer laptops, long-lived service tokens for trusted agents, short-lived CI-issued credentials for unattended pipelines.

Three modes, one config
Axis 03
Profile API
per-profile model · sandbox · approval

Named profiles in [profiles.dev], [profiles.ci], [profiles.prod] each carry their own model, sandbox configuration, and approval policy. Activation is explicit — CODEX_PROFILE env var or --profile flag — rather than inferred.

Deliberate environment separation
Axis 04
Sandbox flags
renamed · new defaults

Sandbox flag names align with the new profile model — old standalone flags either move under [sandbox] or are renamed for clarity. Defaults also shifted: write access is opt-in rather than opt-out, network access is profile-scoped.

Mechanical with codemod coverage

Three of the four axes have codemods or documented migration steps. The fourth — the profile API — is the one that benefits from human thinking, because the right profile layout for your team isn't something a tool can infer. Most teams land on three profiles (dev, ci, prod) and stop there; a minority with multiple production surfaces add a fourth or fifth deliberately.

Why a single big cut-over
The v2 release reads like four migrations stapled together because every axis depends on the others. The new sandbox flags assume profile-scoped activation; the new auth modes assume profile-scoped credentials; the profile API itself needs the new config schema. Shipping them sequentially would force every team through three painful migrations rather than one — the OpenAI Codex team made the right call.

02config.tomlSchema migration — new sections, deprecated keys.

The config schema migration is where most teams spend their time, because every checked-in config.toml file needs touching and many repos have several (developer defaults, CI overrides, per-package configs in monorepos). The v2 schema is structurally cleaner — flat key-value pairs from v1 move into named sections — but the migration itself is a mechanical sweep.

The simplest way to think about it: v1 had one namespace and relied on key prefixes to organise keys (sandbox_model_, approval_ and so on). v2 has explicit sections ([sandbox][model], [approval]) and per-profile sub-sections ([profiles.dev.sandbox] and so on). The information is the same; the layout is more legible and easier to override.

# v1 config.toml — flat layout
model = "gpt-5.5-codex"
sandbox_mode = "workspace-write"
sandbox_network = false
approval_policy = "untrusted"
auth_token = "sk-codex-..."

# v2 config.toml — sectioned layout
[model]
name = "gpt-5.5-codex"

[sandbox]
mode = "workspace-write"
network = false

[approval]
policy = "untrusted"

[auth]
mode = "long-lived-token"
token_env = "CODEX_AUTH_TOKEN"

The token field stops being a literal string by default — v2 prefers an environment-variable reference (token_env) so configs can be safely committed without leaking credentials. Inline tokens still work for back-compat but emit a deprecation warning; teams should remove them before the 72-hour rollback window closes to avoid having to revisit the file later.

Key 01
model → [model] section

Old top-level model = '...' moves into [model] name = '...'. Optional [model] sub-keys (temperature, max_output_tokens) live in the same section. The codemod handles this rename automatically.

Run codemod
Key 02
sandbox_* keys → [sandbox] block

Every sandbox_-prefixed key collapses into a single [sandbox] section. New defaults shipped at the same time — write access is opt-in rather than the v1 opt-out default. Read section 05 before adopting.

Run codemod + audit
Key 03
auth_token → [auth] mode + token_env

Inline tokens still parse with a deprecation warning. The right replacement is the [auth] section with mode = 'long-lived-token' and token_env pointing at an environment variable. Drop inline tokens before the rollback window closes.

Manual review
Key 04
approval_policy → [approval] section

Direct rename. Per-profile overrides become possible — [profiles.dev.approval] can relax to 'never' for trusted developer laptops while [profiles.ci.approval] stays 'untrusted'. Codemod handles the top-level rename.

Run codemod

The codemod ships as part of the v2 CLI — codex migrate scans every config file under a given root, prints a diff, and (with --write) applies the changes. Run it twice: once with --dry-run to review the diff and capture any manual cases the codemod flags, then again with --writeafter spot-checking the output. Commit the codemod's changes in their own PR so reviewers can audit the mechanical sweep separately from any human edits.

Three classes of cases the codemod can't handle on its own: configs assembled at runtime via shell scripts (the codemod only touches files), configs nested inside multi-tool files (e.g. pyproject.tomlwith a Codex table), and configs where comments document the v1 semantics and need re-writing for v2 readers. Plan a short review pass for each of those cases — the codemod's report names every file it skipped and why.

The hidden gotcha
v1 silently accepted unknown keys; v2 rejects them by default. Any custom key your team added to v1 configs for in-house tooling will break the v2 parse unless you move it under [extra](which v2 reserves for tool-specific extensions). Audit for custom keys before the cut-over — the codemod's strict-mode warning will list them.

03Auth SurfaceToken rotation, OAuth flows, headless agent auth.

v2's auth surface is the redesign that benefits unattended workflows most. v1 had a single token concept that everyone stretched: developer laptops, CI workers, and production agents all used the same long-lived token, and the only way to distinguish them was at the IAM layer. v2 splits the auth surface into three explicit modes, each with its own rotation story and audit trail.

Pick the mode that matches the workload, not the one that matches the credential you already have. Developer laptops should use interactive OAuth; trusted agents that run unattended should use long-lived service tokens; CI workers should use short-lived credentials issued per-run. Mixing them (a long-lived token in CI for convenience) is the credential- rotation footgun that v2 is trying to eliminate.

Mode 01
Interactive OAuth — developer laptops

The Codex CLI runs a browser flow on first use. Tokens are stored in the OS keychain and refreshed automatically. No credentials in config files, no rotation duty for the developer. The right mode for any environment where a human is present at first use.

Pick for laptops
Mode 02
Long-lived service token

A long-lived token issued from the OpenAI org dashboard, stored in a secrets manager and read via environment variable. Suitable for trusted agents that run unattended and where rotation is owned by a security team rather than the CI pipeline. Quarterly rotation is the sane cadence.

Pick for trusted agents
Mode 03
Short-lived CI-issued credential

v2's headless mode — the CI runtime requests a per-run credential from the OpenAI auth endpoint using a workload identity (GitHub OIDC, GitLab ID token, AWS IAM). The credential is valid only for the duration of the job, eliminating the credential-rotation problem entirely.

Pick for CI

The CI mode is the one that pays back the migration cost most quickly. In v1, CI workflows held long-lived tokens in repository secrets; a leak meant rotating the token across every consumer. In v2, the workflow requests a fresh credential at the start of every run using its existing workload identity — no shared secret, no rotation duty, no blast radius if a log file accidentally captures a token. The migration is a ten-line change to the workflow YAML and a one-time configuration in the OpenAI org dashboard.

For teams currently running v1 in CI with a shared long-lived token, the migration sequence is: enable workload identity federation in the OpenAI org dashboard, update each CI workflow to use mode 03, run both modes in parallel for a week to verify, then revoke the shared long-lived token. The parallel-run window is the cheap insurance that catches edge cases — a workflow that doesn't have workload identity available, for instance — before they become a 2am firefight.

"v2's headless CI auth mode is the single feature that pays back the entire migration. A long-lived shared token in CI is a credential-rotation footgun; a per-run workload identity is just a workflow."— Internal note, Digital Applied agentic engineering team

04Profile APIPer-profile model, sandbox, approval settings.

The profile API is the v2 feature that takes the most thought to use well. The mechanics are simple: a config file can define multiple named profiles, each with its own model, sandbox configuration, approval policy, and auth mode. The judgement call is which environments deserve their own profile, and which can share.

Three profiles is the most common landing point — dev, ci, prod — and most teams should start there before adding more. Profiles aren't free: every profile is a configuration surface that someone has to maintain, and the surface compounds with the number of repos that consume it. The right question is "does this environment have meaningfully different requirements" rather than "could we make a profile for this?".

# v2 config.toml with three profiles

[profiles.dev]
[profiles.dev.model]
name = "gpt-5.5-codex"

[profiles.dev.sandbox]
mode = "workspace-write"
network = true

[profiles.dev.approval]
policy = "never"

[profiles.ci]
[profiles.ci.model]
name = "gpt-5.5-codex"

[profiles.ci.sandbox]
mode = "workspace-write"
network = false

[profiles.ci.approval]
policy = "untrusted"

[profiles.ci.auth]
mode = "ci-issued"

[profiles.prod]
[profiles.prod.model]
name = "gpt-5.5-codex"

[profiles.prod.sandbox]
mode = "read-only"
network = false

[profiles.prod.approval]
policy = "untrusted"

[profiles.prod.auth]
mode = "long-lived-token"
token_env = "CODEX_PROD_TOKEN"

Activation is explicit — either via the --profile flag or the CODEX_PROFILE environment variable — rather than inferred from the environment. This is a deliberate design choice: implicit activation in v1 led to confused incidents where the wrong profile was active and nobody noticed until a generated commit landed with the wrong auth. v2's explicit activation means a missing or wrong profile fails loudly at startup rather than silently mid-run.

Dev profile
ws-write
Developer laptops

Workspace-write sandbox, network enabled, approval policy 'never' for trusted developers. Interactive OAuth for auth. The most permissive profile — appropriate because a human is at the keyboard.

Interactive OAuth
CI profile
untrusted
CI workers

Workspace-write sandbox but network disabled — generated tests don't need outbound calls. Approval policy 'untrusted'. Short-lived CI-issued credential. The right balance of capability and containment for unattended runs.

ci-issued auth
Prod profile
ro-only
Production agents

Read-only sandbox, network disabled, approval 'untrusted', long-lived token rotated quarterly. For agents that run in production but only need to read code or write to a tightly scoped output channel — most observability and triage agents.

Long-lived token

Two practical rules for designing profiles. First, name them after environments rather than people or teams — dev, ci, prod ages better than alice, frontend-team, migration-project. Second, default to inheriting from a base profile rather than duplicating settings; v2 supports profile extension via extends = "base" at the top of any profile section, which keeps shared defaults in one place.

For teams currently using environment variables to switch behaviour (the v1 pattern), the migration is straightforward: create profiles for each environment, move the variable-driven settings into the corresponding profile, and replace the variable-switching shell wrapper with a single CODEX_PROFILEexport. The result is a config surface that's easier to read, easier to review, and harder to misconfigure silently.

05Sandbox FlagsRenamed flags and the new defaults.

Sandbox flag renames are the most mechanical part of the migration and the part where the codemod does the most work. Most v1 flags have direct v2 equivalents; the codemod swaps them automatically and prints a report of anything it couldn't resolve. What does benefit from human attention is the shift in defaults — v2 is more conservative than v1, and a few workloads need explicit re-permissioning.

Sandbox flag migration · share of cases by handling

Source: Digital Applied internal benchmark, May 2026 · n = 18 repos · 47 v1 configs
Renamed flags · automaticdirect v1 → v2 equivalents, handled by codemod
82%
Re-scoped flags · review neededmoved into [sandbox] section with subtle semantics change
12%
Removed flags · manual replacementv1 features merged into profile API or dropped entirely
6%

The default-change worth knowing in detail: v1 sandboxes defaulted to workspace-write with network access enabled; v2 defaults to workspace-writewith network access disabled. Teams that relied on the implicit network access — for example, scripts that pulled dependencies during a run — will see those operations fail until they explicitly enable network in the appropriate profile. The fix is one line per profile, but it's a line that needs to be added deliberately.

The second default-change worth knowing: v2's read-onlysandbox mode is genuinely read-only, where v1's "read-only" allowed writes to a temporary directory by default. Production profiles using v1 read-only with temp writes need to either upgrade to v2'sread-only-with-tmpmode or refactor to avoid the temp directory. Most production agents don't actually need temp writes — they were inheriting a v1 default they never thought about — but a few do.

Flag 01
--sandbox-mode → --sandbox.mode

Direct rename. Old --sandbox-mode workspace-write becomes --sandbox.mode workspace-write or [sandbox] mode = 'workspace-write' in config. Codemod handles both the CLI flag and the config form.

Codemod
Flag 02
--network-enabled → explicit network flag

v1's --network-enabled (default true) becomes v2's --sandbox.network (default false). The flag name is similar but the default flipped — workloads that didn't pass the flag explicitly now need to.

Audit + codemod
Flag 03
--approval → --approval.policy

Moved under the approval section to allow per-profile override. The codemod handles the rename. Profiles can now relax or tighten approval independently — the developer profile can run with 'never' while CI stays 'untrusted'.

Codemod
Flag 04
--writable-paths → removed

v1's --writable-paths is removed in favour of profile-scoped paths in the [sandbox.paths] block. The codemod migrates known patterns; bespoke path lists need manual review. Most teams find the new structured form clearer than the v1 comma-separated string.

Manual review

For the codemod-handled 82% of cases, the human work is just reviewing the diff. For the 12% re-scoped cases, plan a short audit — run the codemod with --report-changes and walk through each flagged case to confirm the semantics are what you want. For the 6% removed cases, the codemod prints a replacement suggestion and the migration is a focused per-case decision.

06Phased RolloutPilot → wave 1 → wave 2 → cut over.

The phased rollout pattern below is what we recommend for any team with more than a handful of repos using Codex CLI. Big-bang migrations are tempting because they're conceptually simple, but they concentrate every failure mode into one window and leave no room to learn. A four-phase rollout spreads the risk across two to three weeks, lets each wave inform the next, and keeps a working v1 fallback through the entire process.

Phase 01
Pilot · one repo
1 repo · 1 team · 3 days

Pick a low-traffic repo with a small team. Run the codemod, set up profiles, migrate the CI workflow to mode 03 auth. Document every issue. The pilot's job is to surface the unknowns before they cost a wave.

Goal: find the gotchas
Phase 02
Wave 1 · ~30% of repos
non-production · 5–7 days

Apply the pilot's lessons to the next batch — typically internal tools, documentation repos, and other non-critical surfaces. Two engineers shepherd the wave; one owns codemod runs, one owns CI workflow updates.

Goal: prove the pattern
Phase 03
Wave 2 · production
production repos · 5–7 days

Migrate the production repos. Keep v1 fallback configurations in place. Communicate the cut-over to dependent teams a week in advance — agentic pipelines breaking unexpectedly is the avoidable outage.

Goal: ship the value
Phase 04
Cut over · retire v1
post-window cleanup · 1–2 days

After the 72-hour rollback window closes on the last wave, remove v1 fallback configs, revoke v1 long-lived tokens, and run codex migrate --strict across the fleet to confirm zero deprecated keys remain. The migration is done.

Goal: leave it clean

Two practical operating rules for the phased rollout. First, keep the codemod and the human edits in separate commits — a reviewer reading the PR should be able to see the mechanical changes at a glance and focus their attention on the non-mechanical ones. Second, every wave should produce a short retrospective note (what worked, what didn't, what changed for the next wave) — the second and third waves are cheaper precisely because the pilot's lessons compound.

One temptation to resist: combining the v2 migration with other related changes (a model upgrade, a sandbox tightening, a profile reorganisation) into one cut-over. Each of those changes is worth doing on its own merits but bundling them into the migration window makes diagnosis harder when something breaks. Ship the v2 migration first, prove it's stable for a week, then make the other changes as deliberate follow-up PRs.

The rollback discipline
The 72-hour rollback window is the cheapest insurance in this migration — don't treat the cut-over as a one-way move. Each wave should preserve a last-known-good v1 config branch for the rollback duration; if a downstream consumer breaks in a way you can't diagnose in the moment, the right move is revert to v1, fix forward, retry. Heroic forward-only debugging during a cut-over is how migrations consume a week instead of a day.

07Common PitfallsFour upgrade failures with diagnostic signals.

The migration failures below are the ones we see most often across teams — each has a clear diagnostic signal that points at the cause. None are catastrophic if caught early; all are painful if caught late.

Most common v1 → v2 migration failures · share

Source: Digital Applied migration support, May 2026 · n = 23 incidents
Silent v1 key still in configv2 strict-mode rejects unknown keys
38%
CI workflow auth not migratedstale long-lived token in repo secrets
27%
Sandbox network default surpriseworkload silently fails on network-requiring step
22%
Profile activation missingdefault profile applied instead of intended one
13%

Failure 01 — Silent v1 key still in config. The most common failure by a wide margin. A custom v1 key (often added years ago for in-house tooling) survives the codemod sweep because the codemod doesn't know about it, and v2's strict-mode parser rejects it on first run. The diagnostic signal is a startup error mentioning "unknown key" with the offending key name. The fix is either removing the key (if obsolete), moving it under [extra] (if a custom tool reads it), or filing a feature request if the key represents a v1 capability v2 dropped.

Failure 02 — CI workflow auth not migrated. The team migrates configs and profiles in a sweep but forgets that CI workflows still hold the v1 long-lived token. v2 still accepts it (auth back-compat is generous), so nothing breaks — but the team loses the security benefit of mode 03 entirely. The diagnostic signal is a long-lived token still listed in the OpenAI org dashboard's active-tokens view a week after the cut-over. Audit on day seven and remediate any workflows still using v1 auth.

Failure 03 — Sandbox network default surprise. A workload that relied on v1's default-on network silently breaks because v2 defaults network off. The diagnostic signal is a workload failure with no obvious network-related error — Codex itself runs fine, but the generated code can't reach an outbound service it used to reach. The fix is adding network = true to the relevant [profiles.*.sandbox] block, but the more structural fix is questioning whether the workload should need outbound network at all.

Failure 04 — Profile activation missing. A team sets up named profiles but a wrapper script or CI workflow forgets to export CODEX_PROFILE or pass --profile. v2 falls back to the default profile (or to no profile at all, depending on config), and the wrong settings apply silently. The diagnostic signal is a Codex run that succeeds but produces output inconsistent with the intended profile — for example, a CI run that shouldn't have network access making outbound calls. Always grep your CI workflows for the activation pattern after the migration; the explicit-activation discipline is what makes profiles trustworthy.

For broader context on the Codex ecosystem and how this migration sits inside it, our Codex test-generation pipeline tutorial walks the canonical CI-side pattern that benefits most from v2's headless auth, and the Claude Code custom subagent tutorial covers the parallel pattern in the Anthropic ecosystem — the architectural shape carries across vendors. Teams running CLI migrations across many surfaces should read our AI digital transformation engagements for the longer-form playbook on coordinated rollouts.

Wrapping up

CLI migrations are predictable when phased — pilot, wave, cut over, retire.

The Codex CLI v2 migration is the largest schema change the CLI has shipped, but it isn't a mysterious one. Four breaking-change axes, a codemod that handles the bulk of the mechanical work, a 72-hour rollback window that takes the edge off the cut-over, and a phased rollout pattern that spreads the risk across two to three weeks. Done well, it's a three- to five-day project; done ad-hoc, it eats a sprint and produces config sprawl that haunts the team for quarters.

The migration's payback is real and worth naming: per- profile separation that finally makes dev / CI / prod distinctions explicit, a headless CI auth mode that eliminates the long-lived shared-token rotation problem, and a config layout that's legibly sectioned rather than prefix-organised. Teams that adopt the new profile API deliberately — three named profiles, explicit activation, inherited base settings — find the post-migration config surface meaningfully smaller and harder to misconfigure.

The broader pattern is the one to keep. Treat every CLI migration as a phased rollout, not a big-bang cut-over. Separate codemod commits from human edits. Preserve a last-known-good config through the rollback window. Audit on day seven for migrations that look done but quietly aren't. The same shape applies to every CLI bump you'll do in the next two years — Codex, Claude Code, Gemini, whatever ships next — and the team that internalises it once stops dreading the upgrade cycle for good.

Migrate Codex CLI cleanly

CLI migrations are predictable — pilot, wave, cut over, retire.

Our team executes Codex CLI migrations — config sweep, auth migration, profile API adoption, sandbox flag updates — across teams of any size.

Free consultationExpert guidanceTailored solutions
What we ship

Codex CLI migration engagements

  • config.toml sweep with codemods where available
  • Auth surface migration and CI workflow update
  • Profile API design (dev / CI / prod)
  • Sandbox flag migration
  • Pilot-then-wave rollout playbook
FAQ · Codex CLI v2

The questions teams ask before the CLI bump.

Not immediately — v2 preserves enough back-compat in the auth surface that v1 long-lived tokens continue to work during the 72-hour rollback window and beyond. What does require attention is the sandbox network default flip (workloads that relied on v1's default-on network will silently fail until network is explicitly enabled in the relevant profile) and the strict-mode config parser (any custom v1 keys your team added will be rejected). Plan an audit of every CI workflow after the migration: confirm sandbox flags match intent, confirm the workflow has migrated to the mode 03 short-lived-credential auth pattern, confirm profile activation is explicit. The migration itself is mechanical; the discipline is in the post-cutover audit.