Codex CLI 2 is the most configurable agentic coding CLI shipping in 2026, and that configurability is its defining feature — every team that runs Codex in production touches config.toml, designs at least two profiles, picks a sandbox mode per workload, and wires at least one MCP server. Teams that treat configuration as an afterthought end up with a CLI that works on a laptop but fights them in CI; teams that invest a half-day in the config surface ship reliably.
The CLI's design philosophy is honest about the trade-off. Codex CLI is opinionated where the cost of a wrong default is high (sandbox defaults are conservative, approval policies start untrusted), and configurable where the right answer is workload-specific (model choice, network access, MCP servers, approval thresholds). The result is a CLI that can run a developer's exploratory edit, a CI test-generation pipeline, and an unattended production agent — from the same binary, with three different profiles, on the same machine.
This deep dive is the reference for engineers running Codex CLI 2 in production. The config.toml schema and its six sections, the profile API and how to design slots for dev / CI / prod / agent, the three sandbox modes and which workloads each fits, the MCP-server integration surface and the agent-loop primitives it exposes, a five-axis comparison with Claude Code, and four production workflows that ship. The closing FAQ covers the questions engineers ask before standardising on Codex.
- 01Per-profile configuration is the real leverage.Codex CLI 2's profile API lets one config file describe many environments — dev laptops, CI workers, production agents — each with their own model, sandbox, approval policy, and auth mode. Treat profiles as the unit of design and the rest of the config surface stays small.
- 02Three sandbox modes — pick by workload risk.workspace-write for trusted developer edits, full-auto for short-lived sandboxed runs that need broad capability, read-only for production agents that observe but don't mutate. The matching of mode to workload is what separates a reliable Codex deployment from a brittle one.
- 03MCP integration is on par with Claude Code.Codex CLI 2 supports the same Model Context Protocol surface as Claude Code — local server processes, remote endpoints, per-profile server lists. The agent loop exposes tool-call primitives that compose with MCP servers the same way Claude Code's does, so the integration story carries across the two CLIs.
- 04Codex wins specific workflow archetypes.Test-generation pipelines, large-scale codemod runs, and CI-resident agentic refactors are the workloads where Codex CLI 2's headless auth and explicit profile activation pay back the configuration investment. For interactive editor-side work, Claude Code and Cursor are competitive.
- 05Configuration is the differentiator — invest in it.Teams that spend a half-day designing the profile layout, naming sandbox modes per workload, and wiring MCP servers deliberately get a CLI that runs predictably across environments. Teams that copy a config from the README and forget about it eventually pay for it in a 2am incident.
01 — Why Deep DiveCodex CLI 2 is configurable — most teams configure it wrong.
The first time a team installs Codex CLI 2, the natural impulse is to copy the example config.toml from the README, set model = "gpt-5.5-codex", and start using the CLI. That config works — for one developer, on one laptop, doing exploratory edits. The moment the CLI moves into CI or onto a production agent, the README config becomes a liability: the sandbox is too permissive, the approval policy is wrong for unattended runs, the auth mode leaks long-lived tokens into CI logs, and the lack of profiles means every environment is fighting the same config.
The right mental model is to treat Codex CLI 2's config surface as a small, deliberately-designed declarative specification — six sections, named profiles, per-profile overrides — rather than a flat list of CLI flags transcribed to disk. Engineers familiar with Kubernetes manifests, Terraform modules, or CI workflow YAML already have the muscle memory: the schema is small, the activation is explicit, the defaults are conservative. Codex's value compounds when the config is designed; it leaks when the config is copy-pasted.
The remainder of this deep dive treats the configuration surface as the primary unit of analysis. Every section — schema, profiles, sandbox, MCP — is framed around the design decisions a team needs to make before standardising on Codex in production. The workflow archetypes at the end show how the decisions compose into shippable patterns.
"Codex CLI 2 rewards teams that design the config and punishes teams that copy-paste it. The CLI's defaults are conservative; the leverage is in the profile layout, not the flags."— Internal note, Digital Applied agentic engineering team
The deeper context — what changed between v1 and v2, why the schema migration was necessary, how teams should phase the cut-over — is covered in our Codex CLI v1 to v2 migration playbook. This deep dive assumes a team has either completed that migration or is starting fresh on v2; the focus here is the production configuration design rather than the upgrade path.
02 — config.tomlSchema, sections, defaults.
The Codex CLI 2 config.toml schema has six top-level sections. Each section is independent — a config can omit any section and the CLI applies sensible defaults — but production deployments typically use all six. The sections are [model], [sandbox], [approval], [auth], [profiles], and [mcp]; everything else lives under one of those.
The schema is structurally flat at the top level and nested under [profiles.*] for per-profile overrides. Any top-level setting can be overridden inside a profile — so the base layer establishes defaults and the profile layer specialises them. Most production configs have a small base layer (the values genuinely shared across every environment) and most of their content inside the profile sections.
# config.toml — full Codex CLI 2 schema reference
# Six top-level sections, each independently optional.
# ----- [model] -----
[model]
name = "gpt-5.5-codex"
# Optional reasoning / generation overrides
temperature = 0.2
max_output_tokens = 8192
# ----- [sandbox] -----
[sandbox]
mode = "workspace-write" # workspace-write | full-auto | read-only
network = false # default off in v2
writable_paths = ["./", "./tmp"] # explicit writable list
forbidden_paths = ["~/.ssh", "~/.aws"]
# ----- [approval] -----
[approval]
policy = "untrusted" # never | trusted | untrusted
require_confirm_on = ["git_push", "npm_publish"]
# ----- [auth] -----
[auth]
mode = "long-lived-token" # oauth | long-lived-token | ci-issued
token_env = "CODEX_AUTH_TOKEN" # env var reference, not literal
# ----- [profiles.*] -----
[profiles.dev]
extends = "base"
[profiles.dev.sandbox]
mode = "workspace-write"
network = true
[profiles.dev.approval]
policy = "never"
[profiles.dev.auth]
mode = "oauth"
[profiles.ci]
extends = "base"
[profiles.ci.sandbox]
mode = "workspace-write"
network = false
[profiles.ci.auth]
mode = "ci-issued"
[profiles.prod]
extends = "base"
[profiles.prod.sandbox]
mode = "read-only"
network = false
# ----- [mcp] -----
[mcp.servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]
[mcp.servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN_ENV = "GH_MCP_TOKEN" }
[mcp.servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres"]
env = { POSTGRES_URL_ENV = "DATABASE_URL" }
Two things to notice in the example above. First, the [profiles.*] entries inherit from a notional base profile via extends; the base is the top-level [sandbox], [approval], and [auth] sections taken together. Second, the [mcp] section is declared once at the top level but applies per-profile by default — individual profiles can override or extend the server list, but most teams find a single shared MCP server roster sufficient.
Model configuration
Selects the underlying model and any generation overrides (temperature, max_output_tokens, top_p). Per-profile model selection is supported — a CI profile can pin a different model than the dev profile if the workload demands it.
Required in practiceSandbox containment
Controls what the CLI is allowed to mutate. Mode is the headline setting (workspace-write / full-auto / read-only); network, writable_paths, and forbidden_paths refine the policy. v2 defaults are conservative — opt in to permissive settings explicitly.
Read section 04Approval policy
Governs which actions require explicit confirmation. 'never' is the right setting for trusted developer laptops, 'untrusted' for any unattended run, 'trusted' for the narrow middle ground. require_confirm_on adds named-action overrides on top of the policy.
Match to workloadAuthentication mode
Selects between interactive OAuth, long-lived service token, and short-lived CI-issued credential. Per-profile auth is the v2 win — dev profiles use OAuth, CI profiles use ci-issued, prod profiles use long-lived tokens read from environment variables.
Per-profile preferredProfile definitions
Named profiles with optional inheritance. Each profile carries its own model, sandbox, approval, and auth — activated explicitly via --profile flag or CODEX_PROFILE env var. The unit of environment separation.
The leverage sectionMCP server registry
Lists the Model Context Protocol servers Codex can invoke as tool-call primitives. Each server has a command, args, and optional environment-variable mappings. Servers are shared across profiles by default; per-profile overrides are supported when a profile needs a narrower or wider tool surface.
Tool-call primitivesThe schema is strict — Codex CLI 2 rejects unknown top-level keys by default, which catches typos and stale v1 settings that survived a migration. Custom keys for in-house tooling live under [extra], a reserved section v2 ignores but preserves for tool-specific extensions. Production configs should keep [extra] small and documented; it is an escape hatch, not a primary surface.
One under-appreciated default worth calling out: writable_paths defaults to the workspace root only, and forbidden_paths defaults to a curated list of common credential and config locations (~/.ssh, ~/.aws, ~/.config, and similar). Teams that want broader write access need to add it explicitly; teams that want to tighten the forbidden list further can extend it without touching the defaults. The two-layer model (curated denylist plus team allowlist) is the right shape for a sandboxed CLI.
codex config validate in CI to catch drift, and avoidad-hoc edits on developer laptops that don't propagate back. The config is a contract between the CLI and the team — the contract is only useful if it's versioned and enforced.03 — ProfilesPer-profile model, sandbox, approval, MCP.
The profile API is where Codex CLI 2 separates itself from every prior agentic CLI. A profile is a named bundle of settings — model, sandbox, approval, auth, and MCP overrides — that activates as a single unit. One config.toml file can declare any number of profiles, and activation is always explicit: either --profile dev on the command line or CODEX_PROFILE=dev in the environment.
The judgement call is which profiles to declare. The shape that ages best is profiles named after environments rather than people or projects — dev, ci, prod, and optionally agent for long-running production agents that deserve their own auth and approval policy. Teams that name profiles after people (alice, bob) discover the surface doesn't scale; teams that name them after projects (migration, refactor-2026) end up with stale profiles cluttering the config. Environments are the durable axis.
dev — developer laptops
workspace-write sandbox with network enabled, approval policy 'never' for trusted developers, interactive OAuth auth, full MCP server roster. The most permissive profile — appropriate because a human is at the keyboard and reviews every edit before it ships.
OAuth · workspace-writeci — CI workers
workspace-write sandbox but network disabled — generated tests and codemods don't need outbound calls. Approval 'untrusted' so any sensitive operation halts the run for human review. Short-lived ci-issued credential auth. The right balance for unattended pipelines.
ci-issued · workspace-writeprod — production agents
read-only sandbox, network disabled, approval 'untrusted', long-lived service token rotated quarterly. For agents that observe and report — incident triage, log summarisation, scheduled audits — without mutating the codebase. The least-privilege profile.
long-lived-token · read-onlyagent — autonomous workers
Optional fourth profile for long-running autonomous agents that genuinely need to write code in production. workspace-write narrowed to a single output directory, network scoped to a known allowlist, dedicated auth tokens with audit logging. Used by a minority of teams — most don't need a profile beyond dev / ci / prod.
Audit-logged · narrowedTwo practical rules for designing profiles. First, default to inheriting from a base profile rather than duplicating settings; extends = "base" at the top of any profile section keeps shared defaults in one place and makes the deltas obvious. Second, keep the profile count low until pressure forces an addition — three profiles is the common landing point, four is reasonable for teams running autonomous agents, five or more is usually a sign that something else (project-level config files, environment variables, command-line flags) is being misused as a profile substitute.
Activation discipline matters as much as profile design. Every CI workflow, every wrapper script, every long-running agent should set CODEX_PROFILE or pass --profile explicitly; relying on the default profile leads to confused incidents where the wrong settings apply silently. The pattern that holds up is treating the profile flag as required everywhere except the developer laptop, where shell completion or a default in the shell rc file handles it.
CODEX_PROFILEand silently runs under the default profile. The settings are wrong, the run succeeds anyway because Codex doesn't know it's misconfigured, and the output is subtly broken in ways nobody notices for a sprint. Audit every workflow for explicit profile activation after any config change.04 — Sandboxworkspace-write, full-auto, read-only.
Codex CLI 2 ships three sandbox modes, each designed for a specific risk profile. The mode is the headline setting in the [sandbox] section; everything else (network, writable_paths, forbidden_paths) refines the policy within the chosen mode. The matching of mode to workload is what separates a reliable Codex deployment from a brittle one — a workspace-write profile running an unattended production agent is asking for an incident; a read-only profile blocking a developer's exploratory edit is asking for friction.
workspace-write
writes scoped to workspaceThe CLI can read anything within the workspace and write anywhere the writable_paths allowlist permits. Forbidden paths (credentials, system config) are blocked. Network access is profile-controlled. The right mode for developer laptops and CI workers that need to mutate the codebase.
Default for dev and CIfull-auto
broad capability · short-livedA more permissive mode for sandboxed short-lived runs — temporary directories, ephemeral containers, throwaway VMs. The CLI can write outside the workspace if the writable_paths allowlist permits, and approval policies are typically 'never' or 'trusted'. Suitable for codemod batch runs in ephemeral environments.
Sandboxed batch runsread-only
observability and triage agentsThe CLI can read any path the OS permits but cannot mutate the filesystem. Network access is profile-controlled. The right mode for production agents that observe and report — incident triage, scheduled audits, dashboard refreshes — without ever touching the codebase.
Production-defaultThe defaults inside each mode are conservative and worth knowing. workspace-write defaults to network off, writable_paths = ["./"], and the curated forbidden_paths denylist. full-auto defaults to network off but with a broader writable_paths default that includes /tmp and the workspace. read-only defaults to network off and ignores writable_paths entirely. Teams that want network access in any mode need to opt in explicitly per profile.
One subtle but important behaviour: read-onlymode is genuinely read-only — the CLI cannot write to a temporary directory by default, where v1's read-only mode allowed temp writes silently. If a production agent legitimately needs scratch space, the right setting is read-only-with-tmp(a sub-mode of read-only) which permits writes only to the OS temp directory. Most production agents don't actually need it — they were inheriting a v1 default — but a minority do, and the explicit opt-in is the right design.
Codex CLI 2 sandbox mode mix · share of profiles by mode
Source: Digital Applied internal benchmark, May 2026 · n = 31 production Codex CLI 2 deploymentsThe honest reading of the mix is that most teams settle on a two-mode shape: workspace-write for dev and CI, read-only for production agents, full-autoused only for specific batch workloads. The teams that mix modes per workload tend to be running mature agentic platforms with five or more distinct Codex use cases; the rest find the two-mode shape sufficient and don't benefit from added complexity.
05 — MCPServer config and agent-loop primitives.
Codex CLI 2 supports the Model Context Protocol as a first-class extension surface — the same protocol Claude Code popularised — which means the MCP server ecosystem is portable across both CLIs. The [mcp] section in config.toml declares the server roster, each server is launched as a child process when Codex starts, and the tools the server exposes become tool-call primitives the agent loop can invoke. The integration surface is symmetrical with Claude Code; the operational details differ in small ways.
The agent-loop primitives Codex exposes are the standard MCP shape: list_tools, call_tool, list_resources, read_resource, and the prompts surface for parameterised tool calls. Codex adds two of its own primitives on top — a sandbox-aware file_edit that respects the profile's writable_paths, and a propose_command primitive that surfaces shell commands for approval rather than executing them blindly. Both are optional, both are off by default, and both interact with the approval policy in the obvious way.
# [mcp] section — common server roster
# Filesystem access, scoped to the workspace
[mcp.servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"]
# GitHub API for PR / issue manipulation
[mcp.servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN_ENV = "GH_MCP_TOKEN" }
# Postgres for data-aware code generation
[mcp.servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres"]
env = { POSTGRES_URL_ENV = "DATABASE_URL" }
# Per-profile override — CI gets a narrower server list
[profiles.ci.mcp.servers]
filesystem = { inherit = true }
github = { inherit = true }
# postgres explicitly omitted in CI to prevent inadvertent queries
The per-profile override pattern at the bottom of the example is the production discipline worth adopting. Production CI profiles should declare exactly which MCP servers they need, inheriting the base config's server definitions but narrowing the active set. The result is a CI run that cannot reach the production database even if the base config declares the postgres server — least-privilege applied to tool surfaces, not just filesystem paths.
06 — vs Claude CodeFive-axis comparison.
Codex CLI 2 and Claude Code are the two production-grade agentic CLIs in 2026, and most teams running serious agentic engineering eventually pick one as the default and use the other for specific workloads. The honest comparison is five-axis rather than headline-feature: configuration model, sandbox design, MCP surface, headless auth, and editor integration. Each axis has a winner; no CLI sweeps all five.
Configuration model
Codex CLI 2 ships a structured config.toml with explicit sections, named profiles, and per-profile overrides. Claude Code uses a flatter settings.json with project-level CLAUDE.md files for behaviour. Codex wins for teams that want declarative configuration in source control; Claude Code wins for teams that prefer behaviour-as-documentation.
Codex CLI 2Sandbox design
Codex CLI 2 ships three explicit sandbox modes with profile-scoped activation. Claude Code uses a permission system (read / edit / run) layered on tool calls with optional --dangerously-skip-permissions for trusted sessions. Codex's mode-based design is more legible for production; Claude Code's permission model is more flexible for interactive work.
Tie — different shapesMCP surface
Both CLIs support the Model Context Protocol natively. Codex CLI 2 has the per-profile override pattern (narrowing server lists per environment); Claude Code has tighter integration with its subagent system. The same MCP server runs unchanged under both — the ecosystem is portable.
Tie — protocol is portableHeadless auth
Codex CLI 2 ships three explicit auth modes including a short-lived CI-issued credential pattern that eliminates the long-lived-token rotation problem. Claude Code currently relies on long-lived API keys for headless use. Codex wins clearly for CI-resident agentic workloads.
Codex CLI 2Editor integration
Claude Code ships an officially-supported VS Code extension with diff-review UI, plan mode, and subagent surfacing. Codex CLI 2 is terminal-first with optional editor plugins via MCP. Claude Code wins clearly for interactive editor-side work; Codex's terminal-first design suits CI and unattended runs better.
Claude CodeThe pattern that emerges from teams running both CLIs in production: Codex CLI 2 for CI-resident agentic workloads (test generation, codemod runs, scheduled audits) where the headless auth and per-profile config pay back the configuration investment; Claude Code for interactive editor-side work (refactors, feature builds, code review) where the VS Code integration and subagent system reduce friction. The choice isn't exclusive — most mature teams use both — and the MCP-server roster carries across the boundary, so the tool-call investment is durable either way.
For broader CLI context including Windsurf, Cursor, and the agentic-editor surface, our AI transformation engagements cover the longer-form evaluation pattern. The TL;DR for most teams is: pick one CLI as the default for interactive work, run the other in CI where its strengths shine, and treat the MCP server roster as the portable layer that survives the choice.
07 — WorkflowsFour production workflows that ship.
The four workflow archetypes below are the patterns we see most often across production Codex CLI 2 deployments. Each has a characteristic profile design, sandbox mode, auth pattern, and MCP-server roster. The patterns aren't mutually exclusive — most teams run two or three of them — but isolating them as archetypes makes the configuration design concrete.
Test-generation pipeline
ci profile · workspace-write · ci-issuedCI workflow runs Codex against a diff to generate or update tests. Profile: ci. Sandbox: workspace-write with network disabled. Auth: ci-issued short-lived credential. MCP: filesystem and github only. The canonical Codex CLI 2 production workflow.
Most common archetypeCodemod batch run
agent profile · full-auto · long-lived-tokenEphemeral container runs a large-scale codemod across many repos. Profile: agent. Sandbox: full-auto with broad writable_paths inside the container. Auth: long-lived service token scoped to the codemod runner. MCP: filesystem and github. Used for fleet-wide refactors and migration sweeps.
Sandboxed batchIncident-triage agent
prod profile · read-only · long-lived-tokenAlways-on agent that reads logs and code in response to alerts. Profile: prod. Sandbox: read-only. Auth: long-lived token rotated quarterly. MCP: filesystem (read-only), github, and a logs server. Produces summarised triage reports without ever mutating the codebase.
Observability surfaceInteractive developer session
dev profile · workspace-write · oauthA developer runs Codex on their laptop for exploratory edits. Profile: dev. Sandbox: workspace-write with network enabled. Auth: interactive OAuth. MCP: full server roster including any in-house tools. Approval policy 'never' because the developer reviews every edit.
The local defaultThe test-generation pipeline is the workflow that most teams adopt first because the value is immediate and the configuration is well-trodden. Our Codex test-generation pipeline tutorial walks the full pattern end-to-end, including the CI workflow YAML, the profile definition, and the failure modes to watch for. Workflows 02 through 04 build on the same foundation with progressively different trade-offs around containment and auth.
One operating rule that holds across all four archetypes: each workflow gets its own profile, even if two workflows share most settings. The cost of an extra profile is small (a few lines in config.toml); the cost of sharing a profile across distinct workflows is high (a change to one workflow's sandbox accidentally affects the other). Profile-per-workflow is the discipline that pays back across quarters.
Codex CLI 2 is config-first — get the profiles right and the workflows follow.
Codex CLI 2 is the most configurable agentic coding CLI shipping in 2026, and the configurability is the differentiator. Six sections in config.toml, named profiles that bundle model, sandbox, approval, and auth into a single activation unit, three sandbox modes matched to workload risk, and an MCP-server roster that carries across CLIs. Teams that treat configuration as the primary unit of design ship reliably; teams that copy a README config and forget about it pay for it eventually.
The profile API is the leverage. Three profiles — dev, ci, prod — covers the majority of production deployments, and a fourth agent profile handles the minority running long-lived autonomous workers. Activation is always explicit, settings inherit from a base profile, and the sandbox-mode choice flows from the profile design rather than the other way around. The discipline isn't complicated; it's consistent.
The broader pattern is the one to keep. Treat agentic CLIs as production specifications, not editor toys. Version the config in source control, validate in CI, design profiles per environment and workflow, and invest in portable tool-call infrastructure via MCP rather than CLI-specific extensions. The CLI you standardise on today won't be the only CLI you run two years from now — the configuration discipline that survives the next CLI change is the one worth practising now.