Comparison

Claude Code vs Codex CLI: which terminal agent fits your workflow?

Marcus Reilly June 15, 2026 · 7 min read

Two coding agents now dominate the terminal: Claude Code from Anthropic, and Codex CLI from OpenAI. They rhyme. You point either one at a repo, describe a task in plain English, and it reads files, edits them, runs commands, and reports back — no chat-window copy-paste, no leaving your shell. If you've used one, the other feels familiar within minutes.

But "familiar" hides real differences in how they behave once the work gets serious: how much they do without asking, how you steer them, and what grows up around each one. This is a balanced look at both — approach, autonomy, ecosystem, and configurability — so you can pick the one that fits how you actually work. The short version of the conclusion: you don't have to pick just one.

Same shape, different defaults: both edit code in your terminal, but they differ on autonomy, config, and how they're built.

At a glance:

Dimension	Claude Code	Codex CLI
Vendor	Anthropic	OpenAI
Approach	Planning-first (plan vs build modes)	Execution-first, fast iteration
Default autonomy	Ask-first; per-step approval	Approval modes + sandbox; set boundary up front
Instruction file	`CLAUDE.md`	`AGENTS.md` + `config.toml`
Tool extension	Hooks, subagents, SDK, MCP	MCP
Openness	Distributed by Anthropic, not open source	Open source (read, fork, self-host)
Best for	Ambiguous or risky changes; granular control	Well-scoped tasks; fast results behind a sandbox
With Backgrind	An overlay over your real CLI — run Claude Code or Codex in an always-on-top window with ambient notifications, so you can pick by the agent, not the terminal UX. Backgrind is not an agent itself.

Approach: how each one works a task

Both agents follow the same core loop — gather context, propose a change, run something, check the result — but the texture differs.

Claude Code leans into structured planning. It has a distinct read-only planning step (often called plan mode) where it maps out an approach before touching anything, separate from the build step where it executes. That separation is its own small skill; if it's new to you, plan mode vs build mode walks through when to use each. It also tends to narrate its reasoning as it goes, which makes it easy to interrupt and redirect mid-task.
Codex CLI is more execution-forward out of the box. Give it a task and it moves toward running code quickly, leaning on a sandbox to keep that safe (more on that below). It's built to feel like a fast pair-programmer that just does the thing, with iteration happening in the loop rather than in a separate plan-then-build handoff.

Neither approach is "better." Planning-first shines on ambiguous or risky changes where you want to see the map before the agent starts driving. Execution-first shines on well-scoped tasks where the fastest path to a running result wins.

Autonomy and permissions

This is the dimension that actually changes your day, because it decides how often the agent stops to ask you something.

Claude Code is ask-first by default. Before it edits a file or runs a shell command, it pauses for your approval, and you can grant or deny each step. You can widen that trust as you go, but the safe default is granular consent. It also exposes lifecycle hooks — events that fire when it wants attention, finishes, or is about to use a tool — which you can wire to your own scripts or guardrails.
Codex CLI centers on explicit approval modes plus a sandbox. You choose how much rope to give it — from suggest-only, to auto-editing within the workspace, to fuller autonomy — and it runs commands inside a sandboxed environment with file-write and network access constrained by the mode you picked. The mental model is "set the autonomy level once, then let it run within those walls," rather than approving each individual action.

Both let you crank autonomy up to near-hands-off when you trust the task. The difference is the default posture: Claude Code starts cautious and you loosen it; Codex CLI asks you to declare the boundary up front and leans on sandboxing to make broader autonomy safer. If you're working toward a hands-off, vibe-coding style flow, both can get there — you just configure them differently. The catch with either is the same: the more autonomous the agent, the easier it is to stop watching it, which is exactly when you miss the moment it stalls on a prompt or finishes. That's the problem behind babysitting your coding agent.

Configurability and project memory

Both agents read a project file so they pick up your conventions without you re-explaining them every session — but they use different names and config surfaces.

Claude Code uses CLAUDE.md for project and personal instructions — build commands, style rules, "don't touch this directory" — plus settings files for permissions and hooks, and MCP (Model Context Protocol) to plug in external tools and data sources. Its config story is broad: per-project, per-user, and global layers that compose.
Codex CLI uses AGENTS.md as its instruction file and a config.toml for settings like model choice, approval mode, and sandbox policy. It also speaks MCP, so the tool-extension story is comparable. AGENTS.md happens to be a more vendor-neutral convention that several tools have adopted, which is handy if you switch agents.

Practically: if you keep one instruction file per repo, you may end up with both a CLAUDE.md and an AGENTS.md. Many teams keep the real content in one and let the other point at it, so a single source of truth drives whichever agent is running.

Ecosystem and how they're built

The surrounding tooling and openness differ in ways worth knowing before you commit.

Claude Code has a deep, fast-moving ecosystem: hooks, subagents, an SDK, MCP servers, and editor integrations, with Anthropic shipping changes frequently. The CLI itself is distributed by Anthropic rather than developed in the open, but the extension surface around it is large.
Codex CLI is open source, which means you can read exactly how it behaves, file issues against the actual code, and self-host or fork it. For developers who want to inspect or extend the agent itself — not just plug tools into it — that openness is a real differentiator.

Installing either is a quick terminal step. If you're starting from zero, see how to install Claude Code and how to install Codex CLI — and since both publishers revise their install commands between releases, check the official docs for the current command before you paste anything.

So which one?

Reach for Claude Code if you like a plan-then-build rhythm, want granular approval by default, and value the breadth of hooks and project config. Reach for Codex CLI if you want fast execution behind a sandbox, prefer declaring autonomy up front, or care that the agent is open source. Honestly, most people who use both end up keeping both: one for exploratory or risky work, the other for fast, well-scoped tasks. They're cheap to switch between, and your AGENTS.md can feed either.

Where Backgrind fits

You shouldn't have to choose your terminal agent based on which one your tooling supports. Backgrind wraps the real CLI you already use — your Claude Code, your Codex CLI, your own login and history — in an always-on-top overlay that floats over whatever you're doing and pings you only when the agent needs a decision or finishes. Run them side by side in multiple agent tabs, keep each one visible over any app, and stop babysitting either. Or skip the install and use Grindy, Backgrind's own hosted agent, with nothing to set up. See it in the live demo.