Pure white background. Centered black sans serif text with the word 'GPT-5-Codex'.

Complete Guide to GPT-5-Codex API and Prompting: System Prompt, Best Practices, and Coding Insights

OpenAI released the API for GPT-5-Codex. If you try to use it like GPT-5, you will get worse results. The point of this model is agentic and interactive coding with minimal prompting. It runs best inside Codex CLI and related tooling, and it follows a tight system prompt that is much shorter than GPT-5’s. Less prompt, more work done by the model.

What GPT-5-Codex Actually Is

GPT-5-Codex is a variant of GPT-5 trained on real software engineering workflows. It handles feature work, refactors, debugging, and proper code review. It can move quickly in interactive sessions, and it can also work for hours on complex tasks when needed. It is built to use tools like a shell and file editing, navigate repos, run tests, and present changes clearly.

It’s purpose-built for Codex CLI, the Codex IDE extension, GitHub, and cloud environments. It supports tool use, filesystem operations (subject to sandbox rules), and code execution during review and validation. You should use it for agentic and interactive coding cases, not as a general-purpose drop-in model.

Key API Differences You Must Know

  • Not a drop-in replacement for GPT-5: it needs different prompting.
  • Use the Responses API only: the model is supported with Responses, not other endpoints.
  • No verbosity parameter: asking for preambles or long summaries breaks the flow and can cause early stops.
  • Tooling: keep it to a terminal tool and apply_patch for file edits; avoid long tool descriptions.

The Prompting Rule: Less Is More

Codex was trained to do the prompting heavy lifting you used to spell out. If you dump your 2,000-word developer message into it, you’ll likely reduce quality. Start small. Add only the essential rules you truly need for your repo or environment. Adaptive reasoning is default, so you don’t need to ask it to think harder or move faster. It already calibrates effort to task difficulty.

Codex CLI System Prompt: What Matters

The Codex CLI developer message is the reference. It’s about 40% the length of the GPT-5 developer message. It sets hard rules for how the agent uses the shell, edits files, interacts with git, and asks for approvals. Read it first, then trim your own message to match its spirit.

  • Shell: pass arguments to execvp; most commands via [‘bash’, ‘-lc’]; always set a workdir.
  • Fast search: prefer ripgrep; fall back if not available.
  • Editing: default to ASCII; comment only where code is non-obvious; do not revert user changes.
  • Planning: skip plans for trivial tasks; when you plan, keep it meaningful and update as you go.
  • Sandbox and approvals: know read-only vs workspace-write vs danger-full-access, and the approval policy. Request escalations correctly with a one-line justification.
  • Code review mode: list findings first, focus on bugs, risks, regressions, and missing tests. Keep the summary short.
  • Output style: concise, scannable, plain text; reference paths and key lines; avoid dumping large files or raw command output.

The official prompt and guide live here:

Tool Use: Keep It Tight

  • Terminal tool: the workhorse for repo discovery, running tests, installing packages, and quick checks.
  • apply_patch: use it for precise file edits that match Codex training. Avoid ad hoc file dumps.
  • Cut tool descriptions: get rid of fluff and examples the model doesn’t need.

OpenAI published a current apply_patch implementation here: apply_patch.py. If your agent writes files by concatenating big strings, expect worse quality and more mistakes compared to using apply_patch.

No Preambles

GPT-5-Codex does not emit preambles. Do not ask for them. If you prompt for summaries or introductions, the model may stop early or cut off before finishing the real task. The Codex stack uses a separate summarizer when a summary is actually needed for UI.

Adaptive Reasoning Is Built In

Codex adjusts its effort automatically. Quick interactive commands get snappy responses. Hard tasks trigger deeper, longer reasoning, tool use, and multi-step validation. You don’t need to say “think step by step” or “be quick”; it already does that.

Code Review: A First-Class Skill

When you ask for a review, Codex switches into a strict review style:

  • Findings first: bugs, risks, regressions, missing tests.
  • Short summary after the issues, not before.
  • Use file and line references so you can jump right to the code.
  • Validate by running tests when possible.

If you care about real-world agent evals, I covered a harder dataset for agents here: SWE-Bench Pro Commercial Dataset. It’s the right direction if you want to measure whether an agent can actually fix things in serious repos.

Frontend Defaults and How to Steer

Codex defaults to modern frontend taste: clean components, good structure. If you have strong opinions, state them briefly. Short, explicit guidance beats long style guides. For example:

Frontend guidance:
Framework: React + TypeScript
Styling: Tailwind CSS
Components: shadcn/ui
Icons: lucide-react
Charts: Recharts
Fonts: Inter or Geist

That’s enough. Don’t attach a 500-line UI rubric.

Sandboxing and Approvals: Use Them Correctly

Codex expects a harness with filesystem modes and an approval policy:

  • Filesystem sandbox: read-only, workspace-write, or danger-full-access.
  • Network access: restricted or enabled.
  • Approval policies: untrusted, on-failure, on-request, or never.

When a command needs elevated permissions, Codex is trained to request escalation with a short justification. If you are in never mode, it will work around constraints without asking. Don’t fight these rules; design your agent to pass the right parameters so Codex can keep moving.

Output Rules Worth Keeping

  • Plain text, concise, scannable. The CLI or UI will handle styling.
  • Reference files by path and line; don’t paste large files or long logs.
  • For code changes: lead with a quick explanation, then context; end with next steps only if there are obvious ones.

Minimal Prompt Templates That Work

If you’re migrating from a heavy agent prompt, try this cut-down developer message structure:

You are a coding agent running on the user's machine.
- Use the shell via ['bash', '-lc'] and always set a workdir.
- Prefer rg for search; fall back if missing.
- Edit files with apply_patch; default to ASCII; add brief comments only when needed.
- Respect sandbox and approvals; request escalation with a one-line justification.
- For reviews: list findings first with file:line refs; run tests when possible.
- Keep output concise; reference file paths; avoid dumping large files or logs.

Add only the absolute minimum repo-specific guidance after that. For example, test command names, package managers, or your preferred frontend stack. That’s it.

Responses API: A Minimal Example

You don’t need a complex wrapper. Keep the call small, wire in the two tools, and set your developer message to the trimmed version above. Here’s a schematic outline you can adapt:

// Pseudocode: minimal Responses API call
client.responses.create({
  model: "gpt-5-codex",
  input: [
    { role: "developer", content: "" },
    { role: "user", content: "Add a unit test for foo() in bar.test.ts and fix failing CI." }
  ],
  tools: [
    { type: "shell" },
    { type: "apply_patch" }
  ]
});

Notes:

  • Always set workdir on shell calls inside tool use.
  • Keep tool descriptions short to avoid wasting context.
  • Do not pass a verbosity parameter. Codex ignores it and it can hurt behavior.

Troubleshooting: Common Pitfalls and Fixes

  • Over-prompting: Long developer messages reduce quality. Cut most of it. Keep only the essentials.
  • Early stop after a preamble request: Remove the preamble or summary request. Codex doesn’t emit preambles.
  • rg is missing: Codex prefers ripgrep. If your environment lacks it, either install it or expect fallback to slower search commands.
  • Approval loops: If your harness doesn’t pass escalation parameters when Codex requests them, it will stall. Fix the harness to honor with_escalated_permissions and justification.
  • Shallow review findings or noisy warnings: Give the model better feedback loops. Run tests and linters. Point to canonical scripts. Avoid trying to force depth with extra prompt text.
  • Dumped files or log spam: Reinforce the output rule in the dev message: reference paths and key lines only.

Where GPT-5-Codex Fits

  • Codex CLI and IDE extension: interactive and long-running coding sessions.
  • GitHub and cloud environments: reading and editing files, running CI-like checks locally.
  • Agent frameworks: use Responses API, wire in shell and apply_patch, and keep the prompt short.

Performance Notes

  • It performs better than GPT-5 on agentic coding tasks and code review in public benchmarks.
  • On simple tasks, token use can drop dramatically compared to GPT-5.
  • On complex work, it will spend more time and tokens to finish and validate.

Security and Safety Defaults Worth Keeping

  • Do not revert user changes unless explicitly asked. Assume a dirty worktree and proceed carefully.
  • For destructive actions, request approval with a short justification if policy allows it.
  • If sandbox is read-only, request approval for any write action or adjust the plan to avoid writes.

Migration Checklist

  • Switch to Responses API and remove the verbosity parameter.
  • Replace your developer message with a minimal version modeled on the Codex CLI prompt.
  • Expose shell and apply_patch; remove other tools unless they are essential.
  • Wire up sandbox modes and approval policy; ensure escalations pass with a one-line justification.
  • Set default behaviors: reference file paths, avoid big dumps, and prefer ripgrep for search.
  • For frontend repos, add a short stack preference section if needed.

FAQ

Can I just switch my GPT-5 agent to GPT-5-Codex?
Not without editing your prompt and tools. Codex expects a shorter prompt and a small, focused toolset. If you keep your old GPT-5 template, you’ll likely get worse results.

How should I edit files?
Use apply_patch. It aligns with how the model was trained and produces cleaner diffs and fewer mistakes.

Why does it stop early when I ask for a preamble?
Because it doesn’t support preambles. Remove that request and let your UI do summarization if you need it.

Do I need a planning tool?
Codex can plan on its own and uses a planning tool well when provided, but skip it for the simplest tasks and avoid empty single-step plans.

Any notes on model behavior reports?
Some users report inconsistent depth on complex repos or overzealous issues during reviews. Give it stronger validation hooks: tests, linters, and project scripts. Steer with environment signals, not longer prose.

Where do I start?
Read the official guide and copy the structure: GPT-5-Codex Prompting Guide. Then remove 60% of your current developer message and re-run.

Bottom Line

GPT-5-Codex is tuned for agent coding. It expects a short prompt, one terminal, apply_patch, and a harness that handles sandboxing and approvals. If you match that setup, it will do real engineering work with less hand-holding and fewer tokens on simple tasks. That’s the point.