Codex 5.4 Best Practices: How to Get the Most Out of It

Mar 10, 2026

I recently started using OpenAI Codex 5.4, and I quickly learned that using it the “obvious” way wastes tokens and produces messy diffs. Here’s what I discovered about getting surgical, minimal changes instead of sprawling rewrites.

The Core Problem

Codex 5.4 is powerful, but that power can backfire. Without the right approach, it “fixes way more than asked” and burns through your usage. But with proper constraints, it produces +2 -0 diffs instead of +148 -146.

The difference is how you use it.

Best Practice 1: Match Thinking Mode to Task

The reasoning_effort parameter is your main control. I learned this the hard way - using xhigh for simple tasks causes overthinking and slower responses.

+------------------+---------------------------+------------------+
| Mode             | Best For                  | Cost Impact      |
+------------------+---------------------------+------------------+
| low              | Typos, comments, renames  | Lowest           |
| medium           | Standard bug fixes        | Moderate         |
| high             | Multi-file changes        | Higher           |
| xhigh            | Long autonomous tasks     | Highest          |
+------------------+---------------------------+------------------+

An OpenAI employee clarified: xhigh is meant for really long-running tasks, not everyday coding.

The trap: Defaulting to high or xhigh for everything. On simple tasks, this causes the model to “overthink” and produce more issues.

Best Practice 2: Use Explicit Constraints

When I asked Codex to “fix the auth issue,” it rewrote my entire authentication system. The fix: scope delimiters.

Fix the null pointer exception in UserService.java.

CONSTRAINTS:
- Do NOT refactor any existing methods
- Do NOT change method signatures
- Do NOT add new utility functions
- Do NOT modify any files other than UserService.java

Negative constraints work better than positive ones. Tell it what NOT to do.

Best Practice 3: Request Surgical Edits

This was my biggest breakthrough. When I started requesting minimal changes, Codex 5.4 High started behaving like a senior engineer.

Apply the MINIMAL fix to resolve this issue.
If multiple solutions exist, choose the one with the smallest diff.
Do not make any changes that are not strictly necessary.

Why this matters:

+2 -0 changes are easier to review than +148 -146
Less risk of new bugs
More confidence in what changed

Best Practice 4: Strategic Context Usage

GPT-5.4 has a 1M token context window, but there are trade-offs:

Above 272K tokens: 2x input pricing, 1.5x output pricing
Quality drops at the far edge (36.6% retrieval at 512K-1M vs 97.3% at 4K-8K)

My approach:

Estimate tokens before loading (roughly 4 chars = 1 token)
Load only relevant files
Avoid crossing 272K threshold unless necessary

Best Practice 5: Project-Level Instructions

I created an instruction file that Codex respects consistently:

## Code Modification Rules
- Never refactor code that is working correctly
- Always prefer the smallest possible diff
- Do not change variable names unless explicitly requested
- Do not add new dependencies without permission

## Review Required For
- Production configuration changes
- Database schema modifications
- API endpoint changes

Common Mistakes

Mistake 1: “High is always better” No. Match the mode to task complexity.

Mistake 2: Vague prompts

WRONG:  "Fix the auth issue"
RIGHT:  "Fix the null check in line 47 of auth.ts. Do not modify other lines."

Mistake 3: Not reviewing changes Always review before accepting. Use approval mode: codex --approval-mode suggest

Mistake 4: Loading entire codebases This doubles costs and quality degrades at context extremes.

Quick Decision Guide

Use Codex 5.4 (High) when:
  -> Complex bugs needing surgical precision
  -> Architecture debates (great sparring partner)
  -> Multi-file changes requiring deep reasoning

Stick with 5.3-Codex when:
  -> Pure terminal/shell-based coding
  -> Compiler or performance-critical work
  -> Cost is primary constraint

The Bottom Line

Three things unlock Codex 5.4’s potential:

Match reasoning_effort to task complexity
Use explicit constraints to prevent overfixing
Request minimal diffs explicitly

Treat Codex like an overenthusiastic junior developer: be specific about what NOT to change, not just what to change.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 OpenAI GPT-5.4 Documentation
👨‍💻 Reddit: 5.4 High is something special

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!