Codex CLI code review: a pre-merge checklist for small teams
Treat the model like a second reviewer with a checklist, not like a merge button.
Small teams often merge quickly because the change looks obvious: a landing page tweak, a content batch, a deployment script, a CSS cleanup. Those are exactly the changes where regressions slip through. Links break, routes stop exporting, environment variables drift, or an "empty" ad slot suddenly leaves a layout gap.
Codex CLI is useful before merge because it can read the diff, compare it to the goal, and run verification commands. The trick is to ask for a focused review rather than a general opinion.
Step 1 - make the goal explicit
Before asking for review, write the goal in one paragraph:
Goal: Apply the editorial Open Design theme to the existing Next.js site without changing article content, routing, AdSense review mode, or Cloudflare static export.This keeps the review anchored. Without a goal, the model may critique naming, architecture, and visual taste instead of checking whether the change is safe to ship.
Step 2 - give it the diff and the constraints
Use a prompt like:
Review this branch as a pre-merge reviewer.
Priority:
1. Bugs or regressions
2. Missing verification
3. Accessibility or responsive layout risks
4. Over-broad changes
Do not suggest unrelated refactors. The intended change is visual only.
Run:
- git diff --stat
- git diff
- npm run lint
- npm run validate:content
- npm run build:production
- npm run verify:launchThe priority order matters. You want findings, not a design essay.
Step 3 - ask for findings first, fixes second
Do not start with "fix everything." Ask for review findings first:
Return findings only. Include file paths, line references, severity, and why the issue matters. If there are no issues, say that clearly and list residual risks.Then read the output. Some findings will be real. Some will be optional. Some will be taste. You choose what to apply.
This separation protects the codebase from automatic over-editing.
Step 4 - verify generated static pages
For static sites, build success is not enough. You need to test the output folder.
Run:
npm run build:production
npm run verify:launch
python3 -m http.server 4174 -d outThen check key paths:
curl -I http://localhost:4174/
curl -I http://localhost:4174/tools/chatgpt-export-pdf/
curl -I http://localhost:4174/sitemap.xml
curl -I http://localhost:4174/ads.txtFor visual changes, also inspect the page in a browser. Automated tests do not know whether a hero title overlaps the navigation.
Step 5 - ask for a regression pass
After fixes, run a second prompt:
Re-review only the changed files since the last pass. Check whether the fix introduced a new bug. Do not repeat resolved findings.This is where AI review becomes useful. The first pass catches obvious problems. The second pass catches side effects.
Step 6 - keep a merge note
Write a short merge note:
Changed:
- Applied editorial theme to homepage and article pages.
- Kept existing MDX content and AdSense slots.
- Fixed nested html/body layout issue.
Verified:
- npm run lint
- npm run validate:content
- npm run build:production
- npm run verify:launch
- Static HTTP smoke test for / and /tools/chatgpt-export-pdf/This note helps future you understand why the change was safe.
A practical review script
For small teams, the best review prompt is one you can reuse. Save this as a snippet:
You are reviewing this change before merge.
Context:
- This is a production website.
- Keep findings focused on bugs, regressions, missing tests, accessibility, routing, deployment, and content quality.
- Do not suggest broad refactors unless the current change makes them necessary.
Process:
1. Inspect git status and git diff.
2. Identify the intended behavior.
3. Compare changed files against that intent.
4. Run the project's verification commands.
5. Return findings ordered by severity.
Output format:
- Findings first.
- Then residual risks.
- Then verification commands run.This gives Codex a repeatable role. It also makes the review easier to compare across branches. If every review starts from the same checklist, differences in output are more likely to reflect the code, not the prompt.
What a useful finding looks like
A useful finding is specific:
P1: src/app/[locale]/layout.tsx renders html/body inside the locale layout.
Why it matters: the root layout already renders html/body, so the static export contains nested document tags. Browsers recover, but the HTML is invalid and can cause hydration or styling bugs.
Suggested fix: move body classes to the root layout and make the locale layout return only providers and scripts.That finding is better than "layout structure seems odd" because it names the file, explains the failure mode, and gives a minimal fix.
What Codex should not decide
Do not outsource these decisions:
- Whether a visual direction matches the brand
- Whether an article is legally safe
- Whether a privacy-sensitive dataset can be uploaded
- Whether a generated suggestion should be merged without reading it
Use the tool for attention and coverage. Keep judgment with the team.
FAQ
Should Codex replace human review?
No. Use it as a fast second reviewer that catches omissions before a human spends attention.
What should I give Codex first?
Start with the goal, the diff, and the verification commands. Without those, it will review too broadly.
Can I ask it to fix issues directly?
Yes, but separate review from implementation. First ask for findings, then choose which ones to apply.
What if it suggests unrelated refactors?
Reject them. A pre-merge review should focus on regressions, missing tests, and launch blockers.
How many passes are enough?
For small changes, one focused review and one verification pass are enough. Larger changes deserve one pass per area.