Security Best Practices When Using Codex: What to Trust AI With and What Not To

Mar 25, 2026

The Problem

When I started using Codex for more of my coding work, I wondered: can I trust it with security-critical code?

Then I read a Reddit thread where a developer shared this warning:

"When it comes to security (assuming you mean auth?), I would not trust an LLM
to do that properly. You'd be better off offloading auth to some third-party
service like Clerk."

That made me pause. If AI can write code, why shouldn’t it write authentication? I dug deeper into the discussion and found practical advice on what to delegate and what to handle yourself.

What Happened

I had been asking Codex to help me build various features. Most worked fine. But when I asked it to implement user authentication from scratch, I got code that looked reasonable but had subtle issues:

// What Codex generated for password verification
async function verifyPassword(plainPassword, hashedPassword) {
  // This looks correct but uses a weak hashing approach
  const hash = await bcrypt.hash(plainPassword, 8); // Low salt rounds
  return hash === hashedPassword; // Wrong comparison method!
}

The code looked plausible. But it had two problems:

Salt rounds of 8 is too low for modern security standards (should be 12+)
Direct string comparison on hashes is vulnerable to timing attacks

I wouldn’t have caught these issues without manual review. That’s the core problem: AI generates plausible-looking code that may have security vulnerabilities you won’t notice until it’s too late.

Why AI Struggles With Security

AI coding assistants have specific limitations when it comes to security:

They optimize for functionality, not security. When I ask Codex to “add login,” it produces working login code. But working doesn’t mean secure.

They don’t understand your threat model. AI doesn’t know who your users are, what data you’re protecting, or what attacks you’re worried about.

They can’t reason about system-wide security. Each piece of code is generated in isolation, without considering how it fits into your overall security architecture.

They may use outdated patterns. Training data includes old code with outdated security practices.

What NOT to Trust AI With

Based on the Reddit discussion and my own experience, here are the areas I never delegate to AI:

1. Authentication Implementation

BAD:  "Implement user login with JWT tokens and password hashing"
GOOD: "Integrate Clerk for authentication"

Authentication has too many edge cases:

Session management
Token rotation
Password reset flows
Multi-factor authentication
Rate limiting
Brute force protection

Use established services: Clerk, Auth0, Supabase Auth, Firebase Auth.

2. Payment Processing

BAD:  "Build a payment system with credit card processing"
GOOD: "Integrate Stripe checkout"

Payment security includes:

PCI compliance
Fraud detection
Secure card storage
Webhook verification
Refund handling

Use Stripe, PayPal, or similar services.

3. API Key and Secret Management

// NEVER let AI write code like this
const apiKey = "sk-proj-xxxxx"  // Hardcoded secret!

// ALWAYS use environment variables
const apiKey = process.env.STRIPE_SECRET_KEY
if (!apiKey) {
  throw new Error('STRIPE_SECRET_KEY not configured')
}

AI might accidentally expose secrets in:

Log statements
Error messages
Debug output
Git commits

4. Encryption Implementation

Rolling your own encryption is dangerous even for experienced developers. AI-generated encryption code might:

Use weak algorithms
Generate predictable random values
Misuse initialization vectors
Leak information through error messages

Use libraries and services that handle encryption properly.

5. Access Control Logic

// AI might generate code like this
if (user.role === 'admin') {
  return allData  // Too broad!
}

// What you actually need
if (user.role === 'admin' && hasPermission(user, 'read:all')) {
  return filterByOrganization(allData, user.orgId)
}

Access control requires understanding your business rules, data relationships, and compliance requirements.

What IS Safe to Delegate

Not everything is security-critical. I happily let AI handle:

UI Components - React components, styling, responsive layouts

Non-sensitive CRUD operations - Blog posts, comments, public content

Business logic (with review) - Validation rules, calculations, workflows

Tests - Unit tests, integration tests, test fixtures

Documentation - README files, API docs, comments

Refactoring - Code cleanup, extraction, naming improvements

The Multi-Layer Review Approach

From the Reddit discussion, I learned a practical approach:

"I also sometimes use the mid-tier models like Codex Medium or Sonnet to
complete a task, then ask Codex High or Opus to review the changes, in
addition to manually code reviewing."

This creates multiple layers of review:

+-------------------+     +-------------------+     +-------------------+
|   Codex Medium    | --> |    Codex High     | --> |   Manual Review   |
|   (writes code)   |     |   (reviews code)  |     | (security check)  |
+-------------------+     +-------------------+     +-------------------+

For security-critical paths, I add one more layer:

+-------------------+
|  Security Audit   |
|  (Snyk, npm audit)|
+-------------------+

Tools for Security Auditing

The Reddit thread mentioned tools that catch vulnerabilities:

Snyk - Scans code and dependencies for known vulnerabilities

# Install and run Snyk
npm install -g snyk
snyk test

npm audit - Checks for vulnerable dependencies

npm audit
npm audit fix

OWASP ZAP - Web application security scanner

These tools catch issues AI might introduce:

Known vulnerable dependencies
Common security misconfigurations
Exposed secrets in code
Missing security headers

A Practical Security Workflow

When I use Codex for any feature, I follow this workflow:

Step 1: Identify Security Sensitivity

Is this code handling:
- User credentials?      -> Use auth service
- Payment data?          -> Use payment service
- Sensitive data?        -> Manual review required
- Access control?        -> Manual review required
- Public data only?      -> Safe to delegate

Step 2: Delegate or Build

For non-sensitive code, I let Codex write it. For security-critical code, I:

Research the recommended approach
Use established services or libraries
Have Codex help with integration, not implementation

Step 3: Review with Stronger Model

After Codex completes a task, I ask a stronger model to review:

Prompt: "Review this code for security issues, focusing on:
- Input validation
- Authentication/authorization
- Data exposure
- Error handling that might leak information"

Step 4: Run Security Tools

# Check for vulnerabilities
snyk test
npm audit

# Check for secrets in code
git diff --staged | grep -i "api_key\|password\|secret\|token"

Step 5: Manual Review for Critical Paths

For authentication, payments, and data access code, I always read every line myself.

Common Mistakes I’ve Made

Mistake 1: Implementing custom auth

I once asked Codex to implement password reset. The code worked, but it didn’t handle:

Token expiration
Rate limiting reset attempts
Logging for audit trails
Email template security

Now I use Clerk or Auth0 for everything auth-related.

Mistake 2: Hardcoded secrets in AI-generated code

Codex once added an API key directly in the code because I didn’t specify environment variables. It looked like:

const response = await fetch('https://api.example.com', {
  headers: { 'Authorization': 'Bearer sk-test-1234' }
})

I now always specify: “Use environment variables for all secrets.”

Mistake 3: Trusting AI to understand access control

I asked Codex to “add admin functionality.” It created routes that checked if a user was admin, but didn’t verify they could access specific resources. Admin from Organization A could see Organization B’s data.

Security Checklist Before Committing AI-Generated Code

## Secrets Check
- [ ] No API keys in code
- [ ] No passwords in code
- [ ] No tokens in code
- [ ] .env file is in .gitignore

## Input Validation
- [ ] All user inputs are validated
- [ ] SQL queries use parameters (not string concatenation)
- [ ] File uploads are validated and limited

## Access Control
- [ ] Sensitive routes are protected
- [ ] User can only access their own data
- [ ] Admin routes check admin status

## Dependencies
- [ ] npm audit shows no critical vulnerabilities
- [ ] Dependencies are up to date

Summary

In this post, I explained what parts of your application you should never delegate to AI coding assistants like Codex. Authentication, payment processing, encryption, and access control are too important to trust to AI-generated code.

Instead:

Use established third-party services for auth (Clerk, Auth0) and payments (Stripe)
Run security audits with tools like Snyk
Have stronger models review code changes
Always manually review security-critical implementations

AI coding assistants are powerful for many tasks. But security requires human judgment, established best practices, and defense in depth that AI simply can’t provide on its own.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: Best way to use Codex
👨‍💻 Clerk Authentication
👨‍💻 Auth0 Documentation
👨‍💻 Snyk Security Scanner
👨‍💻 OWASP Top 10 Security Risks

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!