How to Implement Rate Limiting Per Endpoint in Express.js (Production Guide)

Mar 4, 2026

Purpose

This post demonstrates how to implement per-endpoint rate limiting in Express.js, applying strict limits to sensitive endpoints while keeping read operations permissive.

Environment

Node.js 20.x
Express.js 4.18.x
express-rate-limit 7.x
rate-limit-redis 4.x (for distributed setups)
Redis 7.x (optional, for multi-server deployments)

The Problem

I deployed my Express API to production with a single global rate limiter set to 100 requests per 15 minutes. A few days later, I noticed brute force attempts on my login endpoint - someone was testing thousands of password combinations. My rate limiter was too permissive for authentication.

So I cranked it down to 5 requests per 15 minutes globally. Now my users couldn’t browse products without hitting rate limits every few seconds. One setting couldn’t handle both scenarios.

Here’s the problematic pattern I started with:

import rateLimit from 'express-rate-limit';

// One global limiter for everything
const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100, // 100 requests per window
});

// Applied to ALL routes
app.use(limiter);

This approach fails because:

Authentication needs strict limits: Login, register, and password reset endpoints are targets for credential stuffing attacks.
Payments need aggressive protection: Card testing attacks can cost money and damage reputation.
Read endpoints need high throughput: Product listings, search, and content endpoints should handle burst traffic.
Write operations need moderate limits: Prevent spam and abuse without blocking legitimate users.

A single global rate limit is too blunt an instrument. I needed per-endpoint configuration.

The Solution

I learned to create multiple rate limiter instances and apply them to specific routes based on sensitivity.

Method 1: Categorize and Create Separate Limiters

First, I categorized my endpoints:

import rateLimit from 'express-rate-limit';

// Strict limiter for authentication endpoints
// Prevents credential stuffing and brute force
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 5, // 5 attempts per window per IP
  message: {
    success: false,
    error: 'Too many attempts, please try again later',
    retryAfter: '15 minutes'
  },
  standardHeaders: true, // Return rate limit info in RateLimit-* headers
  legacyHeaders: false, // Disable X-RateLimit-* headers
});

// Aggressive limiter for payment endpoints
// Prevents card testing and fraud
const paymentLimiter = rateLimit({
  windowMs: 60 * 60 * 1000, // 1 hour
  max: 10, // 10 payment attempts per hour
  message: {
    success: false,
    error: 'Too many payment attempts, please contact support',
  },
});

// Moderate limiter for write operations
// Prevents spam while allowing normal usage
const writeLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100, // 100 writes per 15 minutes
  message: 'Too many requests, please slow down',
});

// Permissive limiter for read operations
// Allows high throughput for content consumption
const readLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 1000, // 1000 reads per 15 minutes
});

export { authLimiter, paymentLimiter, writeLimiter, readLimiter };

Then I applied them to specific routes:

import express from 'express';
import { authLimiter, paymentLimiter, writeLimiter, readLimiter } from './middleware/rateLimiters.js';

const app = express();

// Authentication - strict limits
app.post('/api/auth/login', authLimiter, loginHandler);
app.post('/api/auth/register', authLimiter, registerHandler);
app.post('/api/auth/forgot-password', authLimiter, forgotPasswordHandler);
app.post('/api/auth/reset-password', authLimiter, resetPasswordHandler);

// Payments - aggressive protection
app.post('/api/payments/charge', paymentLimiter, chargeHandler);
app.post('/api/payments/refund', paymentLimiter, refundHandler);

// Write operations - moderate limits
app.post('/api/posts', writeLimiter, createPost);
app.put('/api/posts/:id', writeLimiter, updatePost);
app.delete('/api/posts/:id', writeLimiter, deletePost);
app.post('/api/comments', writeLimiter, createComment);

// Read operations - permissive limits
app.get('/api/products', readLimiter, getProducts);
app.get('/api/products/:id', readLimiter, getProductById);
app.get('/api/posts', readLimiter, getPosts);
app.get('/api/search', readLimiter, searchHandler);

Method 2: Redis-Based Rate Limiting for Distributed Systems

When I scaled to multiple server instances behind a load balancer, I discovered each server tracked its own rate limits. A user could make 5 login attempts on server A, then 5 more on server B, bypassing my limits entirely.

I needed a shared store. Redis solved this:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';

// Create Redis client
const redisClient = createClient({
  url: process.env.REDIS_URL || 'redis://localhost:6379',
});

redisClient.connect().catch(console.error);

// Factory function to create Redis-backed limiters
const createRedisLimiter = (options) => {
  return rateLimit({
    store: new RedisStore({
      sendCommand: (...args) => redisClient.sendCommand(args),
      prefix: 'rl:', // Redis key prefix
    }),
    ...options,
  });
};

// Now all limiters share state across servers
const authLimiter = createRedisLimiter({
  windowMs: 15 * 60 * 1000,
  max: 5,
});

const writeLimiter = createRedisLimiter({
  windowMs: 15 * 60 * 1000,
  max: 100,
});

const readLimiter = createRedisLimiter({
  windowMs: 15 * 60 * 1000,
  max: 1000,
});

export { authLimiter, writeLimiter, readLimiter };

With Redis, rate limit state persists across all server instances. A user hitting server A then server B still counts toward the same limit.

Method 3: Dynamic Rate Limiting by User Role

I later needed to give premium users higher rate limits. The max option accepts a function:

const createDynamicLimiter = (baseOptions) => {
  return rateLimit({
    ...baseOptions,
    // Use user ID if authenticated, otherwise IP
    keyGenerator: (req) => {
      return req.user?.id || req.ip;
    },
    // Dynamic limit based on user tier
    max: (req) => {
      // Premium users get 10x the limit
      if (req.user?.tier === 'premium') {
        return baseOptions.max * 10;
      }
      // Standard users get base limit
      return baseOptions.max;
    },
  });
};

// Premium users: 50 login attempts, standard: 5
const authLimiter = createDynamicLimiter({
  windowMs: 15 * 60 * 1000,
  max: 5,
});

// Premium users: 10000 reads, standard: 1000
const readLimiter = createDynamicLimiter({
  windowMs: 15 * 60 * 1000,
  max: 1000,
});

Method 4: Custom Key Generation for Different Scenarios

Sometimes I needed to rate limit differently based on context:

// Rate limit by API key for public API
const apiLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 60,
  keyGenerator: (req) => {
    // Use API key from header, fallback to IP
    return req.headers['x-api-key'] || req.ip;
  },
});

// Rate limit by user for authenticated endpoints
const userActionLimiter = rateLimit({
  windowMs: 60 * 60 * 1000, // 1 hour
  max: 100,
  keyGenerator: (req) => {
    // Must be authenticated
    if (!req.user?.id) {
      throw new Error('User must be authenticated');
    }
    return `user:${req.user.id}`;
  },
});

// Rate limit by combination of user + action for specific operations
const sensitiveActionLimiter = rateLimit({
  windowMs: 24 * 60 * 60 * 1000, // 24 hours
  max: 3,
  keyGenerator: (req) => {
    return `sensitive:${req.user.id}:password-change`;
  },
});

How It Works

The mechanics of per-endpoint rate limiting:

Middleware ordering: Express processes middleware in order. Rate limiters are middleware functions that track request counts.
Key generation: By default, rate limiters use the client IP as the key. You can customize this to use user IDs, API keys, or combinations.
Window-based counting: The limiter counts requests within a sliding window. When the window expires, the count resets.
Header responses: With standardHeaders: true, the limiter returns RateLimit-Limit, RateLimit-Remaining, and RateLimit-Reset headers so clients can see their limits.
Shared storage: In distributed systems, Redis stores the request counts. All server instances read from and write to the same Redis keys.

Common Mistakes

I made several mistakes implementing per-endpoint rate limiting:

1. Forgetting to sync across server instances

With multiple servers behind a load balancer, each server tracked its own limits. A user could make 5 attempts on each server, multiplying the effective limit. Redis storage fixed this.

2. Setting limits too low on read endpoints

I initially set 100 requests across all endpoints. Users browsing product catalogs hit limits constantly. I raised read limits to 1000 and the complaints stopped.

3. Not differentiating anonymous vs authenticated users

Anonymous users should have stricter limits than authenticated users. Authenticated users have accountability, and you can track abuse back to an account.

const readLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: (req) => req.user ? 2000 : 500,
});

4. Ignoring rate limit headers on the client

The server returns helpful headers, but my frontend didn’t use them. I added a client-side interceptor to show warnings when remaining requests dropped below 10%:

fetch('/api/products')
  .then(response => {
    const remaining = response.headers.get('RateLimit-Remaining');
    if (remaining && parseInt(remaining) < 100) {
      console.warn('Rate limit approaching:', remaining, 'requests remaining');
    }
    return response.json();
  });

5. Using the same limiter instance for different purposes

I initially tried to reuse the same limiter object for different routes. This caused unexpected behavior because the limiter shares state. Create separate limiter instances for each use case.

Why This Matters

Per-endpoint rate limiting transformed my API security:

Protection against targeted attacks: Authentication endpoints have strict limits that prevent credential stuffing. Even if an attacker gets a password dump, they can’t test thousands of combinations.
Maintained user experience: Read-heavy endpoints stay responsive. Users can browse products without hitting arbitrary limits.
Cost protection: Payment endpoints have aggressive limits. Card testing attacks that try thousands of card numbers get blocked.
Flexible scaling: Premium users get higher limits, incentivizing upgrades and improving their experience.
Observable limits: Standard headers let clients see their remaining quota, reducing support tickets about “why am I blocked?”

Summary

In this post, I showed how to implement per-endpoint rate limiting in Express.js. The key insight is that different endpoints have different security and performance requirements - a single global rate limit cannot address them all.

The approach is straightforward:

Categorize endpoints by sensitivity (auth, payments, writes, reads)
Create separate rate limiters with appropriate limits
Apply limiters as middleware to specific routes
Add Redis storage for distributed systems

Start with memory-based rate limiting in development, then add Redis when you deploy multiple server instances. Your security team and your users will both be happier.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!