Should API Gateway Handle All Authentication in Microservices?

Mar 23, 2026

I was setting up authentication for my microservices architecture and kept wondering: should every service verify the JWT token, or can I just handle it at the API gateway level? I was worried about security implications of trusting the gateway.

The Problem: Where Should Authentication Happen?

I had three options floating in my head:

Centralized at gateway - Gateway authenticates all requests, services trust the gateway
Distributed to services - Every service validates tokens independently
Hybrid approach - Gateway does initial validation, services re-validate

The distributed approach felt “more secure” but meant every service needed auth logic, key management, and token verification. That’s a lot of duplicated code and potential for mistakes.

The Solution: Gateway Auth with Private Services

After digging into Reddit discussions and Kubernetes patterns, I found the answer: yes, the API gateway can handle all authentication - but only when your internal services are truly private.

The key insight from the community was clear:

“It should be handled at the API gateway level. But also in this case you should ensure your microservices are not reachable from the outside and are only accessible through the gateway.”

“If your services are in a private network only reachable from the API gateway, then it’s useless to verify a gazillion times.”

In my Kubernetes setup, I have:

An Ingress that forwards requests to the API gateway
Services with ClusterIP (not LoadBalancer or NodePort)
Services only accessible through the gateway

This is the standard pattern. The gateway authenticates incoming requests and forwards user context to downstream services, which handle authorization.

Why This Matters

Simplicity. Auth logic lives in one place. New services don’t need auth boilerplate.

Performance. No need to verify the same token multiple times as a request traverses services.

Maintainability. Token format changes, key rotations, and auth updates happen at one spot.

Cost. Less CPU spent on repeated cryptographic verification.

The Anti-Pattern: Passing Raw Tokens to Services

Here’s what I initially did wrong - passing the raw JWT token to downstream services:

// WRONG: Passing raw token downstream
import jwt from 'jsonwebtoken';

export function authMiddleware(req, res, next) {
  const authHeader = req.headers.authorization;

  if (!authHeader || !authHeader.startsWith('Bearer ')) {
    return res.status(401).json({ error: 'Missing or invalid token' });
  }

  const token = authHeader.substring(7);

  try {
    // Verify the token
    const decoded = jwt.verify(token, process.env.JWT_SECRET);

    // WRONG: Attach raw token and forward to services
    req.user = {
      ...decoded,
      token: token  // This is the anti-pattern
    };

    next();
  } catch (err) {
    return res.status(401).json({ error: 'Invalid token' });
  }
}

The problem? Downstream services would need to re-verify the token anyway, defeating the purpose. Plus, you’re propagating sensitive credentials unnecessarily.

The Correct Pattern: Decode and Forward Context

Instead, the gateway should decode the token once and forward only the necessary user context:

import jwt from 'jsonwebtoken';

export function authMiddleware(req, res, next) {
  const authHeader = req.headers.authorization;

  if (!authHeader || !authHeader.startsWith('Bearer ')) {
    return res.status(401).json({ error: 'Missing or invalid token' });
  }

  const token = authHeader.substring(7);

  try {
    const decoded = jwt.verify(token, process.env.JWT_SECRET);

    // CORRECT: Forward only the user context, not the token
    req.user = {
      id: decoded.sub,
      email: decoded.email,
      roles: decoded.roles || [],
      permissions: decoded.permissions || []
    };

    // Add user context to headers for downstream services
    req.headers['x-user-id'] = decoded.sub;
    req.headers['x-user-email'] = decoded.email;
    req.headers['x-user-roles'] = (decoded.roles || []).join(',');

    next();
  } catch (err) {
    return res.status(401).json({ error: 'Invalid token' });
  }
}

Now downstream services receive the user context via headers and can trust it because:

They’re only reachable through the gateway (ClusterIP)
The gateway has already verified the token

Downstream Service: Using Forwarded Context

For services communicating via gRPC, you can propagate context through metadata:

export function createAuthInterceptor(userId, userEmail, userRoles) {
  return {
    request: (request) => {
      const metadata = request.metadata || {};

      // Add user context to gRPC metadata
      metadata['x-user-id'] = userId;
      metadata['x-user-email'] = userEmail;
      metadata['x-user-roles'] = userRoles.join(',');

      return {
        ...request,
        metadata
      };
    }
  };
}

In the downstream service, extract the context:

export function createOrderService({ orderRepository, productService }) {
  return {
    async createOrder(userId, orderData, userContext) {
      // userContext comes from headers/metadata set by gateway
      const { id, roles, permissions } = userContext;

      // Authorization: check if user can create orders
      if (!permissions.includes('orders:create')) {
        throw new Error('Forbidden: insufficient permissions');
      }

      // Proceed with business logic
      const order = await orderRepository.create({
        ...orderData,
        created_by: id
      });

      return order;
    }
  };
}

The Architecture

┌─────────────┐     ┌──────────────────┐     ┌─────────────────┐
│   Client    │────▶│   API Gateway    │────▶│  Order Service   │
│             │     │                  │     │  (ClusterIP)    │
└─────────────┘     │  1. Verify JWT   │     └─────────────────┘
                    │  2. Extract user │              │
                    │  3. Forward ctx  │              ▼
                    └──────────────────┘     ┌─────────────────┐
                              │               │ Inventory Svc   │
                              │               │  (ClusterIP)    │
                              ▼               └─────────────────┘
                    ┌──────────────────┐
                    │  User Service    │
                    │  (ClusterIP)     │
                    └──────────────────┘

Common Mistakes to Avoid

Passing raw tokens downstream - Forces re-verification and spreads credentials
Exposing internal services - ClusterIP is not enough if you have NodePort or LoadBalancer services
Conflating auth with authz - Gateway does authentication (who are you?), services do authorization (what can you do?)
Forgetting network policies - In Kubernetes, add NetworkPolicy to restrict inter-service communication
Skipping user context validation - Services should still validate that required context is present

When Gateway-Only Auth Breaks

This pattern works when internal services are isolated. But if you have:

Services exposed publicly (NodePort, LoadBalancer)
Direct database access from external tools
Multiple entry points (gateway + message queues + scheduled jobs)

Then you need either:

Service mesh with mutual TLS (Istio, Linkerd)
Zero-trust architecture where every service validates
Different auth strategies for different entry points

Summary

In this post, I explored whether authentication should be centralized at the API gateway. The key insight is that gateway-only auth is the standard pattern when internal services are private and only reachable through the gateway. Kubernetes ClusterIP services provide this isolation naturally. The gateway handles authentication (verifying identity), extracts user context, and forwards it via headers or gRPC metadata. Downstream services handle authorization (checking permissions) based on that context. The anti-pattern to avoid is passing raw tokens downstream - decode once at the gateway, forward context, and trust the network boundary.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Kubernetes ClusterIP Services
👨‍💻 gRPC Metadata Propagation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!