Spring Boot Micrometer Monitoring Best Practices - A Practical Guide

Mar 7, 2026

The Problem

I had set up Prometheus and Grafana for my Spring Boot application, and I could see JVM memory usage and HTTP request metrics. But when my product manager asked specific questions, I couldn’t answer them:

“How many orders are being processed per minute?”
“What’s the failure rate for payment transactions?”
“How long does it take to sync data from the external API?”

The built-in metrics weren’t enough. I needed custom metrics tailored to my business logic. I tried adding metrics using System.currentTimeMillis() and log statements, but that approach was messy and didn’t integrate with my monitoring stack.

I needed a better way to track business-specific metrics that would work seamlessly with Prometheus and Grafana.

Why Micrometer?

I discovered Micrometer, and it’s the standard solution for Spring Boot metrics. Think of it as “SLF4J for metrics” - you instrument your code once, and Micrometer handles translation to your monitoring backend.

Here’s what makes Micrometer the right choice:

+-------------------+
|  Your Application |
|  (Business Logic) |
+-------------------+
         |
         v
+-------------------+     +------------+     +---------+
|   Micrometer      |     |            |     |         |
|   (MeterRegistry) |---->| Prometheus |---->| Grafana |
+-------------------+     +------------+     +---------+
         |                      ^
         |                      |
         +-----> Datadog -------+
         |                      |
         +-----> InfluxDB ------+
         |                      |
         +-----> New Relic -----+

Key benefits I found:

Vendor neutrality: I can switch from Prometheus to Datadog by changing one dependency
Spring Boot integration: Auto-configured via Spring Boot Actuator
Dimensional metrics: Tags enable filtering and drill-down in dashboards
Battle-tested: Used in production at scale by major companies

Step 1: Understand Meter Types

Micrometer provides four main meter types. I needed to understand when to use each one:

Meter Type	Behavior	Use Case	Example
Counter	Only increases	Counting events	Orders placed, errors occurred
Gauge	Can go up or down	Current state	Active connections, queue size
Timer	Measures duration + count	Latency tracking	API response time, DB query time
DistributionSummary	Tracks distribution	Size metrics	Request payload size, batch size

Counter - For Counting Events

Counters only go up. I use them for monotonically increasing values like request counts or completed tasks.

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.stereotype.Service;

@Service
public class OrderService {
    private final Counter ordersCreated;
    private final Counter ordersFailed;

    public OrderService(MeterRegistry registry) {
        this.ordersCreated = Counter.builder("orders.created")
            .description("Number of orders created")
            .tag("type", "online")
            .baseUnit("orders")
            .register(registry);

        this.ordersFailed = Counter.builder("orders.failed")
            .description("Number of failed orders")
            .tag("type", "online")
            .register(registry);
    }

    public Order createOrder(OrderRequest request) {
        try {
            Order order = orderRepository.save(request);
            ordersCreated.increment();
            return order;
        } catch (Exception e) {
            ordersFailed.increment();
            throw e;
        }
    }
}

Important: Never use Counter for values that can decrease. If you need to track current value (like active users), use Gauge instead.

Timer - For Measuring Duration

Timers measure both duration and count. They automatically track percentiles (p50, p95, p99).

import io.micrometer.core.instrument.Timer;
import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.stereotype.Service;
import java.util.concurrent.TimeUnit;

@Service
public class PaymentService {
    private final Timer paymentTimer;

    public PaymentService(MeterRegistry registry) {
        this.paymentTimer = Timer.builder("payment.processing.time")
            .description("Time spent processing payments")
            .tag("gateway", "stripe")
            .publishPercentiles(0.5, 0.95, 0.99)
            .publishPercentileHistogram()
            .minimumExpectedValue(Duration.ofMillis(1))
            .maximumExpectedValue(Duration.ofSeconds(30))
            .register(registry);
    }

    public PaymentResult processPayment(PaymentRequest request) {
        // Option 1: Record with supplier
        return paymentTimer.record(() -> {
            return paymentGateway.charge(request);
        });

        // Option 2: Manual timing
        long start = System.nanoTime();
        try {
            PaymentResult result = paymentGateway.charge(request);
            return result;
        } finally {
            paymentTimer.record(System.nanoTime() - start, TimeUnit.NANOSECONDS);
        }
    }
}

Timers give me both the count of operations and duration statistics in one meter.

Gauge - For Current Values

Gauges sample a value at observation time. They’re perfect for current state metrics.

import io.micrometer.core.instrument.Gauge;
import io.micrometer.core.instrument.MeterRegistry;
import java.util.concurrent.atomic.AtomicLong;
import org.springframework.stereotype.Service;

@Service
public class ConnectionPoolService {
    private final AtomicLong activeConnections = new AtomicLong(0);

    public ConnectionPoolService(MeterRegistry registry) {
        // Gauge samples the value from AtomicLong
        Gauge.builder("connections.active", activeConnections, AtomicLong::get)
            .description("Number of active database connections")
            .tag("pool", "main")
            .register(registry);
    }

    public Connection acquireConnection() {
        activeConnections.incrementAndGet();
        return connectionPool.borrowObject();
    }

    public void releaseConnection(Connection conn) {
        activeConnections.decrementAndGet();
        connectionPool.returnObject(conn);
    }
}

Gotcha I learned: Gauges are sampled, not pushed. Prometheus scrapes the current value when it queries the endpoint.

Step 2: Follow Naming Conventions

I made mistakes with metric naming initially. Here’s what I learned:

Use Lowercase Dot Notation

// GOOD - Consistent, readable
Counter.builder("http.server.requests")
    .tag("method", "GET")
    .tag("uri", "/api/users");

Counter.builder("database.queries")
    .tag("table", "users")
    .tag("operation", "select");

// BAD - Inconsistent, hard to filter
Counter.builder("HTTP_Requests");
Counter.builder("databaseQueries");
Counter.builder("user-query-count");

Name the Thing Being Measured

The metric name should answer “what is being measured?”:

// Clear what's being measured
Timer.builder("http.server.requests");
Counter.builder("orders.created");
Gauge.builder("jvm.memory.used");

// Unclear - what is this?
Timer.builder("timing");
Counter.builder("count");
Gauge.builder("value");

Step 3: Use Tags Effectively

Tags (also called labels in Prometheus) enable dimensional metrics. They let me filter and group metrics in Grafana.

Tags for Granularity

// Track HTTP requests by method, endpoint, and status
registry.counter("http.server.requests",
    "method", "GET",
    "uri", "/api/users",
    "status", "200"
);

// Track database calls by operation
registry.counter("database.queries",
    "database", "users",
    "operation", "select",
    "table", "orders"
);

// Track business metrics by region
registry.counter("orders.processed",
    "region", "us-east-1",
    "service", "order-service"
);

Now in Grafana, I can query:

Total orders: sum(orders_processed)
Orders by region: sum by (region) (orders_processed)
US-east orders only: orders_processed{region="us-east-1"}

Common Tags for All Metrics

I set application-wide tags so every metric includes context:

import io.micrometer.core.instrument.MeterRegistry;
import org.springframework.boot.actuate.autoconfigure.metrics.MeterRegistryCustomizer;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class MetricsConfig {

    @Bean
    public MeterRegistryCustomizer&lt;MeterRegistry&gt; commonTags() {
        return registry -&gt; registry.config()
            .commonTags("application", "order-service")
            .commonTags("environment", "production")
            .commonTags("region", "us-east-1");
    }
}

Or via application properties:

spring:
  application:
    name: order-service

management:
  metrics:
    tags:
      application: ${spring.application.name}
      environment: ${DEPLOY_ENV:development}
      region: ${AWS_REGION:us-east-1}

Step 4: Avoid Cardinality Explosion

I learned this the hard way. High-cardinality tags can crash Prometheus.

The Problem: Too Many Time Series

// DANGEROUS - Creates a time series for each user
Counter.builder("api.requests")
    .tag("userId", userId)  // Could be millions of users!
    .register(registry)
    .increment();

// DANGEROUS - Creates a time series for each request ID
Counter.builder("http.requests")
    .tag("requestId", UUID.randomUUID().toString())  // Unique per request!
    .register(registry)
    .increment();

With 100,000 users, this creates 100,000 time series. With 1 million requests, that’s 1 million time series. Prometheus will run out of memory.

The Solution: Low-Cardinality Tags

// GOOD - Low cardinality (3-4 values)
Counter.builder("api.requests")
    .tag("userType", getUserType(userId))  // "free", "premium", "enterprise"
    .tag("tier", getUserTier(userId))      // "tier1", "tier2", "tier3"
    .register(registry)
    .increment();

// GOOD - Bounded values
Counter.builder("http.requests")
    .tag("endpoint", getEndpointPattern(uri))  // "/api/users/{id}"
    .tag("method", method)                     // "GET", "POST", etc.
    .tag("status", String.valueOf(status))     // "200", "404", "500"
    .register(registry)
    .increment();

Rule of thumb: Keep cardinality under 10 for each tag, and total time series under 100,000.

Step 5: A Complete Example

Here’s a real-world example I use in my payment service:

import io.micrometer.core.instrument.*;
import org.springframework.stereotype.Service;
import java.util.concurrent.atomic.AtomicLong;

@Service
public class PaymentService {
    private final Counter paymentsProcessed;
    private final Counter paymentsFailed;
    private final Timer paymentTimer;
    private final AtomicLong pendingPayments = new AtomicLong(0);

    public PaymentService(MeterRegistry registry) {
        // Counter for successful payments
        this.paymentsProcessed = Counter.builder("payments.processed")
            .description("Total payments processed successfully")
            .tag("service", "payment")
            .tag("gateway", "stripe")
            .register(registry);

        // Counter for failed payments
        this.paymentsFailed = Counter.builder("payments.failed")
            .description("Total failed payments")
            .tag("service", "payment")
            .tag("gateway", "stripe")
            .register(registry);

        // Timer for payment processing duration
        this.paymentTimer = Timer.builder("payments.processing.time")
            .description("Time to process payment")
            .tag("service", "payment")
            .tag("gateway", "stripe")
            .publishPercentiles(0.5, 0.95, 0.99)
            .publishPercentileHistogram()
            .register(registry);

        // Gauge for pending payments
        Gauge.builder("payments.pending", pendingPayments, AtomicLong::get)
            .description("Number of payments currently being processed")
            .tag("service", "payment")
            .register(registry);
    }

    public PaymentResult processPayment(PaymentRequest request) {
        pendingPayments.incrementAndGet();

        try {
            PaymentResult result = paymentTimer.record(() -&gt; {
                return stripeGateway.charge(request);
            });

            paymentsProcessed.increment();
            return result;

        } catch (Exception e) {
            paymentsFailed.increment();
            throw new PaymentException("Payment failed", e);
        } finally {
            pendingPayments.decrementAndGet();
        }
    }
}

This gives me:

payments_processed_total: Total successful payments
payments_failed_total: Total failed payments
payments_processing_time_seconds: Duration statistics (p50, p95, p99)
payments_pending: Current number of payments in flight

Step 6: Test Metrics Locally

I always test metrics before deploying. Here’s my workflow:

Start the Application

./mvnw spring-boot:run

Check Available Metrics

# List all available metrics
curl http://localhost:8080/actuator/metrics

# Get specific metric
curl http://localhost:8080/actuator/metrics/payments.processed

# Check Prometheus format
curl http://localhost:8080/actuator/prometheus | grep payments

Expected Output

# HELP payments_processed_total Total payments processed successfully
# TYPE payments_processed_total counter
payments_processed_total{application="order-service",environment="production",gateway="stripe",service="payment"} 42.0

# HELP payments_processing_time_seconds Time to process payment
# TYPE payments_processing_time_seconds summary
payments_processing_time_seconds_count{application="order-service",gateway="stripe",service="payment"} 42.0
payments_processing_time_seconds_sum{application="order-service",gateway="stripe",service="payment"} 12.34
payments_processing_time_seconds{application="order-service",gateway="stripe",service="payment",quantile="0.5"} 0.25
payments_processing_time_seconds{application="order-service",gateway="stripe",service="payment",quantile="0.95"} 0.89
payments_processing_time_seconds{application="order-service",gateway="stripe",service="payment",quantile="0.99"} 1.23

Common Issues I Encountered

Issue 1: Metric Not Showing Up

Symptoms: Metric doesn’t appear in /actuator/prometheus

Causes I found:

Meter not registered: Call .register(registry)
Meter never used: Counters and Timers only emit when incremented
Filtered by management settings

Fix:

// WRONG - Not registered
Counter counter = Counter.builder("my.metric")
    .tag("key", "value");

// CORRECT - Registered
Counter counter = Counter.builder("my.metric")
    .tag("key", "value")
    .register(registry);  // <-- Important!

Issue 2: Wrong Metric Type

Symptoms: Using Counter for values that go down

Wrong approach:

// WRONG - Counter only increases
Counter userCount = registry.counter("users.active");
userCount.increment();  // When user logs in
// Can't decrement when user logs out!

Correct approach:

// CORRECT - Gauge can go up and down
AtomicLong activeUsers = new AtomicLong(0);

Gauge.builder("users.active", activeUsers, AtomicLong::get)
    .register(registry);

// Increment/decrement as needed
activeUsers.incrementAndGet();  // User logs in
activeUsers.decrementAndGet();  // User logs out

Issue 3: Prometheus Memory Issues

Symptoms: Prometheus runs out of memory, slow queries

Cause: High-cardinality tags

How to find the problem:

topk(10, count by (__name__) ({__name__=~".+"}))

This shows the top 10 metrics by cardinality. Look for metrics with millions of time series.

Issue 4: Percentiles Missing

Symptoms: No p95/p99 values in Timer metrics

Fix: Enable percentile publishing

Timer.builder("my.timer")
    .publishPercentiles(0.5, 0.95, 0.99)  // Enable percentiles
    .publishPercentileHistogram()          // For Prometheus histogram
    .register(registry);

Production Considerations

Separate Management Port

I expose metrics on a different port for security:

management:
  server:
    port: 9090
    address: 0.0.0.0

server:
  port: 8080

Now only internal monitoring systems can access the metrics endpoint.

Disable Unnecessary Metrics

Spring Boot enables many metrics by default. I disable what I don’t need:

management:
  metrics:
    enable:
      jvm: true
      process: true
      tomcat: true
      http: true
      logback: false
      uptime: false

Resource Limits

In Kubernetes, I set limits on the metrics endpoint:

livenessProbe:
  httpGet:
    path: /actuator/health
    port: 9090
  initialDelaySeconds: 30

readinessProbe:
  httpGet:
    path: /actuator/health
    port: 9090
  initialDelaySeconds: 10

resources:
  limits:
    memory: "512Mi"
  requests:
    memory: "256Mi"

Summary

In this post, I covered how to set up and use Micrometer for Spring Boot monitoring with best practices. I started with the problem of needing business-specific metrics beyond the defaults, then walked through the solution using Micrometer’s meter types.

The key practices I follow:

Choose the right meter type: Counter for totals, Gauge for current values, Timer for durations
Follow naming conventions: Use lowercase dot notation and descriptive names
Use tags wisely: Enable filtering without creating cardinality explosion
Set common tags: Add application and environment context to all metrics
Test locally: Verify metrics appear correctly before deploying
Configure for production: Separate management port and resource limits

Micrometer’s integration with Spring Boot Actuator means I can focus on instrumenting my business logic while Micrometer handles translation to Prometheus, Datadog, or any other monitoring backend.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!