What Monitoring Tools Do Spring Boot Developers Use in Production?
I deployed my first Spring Boot application to production without any monitoring. Three weeks later, it crashed at 3 AM on a Sunday. I had no idea why. No logs, no metrics, no clue. After that incident, I promised myself: never again.
When I started researching monitoring tools, I found dozens of options. Datadog looked slick but expensive. New Relic seemed comprehensive but complex. Then I kept seeing the same combination mentioned in forums and Reddit threads: Prometheus and Grafana.
Let me share what I learned setting up production monitoring for Spring Boot applications.
The Problem: Flying Blind in Production
Before adding monitoring, I was essentially flying blind. When users reported slowness, I could only guess. Was it the database? The API? Memory issues? CPU spikes? I had no data to diagnose problems.
What I needed was:
- Metrics: Numbers I could track over time (request latency, error rates, memory usage)
- Visualization: Graphs and dashboards to spot trends
- Alerts: Notifications when something goes wrong
The Standard Stack: Prometheus + Grafana + Micrometer
After reading through numerous discussions, one pattern emerged consistently. The de facto standard for Spring Boot monitoring is:
┌─────────────────────────────────────────────────────────────────┐│ Spring Boot Application ││ ┌────────────────────────────────────────────────────────────┐ ││ │ Spring Boot Actuator │ ││ │ ┌──────────────────────────────────────────────────────┐ │ ││ │ │ Micrometer │ │ ││ │ │ - Counters, Timers, Gauges │ │ ││ │ │ - Vendor-neutral metrics facade │ │ ││ │ │ - Prometheus registry integration │ │ ││ │ └──────────────────────────────────────────────────────┘ │ ││ └────────────────────────────────────────────────────────────┘ ││ ↓ ││ /actuator/prometheus │└─────────────────────────────────────────────────────────────────┘ ↓ Scrape┌─────────────────────────────────────────────────────────────────┐│ Prometheus ││ - Time-series database ││ - Pull-based metrics collection ││ - PromQL query language ││ - Alertmanager integration │└─────────────────────────────────────────────────────────────────┘ ↓ Query┌─────────────────────────────────────────────────────────────────┐│ Grafana ││ - Visualization dashboards ││ - Alerting and notifications ││ - Multi-data source support ││ - Community dashboard sharing │└─────────────────────────────────────────────────────────────────┘Why this combination? Because Spring Boot has native support for it. The integration requires minimal code.
Step 1: Add Dependencies
First, I added the necessary dependencies to my pom.xml:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId></dependency><dependency> <groupId>io.micrometer</groupId> <artifactId>micrometer-registry-prometheus</artifactId></dependency>That’s it. Two dependencies. No code changes yet.
Step 2: Expose the Prometheus Endpoint
By default, Actuator endpoints are not exposed. I configured application.yml to expose the Prometheus endpoint:
management: endpoints: web: exposure: include: prometheus,health,info,metrics metrics: export: prometheus: enabled: trueAfter restarting the application, I navigated to http://localhost:8080/actuator/prometheus. A wall of text appeared:
# HELP jvm_memory_used_bytes The amount of used memory# TYPE jvm_memory_used_bytes gaugejvm_memory_used_bytes{area="heap",id="PS Eden Space",} 1.2345678E7jvm_memory_used_bytes{area="heap",id="PS Survivor Space",} 123456.0...# HELP http_server_requests_seconds# TYPE http_server_requests_seconds summaryhttp_server_requests_seconds_count{exception="None",method="GET",outcome="SUCCESS",status="200",uri="/api/users",} 1523.0...This was exciting. Without writing any instrumentation code, I had access to JVM memory metrics, HTTP request timings, and more.
Step 3: Add Common Tags
To identify metrics from different applications, I added common tags:
@SpringBootApplicationpublic class Application { public static void main(String[] args) { SpringApplication.run(Application.class, args); }
@Bean MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() { return registry -> registry.config() .commonTags("application", "my-spring-boot-app"); }}This tag appears on every metric, making it easy to filter in Grafana.
Step 4: Custom Business Metrics
Out-of-the-box metrics are useful, but I also needed to track business-specific data. For example, tracking orders created:
@Servicepublic class OrderService { private final Counter orderCounter; private final Timer orderProcessingTime;
public OrderService(MeterRegistry registry) { this.orderCounter = Counter.builder("orders.created") .description("Total orders created") .tag("type", "online") .register(registry);
this.orderProcessingTime = Timer.builder("orders.processing.time") .description("Order processing duration") .publishPercentiles(0.5, 0.95, 0.99) .register(registry); }
public Order createOrder(OrderRequest request) { return orderProcessingTime.record(() -> { Order order = processOrder(request); orderCounter.increment(); return order; }); }
private Order processOrder(OrderRequest request) { // ... order processing logic }}The Timer automatically tracks how long order processing takes and computes percentiles (p50, p95, p99). The Counter tracks the total number of orders.
Step 5: Configure Prometheus
I created a prometheus.yml configuration file:
scrape_configs: - job_name: 'spring-boot-apps' metrics_path: '/actuator/prometheus' static_configs: - targets: - 'app1:8080' - 'app2:8080' scrape_interval: 15sPrometheus “scrapes” metrics every 15 seconds by hitting the /actuator/prometheus endpoint. This pull model means my application doesn’t need to know about Prometheus.
Step 6: Run Everything with Docker Compose
For local development, I created a docker-compose.yml:
version: '3.8'services: prometheus: image: prom/prometheus:latest ports: - "9090:9090" volumes: - ./prometheus.yml:/etc/prometheus/prometheus.yml
grafana: image: grafana/grafana:latest ports: - "3000:3000" environment: - GF_SECURITY_ADMIN_PASSWORD=admin volumes: - grafana-storage:/var/lib/grafana
volumes: grafana-storage:Running docker-compose up -d started both services. Prometheus was available at http://localhost:9090 and Grafana at http://localhost:3000.
Step 7: Import a Grafana Dashboard
I didn’t want to build dashboards from scratch. Thankfully, the community has created dozens of pre-built dashboards. I imported dashboard ID 4701 (JVM Micrometer) into Grafana.
The result was immediate. I had graphs for:
- JVM heap memory usage
- HTTP request rates and latency
- Garbage collection pauses
- Thread counts
- CPU usage
Setting Up Alerts
Dashboards are nice, but I needed alerts for when things go wrong. I added alerting rules to Prometheus:
groups: - name: spring-boot-alerts rules: - alert: HighErrorRate expr: | sum(rate(http_server_requests_seconds_count{status=~"5.."}[5m])) / sum(rate(http_server_requests_seconds_count[5m])) > 0.05 for: 5m labels: severity: critical annotations: summary: "High error rate detected"
- alert: HighMemoryUsage expr: | jvm_memory_used_bytes{area="heap"} / jvm_memory_max_bytes{area="heap"} > 0.9 for: 5m labels: severity: warning annotations: summary: "JVM heap memory usage above 90%"The first alert fires when the 5xx error rate exceeds 5% for 5 minutes. The second warns when heap usage exceeds 90%.
What About OpenTelemetry?
While setting this up, I also explored OpenTelemetry (OTel). It’s newer and provides unified observability across metrics, traces, and logs.
For new projects requiring distributed tracing, OTel might be the better choice:
<dependency> <groupId>io.opentelemetry.instrumentation</groupId> <artifactId>opentelemetry-spring-boot-starter</artifactId></dependency>However, for my needs—primarily metrics—the Prometheus + Grafana stack was simpler and more mature. The community support is extensive, and I found answers to every question I had.
Lessons Learned
After running this stack in production for several months, here’s what I learned:
-
Start simple. Actuator + Micrometer gives you 80% of what you need with zero code.
-
Be careful with cardinality. High-cardinality tags (like user IDs) can explode your metrics storage. I made this mistake initially and Prometheus ran out of memory.
-
Tune scrape intervals. 15 seconds works for most cases. Scrape too frequently and you’ll stress Prometheus.
-
Use recording rules for complex queries. Pre-compute expensive PromQL queries rather than running them on every dashboard refresh.
-
Import community dashboards first. Don’t reinvent the wheel. Customize only after you understand what’s available.
When to Choose What
| Stack | Best For |
|---|---|
| Prometheus + Grafana | Battle-tested metrics monitoring, Kubernetes environments, teams familiar with PromQL |
| OpenTelemetry | New projects needing unified traces/metrics/logs, vendor flexibility, distributed systems |
| Commercial (Datadog, New Relic) | Teams wanting managed solutions, all-in-one observability, budget available |
Summary
In this post, I shared my journey from zero monitoring to a complete production monitoring setup for Spring Boot applications. The Prometheus + Grafana + Micrometer stack became the industry standard for good reasons: native Spring Boot integration, extensive community support, and proven reliability.
If you’re starting from scratch, here’s the quickest path:
- Add
spring-boot-starter-actuatorandmicrometer-registry-prometheusdependencies - Configure
application.ymlto expose the Prometheus endpoint - Run Prometheus and Grafana via Docker Compose
- Import a community dashboard (ID: 4701)
- Set up alerts for critical metrics
The whole setup took me about an hour the first time. Now I can deploy any Spring Boot application with confidence, knowing I’ll have visibility into its behavior.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Spring Boot Actuator Documentation
- 👨💻 Micrometer Documentation
- 👨💻 Prometheus Configuration
- 👨💻 Grafana Spring Boot Dashboard
- 👨💻 OpenTelemetry Java
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments