How to Export Apple Health Data to InfluxDB for Grafana Dashboards

Mar 23, 2026

Problem

I wanted to visualize my Apple Health data in custom dashboards beyond what the Health app provides. Apple’s Health app has basic charts, but I needed:

Cross-metric correlation (sleep vs. activity, HRV vs. stress)
Long-term trend analysis with advanced queries
ML-based predictions for sleep quality
A single dashboard combining all health metrics

I tried exporting data to InfluxDB and building Grafana dashboards. Then I hit a critical issue that made my queries return wrong values.

The Critical Gotcha

My step count queries were returning “5 steps today” instead of thousands. I spent hours debugging before discovering the problem:

Apple Health exports step counts as per-minute granules, not daily totals.

# What I expected:
2024-01-15, steps: 8500

# What Apple Health actually exports:
2024-01-15 09:00, steps: 15
2024-01-15 09:01, steps: 23
2024-01-15 09:02, steps: 8
... (1440 rows per day)

When I used mean() aggregation, I got the average of per-minute values, not the total. The fix:

// WRONG: Returns ~5 steps (average per minute)
from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "steps")
  |> mean()  // This averages per-minute granules

// CORRECT: Returns ~8500 steps (sum of all granules)
from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "steps")
  |> sum()  // Sum all per-minute values

Use sum() for cumulative metrics (steps, distance, calories). Use mean() for point-in-time metrics (heart rate, blood oxygen).

Solution Architecture

I set up this pipeline:

iPhone Health App
       |
       v
Local Webhook Server (Python/Flask)
       |
       v
InfluxDB (Time-series database)
       |
       v
Grafana Dashboards + ML Pipeline

Setting Up the Webhook Server

I created a local webhook server to receive Apple Health data:

from flask import Flask, request, jsonify
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
import json
from datetime import datetime

app = Flask(__name__)

# InfluxDB configuration
client = InfluxDBClient(
    url="http://localhost:8086",
    token="your-token",
    org="health"
)
write_api = client.write_api(write_options=SYNCHRONOUS)

@app.route('/health-sync', methods=['POST'])
def health_sync():
    """Receive Apple Health data from webhook"""
    data = request.json

    points = []
    for record in data.get('data', []):
        # Determine aggregation type based on metric
        metric_type = get_metric_type(record['type'])

        point = Point(record['type']) \
            .tag("source", record.get('source', 'iphone')) \
            .tag("metric_type", metric_type) \
            .field("value", record['value']) \
            .time(record['timestamp'])

        points.append(point)

    # Batch write to InfluxDB
    write_api.write(bucket="apple_health", record=points)

    return jsonify({"status": "success", "count": len(points)})

def get_metric_type(metric_name: str) -> str:
    """Determine if metric is cumulative or instantaneous"""
    cumulative = ['steps', 'distance', 'active_energy', 'flights_climbed']
    return 'cumulative' if metric_name in cumulative else 'instantaneous'

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000, ssl_context='adhoc')

InfluxDB Bucket Setup

I created an InfluxDB bucket with appropriate retention:

# Create bucket with 2-year retention
influx bucket create \
  --name apple_health \
  --org health \
  --retention 17520h  # 2 years

# Create token with write access
influx auth create \
  --org health \
  --write-bucket apple_health \
  --description "Apple Health sync token"

Flux Queries for Grafana

Here are the queries I use in my Grafana dashboards:

Daily Steps Query

from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "steps")
  |> filter(fn: (r) => r._field == "value")
  |> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
  |> yield(name: "daily_steps")

Heart Rate (use mean, not sum)

from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "heart_rate")
  |> filter(fn: (r) => r._field == "value")
  |> aggregateWindow(every: 5m, fn: mean, createEmpty: false)
  |> yield(name: "heart_rate")

Sleep Analysis

from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "sleep_analysis")
  |> filter(fn: (r) => r._field == "value")
  |> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
  |> yield(name: "sleep_hours")

HRV Trend

from(bucket: "apple_health")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r._measurement == "heart_rate_variability")
  |> filter(fn: (r) => r._field == "value")
  |> aggregateWindow(every: 1d, fn: mean, createEmpty: false)
  |> movingAverage(n: 7)  // 7-day moving average
  |> yield(name: "hrv_trend")

Grafana Dashboard Panels

I created 6 dashboards:

Sleep Dashboard - Sleep stages, duration, quality score
HRV Dashboard - Heart rate variability trends, stress indicators
Heart Rate Dashboard - Resting HR, exercise HR, zones
VO2 Max Dashboard - Cardio fitness trends
Activity Dashboard - Steps, distance, active energy, stand hours
SpO2 Dashboard - Blood oxygen levels

Here’s a sample panel configuration:

{
  "title": "Daily Steps",
  "type": "stat",
  "targets": [
    {
      "query": "from(bucket: \"apple_health\")\n  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n  |> filter(fn: (r) => r._measurement == \"steps\")\n  |> aggregateWindow(every: 1d, fn: sum)",
      "refId": "A"
    }
  ],
  "options": {
    "graphMode": "area",
    "colorMode": "value"
  },
  "fieldConfig": {
    "defaults": {
      "unit": "short",
      "thresholds": {
        "mode": "absolute",
        "steps": [
          {"color": "red", "value": 0},
          {"color": "yellow", "value": 5000},
          {"color": "green", "value": 10000}
        ]
      }
    }
  }
}

ML Pipeline for Predictions

I added a RandomForest model for sleep quality prediction:

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from influxdb_client import InfluxDBClient
import joblib

class SleepPredictor:
    def __init__(self, influx_url: str, token: str, org: str):
        self.client = InfluxDBClient(url=influx_url, token=token, org=org)
        self.model = None

    def fetch_training_data(self, days: int = 90):
        """Fetch last 90 days of health data"""
        query = f'''
        from(bucket: "apple_health")
          |> range(start: -{days}d)
          |> filter(fn: (r) =>
            r._measurement == "steps" or
            r._measurement == "heart_rate" or
            r._measurement == "active_energy"
          )
          |> aggregateWindow(every: 1d, fn: sum)
        '''

        query_api = self.client.query_api()
        result = query_api.query_data_frame(query)
        return result

    def train_model(self):
        """Train sleep quality predictor"""
        df = self.fetch_training_data()

        # Feature engineering
        features = df.pivot_table(
            index='_time',
            columns='_measurement',
            values='_value'
        ).fillna(0)

        features['sleep_quality'] = features['sleep_analysis'].apply(
            lambda x: 1 if x >= 7 else 0  # Binary: good sleep >= 7 hours
        )

        X = features[['steps', 'heart_rate', 'active_energy']]
        y = features['sleep_quality']

        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=0.2, random_state=42
        )

        self.model = RandomForestClassifier(n_estimators=100)
        self.model.fit(X_train, y_train)

        # Save model
        joblib.dump(self.model, 'sleep_model.joblib')

        return self.model.score(X_test, y_test)

# Cron job: Retrain every Sunday at 3 AM
# 0 3 * * 0 /usr/bin/python3 /path/to/sleep-predictor.py --train

Automation with Cron

I set up automated tasks:

# Sync Apple Health data every 5 minutes
*/5 * * * * /usr/bin/python3 /opt/health/sync.py

# Retrain ML model weekly (Sunday 3 AM)
0 3 * * 0 /usr/bin/python3 /opt/health/train_model.py

# Daily backup to S3
0 2 * * * /usr/bin/influx backup /tmp/backup && aws s3 sync /tmp/backup s3://my-bucket/health-backup/

Common Mistakes

Mistake 1: Using mean() for Cumulative Metrics

// WRONG: Returns average per-minute value (~5)
|> aggregateWindow(every: 1d, fn: mean)

// CORRECT: Returns total daily value (~8500)
|> aggregateWindow(every: 1d, fn: sum)

Mistake 2: Not Handling Missing Data

// WRONG: Creates gaps in visualization
|> aggregateWindow(every: 1d, fn: mean)

// CORRECT: Fill gaps with interpolation
|> aggregateWindow(every: 1d, fn: mean, createEmpty: false)
|> fill(usePrevious: true)

Mistake 3: Wrong Timezone

// WRONG: Data appears at wrong times
|> aggregateWindow(every: 1d, fn: sum)

// CORRECT: Use local timezone
import "timezone"
option location = timezone.location(name: "America/New_York")
|> aggregateWindow(every: 1d, fn: sum)

Why This Matters

Building this pipeline gave me:

Holistic Health View - Correlations between sleep, activity, HRV on one screen
Proactive Health Management - Anomaly detection before issues become problems
ML Predictions - Predict sleep quality from daily activity patterns
Privacy - All data stays on my local infrastructure

The most valuable insight was discovering that my HRV drops significantly 2 days before I get sick, giving me early warning to rest.

Summary

In this post, I showed how to export Apple Health data to InfluxDB for custom Grafana dashboards. The critical gotcha is using sum() aggregation for cumulative metrics like steps, distance, and calories, not mean(). Apple Health exports per-minute granules, so you must aggregate correctly or your queries will return wrong values.

The complete pipeline includes: a local webhook server for data sync, InfluxDB for time-series storage, Grafana for visualization, and an ML model for predictions. With proper aggregation, you get accurate health dashboards that reveal patterns invisible in the Health app.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit: I gave my home a brain. Here's what 50 days of self-hosted AI looks like
👨‍💻 InfluxDB Documentation
👨‍💻 Grafana Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!