Skip to content

How to Export Apple Health Data to InfluxDB for Grafana Dashboards

Problem

I wanted to visualize my Apple Health data in custom dashboards beyond what the Health app provides. Apple’s Health app has basic charts, but I needed:

  • Cross-metric correlation (sleep vs. activity, HRV vs. stress)
  • Long-term trend analysis with advanced queries
  • ML-based predictions for sleep quality
  • A single dashboard combining all health metrics

I tried exporting data to InfluxDB and building Grafana dashboards. Then I hit a critical issue that made my queries return wrong values.

The Critical Gotcha

My step count queries were returning “5 steps today” instead of thousands. I spent hours debugging before discovering the problem:

Apple Health exports step counts as per-minute granules, not daily totals.

apple-health-data-structure.txt
# What I expected:
2024-01-15, steps: 8500
# What Apple Health actually exports:
2024-01-15 09:00, steps: 15
2024-01-15 09:01, steps: 23
2024-01-15 09:02, steps: 8
... (1440 rows per day)

When I used mean() aggregation, I got the average of per-minute values, not the total. The fix:

wrong-steps-query.flux
// WRONG: Returns ~5 steps (average per minute)
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "steps")
|> mean() // This averages per-minute granules
correct-steps-query.flux
// CORRECT: Returns ~8500 steps (sum of all granules)
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "steps")
|> sum() // Sum all per-minute values

Use sum() for cumulative metrics (steps, distance, calories). Use mean() for point-in-time metrics (heart rate, blood oxygen).

Solution Architecture

I set up this pipeline:

architecture.txt
iPhone Health App
|
v
Local Webhook Server (Python/Flask)
|
v
InfluxDB (Time-series database)
|
v
Grafana Dashboards + ML Pipeline

Setting Up the Webhook Server

I created a local webhook server to receive Apple Health data:

health-webhook-server.py
from flask import Flask, request, jsonify
from influxdb_client import InfluxDBClient, Point
from influxdb_client.client.write_api import SYNCHRONOUS
import json
from datetime import datetime
app = Flask(__name__)
# InfluxDB configuration
client = InfluxDBClient(
url="http://localhost:8086",
token="your-token",
org="health"
)
write_api = client.write_api(write_options=SYNCHRONOUS)
@app.route('/health-sync', methods=['POST'])
def health_sync():
"""Receive Apple Health data from webhook"""
data = request.json
points = []
for record in data.get('data', []):
# Determine aggregation type based on metric
metric_type = get_metric_type(record['type'])
point = Point(record['type']) \
.tag("source", record.get('source', 'iphone')) \
.tag("metric_type", metric_type) \
.field("value", record['value']) \
.time(record['timestamp'])
points.append(point)
# Batch write to InfluxDB
write_api.write(bucket="apple_health", record=points)
return jsonify({"status": "success", "count": len(points)})
def get_metric_type(metric_name: str) -> str:
"""Determine if metric is cumulative or instantaneous"""
cumulative = ['steps', 'distance', 'active_energy', 'flights_climbed']
return 'cumulative' if metric_name in cumulative else 'instantaneous'
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000, ssl_context='adhoc')

InfluxDB Bucket Setup

I created an InfluxDB bucket with appropriate retention:

influxdb-setup.sh
# Create bucket with 2-year retention
influx bucket create \
--name apple_health \
--org health \
--retention 17520h # 2 years
# Create token with write access
influx auth create \
--org health \
--write-bucket apple_health \
--description "Apple Health sync token"

Flux Queries for Grafana

Here are the queries I use in my Grafana dashboards:

Daily Steps Query

daily-steps-query.flux
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "steps")
|> filter(fn: (r) => r._field == "value")
|> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
|> yield(name: "daily_steps")

Heart Rate (use mean, not sum)

heart-rate-query.flux
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "heart_rate")
|> filter(fn: (r) => r._field == "value")
|> aggregateWindow(every: 5m, fn: mean, createEmpty: false)
|> yield(name: "heart_rate")

Sleep Analysis

sleep-query.flux
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "sleep_analysis")
|> filter(fn: (r) => r._field == "value")
|> aggregateWindow(every: 1d, fn: sum, createEmpty: false)
|> yield(name: "sleep_hours")

HRV Trend

hrv-trend-query.flux
from(bucket: "apple_health")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r._measurement == "heart_rate_variability")
|> filter(fn: (r) => r._field == "value")
|> aggregateWindow(every: 1d, fn: mean, createEmpty: false)
|> movingAverage(n: 7) // 7-day moving average
|> yield(name: "hrv_trend")

Grafana Dashboard Panels

I created 6 dashboards:

  1. Sleep Dashboard - Sleep stages, duration, quality score
  2. HRV Dashboard - Heart rate variability trends, stress indicators
  3. Heart Rate Dashboard - Resting HR, exercise HR, zones
  4. VO2 Max Dashboard - Cardio fitness trends
  5. Activity Dashboard - Steps, distance, active energy, stand hours
  6. SpO2 Dashboard - Blood oxygen levels

Here’s a sample panel configuration:

grafana-panel-steps.json
{
"title": "Daily Steps",
"type": "stat",
"targets": [
{
"query": "from(bucket: \"apple_health\")\n |> range(start: v.timeRangeStart, stop: v.timeRangeStop)\n |> filter(fn: (r) => r._measurement == \"steps\")\n |> aggregateWindow(every: 1d, fn: sum)",
"refId": "A"
}
],
"options": {
"graphMode": "area",
"colorMode": "value"
},
"fieldConfig": {
"defaults": {
"unit": "short",
"thresholds": {
"mode": "absolute",
"steps": [
{"color": "red", "value": 0},
{"color": "yellow", "value": 5000},
{"color": "green", "value": 10000}
]
}
}
}
}

ML Pipeline for Predictions

I added a RandomForest model for sleep quality prediction:

sleep-predictor.py
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from influxdb_client import InfluxDBClient
import joblib
class SleepPredictor:
def __init__(self, influx_url: str, token: str, org: str):
self.client = InfluxDBClient(url=influx_url, token=token, org=org)
self.model = None
def fetch_training_data(self, days: int = 90):
"""Fetch last 90 days of health data"""
query = f'''
from(bucket: "apple_health")
|> range(start: -{days}d)
|> filter(fn: (r) =>
r._measurement == "steps" or
r._measurement == "heart_rate" or
r._measurement == "active_energy"
)
|> aggregateWindow(every: 1d, fn: sum)
'''
query_api = self.client.query_api()
result = query_api.query_data_frame(query)
return result
def train_model(self):
"""Train sleep quality predictor"""
df = self.fetch_training_data()
# Feature engineering
features = df.pivot_table(
index='_time',
columns='_measurement',
values='_value'
).fillna(0)
features['sleep_quality'] = features['sleep_analysis'].apply(
lambda x: 1 if x >= 7 else 0 # Binary: good sleep >= 7 hours
)
X = features[['steps', 'heart_rate', 'active_energy']]
y = features['sleep_quality']
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
self.model = RandomForestClassifier(n_estimators=100)
self.model.fit(X_train, y_train)
# Save model
joblib.dump(self.model, 'sleep_model.joblib')
return self.model.score(X_test, y_test)
# Cron job: Retrain every Sunday at 3 AM
# 0 3 * * 0 /usr/bin/python3 /path/to/sleep-predictor.py --train

Automation with Cron

I set up automated tasks:

health-cron.txt
# Sync Apple Health data every 5 minutes
*/5 * * * * /usr/bin/python3 /opt/health/sync.py
# Retrain ML model weekly (Sunday 3 AM)
0 3 * * 0 /usr/bin/python3 /opt/health/train_model.py
# Daily backup to S3
0 2 * * * /usr/bin/influx backup /tmp/backup && aws s3 sync /tmp/backup s3://my-bucket/health-backup/

Common Mistakes

Mistake 1: Using mean() for Cumulative Metrics

mistake-mean-vs-sum.flux
// WRONG: Returns average per-minute value (~5)
|> aggregateWindow(every: 1d, fn: mean)
// CORRECT: Returns total daily value (~8500)
|> aggregateWindow(every: 1d, fn: sum)

Mistake 2: Not Handling Missing Data

missing-data-handling.flux
// WRONG: Creates gaps in visualization
|> aggregateWindow(every: 1d, fn: mean)
// CORRECT: Fill gaps with interpolation
|> aggregateWindow(every: 1d, fn: mean, createEmpty: false)
|> fill(usePrevious: true)

Mistake 3: Wrong Timezone

timezone-fix.flux
// WRONG: Data appears at wrong times
|> aggregateWindow(every: 1d, fn: sum)
// CORRECT: Use local timezone
import "timezone"
option location = timezone.location(name: "America/New_York")
|> aggregateWindow(every: 1d, fn: sum)

Why This Matters

Building this pipeline gave me:

  1. Holistic Health View - Correlations between sleep, activity, HRV on one screen
  2. Proactive Health Management - Anomaly detection before issues become problems
  3. ML Predictions - Predict sleep quality from daily activity patterns
  4. Privacy - All data stays on my local infrastructure

The most valuable insight was discovering that my HRV drops significantly 2 days before I get sick, giving me early warning to rest.

Summary

In this post, I showed how to export Apple Health data to InfluxDB for custom Grafana dashboards. The critical gotcha is using sum() aggregation for cumulative metrics like steps, distance, and calories, not mean(). Apple Health exports per-minute granules, so you must aggregate correctly or your queries will return wrong values.

The complete pipeline includes: a local webhook server for data sync, InfluxDB for time-series storage, Grafana for visualization, and an ML model for predictions. With proper aggregation, you get accurate health dashboards that reveal patterns invisible in the Health app.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments