Skip to content

How to Build an Autosave System with Redis Caching and Celery Background Workers

Problem

I wanted to build a collaborative document editor—something like Google Docs where multiple users can edit simultaneously. But I quickly ran into a fundamental problem: saving on every keystroke would absolutely hammer my database.

Think about it: a fast typist might type 10 characters per second. With 10 concurrent users, that’s 100 database writes per second. With 100 users? 1,000 writes per second. My PostgreSQL database was not happy.

I needed an autosave system that felt instant to users but didn’t destroy my database. The solution turned out to be a write-behind caching pattern using Redis and Celery.

Environment

I built this with the following stack:

Development Environment
Python: 3.11.x
Django: 5.x
Redis: 7.x
Celery: 5.x
PostgreSQL: 16.x

What Happened When I Tried Naive Autosave

First, I implemented the simplest approach: save on every change:

naive_autosave.py
# views.py
from django.views import View
from django.http import JsonResponse
from .models import Document
class DocumentUpdateView(View):
def post(self, request, doc_id):
document = Document.objects.get(id=doc_id)
content = request.POST.get('content', '')
# Save EVERY keystroke to database
document.content = content
document.save() # Database write on every character!
return JsonResponse({'status': 'ok'})

This worked for one user. Then I load-tested with 50 concurrent users editing different documents:

Load Test Results
Requests per second: 45 (target was 500)
Average response time: 1.2 seconds
PostgreSQL CPU: 98%
Connection pool: Exhausted
Errors: "connection refused" after 30 seconds

The database became a bottleneck. Each write requires disk I/O, index updates, and potential lock contention. For a collaborative editor, this architecture was fundamentally broken.

Then I tried throttling saves on the frontend:

frontend-throttle.js
// WRONG: Delegating throttling to frontend
let saveTimeout
editor.on('change', (content) => {
clearTimeout(saveTimeout)
saveTimeout = setTimeout(() => {
fetch('/save', { method: 'POST', body: content })
}, 2000) // Save after 2 seconds of inactivity
})

This helped, but created new problems:

  • Users lose up to 2 seconds of work if they close the tab
  • Multiple users editing the same document overwrite each other
  • No visibility into what’s saved vs. what’s in-flight

I needed a backend solution, not a frontend hack.

How to Solve It: Write-Behind Caching with Redis + Celery

The solution is a write-behind caching pattern: store active edits in Redis, then periodically flush to the database using Celery background workers.

Step 1: Store Document Edits in Redis

Instead of writing to the database immediately, I store edits in Redis:

redis_cache.py
import redis
import json
from django.conf import settings
redis_client = redis.Redis.from_url(settings.CELERY_BROKER_URL)
def cache_document_edit(doc_id, user_id, content):
"""
Store document edit in Redis cache.
Key format: active_doc:{doc_id}
"""
cache_key = f'active_doc:{doc_id}'
# Store content and metadata
cache_data = {
'content': content,
'last_modified': time.time(),
'modified_by': user_id
}
# Set with 10-minute TTL (safety net for cleanup)
redis_client.setex(
cache_key,
600, # 10 minutes
json.dumps(cache_data)
)
# Track this document as "active" for the Celery worker
redis_client.sadd('active_documents', doc_id)
return True

Step 2: Create Celery Periodic Task for Persistence

A Celery worker runs periodically to persist cached changes:

celery_tasks.py
from celery import shared_task
from celery.schedules import crontab
from django.conf import settings
import redis
import json
redis_client = redis.Redis.from_url(settings.CELERY_BROKER_URL)
@shared_task
def persist_active_documents():
"""
Periodically flush Redis cache to database.
Runs every 30 seconds.
"""
# Get all active document IDs
active_doc_ids = redis_client.smembers('active_documents')
if not active_doc_ids:
return {'persisted': 0}
persisted_count = 0
errors = []
for doc_id in active_doc_ids:
doc_id = doc_id.decode() if isinstance(doc_id, bytes) else doc_id
cache_key = f'active_doc:{doc_id}'
try:
# Get cached data
cached = redis_client.get(cache_key)
if not cached:
# Document no longer in cache, remove from active set
redis_client.srem('active_documents', doc_id)
continue
data = json.loads(cached)
# Persist to database
from .models import Document
Document.objects.filter(id=doc_id).update(
content=data['content'],
last_modified=timezone.now()
)
persisted_count += 1
# Clear from cache after successful persist
redis_client.delete(cache_key)
redis_client.srem('active_documents', doc_id)
except Exception as e:
errors.append({'doc_id': doc_id, 'error': str(e)})
return {
'persisted': persisted_count,
'errors': errors
}

Step 3: Configure Celery Beat Schedule

Set up the periodic task in Celery configuration:

celery_config.py
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
'persist-documents-every-30-seconds': {
'task': 'documents.tasks.persist_active_documents',
'schedule': 30.0, # Run every 30 seconds
},
}

Step 4: Update the View to Use Cache

The view now writes to Redis instead of directly to the database:

views.py
from django.views import View
from django.http import JsonResponse
from django.views.decorators.csrf import csrf_exempt
from django.utils.decorators import method_decorator
from .cache import cache_document_edit
@method_decorator(csrf_exempt, name='dispatch')
class DocumentUpdateView(View):
def post(self, request, doc_id):
content = request.POST.get('content', '')
user_id = request.user.id
# Fast Redis write instead of slow DB write
cache_document_edit(doc_id, user_id, content)
# Return immediately - Celery will persist later
return JsonResponse({
'status': 'cached',
'doc_id': doc_id
})
class DocumentReadView(View):
def get(self, request, doc_id):
from .cache import get_document_content
# Check Redis first for latest content
content = get_document_content(doc_id)
if content is None:
# Fall back to database
from .models import Document
doc = Document.objects.get(id=doc_id)
content = doc.content
return JsonResponse({'content': content})

Step 5: Handle Cleanup When Users Leave

When all users leave a document, immediately persist and clean up:

cleanup.py
import json
import redis
from django.utils import timezone
redis_client = redis.Redis.from_url(settings.CELERY_BROKER_URL)
def handle_user_leave(doc_id, user_id):
"""
Called when a user leaves a document.
If no users remain, persist immediately.
"""
# Remove user from document's active users set
active_users_key = f'doc_users:{doc_id}'
redis_client.srem(active_users_key, user_id)
# Check if any users remain
remaining_users = redis_client.scard(active_users_key)
if remaining_users == 0:
# No more users - persist immediately
cache_key = f'active_doc:{doc_id}'
cached = redis_client.get(cache_key)
if cached:
data = json.loads(cached)
from .models import Document
Document.objects.filter(id=doc_id).update(
content=data['content'],
last_modified=timezone.now()
)
# Clean up Redis
redis_client.delete(cache_key)
redis_client.delete(active_users_key)
redis_client.srem('active_documents', doc_id)

Step 6: Multi-Purpose Redis Setup

Following the pattern from PyTogether, I reused Redis for multiple purposes:

Redis Multi-Use Architecture
┌─────────────────────────────────────────────────────────┐
│ Redis Server │
├─────────────────────────────────────────────────────────┤
│ 1. Document Cache (active_doc:*) │
│ - Temporary storage for unsaved edits │
│ - 10-minute TTL for safety │
│ │
│ 2. Active Documents Set │
│ - Track which docs need persistence │
│ - Used by Celery worker to iterate │
│ │
│ 3. Channel Layer (django-channels) │
│ - Real-time WebSocket updates │
│ - Redis pub/sub for multi-server scaling │
│ │
│ 4. Celery Broker │
│ - Task queue for background jobs │
│ - Periodic task scheduling │
└─────────────────────────────────────────────────────────┘

This consolidation reduces infrastructure complexity while keeping everything fast.

The Reason: Why Write-Behind Works

The write-behind caching pattern works because of a simple observation: most edits happen in bursts, not evenly distributed over time.

Performance Comparison

Performance Metrics
Naive Approach:
- 100 users x 10 edits/sec = 1,000 DB writes/sec
- PostgreSQL: Overloaded, 1+ second latency
Write-Behind Approach:
- 100 users x 10 edits/sec = 10,000 Redis ops/sec (trivial)
- Celery task: 1 batch write every 30 seconds
- PostgreSQL: ~3 writes/sec average
- User experience: Sub-10ms response time

Latency Breakdown

Operation Latency
Redis SET: 0.1-1ms (in-memory)
PostgreSQL UPDATE: 10-50ms (disk I/O, indexes, WAL)
Ratio: 10-500x faster

For autosave, users only need to know their content is “safe.” Redis provides that assurance instantly, while the database write happens asynchronously.

Data Durability Considerations

I know what you’re thinking: “What if Redis crashes before the Celery worker persists?”

Risk Analysis
Redis persistence options:
1. RDB snapshots (default): May lose last few minutes
2. AOF (append-only file): Loses at most 1 second
3. AOF + fsync everysec: Best durability, slight performance cost
For collaborative docs: AOF everysec is acceptable
Users can tolerate redoing 1 second of work in disaster scenarios

Configure Redis for durability:

redis.conf
appendonly yes
appendfsync everysec

Scalability Benefits

Scaling Paths
Vertical: Bigger Redis instance (millions of ops/sec)
Horizontal: Redis Cluster for sharding
Celery: Add more workers for faster persistence

The architecture scales naturally: Redis handles the write spike, Celery smooths out the database load.

Common Mistakes I Made

Mistake 1: No TTL on cached documents

no-ttl-wrong.py
# WRONG: Cache without expiration
redis_client.set(cache_key, json.dumps(data))

If a user closes their browser without the cleanup running, the document stays in cache forever. Always set a TTL:

with-ttl-correct.py
# CORRECT: 10-minute TTL as safety net
redis_client.setex(cache_key, 600, json.dumps(data))

Mistake 2: Not tracking active users

Without tracking who’s editing, you can’t know when to immediately persist:

no-user-tracking-wrong.py
# WRONG: No user tracking
def document_update(doc_id, content):
cache_document_edit(doc_id, content)
# How do we know when to persist?

The fix is tracking active users per document:

user-tracking-correct.py
# CORRECT: Track active users
def user_join_document(doc_id, user_id):
redis_client.sadd(f'doc_users:{doc_id}', user_id)
def user_leave_document(doc_id, user_id):
redis_client.srem(f'doc_users:{doc_id}', user_id)
remaining = redis_client.scard(f'doc_users:{doc_id}')
if remaining == 0:
persist_immediately(doc_id)

Mistake 3: Celery task failures silently losing data

If the Celery task fails, cached content could be lost:

silent-failure-wrong.py
# WRONG: Silent failure
@shared_task
def persist_active_documents():
for doc_id in active_docs:
# If this fails, data is lost!
persist_to_db(doc_id)

Always log errors and implement retry:

proper-error-handling-correct.py
# CORRECT: Proper error handling and retry
@shared_task(bind=True, max_retries=3)
def persist_active_documents(self):
for doc_id in active_docs:
try:
persist_to_db(doc_id)
except Exception as e:
logger.error(f"Failed to persist {doc_id}: {e}")
# Keep in active set for retry

Mistake 4: Reading from database when cache has newer content

The read path must check Redis first:

read-order-wrong.py
# WRONG: Always read from database
def get_document(doc_id):
return Document.objects.get(id=doc_id).content

Users will see stale content after editing:

read-order-correct.py
# CORRECT: Check cache first
def get_document(doc_id):
# Check Redis for latest
cached = redis_client.get(f'active_doc:{doc_id}')
if cached:
return json.loads(cached)['content']
# Fall back to database
return Document.objects.get(id=doc_id).content

Mistake 5: Single Redis instance without backup

For production, running a single Redis instance is risky:

Production Redis Setup
Recommended:
- Redis Sentinel for automatic failover
- Or Redis Cluster for sharding + high availability
- Regular backups (RDB snapshots to S3)
Minimum for production:
- Primary + Replica
- Sentinel for monitoring and failover

Summary

In this post, I showed how to build an efficient autosave system using Redis caching and Celery background workers:

  • Write-behind caching prevents database overload: Store edits in Redis, persist periodically with Celery
  • Redis operations are 10-500x faster than database writes: Sub-millisecond latency for user actions
  • Cleanup on user exit ensures data durability: Persist immediately when all users leave
  • Multi-purpose Redis reduces infrastructure: Use the same Redis for cache, Celery broker, and real-time updates

The key insight from the Reddit discussion about PyTogether: This is exactly the pattern used in production collaborative editors. Redis handles the write burst, Celery smooths the database load, and users get instant feedback without risking data loss.

If you’re building a collaborative application, start with this pattern. It’s simpler than it appears, and it scales from dozens to thousands of concurrent users.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments