How to Persist Python Data Without a Database

Mar 13, 2026

Problem

I needed to persist data between Python script runs. Not complex data - just a simple cache of user preferences and the last run timestamp for a CLI tool I was building.

My first instinct was to reach for SQLite. After all, it’s built into Python and handles persistence well. But setting up a database for a simple key-value cache felt like overkill. I’d need to design a schema, write SQL queries, manage connections… all for storing a handful of values.

Then I tried JSON files. Simple enough - just dump the data to a file and read it back. But I quickly ran into issues:

import json

# Problem 1: Must read/write entire file every time
with open('cache.json', 'r') as f:
    data = json.load(f)

data['user'] = 'Alice'
data['score'] = 42

# Problem 2: No random access - rewrite everything
with open('cache.json', 'w') as f:
    json.dump(data, f)

# Problem 3: Manual serialization for complex objects

This approach worked, but it was tedious. Every time I needed to update a single value, I had to read the entire file, modify it, and write it all back. For a simple script, this felt unnecessarily complex.

I also considered pickle, but it had the same issues as JSON - read everything, modify, write everything back. Plus, pickle files aren’t human-readable, which made debugging harder.

Solution

Then I discovered the shelve module - a hidden gem in Python’s standard library that provides persistent, dictionary-like storage backed by a file.

import shelve

# Open, write, close - that's it
with shelve.open('my_cache') as db:
    db['user_data'] = {'name': 'Alice', 'score': 42}
    db['last_run'] = '2024-01-15'

# Next script run - data is still there
with shelve.open('my_cache') as db:
    print(db['user_data'])  # {'name': 'Alice', 'score': 42}

That’s the entire API. Open it like a dictionary, write to it, close it, and your data persists. No schema design, no SQL, no connection strings.

How It Works

Shelve uses Python’s dbm module to create a simple key-value store, with pickle handling serialization automatically. Each key is a string, and values can be any picklable Python object.

import shelve

# Create or open a shelf
with shelve.open('app_data') as db:
    # Store various data types
    db['config'] = {'debug': True, 'timeout': 30}
    db['users'] = ['alice', 'bob', 'charlie']
    db['counter'] = 42

# Read back in another session
with shelve.open('app_data') as db:
    print(db['config'])    # {'debug': True, 'timeout': 30}
    print(db['users'])     # ['alice', 'bob', 'charlie']
    print(db['counter'])   # 42

The with statement ensures the shelf is properly closed, which is critical for data integrity.

The Mutable Object Trap

Here’s where I made a mistake. I tried to update a nested value in a stored dictionary:

import shelve

with shelve.open('app_data') as db:
    db['config'] = {'debug': False, 'timeout': 30}

# Later, I tried to update a nested value
with shelve.open('app_data') as db:
    config = db['config']
    config['debug'] = True  # This doesn't persist!
    # db['config'] is still {'debug': False, 'timeout': 30}

The issue is that db['config'] returns a copy, not a reference. Modifying it doesn’t update the shelf.

There are two solutions:

Solution 1: Write back the modified object

import shelve

with shelve.open('app_data') as db:
    config = db['config']
    config['debug'] = True
    db['config'] = config  # Explicitly write back

Solution 2: Use writeback mode

import shelve

# Enable writeback mode
with shelve.open('app_data', writeback=True) as db:
    config = db['config']
    config['debug'] = True  # Automatically persisted on close

Writeback mode is convenient but has a memory cost - all accessed objects are cached in memory until the shelf is closed. For most scripts, this isn’t an issue, but be aware of it for large datasets.

When to Use Shelve vs Alternatives

I’ve learned that shelve fills a specific niche. Here’s when I reach for each tool:

| Use Case                              | Recommended Tool |
|---------------------------------------|------------------|
| Simple key-value cache for scripts    | Shelve           |
| Configuration storage for CLI tools    | Shelve           |
| Prototyping before database design     | Shelve           |
| Need SQL queries                      | SQLite           |
| Multiple processes accessing same data| SQLite           |
| Human-readable data files             | JSON             |
| Cross-language compatibility          | JSON             |

Shelve is perfect for single-process scripts that need persistence without the overhead of a database. But it has limitations:

Not thread-safe: Don’t use it in multi-threaded applications without locking
Not process-safe: Multiple processes shouldn’t access the same shelf file
Platform-dependent: The underlying dbm implementation varies by platform
Not human-readable: Files are binary, so you can’t inspect them with a text editor

Real-World Example

Here’s how I use shelve in a CLI tool that tracks API rate limits:

import shelve
import time
from datetime import datetime, timedelta

class RateLimiter:
    def __init__(self, shelf_path='rate_limits'):
        self.shelf_path = shelf_path

    def can_make_request(self, api_name, max_per_hour=100):
        """Check if we can make a request without exceeding rate limit."""
        with shelve.open(self.shelf_path) as db:
            key = f'{api_name}_requests'
            requests = db.get(key, [])

            # Filter to requests in the last hour
            cutoff = datetime.now() - timedelta(hours=1)
            requests = [r for r in requests if r > cutoff]

            if len(requests) >= max_per_hour:
                return False

            # Record this request
            requests.append(datetime.now())
            db[key] = requests
            return True

    def get_remaining(self, api_name, max_per_hour=100):
        """Get remaining requests for this hour."""
        with shelve.open(self.shelf_path) as db:
            key = f'{api_name}_requests'
            requests = db.get(key, [])

            cutoff = datetime.now() - timedelta(hours=1)
            requests = [r for r in requests if r > cutoff]

            return max(0, max_per_hour - len(requests))

# Usage
limiter = RateLimiter()
if limiter.can_make_request('github_api'):
    # Make the API call
    pass
else:
    print(f"Rate limit reached. {limiter.get_remaining('github_api')} remaining")

This would have been more complex with JSON (reading/writing the entire file each time) or SQLite (schema setup, SQL queries). With shelve, it’s just a few lines.

Common Mistakes to Avoid

Forgetting to close the shelf: Always use with shelve.open() to ensure proper cleanup.
Not handling mutable object updates: Remember that db['key'] returns a copy unless you use writeback mode.
Using shelve for concurrent access: Shelve is not thread-safe or process-safe. Use SQLite for concurrent access.
Storing unpicklable objects: Some objects can’t be pickled (e.g., file handles, database connections). Stick to basic data types.
Assuming cross-platform compatibility: The underlying dbm implementation varies, so shelf files may not be portable between operating systems.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!