How Do I Prevent Partial File Writes in Python When My Process Crashes?
The Problem
I had a Python script that wrote configuration files. One day, my server crashed during a write operation. When I restarted the application, it failed with:
JSONDecodeError: Expecting value: line 1 column 1 (char 0)I checked the config file and found this:
{"database_url": "postgresql://localhost:5432/mydb", "api_kThe file was truncated mid-write. The process crashed before completing the write operation, leaving behind corrupted data. This wasn’t just annoying - it broke my entire application on restart.
The Root Cause
The standard Python open() function writes directly to the target file:
import json
config = { "database_url": "postgresql://localhost:5432/mydb", "api_key": "secret123", "settings": {"timeout": 30}}
with open('config.json', 'w') as f: json.dump(config, f, indent=2) # If process crashes here, file is partially writtenThe problem is that open() starts writing immediately to the target file. If anything interrupts the write - a crash, power failure, SIGKILL, or out-of-memory error - you get a partial file.
Time 0ms: File created, emptyTime 1ms: Writing "{"database_url": ..."Time 10ms: CRASH!Time 11ms: File contains partial content, no closing bracketResult: Corrupted file that breaks JSON parsersThis happens because there’s no atomicity. The write operation has no rollback mechanism.
The Solution: Atomic Writes with safer
The safer library implements the industry-standard “write to temporary, then rename” pattern:
Step 1: Create temp file in same directory (config.json.tmp)Step 2: Write all content to temp fileStep 3: Flush and close temp fileStep 4: Atomic rename temp -> target (only if no exceptions)Step 5: On exception: Delete temp file, target unchangedHere’s the fix:
import jsonimport safer
config = { "database_url": "postgresql://localhost:5432/mydb", "api_key": "secret123", "settings": {"timeout": 30}}
with safer.open('config.json', 'w') as f: json.dump(config, f, indent=2) # File only written when context exits successfullyThe key difference: safer.open() instead of open(). That’s it. One character change.
If the process crashes mid-write now:
Time 0ms: Temp file created (config.json.tmp)Time 1ms: Writing to temp file...Time 10ms: CRASH!Time 11ms: Temp file deleted automaticallyResult: config.json UNCHANGED (original content intact)Why This Matters
Data Integrity
Your files are never in an inconsistent state:
Either: Complete new version exists (write succeeded)Or: Old version remains untouched (write failed/crashed)Never: Partial content (the problem we solved)Production Reliability
Crashes don’t cascade into data corruption. My application restarts cleanly because config files are never corrupted.
Simplified Error Handling
No complex try/except blocks needed:
# BEFORE: Manual error handling (verbose, error-prone)try: temp_file = tempfile.NamedTemporaryFile(delete=False, mode='w') json.dump(config, temp_file) temp_file.close() os.rename(temp_file.name, 'config.json')except Exception: os.unlink(temp_file.name) raise
# AFTER: Automatic with safer (clean, reliable)with safer.open('config.json', 'w') as f: json.dump(config, f)Trial and Error: What I Tried First
Attempt 1: Manual fsync
I thought flushing would help:
with open('config.json', 'w') as f: json.dump(config, f) f.flush() os.fsync(f.fileno()) # Force write to diskResult: Still vulnerable. fsync ensures data reaches disk, but doesn’t protect against partial writes during crashes.
Attempt 2: Roll My Own Atomic Write
import osimport tempfile
temp_file = tempfile.NamedTemporaryFile(delete=False, mode='w')try: json.dump(config, temp_file) temp_file.close() os.rename(temp_file.name, 'config.json')except Exception: os.unlink(temp_file.name) raiseResult: Works, but verbose. And I forgot edge cases:
- What if the directory doesn’t exist?
- What if rename fails on different filesystems?
- What about file permissions?
Every file write became a 10-line boilerplate. I gave up.
Attempt 3: The safer Library
import safer
with safer.open('config.json', 'w') as f: json.dump(config, f)Result: Perfect. Drop-in replacement for open(). No boilerplate. Handles all edge cases.
Advanced Usage
Stream Processing with Large Files
import safer
def process_large_dataset(input_file, output_file): with open(input_file, 'r') as infile: with safer.open(output_file, 'w') as outfile: for line in infile: processed = transform(line) outfile.write(processed) # Even with millions of lines, file is protectedIf the process crashes after processing 1 million lines, the output file doesn’t exist yet. No partial output to deal with.
Custom Context Manager
from contextlib import contextmanagerimport safer
@contextmanagerdef safe_file_update(filename, mode='w'): with safer.open(filename, mode) as f: try: yield f except Exception as e: print(f"Error updating {filename}: {e}") raise # Cleanup handled automatically by safer
# Usagewith safe_file_update('data.json') as f: json.dump(data, f)Common Mistakes to Avoid
Mistake 1: Thinking Small Files Are Safe
# WRONG: Assuming small writes can't be interruptedconfig = json.dumps(data) # Maybe 200 byteswith open('config.json', 'w') as f: f.write(config) # Can still be interrupted!Even small writes can be interrupted. A crash at the wrong moment still produces partial files.
Mistake 2: Only Using safer for “Important” Files
# WRONG: Inconsistent approachwith safer.open('config.json', 'w') as f: # "Important" json.dump(config, f)
with open('temp.log', 'w') as f: # "Not important" f.write(log_data) # But what if temp.log is needed for recovery?Use safer for ALL file writes. Consistency prevents surprises.
Mistake 3: Not Installing Before You Need It
pip install saferDon’t wait for your first corrupted file. Install safer now and make it your default.
Installation
pip install saferThe library is tiny and has no dependencies beyond Python’s standard library.
Summary
Partial file writes are a silent reliability killer. They corrupt data without warning and leave systems in unrecoverable states. The safer library provides a simple, battle-tested solution:
- Drop-in replacement: Change
opentosafer.open - Atomic writes: Write to temp file, then rename on success
- Automatic cleanup: On failure, temp file deleted, target unchanged
- Zero boilerplate: No manual error handling needed
The library has been tested on millions of machines through inclusion in other projects. It handles edge cases I would have missed in manual implementations.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 safer library on PyPI
- 👨💻 safer library GitHub repository
- 👨💻 Reddit discussion on safer library
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments