How to Process Large Email Batches with AI: Context Window Limits Explained
The Problem
I had 15,000 emails in my inbox and wanted AI to help clean them up. My first attempt failed spectacularly - I fed everything to Claude and hit the context window limit. The AI simply couldn’t process that much data at once.
LLMs have finite context windows (200K tokens for Claude). When you’re processing thousands of emails, you need a batching strategy. Here’s what I learned about making it work.
Why Batching Matters
Each email takes roughly 500-2000 tokens depending on length. With Claude’s 200K token limit:
Safe batch calculation:- Context window: 200,000 tokens- Average email: 1,500 tokens- Room for prompt/response: 50,000 tokens- Safe batch size: ~100 emailsTry to process more, and you’ll get truncated results or errors.
Batch Processing Strategies
Strategy 1: Linear Batch Processing O(n)
The simplest approach - process everything in fixed-size batches.
def process_inbox_linear(emails, batch_size=100): results = [] for i in range(0, len(emails), batch_size): batch = emails[i:i + batch_size] # Send to AI for classification batch_results = classify_batch(batch) results.extend(batch_results) return resultsASCII diagram of the flow:
+--------+ +--------+ +--------+ +--------+| Batch 1 | -> | Batch 2 | -> | Batch 3 | -> | Batch N || 100 msgs| | 100 msgs| | 100 msgs| | 100 msgs|+--------+ +--------+ +--------+ +--------+ | | | | v v v v[Classify] [Classify] [Classify] [Classify] | | | | +--------------+--------------+--------------+ | v [Aggregated Results]Pros: Simple, predictable Cons: Processes everything, including irrelevant emails Best for: Full inbox analysis, initial classification
Strategy 2: Filter-First Processing O(n) + filter
Reduce the dataset before batching. Gmail API lets you filter by sender, date, label, etc.
def process_with_filters(service, filter_criteria): # Step 1: Use Gmail API to reduce dataset query = build_gmail_query(filter_criteria) filtered_ids = service.users().messages().list( userId='me', q=query # e.g., "from:newsletter@* older_than:1y" ).execute()
# Step 2: Process only matching emails emails = fetch_emails_batch(service, filtered_ids['messages']) return process_in_batches(emails)Flow diagram:
[Gmail Query Filter] | v+------------------+| Reduced Dataset | (e.g., newsletters > 1 year old)| ~500 emails |+------------------+ | v[Batch Processing: 5 batches of 100] | v[Action: Archive/Delete]Pros: Reduces total processing Cons: Requires knowing filter criteria upfront Best for: Targeted cleanup (e.g., “old newsletters”)
Strategy 3: Sender-Based Aggregation O(n) + group
Group by sender first, then batch-delete by category.
def process_by_sender(emails): # Step 1: Group emails by sender (one pass) senders = {} for email in emails: sender = extract_sender(email) senders.setdefault(sender, []).append(email)
# Step 2: Classify senders in batches sender_categories = classify_senders(list(senders.keys()))
# Step 3: Apply bulk actions by category for category, sender_list in sender_categories.items(): if category == 'spam': delete_emails_from_senders(sender_list)ASCII flow:
[All Emails] | v[Extract Unique Senders] <-- One pass, O(n) | v+-------------------+| Senders to Claude | <-- Batch classify ~100 senders+-------------------+ | v[Categories: spam, newsletters, promotions, important] | v[Bulk Delete by Category]Pros: Efficient for spam/marketing cleanup Cons: May miss legitimate emails from spammy senders Best for: Bulk sender-based cleanup
Strategy 4: Hierarchical Processing
Two-phase approach: quick metadata scan, then detailed content processing.
def hierarchical_process(emails): # Phase 1: Quick metadata scan (more emails per batch) metadata_results = [] for batch in chunk(emails, 200): # Larger batches for metadata results = analyze_metadata_only(batch) metadata_results.extend(results)
# Phase 2: Full content for flagged emails only flagged = [r for r in metadata_results if r.needs_review] detailed_results = [] for batch in chunk(flagged, 50): # Smaller batches for full content results = analyze_full_content(batch) detailed_results.extend(results)
return detailed_resultsFlow visualization:
Phase 1: Metadata Scan (Fast)+------------------+ +------------------+| Batch 1: 200 | | Batch 2: 200 || [sender,subject, | | [sender,subject, || date,label] | | date,label] |+------------------+ +------------------+ | | v v [Flag: needs_review?] [Flag: needs_review?] | | +-------+---------------+ | v [Flagged: ~50 emails]
Phase 2: Full Content Analysis (Deep)+------------------+| Full Content || of 50 flagged |+------------------+ | v [Final Decision]Pros: Minimizes expensive full-content processing Cons: Two-phase approach, more complex Best for: Large inboxes with mixed content types
Choosing Your Strategy
Here’s a quick decision guide:
+------------------+ | How many emails? | +--------+---------+ | +--------------+--------------+ | | < 1,000 > 1,000 | | v v +------------------+ +------------------+ | Linear Batch | | Do you know what | | Strategy 1 | | you're looking | +------------------+ | for? | +--------+---------+ | +--------------+--------------+ | | YES NO | | v v +------------------+ +------------------+ | Filter-First | | Sender-Based | | Strategy 2 | | Strategy 3 | +------------------+ +------------------+Common Mistakes to Avoid
Batch size too large: You’ll hit context limits and get truncated results.
# WRONG: Will overflow contextbatch = emails[:500] # Too many!result = claude.classify(batch)
# CORRECT: Stay within limitsbatch = emails[:100] # Saferesult = claude.classify(batch)No aggregation between batches: Each batch starts fresh, duplicating analysis.
# WRONG: No context between batchesfor batch in chunks(emails, 100): classify(batch) # AI has no memory of previous batches
# CORRECT: Aggregate stateall_results = []for batch in chunks(emails, 100): results = classify(batch, previous_context=all_results) all_results.extend(results)Processing full content when metadata suffices: Wastes tokens on irrelevant emails.
# WRONG: Expensive for large inboxesfull_content = fetch_all_email_bodies(emails)
# CORRECT: Filter firstmetadata = fetch_metadata_only(emails)candidates = filter_by_metadata(metadata)full_content = fetch_bodies(candidates) # Much smallerWorking with Gmail API
Gmail API has its own batching limits. You can group up to 100 API calls per batch request:
from googleapiclient.http import BatchHttpRequest
def batch_delete_emails(service, message_ids): batch = BatchHttpRequest()
for msg_id in message_ids: batch.add( service.users().messages().delete( userId='me', id=msg_id ) )
batch.execute()
# Match API batch size to AI batch sizefor batch_ids in chunk(message_ids, 100): batch_delete_emails(service, batch_ids)Summary
Batch processing with AI requires three things:
- Know your limits - Context window size determines batch size
- Choose the right strategy - Linear, filter-first, sender-based, or hierarchical
- Aggregate state - Maintain context between batches
The Big O thinking from computer science applies here: O(n) linear scans for complete analysis, O(log n) for targeted searches, and strategic batching to fit within AI constraints.
Start small - test with 100 emails first. Once your batching logic works, scale up. The context window isn’t going anywhere, so design your system to respect it from day one.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments