How to Implement the Claim Check Pattern in Spring Boot for Scalable Large File Uploads
The Crash That Changed Everything
I was deploying our document processing service when I saw this in the logs:
java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:3332) at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191) at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile.getBytes(StandardMultipartHttpServletRequest.java:124) ...The system had been working fine during testing with small files. But in production, users were uploading 500MB PDFs, 2GB video files, and the backend was crashing repeatedly.
I tried increasing the heap size:
java -Xmx4g -jar application.jarWorked for a while. Then 8GB. Then 16GB. Eventually I realized: this is not a solution, it’s a band-aid.
What I Was Doing Wrong
My original approach was synchronous and memory-hungry:
@PostMapping("/upload")public ResponseEntity<String> uploadFile(@RequestParam("file") MultipartFile file) { // Loading entire file into memory - BAD! byte[] fileData = file.getBytes();
// Processing in same thread - BLOCKS! Document doc = pdfProcessor.extractText(fileData);
// Storing result synchronously - SLOW! storageService.save(doc);
return ResponseEntity.ok("Processed: " + file.getOriginalFilename());}The problems were obvious once I looked:
file.getBytes()loads the entire file into heap memory- Synchronous processing blocks the request thread
- No backpressure - any number of concurrent uploads could crash the system
I tried streaming:
@PostMapping("/upload")public ResponseEntity<String> uploadFile(@RequestParam("file") MultipartFile file) { try (InputStream is = file.getInputStream()) { // Still loads into memory when Kafka tries to serialize kafkaTemplate.send("file-uploads", file.getBytes()); return ResponseEntity.ok("Queued"); }}But Kafka has message size limits, and I hit this:
org.apache.kafka.common.errors.RecordTooLargeException:The message is 524288000 bytes when serialized which is larger than 10485760,the maximum request size you have configured with the max.request.size configurationI could increase max.request.size, but that’s just another band-aid. The real problem was architectural.
Understanding the Claim Check Pattern
The Claim Check Pattern is named after airline baggage claim checks. When you check in luggage, you get a small ticket. The luggage goes in cargo, you carry the ticket. When you need the luggage, you present the ticket.
In software terms:
- Baggage = Large file payload
- Cargo hold = Object storage (MinIO, S3)
- Ticket = Unique token (claim check)
- Conveyor belt = Message queue (Kafka)
Instead of:
[Large File] -> [Kafka (dies)] -> [Consumer]You do:
[Large File] -> [MinIO] -> returns token[token] -> [Kafka (happy)] -> [Consumer][Consumer] -> uses token -> [MinIO] -> [Large File]Setting Up the Infrastructure
I used MinIO for object storage (S3-compatible, runs locally) and Kafka for messaging.
version: '3.8'services: minio: image: minio/minio:latest ports: - "9000:9000" - "9001:9001" environment: MINIO_ROOT_USER: admin MINIO_ROOT_PASSWORD: admin123 command: server /data --console-address ":9001"
kafka: image: confluentinc/cp-kafka:latest ports: - "9092:9092" environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1Spring Boot configuration:
spring: kafka: bootstrap-servers: localhost:9092 producer: key-serializer: org.apache.kafka.common.serialization.StringSerializer value-serializer: org.apache.kafka.common.serialization.StringSerializer consumer: group-id: claim-check-group auto-offset-reset: earliest key-deserializer: org.apache.kafka.common.serialization.StringDeserializer value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
minio: endpoint: http://localhost:9000 access-key: ${MINIO_ACCESS_KEY:admin} secret-key: ${MINIO_SECRET_KEY:admin123} bucket: file-uploadsImplementing the Storage Service
First, I needed a service to handle MinIO operations:
@Service@Slf4jpublic class MinioStorageService {
private final MinioClient minioClient; private final String bucketName;
public MinioStorageService( MinioClient minioClient, @Value("${minio.bucket}") String bucketName) { this.minioClient = minioClient; this.bucketName = bucketName; initializeBucket(); }
private void initializeBucket() { try { boolean exists = minioClient.bucketExists( BucketExistsArgs.builder() .bucket(bucketName) .build() ); if (!exists) { minioClient.makeBucket( MakeBucketArgs.builder() .bucket(bucketName) .build() ); log.info("Created bucket: {}", bucketName); } } catch (Exception e) { throw new StorageException("Failed to initialize bucket", e); } }
public String storeFile(MultipartFile file) { String claimToken = UUID.randomUUID().toString(); String objectName = claimToken + "-" + file.getOriginalFilename();
try { minioClient.putObject( PutObjectArgs.builder() .bucket(bucketName) .object(objectName) .stream(file.getInputStream(), file.getSize(), -1) .contentType(file.getContentType()) .build() ); log.info("Stored file {} with token {}", objectName, claimToken); return claimToken; } catch (Exception e) { log.error("Failed to store file", e); throw new StorageException("Failed to store file", e); } }
public InputStream retrieveFile(String claimToken) { try { Iterable<Result<Item>> objects = minioClient.listObjects( ListObjectsArgs.builder() .bucket(bucketName) .prefix(claimToken) .build() );
for (Result<Item> item : objects) { String objectName = item.get().objectName(); return minioClient.getObject( GetObjectArgs.builder() .bucket(bucketName) .object(objectName) .build() ); } throw new FileNotFoundException("No file found for token: " + claimToken); } catch (Exception e) { throw new StorageException("Failed to retrieve file", e); } }
public void deleteFile(String claimToken) { try { Iterable<Result<Item>> objects = minioClient.listObjects( ListObjectsArgs.builder() .bucket(bucketName) .prefix(claimToken) .build() );
for (Result<Item> item : objects) { minioClient.removeObject( RemoveObjectArgs.builder() .bucket(bucketName) .object(item.get().objectName()) .build() ); } log.info("Deleted file with token {}", claimToken); } catch (Exception e) { log.error("Failed to delete file", e); } }}The key insight here: never load the entire file into memory. The stream is passed directly to MinIO.
The Message Model
I created a lightweight message that only contains metadata:
@Data@Builder@NoArgsConstructor@AllArgsConstructorpublic class ClaimCheckMessage { private String claimToken; private String originalFilename; private String contentType; private long fileSize; private Instant uploadedAt; private Map<String, String> metadata;}This message is tiny - maybe 500 bytes. Kafka can handle millions of these.
The Upload Endpoint
Now the controller became simple and fast:
@RestController@RequestMapping("/api/files")@Slf4jpublic class FileUploadController {
private final MinioStorageService storageService; private final KafkaTemplate<String, String> kafkaTemplate; private final ObjectMapper objectMapper;
public FileUploadController( MinioStorageService storageService, KafkaTemplate<String, String> kafkaTemplate, ObjectMapper objectMapper) { this.storageService = storageService; this.kafkaTemplate = kafkaTemplate; this.objectMapper = objectMapper; }
@PostMapping("/upload") public ResponseEntity<UploadResponse> uploadFile( @RequestParam("file") MultipartFile file) {
long startTime = System.currentTimeMillis();
// 1. Stream file to MinIO, get claim token String claimToken = storageService.storeFile(file);
// 2. Create lightweight claim check message ClaimCheckMessage message = ClaimCheckMessage.builder() .claimToken(claimToken) .originalFilename(file.getOriginalFilename()) .contentType(file.getContentType()) .fileSize(file.getSize()) .uploadedAt(Instant.now()) .metadata(Map.of("source", "api-upload")) .build();
// 3. Send only token reference to Kafka try { String messageJson = objectMapper.writeValueAsString(message); kafkaTemplate.send("file-uploads", claimToken, messageJson);
long duration = System.currentTimeMillis() - startTime; log.info("File uploaded in {}ms: {}", duration, claimToken);
return ResponseEntity.accepted() .body(UploadResponse.builder() .claimToken(claimToken) .status("QUEUED") .message("File uploaded successfully, processing queued") .processingTimeMs(duration) .build()); } catch (Exception e) { // Cleanup on failure storageService.deleteFile(claimToken); log.error("Failed to queue file", e); throw new UploadException("Failed to queue file for processing", e); } }}I tested with a 500MB file:
File uploaded in 12ms: a1b2c3d4-e5f6-7890-abcd-ef123456789012 milliseconds. The user gets an immediate response while processing happens asynchronously.
The Consumer
Processing happens in a separate service:
@Component@Slf4jpublic class FileProcessingConsumer {
private final MinioStorageService storageService; private final DocumentProcessor documentProcessor; private final ObjectMapper objectMapper;
@KafkaListener(topics = "file-uploads", groupId = "claim-check-group") public void processFile(String messageJson) { ClaimCheckMessage message; InputStream fileStream = null;
try { message = objectMapper.readValue(messageJson, ClaimCheckMessage.class); log.info("Processing file: {} ({} bytes)", message.getOriginalFilename(), message.getFileSize());
// Retrieve file using claim token fileStream = storageService.retrieveFile(message.getClaimToken());
// Process the file (streaming, not loading all into memory) documentProcessor.process(fileStream, message);
// Cleanup after successful processing storageService.deleteFile(message.getClaimToken());
log.info("File processed successfully: {}", message.getClaimToken());
} catch (Exception e) { log.error("Failed to process file", e); throw new RuntimeException("Processing failed", e); } finally { if (fileStream != null) { try { fileStream.close(); } catch (IOException e) { log.error("Failed to close stream", e); } } } }}The consumer processes at its own pace. If it crashes, Kafka will redeliver the message. The file is still in MinIO, safe and sound.
Handling Failures with Dead Letter Queue
I learned the hard way that failed messages need somewhere to go:
@Configurationpublic class KafkaConfig {
@Bean public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory( ConsumerFactory<String, String> consumerFactory, ProducerFactory<String, String> producerFactory) {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>(); factory.setConsumerFactory(consumerFactory);
// Configure retry factory.setCommonErrorHandler( new DefaultErrorHandler( new DeadLetterPublishingRecoverer( new KafkaTemplate<>(producerFactory), (record, ex) -> new TopicPartition( "file-uploads-dlq", record.partition() ) ), new FixedBackOff(1000L, 3L) // 3 retries with 1s delay ) );
return factory; }}Now failed messages go to a dead letter queue for investigation:
@Component@Slf4jpublic class DlqConsumer {
@KafkaListener(topics = "file-uploads-dlq", groupId = "dlq-monitor") public void handleFailedMessage(String messageJson) { log.error("Message failed processing after retries: {}", messageJson); // Alert operations team, store for manual review, etc. }}Why This Works
Memory footprint stays constant. Whether the file is 1MB or 10GB, the API endpoint:
- Streams to MinIO (disk I/O, not memory)
- Sends a 500-byte message to Kafka
- Returns immediately
Horizontal scaling becomes trivial. Need more processing capacity? Add more consumer instances. Kafka handles the distribution. Each consumer only needs enough memory for one file at a time.
The system becomes resilient. If the processing service crashes:
- File is safely stored in MinIO
- Message remains in Kafka
- Processing resumes when service restarts
Related Knowledge: Other Patterns for Large Payloads
The Claim Check Pattern is one of several approaches:
-
Streaming: Process in chunks without loading entire file. Works for real-time processing but doesn’t solve the async/decoupling problem.
-
External Store: Similar to Claim Check but without the messaging component. Useful when you just need storage, not async processing.
-
Content-Based Router: Route different file types to different processors. Often combined with Claim Check.
-
Splitter/Aggregator: Break large files into smaller chunks for parallel processing. Useful when you need to process different parts independently.
The Claim Check Pattern shines when you need:
- Async processing
- Decoupled components
- Message queue reliability guarantees
- Scalable architecture
Lessons Learned
-
Don’t fight the symptoms - Increasing heap size delays the inevitable. Fix the architecture.
-
Separate concerns - Upload, storage, and processing are different concerns with different scaling needs.
-
Embrace asynchrony - Users prefer a 12ms “queued” response over waiting minutes for “processing complete.”
-
Design for failure - Files in MinIO + messages in Kafka = nothing gets lost.
-
Measure everything - The 12ms response time came from logging. Without metrics, you’re flying blind.
The Claim Check Pattern transformed our system from one that physically could run out of memory to one that physically cannot. That’s the kind of architectural change that lets you sleep at night.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 From OOM to 12ms: Scaling Large File Uploads with the Claim Check Pattern
- 👨💻 Project Aegis GitHub Repository
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments