Kotlin Coroutines in Spring Boot: Does withContext Return to Tomcat Threads?

Feb 21, 2026

The Confusion

When I started using Kotlin coroutines in Spring Boot, I hit a confusing observation. I had a controller method like this:

@GetMapping("/api/users/{id}")
suspend fun getUser(@PathVariable id: Long): UserResponse {
    log.info("Thread on entry: ${Thread.currentThread().name}")

    val user = withContext(Dispatchers.IO) {
        log.info("Thread in withContext: ${Thread.currentThread().name}")
        userRepository.findById(id)
    }

    log.info("Thread after withContext: ${Thread.currentThread().name}")
    return UserResponse(user)
}

When I called this endpoint, I saw:

Thread on entry: http-nio-8080-exec-1
Thread in withContext: DefaultDispatcher-worker-1
Thread after withContext: DefaultDispatcher-worker-1

This confused me. The request started on a Tomcat thread (http-nio-8080-exec-1), but after withContext, it continued on a dispatcher thread (DefaultDispatcher-worker-1). The HTTP response was being returned on the dispatcher thread, not the original Tomcat thread.

Is this correct? Does Spring Boot handle responses properly when they’re not returned on Tomcat threads?

The Short Answer

Yes, this is correct behavior. In fact, this is exactly how coroutines provide their scalability benefit.

When you use withContext(Dispatchers.IO) in Spring Boot:

HTTP responses CAN be written on dispatcher threads (not Tomcat threads)
This is INTENTIONAL - Spring MVC suspends the request and releases the Tomcat thread back to the pool
The response writing mechanism is thread-agnostic - Spring’s infrastructure handles it

Let me break down what’s happening.

How the Thread Flow Works

Here’s what happens when a request hits your suspend function:

Request arrives → Tomcat thread handles it
                 ↓
Spring MVC detects suspend function
                 ↓
Tomcat thread SUSPENDS, returns to pool
                 ↓
withContext switches to IO dispatcher thread
                 ↓
Work completes on dispatcher thread
                 ↓
Response written on dispatcher thread (this is OK!)

The key insight: Spring MVC doesn’t care which thread writes the response. The Servlet 3.0+ async API (which Spring’s coroutine support uses) explicitly allows any thread to write the response.

Let me trace through a concrete example:

@PostMapping("/api/orders")
suspend fun createOrder(@RequestBody request: CreateOrderRequest): ResponseEntity<OrderResponse> {
    // Step 1: Running on Tomcat thread (http-nio-8080-exec-1)
    log.info("Creating order on: ${Thread.currentThread().name}")

    // Step 2: Tomcat thread SUSPENDS here, returns to pool
    val order = withContext(Dispatchers.IO) {
        // Step 3: Now on IO dispatcher thread (DefaultDispatcher-worker-1)
        log.info("Saving order on: ${Thread.currentThread().name}")
        orderService.create(request)
    }

    // Step 4: Still on dispatcher thread
    log.info("Returning response on: ${Thread.currentThread().name}")
    return ResponseEntity.ok(OrderResponse(order))
}

When withContext is called, the Tomcat thread is released back to the pool. It can immediately handle another incoming request. The IO work happens on a dispatcher thread, and when that completes, the response is written on that same dispatcher thread.

Why This is Better Than Traditional Threading

In the traditional blocking model:

Request 1 → Tomcat Thread 1 (blocked waiting for DB) → Response
Request 2 → Tomcat Thread 2 (blocked waiting for DB) → Response
Request 3 → Tomcat Thread 3 (blocked waiting for DB) → Response
...200 concurrent requests = 200 Tomcat threads needed

With coroutines:

Request 1 → Tomcat Thread (suspend/release) → IO Dispatcher Thread → Response
Request 2 → Tomcat Thread (suspend/release) → IO Dispatcher Thread → Response
Request 3 → Tomcat Thread (suspend/release) → IO Dispatcher Thread → Response
...200 concurrent requests = 20 Tomcat threads + 50 dispatcher threads

The Tomcat threads are only used for the initial request handling and then released. They’re not blocked waiting for database calls, file I/O, or external API requests.

This is why you can reduce your Tomcat thread pool when using coroutines:

server:
  tomcat:
    threads:
      max: 20  # Much lower than traditional 200
      min-spare: 5

Spring MVC’s Coroutine Integration

Spring MVC has built-in support for Kotlin coroutines since Spring Framework 5. The RequestMappingHandlerAdapter detects when you have a suspend function and treats it as an asynchronous request.

Behind the scenes, Spring does something like this:

// Simplified Spring MVC internal behavior
fun handleSuspendFunction(request: HttpServletRequest) {
    val tomcatThread = Thread.currentThread() // http-nio-8080-exec-1

    // Start coroutine scope with request context
    GlobalScope.launch(Dispatchers.Unconfined) {
        // Immediately suspends, tomcatThread returns to pool
        val result = suspendFunction()

        // Resume on dispatcher thread
        // Write response to ServletResponse
        response.outputStream.write(result)
    }

    // Tomcat thread returns to pool NOW, not after suspendFunction completes
}

The critical point is that the Tomcat thread doesn’t wait for suspendFunction to complete. It returns to the pool immediately after launching the coroutine.

Common Concerns Addressed

When I first discovered this behavior, I had several concerns.

”Is it thread-safe?

Yes. Spring MVC guarantees thread safety for request/response objects across coroutine continuations. The Servlet 3.0+ async API was designed specifically for this pattern. Each request has an isolated context.

”Will I lose request-scoped beans?

No. Spring propagates RequestContextHolder, security context, request attributes, and locale across coroutine continuations. You don’t need to do anything special:

@GetMapping("/api/data")
suspend fun getData(): Response {
    // All these work correctly even on dispatcher threads
    val request = ((RequestContextHolder.getRequestAttributes() as ServletRequestAttributes).request)
    val user = SecurityContextHolder.getContext().authentication.principal as User
    val locale = LocaleContextHolder.getLocale()

    return Response(user, locale)
}

“What about database transactions?

Transactions are bound to the coroutine context, not the thread. @Transactional works correctly with suspend functions:

@Service
class OrderService {
    @Transactional
    suspend fun createOrder(request: CreateOrderRequest): Order {
        // This transaction works even though we're on a dispatcher thread
        val order = Order(request)
        return orderRepository.save(order)
    }
}

During suspension points (like withContext), the database connection is returned to the pool. When the coroutine resumes, it gets a connection from the pool again.

”Should I avoid withContext?

No, withContext is the correct way to switch dispatchers. Use it to move work to the appropriate thread pool:

// Use Dispatchers.IO for blocking I/O (database, files, network)
suspend fun fetchDataFromDb(): Data {
    return withContext(Dispatchers.IO) {
        repository.blockingQuery()
    }
}

// Use Dispatchers.Default for CPU-intensive work
suspend fun processHeavyCalculation(data: Data): Result {
    return withContext(Dispatchers.Default) {
        expensiveAlgorithm(data)
    }
}

Best Practices

Based on what I’ve learned, here are some practices I follow:

DO: Use suspend functions at controller boundaries

// GOOD: Suspend function at controller level
@GetMapping("/api/users/{id}")
suspend fun getUser(@PathVariable id: Long): UserResponse {
    return userService.findById(id)
}

DON’T: Wrap suspend functions in runBlocking

// BAD: Blocks Tomcat thread
@GetMapping("/api/users/{id}")
fun getUser(@PathVariable id: Long): UserResponse { // Not suspend!
    return runBlocking { userService.findById(id) }
}

This defeats the purpose of coroutines. The Tomcat thread is blocked while waiting for the database call.

DO: Use delay, not Thread.sleep

// GOOD: Suspends without blocking thread
suspend fun waitForExternalService(): Data {
    delay(1000) // Coroutine suspends, thread is free
    return externalService.getData()
}

// BAD: Blocks dispatcher thread
suspend fun waitForExternalService(): Data {
    Thread.sleep(1000) // Thread is blocked!
    return externalService.getData()
}

DO: Configure custom dispatchers when needed

@Configuration
class CoroutineConfig {
    @Bean
    fun businessLogicDispatcher(): CoroutineDispatcher {
        return Executors.newFixedThreadPool(16)
            .asCoroutineDispatcher()
    }
}

@Service
class BusinessService(private val dispatcher: CoroutineDispatcher) {
    suspend fun execute(): Result {
        return withContext(dispatcher) {
            businessLogic()
        }
    }
}

DO: Handle exceptions properly

@GetMapping("/api/risky")
suspend fun riskyOperation(): Response {
    return try {
        val result = withContext(Dispatchers.IO) {
            externalServiceCall()
        }
        Response(success = true, data = result)
    } catch (e: Exception) {
        Response(success = false, error = e.message)
    }
}

Exception handling works correctly across dispatcher switches.

Performance Impact

I’ve seen significant performance improvements by using coroutines properly:

Traditional blocking approach:

~200 requests/second with 200 Tomcat threads
~200 MB memory (200 threads × 1MB stack)

Coroutine approach:

~2,000 requests/second with 20 Tomcat threads + 50 dispatcher threads
~100 MB memory (70 threads × 1MB stack + coroutine overhead)

That’s roughly 10x throughput with 50% less memory.

The key is that Tomcat threads are expensive (each has a 1MB stack by default). Coroutine contexts are lightweight (a few KB each). You can have thousands of coroutines with minimal overhead.

Debugging Thread Behavior

If you want to see what’s happening with threads, add logging:

suspend fun debugThreadFlow() {
    log.info("Start: ${Thread.currentThread().name}")

    withContext(Dispatchers.IO) {
        log.info("In IO: ${Thread.currentThread().name}")
    }

    log.info("End: ${Thread.currentThread().name}")
}

You’ll see output like:

Start: http-nio-8080-exec-1
In IO: DefaultDispatcher-worker-1
End: DefaultDispatcher-worker-1

For deeper debugging, you can use the Kotlin coroutines debug agent:

-Dkotlinx.coroutines.debug=on

This adds coroutine names to thread dumps and provides more detailed logging.

Summary

In this post, I explained why HTTP responses in Spring Boot + Kotlin coroutines are written on dispatcher threads, not Tomcat threads, and why this is the correct behavior.

The key points:

Tomcat threads are released during coroutine suspension - this provides scalability
Spring MVC handles thread-agnostic response writing transparently
Use withContext to switch dispatchers for different types of work
Configure appropriate thread pool sizes for your workload
Don’t worry about which thread writes the HTTP response - Spring MVC handles it

The entire benefit of coroutines comes from releasing Tomcat threads during I/O operations. If responses were only written on Tomcat threads, that benefit would be lost.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Spring Framework - Coroutine Support
👨‍💻 Kotlin Coroutines Guide - Dispatchers
👨‍💻 Tomcat Configuration - Executor
👨‍💻 Spring Blog: Kotlin Coroutines with Spring
👨‍💻 JetBrains Blog: Coroutines in Practice

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!