Spring Boot 3 + Java 21: Virtual Threads and What Actually Changes in Production
Java 21 shipped virtual threads as a stable feature. Spring Boot 3.2 enables them with a single configuration line. The marketing says they "solve concurrency." The reality is more nuanced — and more interesting.
This post is about what actually changes in a Spring API when you enable virtual threads, what doesn't change, and what you can break if you don't understand the limits.
The problem they solve
In traditional Spring MVC, each HTTP request occupies a thread pool thread for its entire duration. If that request makes a database call that takes 50ms, the thread is blocked for those 50ms — it can't serve another request.
With a pool of 200 threads (Tomcat's default), your API can handle 200 concurrent requests before it starts queuing. If each request waits 100ms on I/O, your real throughput is 200 / 0.1s = 2000 req/sec maximum, even if CPU processing is minimal.
The solution until now was Reactor and WebFlux: reactive programming that releases the thread while waiting for I/O. It works, but has a high cognitive cost — async code, Mono/Flux, complicated debugging, useless stack traces.
Virtual threads solve the same problem differently: instead of releasing the OS thread, the JVM creates millions of lightweight virtual threads. When a virtual thread does a blocking I/O operation, the JVM "unmounts" it from the OS thread and mounts another. The OS thread never blocks. Your code stays synchronous and sequential — the JVM handles the complexity.
What changes with Spring Boot 3.2 + Java 21
Enabling it
A single property in application.properties:
spring.threads.virtual.enabled=true
With this, Tomcat uses virtual threads to serve requests. Each request runs in its own virtual thread. The 200-OS-thread pool is no longer the bottleneck.
I/O-bound concurrency without Reactor
The most important change for typical APIs: you no longer need WebFlux to handle high I/O-bound concurrency. A classic Spring MVC API with JDBC, RestTemplate, or any blocking client can handle thousands of concurrent requests without exhausting OS threads.
Before:
// For high concurrency, you needed WebFlux
public Mono<UserDto> getUser(Long id) {
return userRepository.findById(id)
.map(this::toDto);
}
Now with virtual threads:
// Classic Spring MVC, same concurrency
public UserDto getUser(Long id) {
return userRepository.findById(id)
.map(this::toDto)
.orElseThrow();
}
Same throughput, synchronous code. For teams struggling with Reactor's learning curve, this is significant.
Thread-per-request is reasonable again
With virtual threads, creating a thread per request isn't expensive. The JVM can handle millions of virtual threads with minimal memory and overhead. The mental model of "one thread = one request" works at scale again.
What virtual threads do NOT fix
CPU-bound is still CPU-bound
Virtual threads are useful when time is spent waiting for I/O. If your code spends time doing computation — image processing, cryptography, intensive calculations — virtual threads don't help. The OS thread is still busy during computation.
For CPU-bound work, the limit is still the number of cores. No magic here.
Pinning: the silent enemy
The most important virtual thread problem in production is called pinning. It happens when a virtual thread can't be unmounted from the OS thread because it's inside a synchronized block or a native method.
// This causes pinning
synchronized (lock) {
result = jdbcTemplate.query(...);
}
During pinning, the OS thread blocks just like before. If you have many virtual threads pinned simultaneously, your real concurrency degrades to the number of available OS threads.
Where pinning appears in typical Spring code:
- Old JDBC drivers that use
synchronizedinternally. - Legacy code with
synchronizedwrapping I/O operations. - Some connection pool implementations.
How to detect it: -Djdk.tracePinnedThreads=full in staging. When pinning occurs, the JVM logs the stack trace.
Connection pools are still necessary
A common mistake: thinking you no longer need a connection pool because "there are threads for everyone." The connection pool doesn't limit threads — it limits database connections. If you create 10,000 virtual threads that all try to connect simultaneously, the DB is still the bottleneck. HikariCP with virtual threads works well — just calibrate the pool size correctly.
Recommended production configuration
spring.threads.virtual.enabled=true
spring.datasource.hikari.maximum-pool-size=20
server.tomcat.threads.max=200
For detecting pinning in staging:
-Djdk.tracePinnedThreads=short
Does WebFlux still make sense?
Yes, in specific cases:
- Data streaming: WebFlux with backpressure handles streaming large volumes better.
- Consolidated reactive ecosystem: if you already have R2DBC, reactive clients, and the team knows the model well, there's no reason to migrate.
- Ultra-low latency with high concurrency: in extreme scenarios, the reactive model still has per-thread overhead advantages.
For a typical CRUD API with PostgreSQL, Redis, and external services: virtual threads with Spring MVC is the simpler option and sufficiently performant.
Conclusion
Virtual threads in Java 21 + Spring Boot 3.2 make the classic synchronous model viable for high I/O-bound concurrency. For most Spring APIs, WebFlux is no longer mandatory to scale.
What doesn't change: CPU-bound still needs cores, connection pools are still necessary, and pinning with synchronized can become your new bottleneck.
Enabling virtual threads takes a minute. Understanding their limits takes this post.