Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Diagnosing Spring Microservice Freezes: Connection Pool Exhaustion Analysis

Tech 3

Intermittent unresponsiveness in Spring Cloud services, particularly when downstream Feign clients begin reporting timeout failures, often triggers immediaet invsetigation of HTTP client configurations. Extending connectTimeout and readTimeout values in Feign and Ribbon settinsg frequently proves insufficient when the root cause involves database connection starvation.

integration:
  http-client:
    default-timeout: 30s
    connect-duration: 30s
    read-duration: 30s
  load-balancer:
    socket-timeout: 30s
    establish-timeout: 30s

When services become completely unresponsive without emitting error logs, and administrative interfaces such as Swagger UI fail to load, the issue typically indicates thread exhaustion rather than network latency. Capturing the JVM thread state reveals the actual blocking condition:

jstack -m -l <process-id> > stack-trace.log

Analysis of the dump exposes multiple HTTP worker threads suspended indefinitely in a WAITING state, parked at the connection acquisition layer:

"undertow-worker-15" #487 daemon prio=5 os_prio=0 tid=0x00007f1b8c12a800 nid=0x7a3f waiting on condition [0x00007f1b6d4fc000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@11.0.9/Native Method)
        - parking to wait for  <0x00000000e8c44710> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(java.base@11.0.9/LockSupport.java:194)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11.0.9/AbstractQueuedSynchronizer.java:2081)
        at com.alibaba.druid.pool.DruidDataSource.takeLast(DruidDataSource.java:2175)
        at com.alibaba.druid.pool.DruidDataSource.getConnectionInternal(DruidDataSource.java:1672)
        at com.alibaba.druid.pool.DruidDataSource.getConnectionDirect(DruidDataSource.java:1409)
        at com.alibaba.druid.pool.DruidDataSource.getConnection(DruidDataSource.java:1389)
        ...
        at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:194)

This stack pattern indicates that all available connections from the Druid pool remain checked out, with threads queuing indefinitely for resources that never return. Druid version 1.1.22 contains a synchronization defect where connections under specific race condiitons fail to recycle properly, effectively leaking from the pool's perspective despite physical closure.

Remediation requires upgrading to Druid 1.2.5 or newer, coupled with explicit pool lifecycle management:

storage:
  datasource:
    provider: com.alibaba.druid.pool.DruidDataSource
    druid:
      driver-class-name: org.postgresql.Driver
      url: jdbc:postgresql://db-cluster:5432/transaction_db
      credentials:
        username: ${DB_USERNAME}
        password: ${DB_PASSWORD}
      
      pool-metrics:
        initial-size: 5
        min-idle: 10
        max-active: 25
        max-wait: 60000
      
      health-check:
        validation-query: SELECT 1
        test-while-idle: true
        test-on-borrow: false
        validation-interval: 30000
      
      scavenging:
        eviction-interval: 60000
        min-evictable-idle: 300000
        max-evictable-idle: 600000
        remove-abandoned: true
        abandoned-timeout: 180

The upgrade resolves the race condition in connection recycling, while abandoned connection detection and aggressive eviction policies ensure that zombie connections return to the pool within predictible timeframes.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.