Fading Coder

An Old Coder’s Final Dance

Home > Tech > Content

Troubleshooting Nacos Config Client Timeouts on Linux (java.net.ConnectException: no available server)

Tech 2

Context

  • Architecture: Spring Cloud microservices using Spring Cloud Alibaba
  • Service discovery and configuration: Nacos
  • Nacos server OS: CentOS 7 (Linux)
  • Clients connect to Nacos at startup for configuration and service registration

On Windows-based deployments the system runs normally, but when the Nacos server is hosted on Linux, clients intermittently fail to fetch configuration with connection/timeout errors.

Symptoms

Example client-side logs during startup or long polling:

ERROR c.a.n.c.config.http.ServerHttpAgent : [NACOS SocketTimeoutException httpPost] currentServerAddr: http://10.0.0.12:8848, err: Read timed out
ERROR c.a.n.c.config.http.ServerHttpAgent : no available server, currentServerAddr : http://10.0.0.12:8848
ERROR c.a.n.client.config.impl.ClientWorker : [fixed-10.0.0.12_8848] [check-update] get changed dataId exception

java.net.ConnectException: no available server, currentServerAddr : http://10.0.0.12:8848
    at com.alibaba.nacos.client.config.http.ServerHttpAgent.httpPost(ServerHttpAgent.java:170)
    at com.alibaba.nacos.client.config.http.MetricsHttpAgent.httpPost(MetricsHttpAgent.java:64)
    at com.alibaba.nacos.client.config.impl.ClientWorker.checkUpdateConfigStr(ClientWorker.java:377)
    at com.alibaba.nacos.client.config.impl.ClientWorker.checkUpdateDataIds(ClientWorker.java:352)
    at com.alibaba.nacos.client.config.impl.ClientWorker$LongPollingRunnable.run(ClientWorker.java:512)
    ...

The error typically appears during the Nacos configuration long-polling request and repeats at intervals.

Why it happens

In Nacos client 1.1.x, the HTTP timeouts for config long polling are relatively aggressive. On some Linux deployments with higher latency, occasional network jitter causes reads to exceed those thresholds. The client then reports "no available server" eventhough the server is reachable. This behavior was tracked as a known issue and addressed in later client versions (1.2.0+). The same setup on Windows may not reproduce the problem due to different network characteristics.

Reference: https://github.com/alibaba/nacos/issues/2206

How to diagnose

  • Verify basic connectivity from a client host to the Nacos server:
    • curl -m 5 http://:8848/nacos
    • telnet 8848 (or nc -vz 8848)
  • Check Linux firewalls/security groups and ensure port 8848 is open.
  • Inspect Nacos server logs (nacos/logs) for request errors or thread pool saturation.
  • Confirm that client and server versions are compatible with your Spring Cloud Alibaba stack.

Fix: upgrade the Nacos client

Upgrade the Nacos client on the application side to 1.2.0 or newer. This adjusts timeout handling and stability of long polling.

Maven dependency override (simple)

If you already use spring-cloud-starter-alibaba-nacos-* starters, explicitly add a newer nacos-client to override the transitive one:

<!-- Nacos Service Discovery -->
<dependency>
  <groupId>com.alibaba.cloud</groupId>
  <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
</dependency>

<!-- Nacos Config -->
<dependency>
  <groupId>com.alibaba.cloud</groupId>
  <artifactId>spring-cloud-starter-alibaba-nacos-config</artifactId>
</dependency>

<!-- Force nacos-client >= 1.2.0 to fix timeout issues -->
<dependency>
  <groupId>com.alibaba.nacos</groupId>
  <artifactId>nacos-client</artifactId>
  <version>1.2.0</version>
</dependency>

Maven with dependencyManagement (recommended for larger projects)

Manage the exact Nacos client version in one place:

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.alibaba.nacos</groupId>
      <artifactId>nacos-client</artifactId>
      <version>1.2.0</version>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-discovery</artifactId>
  </dependency>
  <dependency>
    <groupId>com.alibaba.cloud</groupId>
    <artifactId>spring-cloud-starter-alibaba-nacos-config</artifactId>
  </dependency>
</dependencies>

Ensure that the Nacos server version is compatible with the upgraded client. If possible, keep both client and server close to the same minor version.

Environment in the reported case

  • Spring Boot: 2.1.6.RELEASE
  • Spring Cloud: Greenwich
  • Spring Cloud Alibaba: 2.1.1.RELEASE
  • Nacos Server: 1.1.4
  • Java: 1.8

With the above stack on Linux, upgrading only the nacos-client to 1.2.0 or later on the application side mitigates the long-polling timeout and the resulting "no available server" exceptions.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.