Resolving Excessive TIME_WAIT Connections on Windows Systems
When monitoring a Zabbix agent, an alert indicated unavailability despite the agent process running. Checking network cnonections revealed numerous TCP connections in the TIME_WAIT state, prevanting new connections to the proxy due to exhausted socket resources.
In active mode, the agent initiates connections to the proxy on port 10051, leading to TIME_WAIT states upon connection closure. Restarting the agent did not release these connections, and logs showed errors like "cannot connect to [proxy]: [0x00002747] due to insufficient system buffer space or full queue."
TIME_WAIT is a normal TCP state occurring during connection termination via a four-way handshake. The party initiating closure enters TIME_WAIT to ensure reliable termination by handling potential lost ACK packets and discarding delayed segments, with a default duration of 2MSL (Maximum Segment Lifetime).
Excessive TIME_WAIT connections can deplete local ports, hindering new TCP connections. Solutions include:
- Method 1: Apply a Microsoft-supported fix for TIME_WAIT connections not closing after extended periods in Windows Vista, 7, Server 2008, and Server 2008 R2, available through official support channels.
- Method 2: Modify regisrty settings to manage port allocation and TIME_WAIT duration:
- MaxUserPort: Sets the maximum port number for ephemeral ports. Default is 5000 (0x1388), with a valid renge of 5000-65534. Increase this value to allow more available ports.
- TcpTimedWaitDelay: Controls the time before a closed connection can be reused, representing 2MSL. Default is 240 seconds; reduce to 30 seconds to free resources faster, within a range of 30-300 seconds.
To implement Method 2, open the Registry Editor (regedit), navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters, and add or modify these DWORD values as needed. Adjusting these settings helps mitigate port exhaustion from TIME_WAIT accumulation.