0. The Problem
Recently I am working on a project consists of TCP socket programming on Linux. I encountered errno 98 (address already in use) and 99 (cannot assign requested address) frequently. I wrote a small test program to reproduce the issue. The test code is as below,
Save the code to test.c, then you can compile it using “gcc -o test test.c”. Running the program multiple times will likely to give you one error, enable/disable line “setsockopt(socketFd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof one);“ will give you the other error.
When the error occurs, run netstat command “netstat | grep 27511” (from the program output, I know the error occurs at port 27511). Below is the screenshot,
Figure 1. Cannot Assign Request Address
As shown in figure 1, the tcp port is in TIME_WAIT state.
1. The TCP State Transition at Closure
To close a established TCP connection, both endpoints send FIN packets to indicate there’s no more data. Upon receiving the other party’s FIN packet, both endpoints need to ACK it.
The FIN packets are sent when a program calls exit(), close() or shutdown(). The ACKs are handled by the kernel after close() is completed. Therefore, it is possible that the program finishes before the kernel releases the associated network resource. And another process won’t be able to use it until kernel has freed it.
Below is a figure of detailed state transitions for an endpoint when TCP connection closes. It follows different paths depending on which side initiated the closure.
Figure 2. TCP State Transition at Closure (diagram from reference 4)
Note that TIME_WAIT only occurs at the endpoint which initiated the closure.
2. Why TIME_WAIT
After the TCP connection is closed, there might still be live packets in the network. If a new connection is established with the exact same (client IP, client port, server IP, server port) tuple, the packets from the previous connection will be treated for the new connection.
To avoid this, TIME_WAIT time is generally set to twice the packets maximum age. The value is long enough that the packets for the old connection will be dead after the time expires. Note that setting TIME_WAIT at one endpoint would be enough to make sure no two exactly same (client IP, client port, server IP, server port) tuples appear.
3. How to Avoid the Problem
TIME_WAIT only occurs at the side which initiates the TCP connection closure, so a natural solution would be avoid calling close(). If you have control over both client and server, you may want to let the client close first, so the server won’t ends of lots of TIME_WAIT ports.
As indicated in the testing program, setsockopt() with SO_REUSEADDR allows you to bind the a socket to a port which in TIME_WAIT. If you use the socket as a client side, and try connecting to the same (server address, server port) tuple, you’ll fail at connect stage. However, connecting to other (server address, server port) is allowed.
If you use the socket as a server socket, you can also use SO_REUSEADDR. I’ve not tested if the same (client address, client port) tries to connect, what will happen. But I guess the connection request will be denied.
It’s also possible to modify the TIME_WAIT values on some operating systems.
1. Setting TIME_WAIT TCP, stackoverflow: http://stackoverflow.com/questions/337115/setting-time-wait-tcp
2. TIME_WAIT and its design implications for protocols and scalable client server systems: http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html
3. The TIME_WAIT state in TCP and Its Effect on Busy Servers: http://www.isi.edu/touch/pubs/infocomm99/infocomm99-web/
4. Bind: Address Already in Use or How to Avoid this Error when Closing TCPConnections