TCP, NAT and 2MSL mismatch

We have a client that connects over the NHS internal network to a server hosted at our site. We have lots of clients like this, but these are slightly different because they NAT all their machines to one IP before it gets to us.

Recently they complained about connection problems and after lots of investigation we managed to get a packet capture of the problem (IPs changed of course):

 1  0.00 -> TCP 2268 > 80 [SYN]
 2  0.00 -> TCP 80 > 2268 [SYN, ACK]
 3  0.01 -> TCP 2268 > 80 [ACK]
 4  0.08 -> HTTP POST
 5  0.24 -> TCP 80 > 2268 [ACK]
 6  0.23 -> HTTP Continuation
 7  0.24 -> HTTP HTTP/1.1 200 OK 1365
 8  0.24 -> HTTP Continuation
 9  0.24 -> TCP 80 > 2268 [FIN, ACK]
10  0.29 -> TCP 2268 > 80 [ACK]
11  0.31 -> TCP 2268 > 80 [FIN, ACK]
12  0.31 -> TCP 80 > 2268 [ACK]
13  0.34 -> TCP 2268 > 80 [ACK]
14 68.26 -> TCP 2268 > 80 [SYN]
15 71.18 -> TCP 2268 > 80 [SYN]
16 77.13 -> TCP 2268 > 80 [SYN]
17 98.25 -> TCP 2268 > 80 [RST, CWR]

The problem starts at packet 14, where the client tries to connect to port 80 on the server. The server seems to ignore the SYN packets and eventually the client gives up and send a RST. We’re capturing the packet on a bridge just before the web server. We eventually got Ethereal installed on the web server itself and confirmed the packets were actually arriving.

This is puzzling until you look at the problem in context (packets 1 to 13). This shows that the same client IP connected successfully to the same server 68 seconds previously; more importantly it did it from the same TCP source port, 2268. So we dig our trusty “TCP/IP illustrated” by W. Richard Stevens.

So the problem here is with something called the 2MSL wait state, also known as TIME_WAIT state. MSL is the maximum segment lifetime and is the maximum amount of time any segment can exist in the network before being discarded. When a normal connection finishes, both client and server keep a note of the (source ip, source port, destination ip, destination port) tuple for a period of time equal to 2 x MSL. During this time the client won’t use the tuple for new connections and the server ignores any incoming packets matching it.

Here, the NATting device (some Cisco thing) has a low MSL but our server has a high MSL. The result is that client connections end up NATted to new incarnations of tuples that the server is still ignoring, so the connection attempt fails.

You can see that the new incarnation starts at packet 14 only 68 seconds after the previous one finished. From this we can see that the NAT device has an MSL setting of around 30 seconds. The server is IIS on Windows 2003 and has a default MSL of 2 minutes.

The solutions is to either increase the MSL on the NAT device, decrease the MSL on the server or stop NATting. There are good performance arguments to lowering a web servers MSL so we’ll probably do this. Additionally, the NATting is unnecessary and we’ve arranged for our client to stop doing it.

Leave a Reply