Packet Analysis for Troubleshooting-Slow response of AD home directory

Symptom: virtual desktop end users complain the performance issue: the end users can access their AD home directory quickly at the first time. After a little while, they have to wait for over 30 seconds before they can reach their home directory.

Method: perform packet capture on one of end users and successfully capture the packet when the user is experiencing the issue.

Finding in the packet analysis:

TCP_retransmission

Root Cause: By default, the timeout setting of session entry in firewall session table for most of stateful firewalls are 30 mins. If there is not any packet passing through the firewall for that session, the session will be timed out and removed from the session table by the firewall.

In our case, a new TCP session entries will be established in the firewall session table when the virtual desktop users try to access the home directory at the first time. Then the end users often doesn’t use the home directory any more. After half hour, the idle session entry will be removed from the firewall. But from end user application point of view, the session is still alive and they try to use this alive session to access their home directory again. (Remember the user desktop won’t perform so called a three-way handshake to establish a new TCP session as the application layer still think the TCP session is still alive). When the application traffic hits the firewall. the firewall dropped the packet as the application traffic is not TCP SYN packet. (Unfortunately, the Juniper SRX firewall drops the packets silently!!! No logging or alert). So the end user desktop has to follow up the standard TCP re-transimission mechanism to re-transmit the packet. It takes 12 seconds in our case before the end user device gives up and try to initiate a new TCP session.

Solution:

We have 2 ways to fix the issue:

1. Infrastructure point of view:

Change the default session timeout setting on the firewall to a bit of bigger than the application layer session timeout;

2. Application point of view:

Make the application to periodically (e.g. 20mins) send TCP keep-alive packet before the session entry is removed from the firewall session table;

Both of the above fixes will bring a bit of overhead on the firewall, especially from session table size point of view.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s