Why is TCP accept() performance so bad under Xen?

Right now: Small packet performance sucks under Xen

(moved from the question itself to a separate answer instead)

According to a user on HN (a KVM developer?) this is due to small packet performance in Xen and also KVM. It’s a known problem with virtualization and according to him, VMWare’s ESX handles this much better. He also noted that KVM are bringing some new features designed alleviate this (original post).

This info is a bit discouraging if it’s correct. Either way, I’ll try the steps below until some Xen guru comes along with a definitive answer 🙂

Iain Kay from the xen-users mailing list compiled this graph:
netperf graph
Notice the TCP_CRR bars, compare “2.6.18-239.9.1.el5” vs “2.6.39 (with Xen 4.1.0)”.

Current action plan based on responses/answers here and from HN:

  1. Submit this issue to a Xen-specific mailing list and the xensource’s bugzilla as suggested by syneticon-dj
    A message was posted to the xen-user list, awaiting reply.

  2. Create a simple pathological, application-level test case and publish it.
    A test server with instructions have been created and published to GitHub. With this you should be able to see a more real-world use-case compared to netperf.

  3. Try a 32-bit PV Xen guest instance, as 64-bit might be causing more overhead in Xen. Someone mentioned this on HN. Did not make a difference.

  4. Try enabling net.ipv4.tcp_syncookies in sysctl.conf as suggested by abofh on HN. This apparently might improve performance since the handshake would occur in the kernel. I had no luck with this.

  5. Increase the backlog from 1024 to something much higher, also suggested by abofh on HN. This could also help since guest could potentially accept() more connections during it’s execution slice given by dom0 (the host).

  6. Double-check that conntrack is disabled on all machines as it can halve the accept rate (suggested by deubeulyou). Yes, it was disabled in all tests.

  7. Check for “listen queue overflow and syncache buckets overflow in netstat -s” (suggested by mike_esspe on HN).

  8. Split the interrupt handling among multiple cores (RPS/RFS I tried enabling earlier are supposed to do this, but could be worth trying again). Suggested by adamt at HN.

  9. Turning off TCP segmentation offload and scatter/gather acceleration as suggested by Matt Bailey. (Not possible on EC2 or similar VPS hosts)

Leave a Comment