Page MenuHomeVyOS Platform

XCP-ng packet drops for small packets (e.g. icmp) under Xen and AWS
Open, Requires assessmentPublicBUG

Description

Hi I’m new. I’m not sure what needs to go here but I found a problem and someone else on the forum confirmed it is real.

When I run netstat -i it show Tx drops on Ethernet interfaces. Same TX drop shows When I run ifconfig

The drop can be reproduced by sending packets under 214 bytes in size. It seems to drop about 3.75% of small packets. Packets over 215 on the same test has 0% loss reliably.

Tested under XCP Ng 8, XenServer 6.5 and AWS (which runs Xen).

Doesn’t happen on virtual box, VMware, or HyperV.

Suspect it’s related to PARAVIRTUAL IO Drivers (xen_netfront) or something related.

Also only drops packets which are being forwarded from Ethernet to Ethernet. it doesn’t affect traffic that originates or terminated on the VyOS itself. Doesn’t affect traffic from VPN to Ethernet.

I have a XEN Lab available to anyone who wishes to tinker and test.

Details

Difficulty level
Hard (possibly days)
Version
1.3 rolling
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

https://phabricator.vyos.net/T935 Here’s the same thing happening in the past. I think it was resolved by doing kernel updates? Can someone do a kernel update in the rolling build?

There is no newer kernel then 4.19.124 on the 4.19x train. Newer Kernels do not work as the out-of-tree Intel drivers for the NICs and QAT won‘t compile for Kernel >5.3 and that is bot an LTS one.

In T2505#64889, @c-po wrote:

There is no newer kernel then 4.19.124 on the 4.19x train. Newer Kernels do not work as the out-of-tree Intel drivers for the NICs and QAT won‘t compile for Kernel >5.3 and that is bot an LTS one.

So if this is a kernel issue I should have the same problem with the same kernel under Debian 10 right

If this can be solved by a kernel update, there was talk about maybe having different build "flavors" in the past - one with all the hardware nic drivers, one without. The minimal image could then have the latest (5.x) kernel.
There's T2085 which prevents us from testing any newer kernel ourselves as it's built by Jenkinsfiles in the CI, we'd need to manually do the steps the CI does to build a kernel. I proposed a shared script solution for these repositories in that task that could be called from both the CI and vyos-build, this would allow anyone to build all packages, including the kernel, through vyos-build, just for cases like this.

@Sonicbx @jjakob I also created https://phabricator.vyos.net/T2504 - I think we duplicated the issue here. You can close whichever issue you want.

In T2505#64896, @jjakob wrote:

If this can be solved by a kernel update, there was talk about maybe having different build "flavors" in the past - one with all the hardware nic drivers, one without. The minimal image could then have the latest (5.x) kernel.
There's T2085 which prevents us from testing any newer kernel ourselves as it's built by Jenkinsfiles in the CI, we'd need to manually do the steps the CI does to build a kernel. I proposed a shared script solution for these repositories in that task that could be called from both the CI and vyos-build, this would allow anyone to build all packages, including the kernel, through vyos-build, just for cases like this.

vyos-build-kernel comes with dedicated build scripts for some time now - this should no longer be an issue. I do not support the different falvour idea as it will be a nightmare to maintain. Just give it some time when Intel decides to update their stuff.

I replaced the distributed guest utilities (vyos-xe-guest-utilities) with the ones that come with xcp-ng. But this changed nothing regarding the packet loss. Tho, now they get properly recognized by xcp-ng :-)

Does anyone have some idea on how to test with different kernels? For now this is a deal breaker while using the 1.3.x branch. Tho I would really love to keep using bleeding edge in order to help testing things :-)

I don't see problems with Debian Buster, kernel "4.19.0-9"
Need to check this patch. Ref. https://patchwork.kernel.org/patch/9293785/

With 4.19.123-amd64-vyos I am having the same problems. I would assume, that the patch from 2016 is already in this kernel?

Also happens with 4.19.131-amd64-vyos - I guess that patch mentioned by @Viacheslav is either not included or not solving the problem.

@fetzerms the mentioned patch is not included in the mainline kernel!

@c-po Thank you for clarifying. I guess I misinterpreted what i read on patchwork. I'd be eager to test a kernel with the patch!

Oh - I'm sorry - I mixed up the lines in the kernel. The patch is actually in VyOS.

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=fd07160bb7180cdd0afeb089d8cdfd66002f17e6

(21:51) cpo lnx01:~/vyos-build/packages/linux-kernel/linux # git tag --contains fd07160bb7180 | grep 4.19.131
v4.19.131
c-po renamed this task from Major Dropping small packets under Xen and AWS to XCP-ng packet drops for small packets (e.g. icmp) under Xen and AWS.Aug 16 2020, 2:49 PM
c-po added a subscriber: zsdc.