Page MenuHomeVyOS Platform

Intel ethernet driver defaults sub-optimal
Open, WishlistPublic

Description

My hardware is based on i211AT and slow (by today's standards, 4 cores @ 1GHz) low power CPU (PC Engines APU4) but the issues could affect other Intel NICs as well.

  1. Default ring buffer sizes (see "ethtool -g eth0") of 256 are sub-optimal for a fairly busy router, causing significant packet loss. Works much better after "ethtool -G rx 4096 tx 4096 eth0" which are the max allowed values. Real Internet traffic (in my case, BGP router for a small local ISP with a few hundreds of customers) tends to be quite bursty, so it's not immediately obvious from averaged traffic or CPU load statistics or even iperf tests that there are brief moments when it can't keep up. If the small defaults are needed due to some old hardware limitations, it would at least be good to clearly document as it cost me some grey hair to find the cause of packet loss (wired router is the last place to expect this in a mostly wireless network). Anyway, igb (even ixgbe uses just 512 by default, max is 4096 too) is for fairly recent hardware, e1000e for older (PCIe) and e1000 for much older (parallel PCI, PCI-X) hardware right? For comparison, I checked driver source for cheap Realtek NICs (r8169) which use 2560 RX descriptors - not tunable, but 10x more. Larger buffers mean higher latency but I'm not seeing that much higher, and packet loss before increasing them had much worse effect on customer experience (about 1% may not seem a lot, but matters a lot at todays high speeds, see the formula at https://en.wikipedia.org/wiki/TCP_tuning#Packet_loss - 10x more TCP speed needs 100x lower loss under square root).
  1. On the same busy router I'm seeing lots of "igb ...: partial checksum but l4 proto=29!" messages in dmesg logs, and also some "mixed HW and IP checksum settings", this probably has something to do with offload settings Not sure which of the settings yet. Again, defaults are probably tuned for a not very busy host, not for a busy router.
  1. This may be a quirk of my specific hardware or BIOS (coreboot) but of 4 i211AT interfaces, first two get two IRQs/queues and next two only get one IRQ/queue each. RPS needs to be enabled manually to distribute the load evenly between 4 CPU cores, otherwire just one core was 100%, especially on the box running as PPPoE server, this seems much more CPU intensive (my guess - copying each packet, as it's fresh data from the network it means lots of cache misses and stressed memory bandwidth) than plain routing.

Details

Difficulty level
Unknown (require assessment)
Version
1.2.6-epa1
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Perfectly compatible
Issue type
Improvement (missing useful functionality)

Event Timeline

marekm triaged this task as Wishlist priority.Sep 9 2020, 10:00 PM
marekm created this task.
marekm created this object in space S1 VyOS Public.
  1. Totally agree with this. We had this same issue when we used to run Vyatta. Took me ages to figure out too.

However, I'm not sure what would be the best way to implement this is? I read a good explanation here about when to increase and change interrupt settings.
Do you think a config option is best e.g.
set interfaces ethernet eth0 advanced ring rx nnnn

and

What I think would be ideal, but a bit more interesting to implement, would be to have some kind of health monitor to look periodically at the rx_no_buffer_count and rx_missed_errors counters on different interfaces, and throw a warning or recommendation out to the logs and/or console?
Maybe a daily script or something?

I think most users will be looking at these figures after tearing their hair out for some time (like you did, and like I did!) before they realise how to fix the issue. So I think a prompt of sorts to look at the buffer settings would overall be best.

What do you think?

  1. protocol 29 is ipv6. I've noticed that there is some development work going on with kernel 5.9.9, which has inbuilt intel drivers which might be a bit less buggy?
  1. not sure about this one.
erkin set Issue type to Improvement (missing useful functionality).Aug 29 2021, 1:09 PM
erkin removed a subscriber: Active contributors.

I also have some of those APU4 devices and they work actually pretty good. The reasons for those "low" defaults are actually not from VyOS but from the Linux Kernel itself.

The problem in "raising" then generally is as follows:

  • Defaults are the Linux community sane defaults
  • Not every hardware has the same min/max value making config port more complex
  • If I remember increasing RX/TX ring buffers add additional latency but I might remember incorrectly
  • What about APU boards that have the lowest available amount of memory? Will it break?

To me it feels - right now - that there is no silver bullet here. Improving the documentation on the other hand will never hurt. As you hit this one hard you care to submit a PR updating or documentation at https://docs.vyos.io ?