Page MenuHomeVyOS Platform

WAN load-balancing can't flush connections when conntrack-sync is enabled
On hold, NormalPublicBUG

Description

When flush-connections is enabled in load-balancing, then when interface state is changing daemon wan_lb execute:

conntrack -F
conntrack -F expect

But, when conntrackd is running, the first command can't finish up and hangs. As I see, conntrack sending the command to a kernel, but don't receive any answers: Also, this leads to a continuous CPU usage by conntrack and conntrackd.

root      7792 81.7  1.0  24416  5332 ?        S<s  17:00   0:33 /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf
root      7829  7.0  0.1  12652   888 pts/1    R+   17:00   0:02 conntrack -F

This problem breaks load-balancing functional.
Instead of flushing table, we can delete its content. This works without problems. I propose to change command to the followed to avoid problems:

conntrack -D

Otherwise, we need to block the ability to enable both options at the same time.

Details

Difficulty level
Unknown (require assessment)
Version
1.2.0-rolling+201903210337
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

syncer changed the task status from Open to Confirmed.Apr 17 2019, 7:41 PM
syncer assigned this task to hagbard.
syncer triaged this task as Normal priority.

@zsdc Can you please share some config data or clarify what you mean? thx

hagbard changed the task status from Confirmed to On hold.Apr 29 2019, 4:19 PM
hagbard added a subscriber: hagbard.

@syncer
Sorry to dredge up an old bug, but I believe I've hit this today on 1.2.7-LTS myself. Per @zsdc's original description, It seems that when you configure:

service {
    conntrack-sync {

and also configure:

load-balancing {
    wan {
        flush-connections

conntrackd will peg a CPU at 100% utilization immediately after commit, forever, and this also prevents wan load-balancing from starting. The only way I could see to recover was to delete the load-balance flush-connections configuration entry and reboot, which took me a minute, due to the system being very slow from a locked-up CPU core.

Wan load-balancing and conntrack-sync seem to cooperate together fine as long as you do *not* enable flush-connections. But, I *really* would also like flush-connections to work along with my conntrack-sync. Is it possible?

dmbaturin set Is it a breaking change? to Unspecified (possibly destroys the router).

I have also hit this into the latest rolling version:

[email protected]:~$ show version

Version:          VyOS 1.4-rolling-202206130217
Release train:    sagitta

Built by:         [email protected]
Built on:         Mon 13 Jun 2022 02:17 UTC
Build UUID:       6c04f7fd-b1c8-4e5d-ab90-310ccfb016d5
Build commit ID:  001451a9c514e7

Architecture:     x86_64
Boot via:         installed image
System type:      KVM guest

Hardware vendor:  QEMU
Hardware model:   Standard PC (i440FX + PIIX, 1996)
Hardware S/N:     
Hardware UUID:    70fb4a55-e1f4-4593-83d0-189dcd06ec78

Copyright:        VyOS maintainers and contributors

I have to remove flush-connections as suggested by @klipz but this results on my slow backup internet connection being used longer than needed after the primary is recovered.