Page MenuHomeVyOS Platform

WAN load-balancing can't flush connections when conntrack-sync is enabled
Open, NormalPublicBUG

Description

When flush-connections is enabled in load-balancing, then when interface state is changing daemon wan_lb execute:

conntrack -F
conntrack -F expect

But, when conntrackd is running, the first command can't finish up and hangs. As I see, conntrack sending the command to a kernel, but don't receive any answers: Also, this leads to a continuous CPU usage by conntrack and conntrackd.

root      7792 81.7  1.0  24416  5332 ?        S<s  17:00   0:33 /usr/sbin/conntrackd -C /etc/conntrackd/conntrackd.conf
root      7829  7.0  0.1  12652   888 pts/1    R+   17:00   0:02 conntrack -F

This problem breaks load-balancing functional.
Instead of flushing table, we can delete its content. This works without problems. I propose to change command to the followed to avoid problems:

conntrack -D

Otherwise, we need to block the ability to enable both options at the same time.

Details

Difficulty level
Unknown (require assessment)
Version
1.2.0-rolling+201903210337
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Unspecified (please specify)

Event Timeline

syncer changed the task status from Open to Confirmed.Apr 17 2019, 7:41 PM
syncer assigned this task to hagbard.
syncer triaged this task as Normal priority.

@zsdc Can you please share some config data or clarify what you mean? thx

hagbard changed the task status from Confirmed to On hold.Apr 29 2019, 4:19 PM
hagbard added a subscriber: hagbard.

@syncer
Sorry to dredge up an old bug, but I believe I've hit this today on 1.2.7-LTS myself. Per @zsdc's original description, It seems that when you configure:

service {
    conntrack-sync {

and also configure:

load-balancing {
    wan {
        flush-connections

conntrackd will peg a CPU at 100% utilization immediately after commit, forever, and this also prevents wan load-balancing from starting. The only way I could see to recover was to delete the load-balance flush-connections configuration entry and reboot, which took me a minute, due to the system being very slow from a locked-up CPU core.

Wan load-balancing and conntrack-sync seem to cooperate together fine as long as you do *not* enable flush-connections. But, I *really* would also like flush-connections to work along with my conntrack-sync. Is it possible?

dmbaturin set Is it a breaking change? to Unspecified (possibly destroys the router).

I have also hit this into the latest rolling version:

vyos@vyos:~$ show version

Version:          VyOS 1.4-rolling-202206130217
Release train:    sagitta

Built by:         [email protected]
Built on:         Mon 13 Jun 2022 02:17 UTC
Build UUID:       6c04f7fd-b1c8-4e5d-ab90-310ccfb016d5
Build commit ID:  001451a9c514e7

Architecture:     x86_64
Boot via:         installed image
System type:      KVM guest

Hardware vendor:  QEMU
Hardware model:   Standard PC (i440FX + PIIX, 1996)
Hardware S/N:     
Hardware UUID:    70fb4a55-e1f4-4593-83d0-189dcd06ec78

Copyright:        VyOS maintainers and contributors

I have to remove flush-connections as suggested by @klipz but this results on my slow backup internet connection being used longer than needed after the primary is recovered.

Viacheslav changed the task status from On hold to Needs testing.Jan 9 2023, 8:32 AM

@Viacheslav will you backport this to 1.3 ?

@syncer Not sure if it works properly, requires more tests and responses.

Viacheslav reopened this task as Open.
dmbaturin set Issue type to Unspecified (please specify).