Page MenuHomeVyOS Platform

Bond doesn't survive reboot
Closed, ResolvedPublicBUG

Description

I have a test setup:

interfaces {

bonding bond0 {
    description "Core: XXX"
    member {
        interface eth6
        interface eth7
    }
    mode 802.3ad
    mtu 9000
    vif 3331 {
        address x.y.z.a/29
    }
}

}

and

ethernet eth6 {
    hw-id ac:1f:6b:6d:16:74
    mtu 9000
    offload-options {
        generic-receive on
        generic-segmentation on
        scatter-gather on
        tcp-segmentation on
        udp-fragmentation on
    }
}
ethernet eth7 {
    hw-id ac:1f:6b:6d:16:75
    mtu 9000
    offload-options {
        generic-receive on
        generic-segmentation on
        scatter-gather on
        tcp-segmentation on
        udp-fragmentation on
    }
}

this setup doesn't survive reboot ... it boots into bond state down:

$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: down
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

But:

vyos@vRouter3:~$ configure
vyos@vRouter3# delete interfaces bonding bond0 member interface eth7
vyos@vRouter3# delete interfaces bonding bond0 member interface eth6
vyos@vRouter3# commit
vyos@vRouter3# set interfaces bonding bond0 member interface eth6
vyos@vRouter3# set interfaces bonding bond0 member interface eth7
vyos@vRouter3# commit

Solves the problem.

It doesn't work in 1.3-rolling-202002101915 and 1.3-rolling-202002120218. Similar setup does work in 1.3-rolling-202001261918.

There are no visible errors in log files.

Details

Difficulty level
Normal (likely a few hours)
Version
1.3-rolling-202002120218
Why the issue appeared?
Design mistake
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

Unknown Object (User) changed the task status from Open to Confirmed.Feb 12 2020, 10:38 PM
c-po changed the task status from Confirmed to In progress.Feb 15 2020, 1:11 PM
c-po claimed this task.

There is a faulty delta check in interfaces-bonding.py leading to the fact that physical interfaces are not enslaved. As soon as they are enslaved all works (eth2 and eth3 in my case):

ip link set dev eth2 down
echo '+eth2' > /sys/class/net/bond0/bonding/slaves
ip link set dev eth3 down
echo '+eth3' > /sys/class/net/bond0/bonding/slaves
c-po triaged this task as High priority.
c-po changed Difficulty level from Unknown (require assessment) to Normal (likely a few hours).
c-po changed Why the issue appeared? from Will be filled on close to Design mistake.
c-po changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.

Thanks for reporting this nasty issue. The fix is applied to rolling release starting with build: vyos-1.3-rolling-202002161021-amd64.iso

This issue is not present in 1.2 crux branch as it was introduced in the new Python interface code.

erkin set Issue type to Bug (incorrect behavior).Aug 31 2021, 5:43 PM