Page MenuHomePhabricator

Bond doesn't survive reboot
Closed, ResolvedPublicBUG

Description

I have a test setup:

interfaces {

bonding bond0 {
    description "Core: XXX"
    member {
        interface eth6
        interface eth7
    }
    mode 802.3ad
    mtu 9000
    vif 3331 {
        address x.y.z.a/29
    }
}

}

and

ethernet eth6 {
    hw-id ac:1f:6b:6d:16:74
    mtu 9000
    offload-options {
        generic-receive on
        generic-segmentation on
        scatter-gather on
        tcp-segmentation on
        udp-fragmentation on
    }
}
ethernet eth7 {
    hw-id ac:1f:6b:6d:16:75
    mtu 9000
    offload-options {
        generic-receive on
        generic-segmentation on
        scatter-gather on
        tcp-segmentation on
        udp-fragmentation on
    }
}

this setup doesn't survive reboot ... it boots into bond state down:

$ cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: load balancing (round-robin)
MII Status: down
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

But:

vyos@vRouter3:~$ configure
vyos@vRouter3# delete interfaces bonding bond0 member interface eth7
vyos@vRouter3# delete interfaces bonding bond0 member interface eth6
vyos@vRouter3# commit
vyos@vRouter3# set interfaces bonding bond0 member interface eth6
vyos@vRouter3# set interfaces bonding bond0 member interface eth7
vyos@vRouter3# commit

Solves the problem.

It doesn't work in 1.3-rolling-202002101915 and 1.3-rolling-202002120218. Similar setup does work in 1.3-rolling-202001261918.

There are no visible errors in log files.

Details

Difficulty level
Normal (likely a few hours)
Version
1.3-rolling-202002120218
Why the issue appeared?
Design mistake
Is it a breaking change?
Perfectly compatible

Event Timeline

primoz created this task.Wed, Feb 12, 10:03 PM
Dmitry changed the task status from Open to Confirmed.Wed, Feb 12, 10:38 PM
c-po changed the task status from Confirmed to In progress.Sat, Feb 15, 1:11 PM
c-po claimed this task.
c-po added a comment.Sat, Feb 15, 7:44 PM

There is a faulty delta check in interfaces-bonding.py leading to the fact that physical interfaces are not enslaved. As soon as they are enslaved all works (eth2 and eth3 in my case):

ip link set dev eth2 down
echo '+eth2' > /sys/class/net/bond0/bonding/slaves
ip link set dev eth3 down
echo '+eth3' > /sys/class/net/bond0/bonding/slaves
c-po closed this task as Resolved.Sun, Feb 16, 9:58 AM
c-po triaged this task as High priority.
c-po changed Difficulty level from Unknown (require assessment) to Normal (likely a few hours).
c-po changed Why the issue appeared? from Will be filled on close to Design mistake.
c-po changed Is it a breaking change? from Perfectly compatible to Perfectly compatible.
c-po added a comment.Sun, Feb 16, 10:39 AM

Thanks for reporting this nasty issue. The fix is applied to rolling release starting with build: vyos-1.3-rolling-202002161021-amd64.iso

This issue is not present in 1.2 crux branch as it was introduced in the new Python interface code.

pasik added a subscriber: pasik.Sun, Feb 16, 8:31 PM
c-po moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus board.Mon, Feb 17, 6:57 PM