Page MenuHomeVyOS Platform

Bonded interfaces get updated with incorrect hw-id in config.
Closed, ResolvedPublicBUG

Description

When eth1 and eth2 are bonded, both hw-id entries are occasionally set to the same MAC in the config, causing eth2 to not function on the next reboot.

Where does VyOS look to get the MAC? At least with r8169, there appears to be no way to pull the original mac after the interface is bonded or the mac is changed.

Made this temp fix so the router boots correctly.
vyos-preconfig-bootup.script:
sed -i.bak '/00:e0:4c:68:0a/d' /config/config.boot

Details

Difficulty level
Hard (possibly days)
Version
1.2.0-rolling+201808021639
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

mb300sd created this task.Aug 7 2018, 2:48 PM
syncer triaged this task as Low priority.Sep 1 2018, 2:55 PM
pasik added a subscriber: pasik.Oct 1 2018, 9:51 AM
syncer assigned this task to hagbard.Feb 8 2019, 12:09 AM
syncer raised the priority of this task from Low to Normal.
hagbard changed the task status from Open to Confirmed.May 6 2019, 9:34 PM
hagbard changed the task status from Confirmed to On hold.Sep 4 2019, 10:26 PM

@mb300sd can you please test with the latest rolling image and see if the issue still exists?

@hagbard I no longer have the hardware the issue was found on, or anything else with identical interfaces to bond at the moment.

No worries, I checked it out, the issue still persists but is not easily fixable.

c-po added a subscriber: c-po.Sep 5 2019, 4:18 PM

As the bonding interface has been completely rewritten this should not be an issue as I do not touch underlaying interface MAC addresses

hagbard changed the task status from On hold to In progress.Sep 5 2019, 4:47 PM
hagbard set Is it a breaking change? to Unspecified (possibly destroys the router).

@c-po vyos config does touch it via a perl script. I have a patch ready today for it.

To reproduce:

set interfaces bonding bond0 address '10.100.100.1/24'
set interfaces bonding bond0 member interface 'eth2'
set interfaces bonding bond0 member interface 'eth0'
set interfaces bonding bond0 mode 'round-robin'

Save and reboot.

c-po added a comment.Sep 5 2019, 5:43 PM

Huh? Which perl script?

hagbard added a comment.EditedSep 5 2019, 6:13 PM
/opt/vyatta/sbin/vyatta-interfaces.pl

Has nothing to do with your rewrite, it is the legacy code which sets up the ethernet interfaces. Bond runs first, after that comes ethernet and changes the mac address of the bond member interface and that's the issue.

So there are 2 issues as I found out, I fixed one so far. `/opt/vyatta/sbin/vyatta-interfaces.pl``` has been fixed, if it's called with a bonding interface it doesn't care about hw-id as long as it's a bond member, otherwise the legacy code just continues as before.
That helps with config changes and a cold boot, reboot however brings in another issue. Before the system goes down it compares mac addresses and sorts them. bond is still active and 2 eth interface have the same mac which confuses `/lib/udev/vyatta_net_name```

/run/udev/log/vyatta-net-name.coldplug:

Thu Sep  5 19:34:42 2019: lookup eth2 08:00:27:05:c4:22
Thu Sep  5 19:34:43 2019: use hw-id 08:00:27:05:c4:22 in config mapped to 'eth2' <- that was eth0, a bond member interface which won't be found now anymore
Thu Sep  5 19:34:43 2019: lookup eth1 08:00:27:34:8d:72
Thu Sep  5 19:34:43 2019: use hw-id 08:00:27:34:8d:72 in config mapped to 'eth1'
Thu Sep  5 19:34:43 2019: lookup eth0 08:00:27:05:c4:22
Thu Sep  5 19:34:43 2019: use hw-id 08:00:27:05:c4:22 in config mapped to 'eth2'
Thu Sep  5 19:34:43 2019: lookup eth3 08:00:27:b9:3b:e2
Thu Sep  5 19:34:43 2019: use hw-id 08:00:27:b9:3b:e2 in config mapped to 'eth3'
hagbard changed Difficulty level from Unknown (require assessment) to Hard (possibly days).Sep 5 2019, 9:38 PM
hagbard changed the task status from In progress to On hold.Sep 6 2019, 6:44 PM

Confirmed, same issue in 1.2.2

to test the syntax is a bit different:

 set interfaces bonding bond0 address '10.100.100.1/24'
set interfaces bonding bond0 hash-policy 'layer2'
set interfaces bonding bond0 mode 'round-robin'
set interfaces ethernet eth0 bond-group 'bond0'
set interfaces ethernet eth2 bond-group 'bond0'

Then call reboot and eth0 will be cone.

Fri Sep  6 18:48:34 2019: lookup eth2 08:00:27:2d:3b:c9
Fri Sep  6 18:48:34 2019: use hw-id 08:00:27:2d:3b:c9 in config mapped to 'eth2' <-- that was eth0
Fri Sep  6 18:48:34 2019: lookup eth0 08:00:27:2d:3b:c9
Fri Sep  6 18:48:34 2019: use hw-id 08:00:27:2d:3b:c9 in config mapped to 'eth2'
Fri Sep  6 18:48:34 2019: lookup eth3 08:00:27:74:f7:14
hagbard changed the task status from On hold to In progress.Sep 6 2019, 7:03 PM
hagbard changed the task status from In progress to Confirmed.
hagbard added a project: VyOS 1.2 Crux.
hagbard changed the task status from Confirmed to In progress.Sep 11 2019, 5:16 PM
syncer changed the task status from In progress to Needs testing.Nov 16 2019, 11:44 PM
syncer reassigned this task from hagbard to Dmitry.
syncer edited projects, added VyOS 1.2 Crux (VyOS 1.2.4); removed VyOS 1.2 Crux.
syncer added a subscriber: hagbard.
Dmitry added a comment.Dec 8 2019, 7:51 PM

I can't reproduce this issue on 1.2.3 and 1.2/1.3 rolling. @hagbard can you test again too?

@Dmitry Tested it with the latest 1.2 rolling, the issue is still present.

https://downloads.vyos.io/rolling/current/amd64/vyos-1.2-rolling-201912100217-amd64.iso

Here is my entire config, eth2 just disappeared due to the fact it had the same mac as eth3 from the bonding driver.

set interfaces ethernet eth0 address 'dhcp'
set interfaces ethernet eth0 hw-id '08:00:27:79:e4:b3'
set interfaces ethernet eth1 address '10.1.1.12/24'
set interfaces ethernet eth1 hw-id '08:00:27:9d:92:e1'
set interfaces ethernet eth3 hw-id '08:00:27:e9:45:1c'
set interfaces loopback lo
set protocols static
set service ssh
set system config-management commit-revisions '100'
set system host-name '1-2-latest'
set system login user vyos authentication encrypted-password '$6$qXcOmR.AMCuZv$WXXa0HXcJaNb4fcRymUMpTzBYxb3QCoLWWlgPgiTbRGa2GaS1R1qtI4hpscSJnA/4AnkJhJ8XYj15XjeCHjL61'
set system login user vyos authentication plaintext-password ''
set system login user vyos level 'admin'
set system ntp server 0.pool.ntp.org
set system ntp server 1.pool.ntp.org
set system ntp server 2.pool.ntp.org
set system proxy port '8080'
set system proxy url 'http://159.249.136.149'
set system syslog global facility all level 'info'
set system syslog global facility protocols level 'debug'
set system time-zone 'UTC'

So the above will fail with eth2 supposed to be renamed. The bot config however still has all the correct data:

interfaces {
    bonding bond0 {
        address 10.100.100.1/29
        member {
            interface eth3
            interface eth2
        }
    }
    ethernet eth0 {
        address dhcp
        hw-id 08:00:27:79:e4:b3
    }
    ethernet eth1 {
        address 10.1.1.12/24
        hw-id 08:00:27:9d:92:e1
    }
    ethernet eth2 {
        hw-id 08:00:27:19:08:fd
    }
    ethernet eth3 {
        hw-id 08:00:27:e9:45:1c
    }
    loopback lo {
    }
}
protocols {
    static {
    }
}
service {
    ssh {
    }
}
system {
    config-management {
        commit-revisions 100
    }
    host-name 1-2-latest
    login {
        user vyos {
            authentication {
                encrypted-password $6$qXcOmR.AMCuZv$WXXa0HXcJaNb4fcRymUMpTzBYxb3QCoLWWlgPgiTbRGa2GaS1R1qtI4hpscSJnA/4AnkJhJ8XYj15XjeCHjL61
                plaintext-password ""
            }
            level admin
        }
    }
    ntp {
        server 0.pool.ntp.org {
        }
        server 1.pool.ntp.org {
        }
        server 2.pool.ntp.org {
        }
    }
    proxy {
        port 8080
        url http://159.249.136.149
    }
    syslog {
        global {
            facility all {
                level info
            }
            facility protocols {
                level debug
            }
        }
    }
    time-zone UTC
}


/* Warning: Do not remove the following line. */
/* === vyatta-config-version: "broadcast-relay@1:cluster@1:config-management@1:conntrack@1:conntrack-sync@1:dhcp-relay@2:dhcp-server@5:dns-forwarding@2:firewall@5:interfaces@4:ipsec@5:l2tp@1:mdns@1:nat@4:ntp@1:pptp@1:qos@1:quagga@4:snmp@1:ssh@1:system@12:vrrp@2:vyos-accel-ppp@2:wanloadbalance@3:webgui@1:webproxy@2:zone-policy@1" === */
/* Release version: 1.2-rolling-201912100217 */

When the bond driver enslaves the physical interfaces it uses the mac address from the first and sets it up on all enslaved nics, /lib/udev/vyatta_net_name finds now multiple different nic names (all slaves after the first one was successful) with the same mac address and tries to rename it to the one which already exists, in our example eth2. Udevd can't do that ans stays in that state, while I was investigation if we can get a status back based on that event we could use to skip processing these interfaces via /lib/udev/vyatta_net_name. That would leave everything else still intact, unfortunately I didn't have the time to work on that.

If it possible, please explain me in private message how I can reach this behaviour. In my LAB this does not reproduced.

vyos@vyos# run show version | grep Version
Version:          VyOS 1.2.3
[edit]
vyos@vyos# set interfaces bonding bond0 address '10.100.100.1/24'
[edit]
vyos@vyos# set interfaces bonding bond0 hash-policy 'layer2'
[edit]
vyos@vyos# set interfaces bonding bond0 mode 'round-robin'
[edit]
vyos@vyos# set interfaces ethernet eth0 bond-group 'bond0'
[edit]
vyos@vyos# set interfaces ethernet eth2 bond-group 'bond0'
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# save 
Saving configuration to '/config/config.boot'...

After reboot all interfaces added.

vyos@vyos:~$ show interfaces 
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
bond0            10.100.100.1/24                   u/u  
eth0             -                                 u/u  
eth1             -                                 u/u  
eth2             -                                 u/u  
eth3             -                                 u/u  
lo               127.0.0.1/8                       u/u  
                 ::1/128
                 
vyos@vyos:~$ sudo cat /run/udev/log/vyatta-net-name.coldplug 
Thu Jan 23 13:56:55 2020: lookup eth0 50:00:00:03:00:00
Thu Jan 23 13:56:55 2020: use hw-id 50:00:00:03:00:00 in config mapped to 'eth0'
Thu Jan 23 13:56:55 2020: lookup eth1 50:00:00:03:00:01
Thu Jan 23 13:56:55 2020: use hw-id 50:00:00:03:00:01 in config mapped to 'eth1'
Thu Jan 23 13:56:55 2020: lookup eth2 50:00:00:03:00:02
Thu Jan 23 13:56:55 2020: use hw-id 50:00:00:03:00:02 in config mapped to 'eth2'
Thu Jan 23 13:56:55 2020: lookup eth3 50:00:00:03:00:03
Thu Jan 23 13:56:55 2020: use hw-id 50:00:00:03:00:03 in config mapped to 'eth3'

@hagbard Can you check it with the latest rolling?

c-po added a comment.Aug 3 2020, 3:34 PM

Works as expected

c-po closed this task as Resolved.Aug 3 2020, 3:34 PM
c-po moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus board.Aug 4 2020, 6:09 AM