Page MenuHomeVyOS Platform

Upgrade from 1.2.5 to 1.3-rolling-202005261512 results in broken network config on second boot
Open, HighPublicBUG

Description

Upgrading works fine, router comes up good on the first boot. Any reboots after that and the networking config is messed up.

Snippet from config.boot pre-upgrade:

interfaces {
    bridge br0 {
        address 193.111.111.1/28
        address 2001:111:111:3::1/64
        aging 300
        hello-time 2
        ipv6 {
            dup-addr-detect-transmits 1
            router-advert {
                cur-hop-limit 64
                link-mtu 0
                managed-flag false
                max-interval 600
                other-config-flag false
                prefix 2001:111:111:3::/64 {
                    autonomous-flag true
                    on-link-flag true
                    valid-lifetime 2592000
                }
                reachable-time 0
                retrans-timer 0
                send-advert true
            }
        }
        max-age 20
        policy {
            route dmz
        }
        priority 32768
        stp false
    }
    ethernet eth1 {
        address dhcp
        duplex auto
        hw-id 00:e0:67:16:d8:ac
        smp-affinity auto
        speed auto
    }
    ethernet eth2 {
        bridge-group {
            bridge br0
        }
        duplex auto
        hw-id 00:e0:67:16:d8:ad
        smp-affinity auto
        speed auto
    }
    ethernet eth3 {
        bridge-group {
            bridge br0
        }
        duplex auto
        hw-id 00:e0:67:16:d8:ae
        smp-affinity auto
        speed auto
    }
    ethernet eth4 {
        bridge-group {
            bridge br0
        }
        duplex auto
        hw-id 00:e0:67:16:d8:af
        smp-affinity auto
        speed auto
    }
    loopback lo {
    }
    wireguard wg1 {
        address 193.111.111.242/30
        address 2001:111:111:a4::2/64
        description c1
        peer c1 {
            allowed-ips 0.0.0.0/0
            allowed-ips ::/0
            endpoint 193.111.111.227:12346
            pubkey 111
        }
    }
}

First boot into 1.3. Networking is OK:

danhusan@portabel:~$ show interfaces 
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
br0              193.111.111.1/28                   u/u  
                 2001:111:111:3::1/64
eth1             10.88.89.45/24                    u/u  
eth2             -                                 u/D  
eth3             -                                 u/u  
eth4             -                                 u/D  
lo               127.0.0.1/8                       u/u  
                 ::1/128
wg1              193.111.111.242/30                 u/u  c1 
                 2001:111:111:a4::2/64

Snippet from config.boot, clearly some syntax has changed:

interfaces {
    bridge br0 {
        member {
            interface             eth4 { }
            interface             eth3 { }
            interface             eth2 { }
        }
        address "193.111.111.1/28"
        address "2001:111:111:3::1/64"
        aging "300"
        hello-time "2"
        ipv6 {
            dup-addr-detect-transmits "1"
        }
        max-age "20"
        policy {
            route "dmz"
        }
        priority "32768"
    }
    ethernet eth1 {
        address "dhcp"
        duplex "auto"
        hw-id "00:e0:67:16:d8:ac"
        smp-affinity "auto"
        speed "auto"
    }
    ethernet eth2 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:ad"
        smp-affinity "auto"
        speed "auto"
    }
    ethernet eth3 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:ae"
        smp-affinity "auto"
        speed "auto"
    }
    ethernet eth4 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:af"
        smp-affinity "auto"
        speed "auto"
    }
    loopback     lo { }
    wireguard wg1 {
        address "193.111.111.242/30"
        address "2001:111:111:a4::2/64"
        description "c1"
        peer c1 {
            port "12346"
            address "193.111.111.227"
            allowed-ips "0.0.0.0/0"
            allowed-ips "::/0"
            pubkey "111"
        }
    }
}

Rebooting again, not making any changes, committing or saving ends up with broken networking:

[ 104.736589] vyos-router[764]: Starting VyOS router: migrate rl-system firewall configure failed!
[ 104.991641] vyos-config[816]: Configuration error

danhusan@portabel:~$ show interfaces 
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             -                                 u/u  
eth1             -                                 u/D  
eth2             -                                 u/u  
eth3             -                                 u/D  
lo               127.0.0.1/8                       u/u  
                 ::1/128
wg1              193.111.111.242/30                 u/u  c1 
                 2001:111:111:a4::2/64

And config.boot has been messed up. Notice the double hw-id's and added eth0.

interfaces {
    bridge br0 {
        member {
            interface             eth4
            interface             eth3
            interface             eth2
        }
        address "193.111.111.1/28"
        address "2001:111:111:3::1/64"
        aging "300"
        hello-time "2"
        ipv6 {
            dup-addr-detect-transmits "1"
        }
        max-age "20"
        policy {
            route "dmz"
        }
        priority "32768"
    }
    ethernet eth1 {
        address "dhcp"
        duplex "auto"
        hw-id "00:e0:67:16:d8:ac"
        smp-affinity "auto"
        speed "auto"
        hw-id 00:e0:67:16:d8:ad
    }
    ethernet eth2 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:ad"
        smp-affinity "auto"
        speed "auto"
        hw-id 00:e0:67:16:d8:ae
    }
    ethernet eth3 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:ae"
        smp-affinity "auto"
        speed "auto"
        hw-id 00:e0:67:16:d8:af
    }
    ethernet eth4 {
        duplex "auto"
        hw-id "00:e0:67:16:d8:af"
        smp-affinity "auto"
        speed "auto"
    }
    loopback     lo
    wireguard wg1 {
        address "193.111.111.242/30"
        address "2001:111:111:a4::2/64"
        description "c1"
        peer c1 {
            port "12346"
            address "193.111.111.227"
            allowed-ips "0.0.0.0/0"
            allowed-ips "::/0"
            pubkey "111"
        }
    }
    ethernet eth0 {
        hw-id 00:e0:67:16:d8:ac
    }
}

Details

Difficulty level
Unknown (require assessment)
Version
1.3-rolling-202005261512
Why the issue appeared?
Implementation mistake
Is it a breaking change?
Config syntax change (migratable)

Event Timeline

danhusan created this task.May 26 2020, 7:04 PM
c-po added a subscriber: c-po.May 26 2020, 8:56 PM

Why is there no eth0 on VyOS 1.2.5?

c-po added a comment.May 26 2020, 9:08 PM

I tried mocking with your configuration and thus needed to delete the policy statement as I have no policy installed. Maybe you can boot your system with the vyos-config-debug option and share the output? Or send the full config.boot.

To pass the parameter, halt on GRUB prompt, press e for edit and change as follows, then press F10 to boot

More infos: https://docs.vyos.io/en/latest/contributing/development.html#kernel-boot-parameters

c-po changed the task status from Open to On hold.May 26 2020, 9:10 PM
c-po claimed this task.
danhusan added a comment.EditedMay 27 2020, 6:07 AM
In T2523#65310, @c-po wrote:

Why is there no eth0 on VyOS 1.2.5?

I've manually edited config.boot to have eth1->4 instead of eth0->3 to correctly reflect the interface naming of the appliance.
Should I run the debug during first-boot, second boot, or maybe both?

Sanitized config (pre-upgrade):

set firewall all-ping 'enable'
set firewall broadcast-ping 'disable'
set firewall config-trap 'disable'
set firewall ipv6-receive-redirects 'disable'
set firewall ipv6-src-route 'disable'
set firewall ip-src-route 'disable'
set firewall log-martians 'enable'
set firewall options interface wg1 adjust-mss '1372'
set firewall options interface wg1 adjust-mss6 '1280'
set firewall receive-redirects 'disable'
set firewall send-redirects 'enable'
set firewall source-validation 'disable'
set firewall syn-cookies 'enable'
set firewall twa-hazards-protection 'disable'
set interfaces bridge br0 address '193.111.111.1/28'
set interfaces bridge br0 address '2001:111:111:3::1/64'
set interfaces bridge br0 aging '300'
set interfaces bridge br0 hello-time '2'
set interfaces bridge br0 ipv6 dup-addr-detect-transmits '1'
set interfaces bridge br0 ipv6 router-advert cur-hop-limit '64'
set interfaces bridge br0 ipv6 router-advert link-mtu '0'
set interfaces bridge br0 ipv6 router-advert managed-flag 'false'
set interfaces bridge br0 ipv6 router-advert max-interval '600'
set interfaces bridge br0 ipv6 router-advert other-config-flag 'false'
set interfaces bridge br0 ipv6 router-advert prefix 2001:111:111:3::/64 autonomous-flag 'true'
set interfaces bridge br0 ipv6 router-advert prefix 2001:111:111:3::/64 on-link-flag 'true'
set interfaces bridge br0 ipv6 router-advert prefix 2001:111:111:3::/64 valid-lifetime '2592000'
set interfaces bridge br0 ipv6 router-advert reachable-time '0'
set interfaces bridge br0 ipv6 router-advert retrans-timer '0'
set interfaces bridge br0 ipv6 router-advert send-advert 'true'
set interfaces bridge br0 max-age '20'
set interfaces bridge br0 policy route 'dmz'
set interfaces bridge br0 priority '32768'
set interfaces bridge br0 stp 'false'
set interfaces ethernet eth1 address 'dhcp'
set interfaces ethernet eth1 duplex 'auto'
set interfaces ethernet eth1 hw-id '00:e0:67:16:d8:ac'
set interfaces ethernet eth1 smp-affinity 'auto'
set interfaces ethernet eth1 speed 'auto'
set interfaces ethernet eth2 bridge-group bridge 'br0'
set interfaces ethernet eth2 duplex 'auto'
set interfaces ethernet eth2 hw-id '00:e0:67:16:d8:ad'
set interfaces ethernet eth2 smp-affinity 'auto'
set interfaces ethernet eth2 speed 'auto'
set interfaces ethernet eth3 bridge-group bridge 'br0'
set interfaces ethernet eth3 duplex 'auto'
set interfaces ethernet eth3 hw-id '00:e0:67:16:d8:ae'
set interfaces ethernet eth3 smp-affinity 'auto'
set interfaces ethernet eth3 speed 'auto'
set interfaces ethernet eth4 bridge-group bridge 'br0'
set interfaces ethernet eth4 duplex 'auto'
set interfaces ethernet eth4 hw-id '00:e0:67:16:d8:af'
set interfaces ethernet eth4 smp-affinity 'auto'
set interfaces ethernet eth4 speed 'auto'
set interfaces loopback lo
set interfaces wireguard wg1 address '193.111.111.242/30'
set interfaces wireguard wg1 address '2001:111:111:a4::2/64'
set interfaces wireguard wg1 description 'c1'
set interfaces wireguard wg1 peer c1 allowed-ips '0.0.0.0/0'
set interfaces wireguard wg1 peer c1 allowed-ips '::/0'
set interfaces wireguard wg1 peer c1 endpoint '193.111.111.227:12346'
set interfaces wireguard wg1 peer c1 pubkey '11111111111111111111111111111111'
set policy route dmz rule 1 set table '10'
set policy route dmz rule 1 source address '193.111.111.0/28'
set protocols static route6 ::/0 next-hop 2001:111:111:a4::1
set protocols static table 10 route 0.0.0.0/0 next-hop 193.111.111.241
set service dhcp-server shared-network-name dmz subnet 193.111.111.0/28 default-router '193.111.111.1'
set service dhcp-server shared-network-name dmz subnet 193.111.111.0/28 dns-server '1.1.1.1'
set service dhcp-server shared-network-name dmz subnet 193.111.111.0/28 dns-server '1.0.0.1'
set service dhcp-server shared-network-name dmz subnet 193.111.111.0/28 range a start '193.111.111.4'
set service dhcp-server shared-network-name dmz subnet 193.111.111.0/28 range a stop '193.111.111.10'
set service ssh port '22'
set system config-management commit-revisions '100'
set system console device ttyS0 speed '9600'
set system host-name 'portabel'
set system login user vyos authentication plaintext-password 'vyos'
set system login user vyos level 'admin'
set system name-server '1.1.1.1'
set system ntp server 0.pool.ntp.org
set system ntp server 1.pool.ntp.org
set system ntp server 2.pool.ntp.org
set system syslog global facility all level 'info'
set system syslog global facility protocols level 'debug'
set system time-zone 'UTC'

Why 2 times hw-id for eth1 eth2 eth3

ethernet eth1 {
    address "dhcp"
    duplex "auto"
    hw-id "00:e0:67:16:d8:ac"
    smp-affinity "auto"
    speed "auto"
    hw-id 00:e0:67:16:d8:ad
danhusan added a comment.EditedMay 27 2020, 10:07 AM

VyOS added those during the 2nd boot after upgrade. I would assume bug related to the fact that my config doesn't include eth0.

pasik added a subscriber: pasik.May 27 2020, 5:13 PM

Reworking my config so that it uses eth0->eth3 instead of eth1->eth4 makes everything work as expected. So something has clearly changed regarding the interface naming/creation logic.

c-po added a comment.May 28 2020, 3:39 PM

Yes there have been issues with interface naming in the past. Hopefully they are finally resolved in 1.3 now.

Closing this. Please reopen if problem happens again

c-po closed this task as Invalid.May 28 2020, 3:39 PM
c-po moved this task from Need Triage to Backlog on the VyOS 1.3 Equuleus board.Jun 2 2020, 5:48 PM
c-po moved this task from Backlog to Finished on the VyOS 1.3 Equuleus board.Jun 2 2020, 7:36 PM
jjakob added a subscriber: jjakob.Thu, Jun 25, 2:18 PM

I think I ran into this today after upgrading from 1.3-rolling-202006110117 to 1.3-rolling-202006241940. My config had eth1-eth3 (as those were the default names created by a previous install of 1.3 somewhere around May) and those worked fine for numerous reboots before this upgrade. The first reboot after adding the new image, everything was fine. The 2nd reboot (actually a power outage) the interfaces were eth0-eth2 on the system, but eth1-eth3 in the config, so the config load failed.

This is after entering config mode and loading config.boot:

vyos@vyos# show interfaces 
 ethernet eth0 {
     hw-id xx:b4
 }
 ethernet eth1 {
     address dhcp
     address dhcpv6
     hw-id xx:b5
     ipv6 {
         address {
             autoconf
         }
     }
 }
 ethernet eth2 {
     hw-id xx:b6
 }
+ethernet eth3 {
+    hw-id xx:b6
+}
 loopback lo {
 }
[edit]

As can be seen, eth3 should be :b6 but eth2 has that mac, so all interfaces were shifted by one (:b4 was eth1 in config.boot, :b5 eth2, :b6 eth3)

Not doing anything regarding the failed load and just rebooting has now hard-baked the eth0-eth2 names into config.boot without me doing anything. So something effectively decided to rename eth1-eth3 to eth0-eth2 and save it to config.boot.

jjakob reopened this task as Open.Fri, Jun 26, 8:15 AM
jjakob moved this task from Finished to In Progress on the VyOS 1.3 Equuleus board.
jjakob moved this task from In Progress to Need Triage on the VyOS 1.3 Equuleus board.
jjakob added a comment.EditedFri, Jun 26, 8:39 AM

Attached are config.boot post-upgrade(migration) and config.boot pre-migration.
Notice:

  • doubled hw-id lines
  • missing opening and closing curly braces on lines 45, 75-77 (tag nodes)
  • lines 8, 33, 35 are leaf nodes, those shouldn't have opening/closing curly braces after them
  • some things are quoted, some are not

Maybe the wrong curly braces or the wrong quoting introduced by the migration scripts confuses the config system?

This comment was removed by jjakob.
jjakob added a comment.EditedFri, Jun 26, 9:09 AM

Migration scripts use vyos.configtree which uses libvyosconfig so it's probably a bug there.

Edit: or not, as I've hand edited config.boot to fix all the braces (none on valueless nodes, present on tag nodes), rebooted, and the doubled hw-id lines reappeared, as did the wrong braces, undoing all my changes. Since no migrations scripts were ran (no config.boot.pre-migration with the current date) it's likely not a bug there, but in vyatta-cfg, or something causes a bug in vyatta-cfg to appear.

c-po closed this task as Resolved.Fri, Jun 26, 4:29 PM
c-po reopened this task as Open.
c-po removed c-po as the assignee of this task.
c-po triaged this task as High priority.
c-po changed Why the issue appeared? from Will be filled on close to Implementation mistake.
c-po changed Is it a breaking change? from Unspecified (possibly destroys the router) to Config syntax change (migratable).

Sorry - used the wrong task

To clarify, I'm not actually sure the formatting is the reason for the bug - I just observed that it happened. It may have no impact on functionality. The real problem is that something is messing with the hw-id entries, and I don't think it's the migrator scripts, as those weren't executed when it happened. Maybe some legacy Vyatta code.

jjakob added a comment.Sun, Jul 5, 8:42 AM

I tried reupgrading from 1.3-rolling-202006110117 to 1.3-rolling-202007050117 and the exact same error occurred - on first reboot everything was fine (config.boot was migrated, looked correct, and loaded fine). On 2nd reboot, the exact same thing happened.

jjakob added a comment.Sun, Jul 5, 9:08 AM

Upgrading a different test VM with different config that starts at eth0: on 2nd reboot the hw-id lines are duplicated too, but they are the same on a single interface, and there are no new interfaces created, so the config loads and works fine. The duplicated hw-id lines stay in the config for all subsequent reboots.
Example:

ethernet eth0 {
    address "192.0.2.1"
    hw-id "52:54:00:2d:29:19"
    ipv6 {
        address {
        }
    }
    smp-affinity "auto"
    speed "auto"
    hw-id 52:54:00:2d:29:19
}

What I'm noticing is that the migration scripts save all nodes with quotes, but saving in config mode (through vyatta-cfg) results in most nodes not having quotes (mostly just those with spaces have it). Maybe there is a vyatta script that adds any new interfaces to config.boot that runs on each boot that doesn't like these quoted hw-id lines that the migration scripts produce.

jjakob added a comment.Sun, Jul 5, 9:30 AM

The most likely culprit is /opt/vyatta/sbin/vyatta_interface_rescan. I'm not sure if this should be fixed or migrated to Python.
The rewrite would need to be done together with all other vyatta interface renaming and detection scripts.