Page MenuHomeVyOS Platform

Failed upgrade from 1.4-rolling-202212310809 to 1.4-rolling-202309030023
Open, HighPublicBUG

Description

Hello!

I have just tried to upgrade from 1.4-rolling-202212310809 to 1.4-rolling-202309030023.

The upgrade did not succeeded. I was able to plug in a console cable into the router and poke around. I did notice that none of the NIC activity LEDs were lit and at the login screen my hostname wasn't set. I'm assuming something failed early on in mounting and activating the vyos config.

I then tried to login anyways as the vyos user with what I assumed was the password. It didn't work, so I rebooted and reset the password and it still didn't work. I also observed that before asking for the password it was asking for the password + OTP which I never setup..

A screenshot of me resetting the password is here https://cdn.discordapp.com/attachments/1097925444566794260/1148231466854137968/image.png it's useless

Does anyone have any ideas what could've went wrong or how I can proceed? For now I rolled back to 1.4-rolling-202212310809 and set it as the default image.

Details

Difficulty level
Normal (likely a few hours)
Version
from 1.4-rolling-202212310809 to 1.4-rolling-202309030023
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

Adding vyos-config-debug to the boot cmdline should allow you to log in and will provide some information in /tmp/boot-config-trace. Cf.:
https://docs.vyos.io/en/latest/contributing/debugging.html

I would like to stage this in a VM if I do try the above as physical access to the router is tough. Does anyone know where I can find a ISO for 1.4-rolling-202212310809. It seems the old s3 endpoint doesn't resolve https://s3.vyos.io/rolling/current/vyos-1.4-rolling-202212310809-amd64.iso

This comment was removed by anthr76.

grub was very buggy for me in a USB console. I did finally manage to get vyos-config-debug to boot.

I observed

https://gist.github.com/anthr76/90016102a0eb9481bb74d101c242288a

I observed the trace from 1.4-rolling-202306180309 as I was trying to slowly increment up. This was the path I took to get there

1: 1.4-rolling-202303040307
2: 1.4-rolling-202305110952
3: 1.4-rolling-202306070310 (Bad!!!)
4: 1.4-rolling-202305290305
5: 1.4-rolling-202306030305
6: 1.4-rolling-202306180309 (Bad!!!) <--- Where I was able to get a log

For now because I need some kind of stability I'm rolling back to 1.4-rolling-202303040307.

Using the config you provided in Slack I managed to trace the error (or I think so):

root@vyos:~# /opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546
Traceback (most recent call last):
  File "/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30", line 46, in <module>
    and is_wireguard_key_pair(private_key, peer_public_key):
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/vyos/utils/network.py", line 378, in is_wireguard_key_pair
    gen_public_key = cmd('wg pubkey', input=private_key)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/vyos/utils/process.py", line 155, in cmd
    raise OSError(code, feedback)
PermissionError: [Errno 1] failed to run command: wg pubkey
returned: 
exit code: 1

However the above could be due to that the config provided have gone through "strip-private".

So could you please do something along the line of:

sudo cp /config/config.boot /config/config.boot.T5546
sudo /opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546

and report back with the output?

That is manually run the particular migrate script on the last config version that didnt crash during boot?

I do not have a 29-30 migrate script

ls /opt/vyatta/etc/config-migrate/migrate/interfaces/
0-to-1    11-to-12  13-to-14  15-to-16  17-to-18  19-to-20  20-to-21  22-to-23  24-to-25  2-to-3    4-to-5    6-to-7    8-to-9
10-to-11  12-to-13  14-to-15  16-to-17  18-to-19  1-to-2    21-to-22  23-to-24  25-to-26  3-to-4    5-to-6    7-to-8    9-to-10

What VyOS version did you have there?

I was assuming you had 1.4-rolling-202309030023.

The below is on VyOS 1.4-rolling-202309040919:

root@vyos:~# ls /opt/vyatta/etc/config-migrate/migrate/interfaces/
0-to-1	  11-to-12  13-to-14  15-to-16	17-to-18  19-to-20  20-to-21  22-to-23	24-to-25  26-to-27  28-to-29  2-to-3  4-to-5  6-to-7  8-to-9
10-to-11  12-to-13  14-to-15  16-to-17	18-to-19  1-to-2    21-to-22  23-to-24	25-to-26  27-to-28  29-to-30  3-to-4  5-to-6  7-to-8  9-to-10

I dont know if something would break to run migrate from within another VyOS version but you could test something like this, preferly from last version that worked without errors ("5: 1.4-rolling-202306030305"?):

mount /usr/lib/live/mount/persistence/boot/1.4-rolling-202309030023/1.4-rolling-202309030023.squashfs /mnt
sudo cp /config/config.boot /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546

Also note that the uploaded config had this as last line:

// Warning: Do not remove the following line.
// vyos-config-version: "bgp@3:broadcast-relay@1:cluster@1:config-management@1:conntrack@3:conntrack-sync@2:container@1:dhcp-relay@2:dhcp-server@6:dhcpv6-server@1:dns-forwarding@3:firewall@9:flow-accounting@1:https@4:ids@1:interfaces@26:ipoe-server@1:ipsec@10:isis@2:l2tp@4:lldp@1:mdns@1:monitoring@1:nat@5:nat66@1:ntp@1:openconnect@2:ospf@1:policy@5:pppoe-server@6:pptp@2:qos@1:quagga@10:rpki@1:salt@1:snmp@2:ssh@2:sstp@4:system@25:vrf@3:vrrp@3:vyos-accel-ppp@2:wanloadbalance@3:webproxy@2"
// Release version: 1.4-rolling-202212310809

meaning if you want to run the interface migrate manually you would probably need to run it as (due to "interfaces@26" in above):

sudo cp /config/config.boot /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/26-to-27 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/27-to-28 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/28-to-29 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546

My apologies @Apachez this was on 1.4-rolling-202212310809.

I can give https://vyos.dev/T5546#158799 a try

Thanks for the help so far @Apachez !

While my config has no mention of wireguard (besides a firewall rule) interfaces is still failing:

root@-fw-1:~# cat /config/config.boot.T5546  | grep -C 3 wireguard
        }
        rule 5182 {
            action "accept"
            description "wireguard"
            destination {
                port "51820"
            }
root@-fw-1:~# /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546
Traceback (most recent call last):
  File "/mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30", line 20, in <module>
    from vyos.utils.network import is_wireguard_key_pair
ModuleNotFoundError: No module named 'vyos.utils'

Ehm, are you sure you operate on the correct config?

You sure do have a "wireguard wg0" interface over at https://gist.github.com/anthr76/4b091d952bcd69b1ac8d4c7d08aaaac6

For the above migrate example I posted earlier you should use the config from 1.4-rolling-202212310809.

If you use any newer config and migrate have failed (but not failed the boot itself) on that there might be missing sections.

One thing to test (assuming 1.4-rolling-202306030305 booted fine but had broken config):

  1. Boot 1.4-rolling-202306030305.
  2. Delete /usr/lib/live/mount/persistence/boot/1.4-rolling-202309030023/rw/config/config.boot.
  3. Boot 1.4-rolling-202309030023, since config.boot is now missing it will use a default one (so no networks will work etc but you can at least boot into 1.4-rolling-202309030023).
  4. Copy the config.boot from 1.4-rolling-202212310809 and call that config.boot.221231:
sudo cp /usr/lib/live/mount/persistence/boot/1.4-rolling-202212310809/rw/config/config.boot /config/config.boot.221231
  1. Now on that config.boot.221231 (well first copy that into config.boot.T5546) we run the migrate scripts from version 26 up to current version 30 like so:
sudo cp /config/config.boot.221231 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/26-to-27 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/27-to-28 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/28-to-29 /config/config.boot.T5546
sudo /mnt/opt/vyatta/etc/config-migrate/migrate/interfaces/29-to-30 /config/config.boot.T5546

The point here is that we do have booted (using default config) into 1.4-rolling-202309030023 so all migrate scripts should work (as in no error of missing modules because the migrate is runned under older VyOS) and if there is some error of the manual migrate (regarding interface section) any error here would (hopefully) be the true error for why your upgrade from 1.4-rolling-202212310809 to 1.4-rolling-202309030023 fails.

There is probably some other script one can run that goes through all the migrate scripts which would be runned for your 221231 config so the above is just to pinpoint if its the interface migrate scripts that have some error (most likely due to the provided logs and observations so far).

The steps above i can try as a last ditch effort but that means I need to be on site with the device and will require lots of time (and downtime)

:(

There is a similar case going on at the forum with different workarounds which might help?

https://forum.vyos.io/t/migration-error-from-vyos-1-4-april-to-the-latest-release-1-4-sep/12077/

Also this case might be related to what you are seeing:

https://vyos.dev/T5520
Likely source of corruption on system update exposed by change in coreutils for Bookworm

@anthr76 T5520 would be unrelated to an upgrade from 1.4-rolling-202212310809, as the change to using Bookworm did not occur until 2023. We can take a look at the specific errors that you encountered.

Sounds great @jestabro I'm happy to assist in any way possible

In my case the upgrade from 1.4-rolling-202308060317 to vyos-1.4-rolling-202308060317 made the vrf unavailable so no access to management. Booting back to old version became working again.