Page MenuHomeVyOS Platform

[1.3.3->1.4.0-epa1 Migration] Most of config missing
Needs testing, HighPublicBUG

Assigned To
None
Authored By
matthewr
Feb 28 2024, 6:34 PM
Referenced Files
Restricted File
Mar 9 2024, 6:19 PM
Restricted File
Mar 9 2024, 6:19 PM
Restricted File
Mar 2 2024, 6:31 PM
Restricted File
Feb 29 2024, 9:24 AM
Restricted File
Feb 28 2024, 6:34 PM

Description

Having tried upgrading from 1.3.3 to 1.4.0-epa1, the results were that most of the config did not appear. For the reboot after the upgrade the kernel parameter "vyos-config-debug" was added.

The details (before and after) are in:-

{F4215536}

in which I have made minor redactions to /tmp/boot-config-trace.

With apologies, I have not managed to work out what it did not like in the config, although it did make a dreadful mess!

Details

Difficulty level
Unknown (require assessment)
Version
1.4.0-epa1
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Unspecified (please specify)

Event Timeline

Firewall failed migration due to incorrect subnet, 1.3 firewall did not correctly validate those fields. Correcting source address on rule 30 on TO-ROUTER chain should migrate the firewall properly.

[ firewall ipv4 name TO-ROUTER rule 30 source address xxx.24/28 ]



Error: xxx.24/28 is not a valid IPv4 address range

Error: xxx.24/28 is not a valid IPv4 prefix

Error: xxx.24/28 is not a valid IPv4 address

Will investigate why OSPF migration failed, log didn't show any cause for that failure.

Thanks for that.

On the off chance, I tried again having corrected the firewall typo, but it failed again (perhaps unsurprisingly) for the OSPF.

The session logs (which are probably not needed) are:-

{F4216768}

What is the content of
/tmp/vyos-configd-script-stdout ?

Rebooting the failed image (I did not redo the upgrade - the failed image remained on the router) produces:-

itconsult@ha-r02a:~$ cat /tmp/vyos-configd-script-stdout

WARNING: NAT interface "eth0.20" for source NAT rule "141" does not
exist!


WARNING: changing speed/duplex setting on "eth0" is unsupported!


Interface "eth0.150" does not exist!

Is this sufficient, or do I need to completely redo the upgrade?

You don't have eth0.150

set interfaces ethernet eth0 vif 141
set interfaces ethernet eth0 vif 262 

set protocols ospf passive-interface 'eth0.150'

Thanks - yes - there were in fact 3 "passive-interface" entries for old interfaces, which I should have tidied. Having deleted them, the migration completed.

Whilst labbing up a simpler config to troubleshoot a different issue, I noticed (by typo!) that in 1.3.6 one can put any string whatsoever as the parameter to "set protocols ospf passive-interface" with no validation whatsoever. Doing the same thing in 1.4.0-epa1, the string is validated as needing to be a real interface.

Clearly the problem above is caused by a bad initial config in 1.3.6.

Do people think that adjusting the migration script to check for invalid passive interfaces and remove them, or is this better considered a user error for having bad config?

Now trying to migrate a second router, with different config. I have corrected the mistakes I could see in the config and have removed policy route config with tcp flags.

However, the upgrade fails, and all of the OSPF config disappears. The only clues (please point me in the right direction if I have missed something) seem to be in boot-config trace:-

[[protocols ospf]] failed

and in vyos-configd-script-stdout:-

Interface "vtun1" does not exist!

The log of the upgrade session is:-

{F4222723}

I have tried to reproduce this on a lab router, suspecting openvpn and/or ospf redistribution, but without success.

Any ideas for further troubleshooting would be appreciated...

Testing further today, I have managed to get two close configs: one migrates and the other does not.

The one which does work is shown:-
{F4228371}
which I found by accident.

The one which does not work is shown:-
{F4228372}

The only difference is that for the second one, 3 Openvpn interfaces were shutdown:-

set interfaces openvpn vtun2 disable
set interfaces openvpn vtun5 disable
set interfaces openvpn vtun1 disable

My suspicion is that their being shut prevents certain other config from loading:-

itconsult@ha-r01a:~$ conf
WARNING: There was a config error on boot: saving the configuration now could overwrite data.
You may want to check and reload the boot config
[edit]
itconsult@ha-r01a# load
Loading configuration from 'config.boot'
Load complete. Use 'commit' to make changes effective.
[edit]
itconsult@ha-r01a# commit

Interface "vtun1" does not exist!

[[protocols ospf]] failed
Commit failed

I wonder whether there are further issues around config bound to shutdown interfaces when booting. In 1.3, this config boots fine with the interfaces shutdown.

See T6131 for a report of the VTUN/OSPF issue with a simple lab config, which occurs separately from a migration.

Viacheslav changed the task status from Open to Needs testing.Mar 24 2024, 10:27 AM