Page MenuHomePhabricator

Static route not reachable through VRRP address
Closed, ResolvedPublicBUG

Description

Hello,

I have a pair of 1.1.8 instances that fail when I try upgrading to 1.2.0-11. I have finally been able to isolate the problem in a test environment. The static route through a VRRP address is not being used when VRRP transitions to master. However, it is also related to having blackhole routes - without those it works. Maybe the default route doesn't get "recalculated" when the VRRP route is injected?

This works fine in 1.1.8 and I only noticed it when trying to upgrade a production system to 1.2.0 and had to immediately rollback because traffic stopped passing! I haven't found a workaround yet except disabling the blackhole routes, which I'd rather not do in this case.

Here are configs and a log of commands that might help explain better. The ping at the very end when VRRP is in master state succeeds on 1.1.8 and fails on 1.2.0.


VyOS 1.1.8 (works)

vyos@vyos:~$ show ver

Version:      VyOS 1.1.8
...

vyos@vyos:~$ show config

interfaces {
    ethernet eth0 {
        address 10.16.4.32/21
        duplex auto
        hw-id 00:0c:29:20:2a:ec
        smp_affinity auto
        speed auto
    }
    ethernet eth1 {
        duplex auto
        hw-id 00:0c:29:20:2a:f6
        smp_affinity auto
        speed auto
        vrrp {
            vrrp-group 35 {
                advertise-interval 1
                hello-source-address 10.16.4.32
                preempt false
                virtual-address 10.240.4.30/21
            }
        }
    }
    loopback lo {
    }
}
protocols {
    static {
        route 0.0.0.0/0 {
            next-hop 10.240.0.1 {
                distance 1
            }
        }
        route 10.0.0.0/8 {
            blackhole {
            }
        }
        route 169.254.0.0/16 {
            blackhole {
            }
        }
        route 172.16.0.0/12 {
            blackhole {
            }
        }
        route 192.168.0.0/16 {
            blackhole {
            }
        }
    }
}
...

vyos@vyos:~$ show vrrp

                                 RFC        Addr   Last        Sync
Interface         Group  State   Compliant  Owner  Transition  Group
---------         -----  -----   ---------  -----  ----------  -----
eth1              35     BACKUP  no         no     6m16s       <none>

vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive
S>* 10.0.0.0/8 [1/0] is directly connected, Null0, bh
C>* 10.16.0.0/21 is directly connected, eth0
C>* 127.0.0.0/8 is directly connected, lo
S>* 169.254.0.0/16 [1/0] is directly connected, Null0, bh
S>* 172.16.0.0/12 [1/0] is directly connected, Null0, bh
S>* 192.168.0.0/16 [1/0] is directly connected, Null0, bh

vyos@vyos:~$ ping 8.8.8.8

connect: Network is unreachable

Reset links to force VRRP transition

vyos@vyos:~$ show vrrp

                                 RFC        Addr   Last        Sync
Interface         Group  State   Compliant  Owner  Transition  Group
---------         -----  -----   ---------  -----  ----------  -----
eth1              35     MASTER  no         no     10s         <none>

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
       I - ISIS, B - BGP, > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.240.0.1, eth1
S>* 10.0.0.0/8 [1/0] is directly connected, Null0, bh
C>* 10.16.0.0/21 is directly connected, eth0
C>* 10.240.0.0/21 is directly connected, eth1
C>* 127.0.0.0/8 is directly connected, lo
S>* 169.254.0.0/16 [1/0] is directly connected, Null0, bh
S>* 172.16.0.0/12 [1/0] is directly connected, Null0, bh
S>* 192.168.0.0/16 [1/0] is directly connected, Null0, bh

vyos@vyos:~$ ping 8.8.8.8

PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_req=1 ttl=121 time=6.92 ms
64 bytes from 8.8.8.8: icmp_req=2 ttl=121 time=6.75 ms
64 bytes from 8.8.8.8: icmp_req=3 ttl=121 time=6.96 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 6.759/6.882/6.965/0.088 ms

VyOS 1.2.0-rc11 (fails)

vyos@vyos:~$ show ver

Version:          VyOS 1.2.0-rc11
...

vyos@vyos:~$ show config

high-availability {
    vrrp {
        group eth1-35 {
            hello-source-address 10.16.4.31
            interface eth1
            no-preempt
            virtual-address 10.240.4.30/21
            vrid 35
        }
        sync-group core {
            member eth1-35
        }
    }
}
interfaces {
    ethernet eth0 {
        address 10.16.4.31/21
        duplex auto
        hw-id 00:0c:29:80:6e:50
        smp-affinity auto
        speed auto
    }
    ethernet eth1 {
        duplex auto
        hw-id 00:0c:29:80:6e:5a
        smp-affinity auto
        speed auto
    }
    loopback lo {
    }
}
protocols {
    static {
        route 0.0.0.0/0 {
            next-hop 10.240.0.1 {
                distance 1
            }
        }
        route 10.0.0.0/8 {
            blackhole {
            }
        }
        route 169.254.0.0/16 {
            blackhole {
            }
        }
        route 172.16.0.0/12 {
            blackhole {
            }
        }
        route 192.168.0.0/16 {
            blackhole {
            }
        }
    }
}
...

vyos@vyos:~$ show vrrp

Name     Interface      VRID  State    Last Transition
-------  -----------  ------  -------  -----------------
eth1-35  eth1             35  BACKUP   1m4s

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:12:09
  *                   unreachable, 00:12:09
S>* 10.0.0.0/8 [1/0] unreachable (blackhole), 00:12:09
C>* 10.16.0.0/21 is directly connected, eth0, 00:13:19
S>* 169.254.0.0/16 [1/0] unreachable (blackhole), 00:12:09
S>* 172.16.0.0/12 [1/0] unreachable (blackhole), 00:12:09
S>* 192.168.0.0/16 [1/0] unreachable (blackhole), 00:12:09

vyos@vyos:~$ ping 8.8.8.8

connect: Invalid argument

Reset links to force VRRP transition

vyos@vyos:~$ show vrrp

Name     Interface      VRID  State    Last Transition
-------  -----------  ------  -------  -----------------
eth1-35  eth1             35  MASTER   11s

vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:13:04
  *                   unreachable, 00:13:04
S>* 10.0.0.0/8 [1/0] unreachable (blackhole), 00:13:04
C>* 10.16.0.0/21 is directly connected, eth0, 00:14:14
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:16
S>* 169.254.0.0/16 [1/0] unreachable (blackhole), 00:13:04
S>* 172.16.0.0/12 [1/0] unreachable (blackhole), 00:13:04
S>* 192.168.0.0/16 [1/0] unreachable (blackhole), 00:13:04

vyos@vyos:~$ ping 8.8.8.8

connect: Invalid argument

Failed! It would be nice if it worked like 1.1.8

Details

Difficulty level
Unknown (require assessment)
Version
1.2.0-rc11
Why the issue appeared?
Will be filled on close
bmtauer created this task.Jan 4 2019, 9:51 PM
bmtauer updated the task description. (Show Details)Jan 4 2019, 10:03 PM
Merijn added a subscriber: Merijn.EditedJan 4 2019, 11:51 PM

I see in the config that you do not have an interface IP on the VRRP members.
This works in 1.1.8 most of the time. But can you test if 1.2.0 works with those added. The hello source address is not needed then and the chances are the kernel wil load the connected route this way.

This won't help in production case, as that uses a /30 network with only 2 possible addresses. One is the floating VRRP address and the other is the destination for the static route.

Regardless, I added 10.240.4.{31,32} addresses to the config above and expected the ping to start working regardless of VRRP state. I was surprised! It doesn't work UNTIL the router is rebooted. Somehow it doesn't seem to "recalculate" routes when IP addresses are added and removed in 1.2.0?

Please let me know if there is anything else you'd like me to try.

Do you mean the 31 and 32 also couldn’t ping eachother?

From the 1.2.0 instance (10.240.4.31) I'm able to ping the 1.1.8 (10.240.4.32) instance immediately after adding the address, but cannot ping out to the internet until after a reboot.

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:03:22
  *                   unreachable, 00:03:22
S>* 10.0.0.0/8 [254/0] unreachable (blackhole), 00:03:22
C>* 10.16.0.0/21 is directly connected, eth0, 00:03:23
S>* 169.254.0.0/16 [254/0] unreachable (blackhole), 00:03:22
S>* 172.16.0.0/12 [254/0] unreachable (blackhole), 00:03:22
S>* 192.168.0.0/16 [254/0] unreachable (blackhole), 00:03:22

vyos@vyos:~$ configure
[edit]
vyos@vyos# set interfaces ethernet eth1 address 10.240.4.31/21
[edit]
vyos@vyos# commit
[edit]
vyos@vyos# save
Saving configuration to '/config/config.boot'...
Done
[edit]

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:04:25
  *                   unreachable, 00:04:25
S>* 10.0.0.0/8 [254/0] unreachable (blackhole), 00:04:25
C>* 10.16.0.0/21 is directly connected, eth0, 00:04:26
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:19
S>* 169.254.0.0/16 [254/0] unreachable (blackhole), 00:04:25
S>* 172.16.0.0/12 [254/0] unreachable (blackhole), 00:04:25
S>* 192.168.0.0/16 [254/0] unreachable (blackhole), 00:04:25

vyos@vyos:~$ ping 10.240.4.32

PING 10.240.4.32 (10.240.4.32) 56(84) bytes of data.
64 bytes from 10.240.4.32: icmp_seq=1 ttl=64 time=0.231 ms
64 bytes from 10.240.4.32: icmp_seq=2 ttl=64 time=0.131 ms
^C
--- 10.240.4.32 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1074ms
rtt min/avg/max/mdev = 0.131/0.181/0.231/0.050 ms

vyos@vyos:~$ ping 8.8.8.8

connect: Invalid argument

vyos@vyos:~$ reboot

Are you sure you want to reboot this system? [y/N] y
Connection to 10.16.4.31 closed by remote host.
Connection to 10.16.4.31 closed.

After reboot...

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.240.0.1, eth1, 00:00:41
S>* 10.0.0.0/8 [254/0] unreachable (blackhole), 00:00:41
C>* 10.16.0.0/21 is directly connected, eth0, 00:00:42
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:42
S>* 169.254.0.0/16 [254/0] unreachable (blackhole), 00:00:41
S>* 172.16.0.0/12 [254/0] unreachable (blackhole), 00:00:41
S>* 192.168.0.0/16 [254/0] unreachable (blackhole), 00:00:41

vyos@vyos:~$ ping 8.8.8.8

PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=121 time=7.23 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=121 time=6.53 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=121 time=6.54 ms
^C
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 6.533/6.773/7.239/0.329 ms
syncer triaged this task as Normal priority.Jan 12 2019, 6:34 PM
syncer assigned this task to zsdc.
syncer edited projects, added VyOS 1.2 Crux (VyOS 1.2.0-GA); removed VyOS 1.2 Crux.
syncer added subscribers: zsdc, syncer.

@zsdc please check this one

zsdc added a comment.Jan 15 2019, 9:49 AM

Hi, @bmtauer!
To be honest, it's looks like you have used bug or some non-typical behavior in 1.1.8 as feature. Your configuration looks strange from the start, so I propose to start investigation of this from detailed description of your task, which you want to solve by this all.
If you can, please, provide information about:

  1. Connections at eth interfaces on both (master and backup) routers. As I understand, all interfaces is connected to the same L2 segment of network? Explain why connections was made like this - this is not obvious for us now.
  2. What exactly you want to reach by VRRP? Just make reserved router or this is part of more complex task?

The more we understand your task, the faster we will can help to solve this problem.

Hi @zsdc

The abnormal part of our setup might be that we have blackhole routes in
place to prevent any accidental leakage of private IP addresses through a
public class C network. The configurations posted above are contrived -
purely to demonstrate the problem, but it does reliably demonstrate the
regression from 1.1.8 to 1.2.0.

  1. Connections on both of the ethernet interfaces are in _separate_ L2

networks. These are completely isolated L2 networks representing the
"internal" and "ISP" networks in our production case. The "internal"
network is a class C block of public IP addresses, and the "ISP" side is a
192.168.1.x/30 network, which means only TWO addresses. One address is the
ISP router and one address is our VRRP address.

  1. The reason for VRRP is so this can be handled by a pair of virtual

routers for high availability. Traffic needs to switch back and forth
between these routers seamlessly if either needs rebooting, or physical
host needs rebooting, etc. It has worked quite well with the 1.1.x
series. The VRRP addresses (default gateway for the internal side; and our
192.168.1.x address for the ISP side) are in a group, so they both
transition between master/backup together.

Now that I understand what is going on, we could work around the problem by
dropping the blackhole route for 192.168.x.x, but since that is less
specific than the 192.168.1.x/30 interface it has always worked in the
past.

Something changed in the way route calculation or priority happens, and
traffic doesn't flow to the ISPs 192.168.1.x address (routers internal
default route), when the VRRP transition happens, or when the machine boots
up. It also doesn't happen in my test case when I manually add an address
to the interface. The routing table shows the default route as unreachable
and never realizes when it becomes reachable even though the route by which
it is now reachable shows in the table.

Hopefully this helps explain a bit more what is going on and why.

zsdc added a comment.Jan 21 2019, 9:42 AM

OK, things is more clearly now.
If you don't have any L2-filters between eth1 interfaces of VyOS instances I could recommend you first to change configuration to something like this (based on your configuration from first message):
Router 1:

high-availability {
    vrrp {
        group eth1-35 {
            hello-source-address 10.10.35.1
            peer-address 10.10.35.2
            interface eth1
            no-preempt
            virtual-address 10.240.4.30/21
            vrid 35
        }
    }
}
interfaces {
    ethernet eth0 {
        duplex auto
        hw-id 00:0c:29:80:6e:50
        smp-affinity auto
        speed auto
    }
    ethernet eth1 {
        address 10.10.35.1/30
        duplex auto
        hw-id 00:0c:29:80:6e:5a
        smp-affinity auto
        speed auto
    }
    loopback lo {
    }
}

Router 2:
Change 10.10.35.1 to 10.10.35.2 in eth1 address and hello-source-address, and 10.10.35.2 to 10.10.35.1 in peer-address.
Don't forget to remove vrrp-sync on both.

By this changes we move all VRRP to correct interface, then check again if problem still exist.
If you can, test this on lab environment first to avoid dual VRRP MASTER state in production.

pasik added a subscriber: pasik.Jan 21 2019, 9:41 PM

Hello,

I've tried several variations on the VRRP configuration, and it doesn't seem to make any difference. As far as I can tell, nothing is wrong with VRRP. It is only relevant as a source for change in the routing table. I can demonstrate the problem on a single instance with no VRRP.

The problem, which may have been buried in all the extra detail, is that traffic will not pass to the default gateway when the route is added "later". If the interface/address is already in the system on boot things seem to work fine, but adding the address later, either manually or by VRRP, does not work. The routing table doesn't get updated to allow traffic destined for the default route to use the newly added address. The commands below show how the routing table should allow the final ping to work, but it doesn't. Why does it show the default route 0.0.0.0/0 as recursive/unreachable when 10.240.0.1 is clearly reachable via eth1 a few lines later?

Removing the 10.0.0.0/8 blackhole route (and rebooting) allows it to work. Somehow blackhole routes seem to override normal priority and specificity rules, or prevent updates. This always worked in the 1.1.x. Is it an FRR change, perhaps?

Benjamin

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:04:38
  *                   unreachable, 00:04:38
S>* 10.0.0.0/8 [254/0] unreachable (blackhole), 00:04:38
C>* 10.16.0.0/21 is directly connected, eth0, 00:04:39
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:55
S>* 169.254.0.0/16 [254/0] unreachable (blackhole), 00:04:38
S>* 172.16.0.0/12 [254/0] unreachable (blackhole), 00:04:38
S>* 192.168.0.0/16 [254/0] unreachable (blackhole), 00:04:38

vyos@vyos:~$ ping 10.240.0.1

PING 10.240.0.1 (10.240.0.1) 56(84) bytes of data.
64 bytes from 10.240.0.1: icmp_seq=1 ttl=64 time=0.360 ms
64 bytes from 10.240.0.1: icmp_seq=2 ttl=64 time=0.446 ms
64 bytes from 10.240.0.1: icmp_seq=3 ttl=64 time=0.358 ms
--- 10.240.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2058ms
rtt min/avg/max/mdev = 0.358/0.388/0.446/0.041 ms

vyos@vyos:~$ ping 8.8.8.8

connect: Invalid argument

The problem discussed here sounds remarkably similar to what I'm seeing: https://github.com/FRRouting/frr/issues/2230

The less specific blackhole route is overriding the more specific interface route that gets added later.

dmbaturin changed the task status from Open to On hold.Jan 26 2019, 4:39 PM
dmbaturin added a subscriber: dmbaturin.

Ok, I've re-tested everything one more time to be sure. I can confirm the somewhat strange (though technically correct) behaviour of blackhole routes: they become ECMP routes if there's another route of the same distance. I would expect normal routes to override them in that case, but the kernel is following its hard and fast rule that two routes with the same distance automatically become ECMP routes, that behaviour is counter-intuitive but consistent.

Moreover, in 1.1.8 the simplest test case (add a route to a network via a gateway and a blackhole to the same network) produces an ECMP route as well.

I'm not sure why exactly that setup worked in 1.1.8, to me it feels like it should have never worked. Without different distance for the "standby" route, it clearly depends on some implementation-specific behaviour combination to function.

There's a trivial fix: don't use the same distance. Distance of 1 is the default for static routes, so you need to set the distance of the blackhole route distance to a higher value, e.g. 250. See a transcript from the latest 1.2.0 that shows correct behaviour in that case:

yos@vyos-test-2# run show ip route 0.0.0.0/0
Routing entry for 0.0.0.0/0
  Known via "static", distance 210, metric 0, best
  Last update 00:01:36 ago
  * 10.217.32.254, via eth0

[edit]
vyos@vyos-test-2# set protocols static route 0.0.0.0/0 blackhole distance 240
[edit]
vyos@vyos-test-2# commit
[edit]

vyos@vyos-test-2# run show ip route 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S   0.0.0.0/0 [240/0] unreachable (blackhole), 00:00:12
S>* 0.0.0.0/0 [210/0] via 10.217.32.254, eth0, 00:02:00

If you run into cases when route distance is not honored when it's not the same, then it will be a bug.

syncer changed the task status from On hold to Needs testing.Jan 26 2019, 8:16 PM
syncer closed this task as Resolved.Jan 26 2019, 10:14 PM
syncer added a project: VyOS-1.2.0-GA.

Unfortunately I still see the problem when the blackhole routes are set with a distance of 240 and the 0.0.0.0 route is distance 1.

Here is the starting point:

vyos@vyos:~$ show ver

Version:          VyOS 1.2.0-epa3

...

vyos@vyos:~$ show config commands

set interfaces ethernet eth0 address '10.16.4.32/21'
set interfaces ethernet eth0 duplex 'auto'
set interfaces ethernet eth0 hw-id '00:0c:29:20:2a:ec'
set interfaces ethernet eth0 smp-affinity 'auto'
set interfaces ethernet eth0 speed 'auto'
set interfaces ethernet eth1 duplex 'auto'
set interfaces ethernet eth1 hw-id '00:0c:29:20:2a:f6'
set interfaces ethernet eth1 smp-affinity 'auto'
set interfaces ethernet eth1 speed 'auto'
set interfaces loopback lo
set protocols static
set service ssh port '22'
set system config-management commit-revisions '100'
set system console
set system host-name 'vyos'
set system login user vyos authentication encrypted-password '$1$gU100Pg/$xQUpNRtppcSQuD6bKQkGI1'
set system login user vyos authentication plaintext-password ''
set system login user vyos level 'admin'
set system ntp server 0.pool.ntp.org
set system ntp server 1.pool.ntp.org
set system ntp server 2.pool.ntp.org
set system syslog global facility all level 'notice'
set system syslog global facility protocols level 'debug'
set system time-zone 'UTC'

Now, add the blackhole routes, default route, and an interface address:

config
set protocols static route 10.0.0.0/8 blackhole distance '240'
set protocols static route 169.254.0.0/16 blackhole distance '240'
set protocols static route 172.16.0.0/12 blackhole distance '240'
set protocols static route 192.168.0.0/16 blackhole distance '240'
commit

set protocols static route 0.0.0.0/0 next-hop 10.240.0.1 distance '1'
commit

set interfaces ethernet eth1 address '10.240.4.32/21'
commit
save

Do we have a problem? Let's take a look...

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>  0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:00:29
  *                   unreachable, 00:00:29
S>* 10.0.0.0/8 [240/0] unreachable (blackhole), 00:00:34
C>* 10.16.0.0/21 is directly connected, eth0, 00:04:03
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:23
S>* 169.254.0.0/16 [240/0] unreachable (blackhole), 00:00:34
S>* 172.16.0.0/12 [240/0] unreachable (blackhole), 00:00:34
S>* 192.168.0.0/16 [240/0] unreachable (blackhole), 00:00:34

Why is the empty unreachable route selected, rather than via 10.240.0.1?

vyos@vyos:~$ ip route show

blackhole default proto static metric 20 
blackhole 10.0.0.0/8 proto static metric 20 
10.16.0.0/21 dev eth0 proto kernel scope link src 10.16.4.32 
10.240.0.0/21 dev eth1 proto kernel scope link src 10.240.4.32 
blackhole 169.254.0.0/16 proto static metric 20 
blackhole 172.16.0.0/12 proto static metric 20 
blackhole 192.168.0.0/16 proto static metric 20

This is also puzzling - why is the "via 10.240.0.1" route not shown here?

vyos@vyos:~$ show config commands

set interfaces ethernet eth0 address '10.16.4.32/21'
set interfaces ethernet eth0 duplex 'auto'
set interfaces ethernet eth0 hw-id '00:0c:29:20:2a:ec'
set interfaces ethernet eth0 smp-affinity 'auto'
set interfaces ethernet eth0 speed 'auto'
set interfaces ethernet eth1 address '10.240.4.32/21'
set interfaces ethernet eth1 duplex 'auto'
set interfaces ethernet eth1 hw-id '00:0c:29:20:2a:f6'
set interfaces ethernet eth1 smp-affinity 'auto'
set interfaces ethernet eth1 speed 'auto'
set interfaces loopback lo
set protocols static route 0.0.0.0/0 next-hop 10.240.0.1 distance '1'
set protocols static route 10.0.0.0/8 blackhole distance '240'
set protocols static route 169.254.0.0/16 blackhole distance '240'
set protocols static route 172.16.0.0/12 blackhole distance '240'
set protocols static route 192.168.0.0/16 blackhole distance '240'
set service ssh port '22'
set system config-management commit-revisions '100'
set system console
set system host-name 'vyos'
set system login user vyos authentication encrypted-password '$1$gU100Pg/$xQUpNRtppcSQuD6bKQkGI1'
set system login user vyos authentication plaintext-password ''
set system login user vyos level 'admin'
set system ntp server 0.pool.ntp.org
set system ntp server 1.pool.ntp.org
set system ntp server 2.pool.ntp.org
set system syslog global facility all level 'notice'
set system syslog global facility protocols level 'debug'
set system time-zone 'UTC'

Okay, let's reboot and see how things look...

vyos@vyos:~$ show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [1/0] via 10.240.0.1, eth1, 00:00:07
S>* 10.0.0.0/8 [240/0] unreachable (blackhole), 00:00:07
C>* 10.16.0.0/21 is directly connected, eth0, 00:00:08
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:08
S>* 169.254.0.0/16 [240/0] unreachable (blackhole), 00:00:07
S>* 172.16.0.0/12 [240/0] unreachable (blackhole), 00:00:07
S>* 192.168.0.0/16 [240/0] unreachable (blackhole), 00:00:07

vyos@vyos:~$ ip route show

default via 10.240.0.1 dev eth1 proto static metric 20 
blackhole 10.0.0.0/8 proto static metric 20 
10.16.0.0/21 dev eth0 proto kernel scope link src 10.16.4.32 
10.240.0.0/21 dev eth1 proto kernel scope link src 10.240.4.32 
blackhole 169.254.0.0/16 proto static metric 20 
blackhole 172.16.0.0/12 proto static metric 20 
blackhole 192.168.0.0/16 proto static metric 20

So, it works fine after a reboot, but not when originally commited and saved.

This is important because the same problem also happens when the interface address is added through a VRRP transition. The routing table doesn't "realize" the default route is now accessible, and the blackhole route stays selected.

I can run the same sequence of commands against 1.1.8 if that would help. The routing table is correct immediately after commiting the interface address and doesn't require a reboot for things to "recalculate".