Hello,
I have a pair of 1.1.8 instances that fail when I try upgrading to 1.2.0-11. I have finally been able to isolate the problem in a test environment. The static route through a VRRP address is not being used when VRRP transitions to master. However, it is also related to having blackhole routes - without those it works. Maybe the default route doesn't get "recalculated" when the VRRP route is injected?
This works fine in 1.1.8 and I only noticed it when trying to upgrade a production system to 1.2.0 and had to immediately rollback because traffic stopped passing! I haven't found a workaround yet except disabling the blackhole routes, which I'd rather not do in this case.
Here are configs and a log of commands that might help explain better. The ping at the very end when VRRP is in master state succeeds on 1.1.8 and fails on 1.2.0.
---
VyOS 1.1.8 (works)
---
vyos@vyos:~$ show ver
Version: VyOS 1.1.8
...
vyos@vyos:~$ show config
interfaces {
ethernet eth0 {
address 10.16.4.32/21
duplex auto
hw-id 00:0c:29:20:2a:ec
smp_affinity auto
speed auto
}
ethernet eth1 {
duplex auto
hw-id 00:0c:29:20:2a:f6
smp_affinity auto
speed auto
vrrp {
vrrp-group 35 {
advertise-interval 1
hello-source-address 10.16.4.32
preempt false
virtual-address 10.240.4.30/21
}
}
}
loopback lo {
}
}
protocols {
static {
route 0.0.0.0/0 {
next-hop 10.240.0.1 {
distance 1
}
}
route 10.0.0.0/8 {
blackhole {
}
}
route 169.254.0.0/16 {
blackhole {
}
}
route 172.16.0.0/12 {
blackhole {
}
}
route 192.168.0.0/16 {
blackhole {
}
}
}
}
...
vyos@vyos:~$ show vrrp
RFC Addr Last Sync
Interface Group State Compliant Owner Transition Group
--------- ----- ----- --------- ----- ---------- -----
eth1 35 BACKUP no no 6m16s <none>
vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
I - ISIS, B - BGP, > - selected route, * - FIB route
S> 0.0.0.0/0 [1/0] via 10.240.0.1 (recursive
S>* 10.0.0.0/8 [1/0] is directly connected, Null0, bh
C>* 10.16.0.0/21 is directly connected, eth0
C>* 127.0.0.0/8 is directly connected, lo
S>* 169.254.0.0/16 [1/0] is directly connected, Null0, bh
S>* 172.16.0.0/12 [1/0] is directly connected, Null0, bh
S>* 192.168.0.0/16 [1/0] is directly connected, Null0, bh
vyos@vyos:~$ ping 8.8.8.8
connect: Network is unreachable
Reset links to force VRRP transition
vyos@vyos:~$ show vrrp
RFC Addr Last Sync
Interface Group State Compliant Owner Transition Group
--------- ----- ----- --------- ----- ---------- -----
eth1 35 MASTER no no 10s <none>
vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP, O - OSPF,
I - ISIS, B - BGP, > - selected route, * - FIB route
S>* 0.0.0.0/0 [1/0] via 10.240.0.1, eth1
S>* 10.0.0.0/8 [1/0] is directly connected, Null0, bh
C>* 10.16.0.0/21 is directly connected, eth0
C>* 10.240.0.0/21 is directly connected, eth1
C>* 127.0.0.0/8 is directly connected, lo
S>* 169.254.0.0/16 [1/0] is directly connected, Null0, bh
S>* 172.16.0.0/12 [1/0] is directly connected, Null0, bh
S>* 192.168.0.0/16 [1/0] is directly connected, Null0, bh
vyos@vyos:~$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_req=1 ttl=121 time=6.92 ms
64 bytes from 8.8.8.8: icmp_req=2 ttl=121 time=6.75 ms
64 bytes from 8.8.8.8: icmp_req=3 ttl=121 time=6.96 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
rtt min/avg/max/mdev = 6.759/6.882/6.965/0.088 ms
---
VyOS 1.2.0-rc11 (fails)
---
vyos@vyos:~$ show ver
Version: VyOS 1.2.0-rc11
...
vyos@vyos:~$ show config
high-availability {
vrrp {
group eth1-35 {
hello-source-address 10.16.4.31
interface eth1
no-preempt
virtual-address 10.240.4.30/21
vrid 35
}
sync-group core {
member eth1-35
}
}
}
interfaces {
ethernet eth0 {
address 10.16.4.31/21
duplex auto
hw-id 00:0c:29:80:6e:50
smp-affinity auto
speed auto
}
ethernet eth1 {
duplex auto
hw-id 00:0c:29:80:6e:5a
smp-affinity auto
speed auto
}
loopback lo {
}
}
protocols {
static {
route 0.0.0.0/0 {
next-hop 10.240.0.1 {
distance 1
}
}
route 10.0.0.0/8 {
blackhole {
}
}
route 169.254.0.0/16 {
blackhole {
}
}
route 172.16.0.0/12 {
blackhole {
}
}
route 192.168.0.0/16 {
blackhole {
}
}
}
}
...
vyos@vyos:~$ show vrrp
Name Interface VRID State Last Transition
------- ----------- ------ ------- -----------------
eth1-35 eth1 35 BACKUP 1m4s
vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route
S> 0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:12:09
* unreachable, 00:12:09
S>* 10.0.0.0/8 [1/0] unreachable (blackhole), 00:12:09
C>* 10.16.0.0/21 is directly connected, eth0, 00:13:19
S>* 169.254.0.0/16 [1/0] unreachable (blackhole), 00:12:09
S>* 172.16.0.0/12 [1/0] unreachable (blackhole), 00:12:09
S>* 192.168.0.0/16 [1/0] unreachable (blackhole), 00:12:09
vyos@vyos:~$ ping 8.8.8.8
connect: Invalid argument
vyos@vyos:~$ # Reset links to force VRRP transition
vyos@vyos:~$ show vrrp
Name Interface VRID State Last Transition
------- ----------- ------ ------- -----------------
eth1-35 eth1 35 MASTER 11s
vyos@vyos:~$ show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR, f - OpenFabric,
> - selected route, * - FIB route
S> 0.0.0.0/0 [1/0] via 10.240.0.1 (recursive), 00:13:04
* unreachable, 00:13:04
S>* 10.0.0.0/8 [1/0] unreachable (blackhole), 00:13:04
C>* 10.16.0.0/21 is directly connected, eth0, 00:14:14
C>* 10.240.0.0/21 is directly connected, eth1, 00:00:16
S>* 169.254.0.0/16 [1/0] unreachable (blackhole), 00:13:04
S>* 172.16.0.0/12 [1/0] unreachable (blackhole), 00:13:04
S>* 192.168.0.0/16 [1/0] unreachable (blackhole), 00:13:04
vyos@vyos:~$ ping 8.8.8.8
connect: Invalid argument
Failed! Should have succeeded like 1.1.8 does.