Page MenuHomePhabricator

VRRP transition scripts for sync-groups are not supported in VyOS (anymore)
Open, HighPublicBUG

Description

Hello all,

with keepalived 2.x the behavior of transition scripts and sync-group seems to have changed.
It was

VyOS 1.2.x only supports transition scripts on the vrrp group level like:

high-availability {
     vrrp {
         group eth1 {
             advertise-interval 1
             authentication {
                 password test
                 type plaintext-password
             }
             hello-source-address 10.10.0.11
             interface eth1
             no-preempt
             transition-script {
                 backup /config/scripts/vrrp-fail.sh
                 fault /config/scripts/vrrp-fail.sh
                 master /config/scripts/vrrp-master.sh
             }
             virtual-address 192.168.240.1/24
             vrid 101
         }
     }
 }

If you are using sync-groups for syncing multiple VRRP groups you have to define the transition scripts on the sync-group level according to the man page of the keepalived 2.x version:
(https://www.keepalived.org/manpage.html)

vrrp_sync_group <STRING> {
    ....
     notify_master /path/to_master.sh [username [groupname]]
    ....
}

e.g. in the following example no transition scripts are executed:

high-availability {
    vrrp {
        group eth0 {
            advertise-interval 1
            authentication {
                password test
                type plaintext-password
            }
            hello-source-address 10.128.2.222
            interface eth0
            no-preempt
            virtual-address 192.168.241.1/24
            vrid 100
        }
        group eth1 {
            advertise-interval 1
            authentication {
                password test
                type plaintext-password
            }
            hello-source-address 10.128.2.222
            interface eth1
            no-preempt
            transition-script {
                backup /config/scripts/vrrp-fail.sh
                fault /config/scripts/vrrp-fail.sh
                master /config/scripts/vrrp-master.sh
            }
            virtual-address 192.168.240.1/24
            virtual-address 192.168.240.5/24
            virtual-address 192.168.240.10/24
            vrid 101
        }
        sync-group TEST {
            member eth0
            member eth1
        }
    }
}

Unfortunately the config node of the sync-group only have "member" childs.
In order to support sync-group together with transition scripts this needs to be extended!

And since version 1.2.0 there is probably no workaround if you have more than one vrrp group together with sync-group.

Regards
Markus

Details

Difficulty level
Unknown (require assessment)
Version
VyOS 1.2.x
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Perfectly compatible

Event Timeline

adestis created this task.Fri, Nov 22, 4:30 PM
Dmitry added a subscriber: Dmitry.Fri, Nov 22, 5:46 PM
pasik added a subscriber: pasik.Fri, Nov 22, 7:21 PM
syncer assigned this task to jestabro.Sat, Nov 23, 12:06 PM
syncer triaged this task as High priority.
syncer added a project: VyOS 1.3 Equuleus.
syncer reassigned this task from jestabro to Dmitry.Sat, Nov 23, 12:17 PM
syncer added a subscriber: jestabro.

Hello, I created LAB with sync-group configuration. In this LAB I use VyOS 1.2.2 on all routers for reason like in your ticket


Using next configuration:

set high-availability vrrp group LAN2 interface 'eth1'
set high-availability vrrp group LAN2 transition-script backup '/config/scripts/vrrp-backup.sh'
set high-availability vrrp group LAN2 transition-script fault '/config/scripts/vrrp-fault.sh'
set high-availability vrrp group LAN2 transition-script master '/config/scripts/vrrp-master.sh'
set high-availability vrrp group LAN2 virtual-address '192.168.0.254/24'
set high-availability vrrp group LAN2 vrid '10'
set high-availability vrrp group MIDLE interface 'eth0'
set high-availability vrrp group MIDLE virtual-address '172.16.0.254/24'
set high-availability vrrp group MIDLE vrid '20'
set high-availability vrrp sync-group SG member 'LAN2'
set high-availability vrrp sync-group SG member 'MIDLE'

Router-R4 has state VRRP-Master

vyos@R4# run show vrrp 
Name    Interface      VRID  State    Last Transition
------  -----------  ------  -------  -----------------
LAN2    eth1             10  MASTER   16m52s
MIDLE   eth0             20  MASTER   16m52s

When we disable link on eth0 scripts which set for group LAN2 executed on all routers. For Router-R4 executed vrrp-fault.sh for Router-R3 executed vrrp-master.sh
Log on Router-R3

Nov 23 13:24:34 R3 Keepalived_vrrp[2105]: (MIDLE) ip address associated with VRID 20 not present in MASTER advert : 172.16.0.254
Nov 23 13:24:51 R3 Keepalived_vrrp[2105]: message repeated 16 times: [ (MIDLE) ip address associated with VRID 20 not present in MASTER advert : 172.16.0.254]
Nov 23 13:24:54 R3 Keepalived_vrrp[2105]: (LAN2) Backup received priority 0 advertisement
Nov 23 13:24:54 R3 Keepalived_vrrp[2105]: (MIDLE) ip address associated with VRID 20 not present in MASTER advert : 172.16.0.254
Nov 23 13:24:56 R3 Keepalived_vrrp[2105]: message repeated 2 times: [ (MIDLE) ip address associated with VRID 20 not present in MASTER advert : 172.16.0.254]
Nov 23 13:24:57 R3 Keepalived_vrrp[2105]: (MIDLE) Entering MASTER STATE
Nov 23 13:24:57 R3 Keepalived_vrrp[2105]: VRRP_Group(SG) Syncing instances to MASTER state
Nov 23 13:24:57 R3 Keepalived_vrrp[2105]: (LAN2) Entering MASTER STATE
Nov 23 13:24:57 R3 vyos-vrrp-wrapper: Running transition script /config/scripts/vrrp-master.sh for VRRP group LAN2

Log on Router-R4

Nov 23 13:24:55 R4 Keepalived_vrrp[1623]: Netlink reports eth0 down
Nov 23 13:24:55 R4 Keepalived_vrrp[1623]: (MIDLE) Entering FAULT STATE
Nov 23 13:24:55 R4 Keepalived_vrrp[1623]: (MIDLE) sent 0 priority
Nov 23 13:24:55 R4 Keepalived_vrrp[1623]: VRRP_Group(SG) Syncing instances to FAULT state
Nov 23 13:24:55 R4 Keepalived_vrrp[1623]: (LAN2) Entering FAULT STATE
Nov 23 13:24:55 R4 sudo: pam_unix(sudo:session): session closed for user root
Nov 23 13:24:56 R4 vyos-vrrp-wrapper: Running transition script /config/scripts/vrrp-fault.sh for VRRP group LAN2

VyOS 1.2.2 use keepalived version 2.0.10

vyos@R4# sudo /usr/sbin/keepalived --version
Keepalived v2.0.10 (11/12,2018)

Seems all work as expected. Can you provide your logs and output of command show vrrp on both sides?
Which VyOS version are you using in this case?

Notice: When we run restar vrrp on MASTER, files with VRRP interfaces states stored in /run/vyos/vrrp/ don't deleted. Next time scripts not running, because already have some state.
We need review this task with T1350