Page MenuHomeVyOS Platform

VRRP rfc3768-compatibility not working correctly when resulting interface name is over 15 characters
Closed, ResolvedPublicBUG

Description

When creating a VRRP group with rfc3768-compatibility compatibility enabled, a VRRP interface is created with the name format <parent interface>v<VRID>v<IP Version>. For example, creating an IPv4 group using the VRID 1 on the interface eth0 would create an interface named eth0v1v4

I've noticed that this generated interface appears to be limited to 15 characters. Once the generated interface name exceeds this length, the interface appears to be created using the format vrrp.<VRID>. In this scenario, the virtual IP appears to respond correctly, however, the interface and IP don't appear in show interfaces, impacting visibility of the current running config.

There are many scenarios that can result in an interface name exceeding the 15 character limit, such as QinQ interfaces, VLANs on bonds, etc.

Below is output showing both scenarios, where eth1 has a working VRRP group and eth2 has a non-fully-working group:

vyos@vyos:~$ sh vrrp
Name    Interface          VRID  State      Priority  Last Transition
------  ---------------  ------  -------  ----------  -----------------
eth1    eth1.100.20v1v4       1  MASTER          100  32s
eth2    vrrp.1                2  MASTER          100  36s

vyos@vyos:~$ sh int
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address       MAC                VRF        MTU  S/L    Description
---------------  ---------------  -----------------  -------  -----  -----  -------------
eth0             172.31.150.3/20  00:15:5d:00:04:32  default   1500  u/u
eth1             -                00:15:5d:00:04:33  default   1500  u/u
eth1.100         -                00:15:5d:00:04:33  default   1500  u/u
eth1.100.20      10.1.0.1/24      00:15:5d:00:04:33  default   1500  u/u
eth1.100.20v1v4  10.1.0.254/24    00:00:5e:00:01:01  default   1500  u/u
eth2             -                00:15:5d:00:04:34  default   1500  u/u
eth2.100         -                00:15:5d:00:04:34  default   1500  u/u
eth2.100.200     10.2.0.1/24      00:15:5d:00:04:34  default   1500  u/u

Relevant interfaces from ifconfig:

eth1.100.20v1v4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.1.0.254  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 00:00:5e:00:01:01  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 87  bytes 4578 (4.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

vrrp.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 10.2.0.254  netmask 255.255.255.0  broadcast 0.0.0.0
        ether 00:00:5e:00:01:01  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 1425  bytes 67462 (65.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Relevant interfaces from ip address:

37: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:00:5e:00:01:01 brd ff:ff:ff:ff:ff:ff
    inet 10.1.0.254/24 scope global eth1.100.20v1v4
       valid_lft forever preferred_lft forever


33: [email protected]: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:00:5e:00:01:01 brd ff:ff:ff:ff:ff:ff
    inet 10.2.0.254/24 scope global vrrp.1
       valid_lft forever preferred_lft forever

Relevant config:

high-availability {
    vrrp {
        group eth1 {
            address 10.1.0.254/24 {
            }
            interface eth1.100.20
            rfc3768-compatibility
            vrid 1
        }
        group eth2 {
            address 10.2.0.254/24 {
            }
            interface eth2.100.200
            rfc3768-compatibility
            vrid 1
        }
    }
}
interfaces {

....

    ethernet eth1 {
        hw-id 00:15:5d:00:04:33
        vif-s 100 {
            vif-c 20 {
                address 10.1.0.1/24
            }
        }
    }
    ethernet eth2 {
        hw-id 00:15:5d:00:04:34
        vif-s 100 {
            protocol 802.1q
            vif-c 200 {
                address 10.2.0.1/24
            }
        }
    }

....

}

Details

Difficulty level
Unknown (require assessment)
Version
VyOS 1.4.0-epa2
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

Log output when starting keepalived:

Apr 10 12:27:36 systemd[1]: Started keepalived.service - Keepalive Daemon (LVS and VRRP).
Apr 10 12:27:36 Keepalived[13972]: Starting Keepalived v2.2.7 (01/16,2022)
Apr 10 12:27:36 Keepalived[13972]: Running on Linux 6.6.21-amd64-vyos #1 SMP PREEMPT_DYNAMIC Thu Mar  7 21:32:13 UTC 2024 (built for Linux 5.19.11)
Apr 10 12:27:36 Keepalived[13972]: Command line: '/usr/sbin/keepalived' '--use-file' '/run/keepalived/keepalived.conf' '--pid'
Apr 10 12:27:36 Keepalived[13972]:               '/run/keepalived/keepalived.pid' '--dont-fork'
Apr 10 12:27:36 Keepalived[13972]: Configuration file /run/keepalived/keepalived.conf
Apr 10 12:27:36 Keepalived[13972]: NOTICE: setting config option max_auto_priority should result in better keepalived performance
Apr 10 12:27:36 Keepalived[13972]: Starting VRRP child process, pid=13973
Apr 10 12:27:36 Keepalived_vrrp[13973]: (/run/keepalived/keepalived.conf: Line 33) VMAC interface name 'eth2.100.200v1v4' too long or invalid characters - ignoring
Apr 10 12:27:36 Keepalived_vrrp[13973]: SECURITY VIOLATION - scripts are being executed but script_security not enabled.
Apr 10 12:27:36 Keepalived_vrrp[13973]: use_vmac or no_accept/strict specified, but no firewall configured - using nftables
Apr 10 12:27:36 kernel: hv_netvsc 34e814fd-2c98-48fd-ba20-34634c9c0f11 eth1: entered promiscuous mode
Apr 10 12:27:36 kernel: hv_netvsc 5c3d6d23-3071-4cc9-b69e-8b2f3921d615 eth2: entered promiscuous mode
Apr 10 12:27:36 (udev-worker)[13969]: Network interface NamePolicy= disabled on kernel command line.
Apr 10 12:27:36 (udev-worker)[13970]: Network interface NamePolicy= disabled on kernel command line.
Apr 10 12:27:36 Keepalived[13972]: Startup complete
Apr 10 12:27:36 Keepalived_vrrp[13973]: (eth1) Entering BACKUP STATE (init)
Apr 10 12:27:36 Keepalived_vrrp[13973]: (eth2) Entering BACKUP STATE (init)
Apr 10 12:27:36 keepalived-fifo.py[13976]: Starting FIFO pipe for Keepalived
Apr 10 12:27:36 keepalived-fifo.py[13976]: Loaded configuration: {'group': {'eth1': {'address': {'10.1.0.254/24': {}}, 'interface': 'eth1.100.20', 'rfc3768_compatibility': {}, 'vrid': '1'}, 'eth2': {'address': {'10.2.0.254/24': {}}, 'interface': 'eth2.100.200', 'rfc3768_compatibility': {}, 'vrid': '1'}}}
Apr 10 12:27:36 keepalived-fifo.py[13976]: PIPE already exist: /run/keepalived/keepalived_notify_fifo
Apr 10 12:27:36 keepalived-fifo.py[13976]: Message reading start
Apr 10 12:27:36 keepalived-fifo.py[13976]: Message processing start
Apr 10 12:27:36 keepalived-fifo.py[13976]: Received message: INSTANCE "eth1" BACKUP 100
Apr 10 12:27:36 keepalived-fifo.py[13976]: INSTANCE eth1 changed state to BACKUP
Apr 10 12:27:36 keepalived-fifo.py[13976]: Received message: INSTANCE "eth2" BACKUP 100
Apr 10 12:27:36 keepalived-fifo.py[13976]: INSTANCE eth2 changed state to BACKUP
Apr 10 12:27:39 Keepalived_vrrp[13973]: (eth1) Entering MASTER STATE
Apr 10 12:27:39 Keepalived_vrrp[13973]: (eth2) Entering MASTER STATE
Apr 10 12:27:39 keepalived-fifo.py[13976]: Received message: INSTANCE "eth1" MASTER 100
Apr 10 12:27:39 keepalived-fifo.py[13976]: INSTANCE eth1 changed state to MASTER
Apr 10 12:27:39 keepalived-fifo.py[13976]: Received message: INSTANCE "eth2" MASTER 100
Apr 10 12:27:39 keepalived-fifo.py[13976]: INSTANCE eth2 changed state to MASTER

After a bit more digging it looks like the 15 character limit is a kernel limitation, so the issue is more with how the interfaces are named when creating VRRP groups and no real handling of a scenario where the length is over 15 characters.

This appears to extend similarly to all interface creation. Attempting to create the interface bond20.1000.1000 for example, errors out when committing, with the following:

Traceback (most recent call last):
  File "/usr/libexec/vyos/conf_mode/interfaces_bonding.py", line 291, in <module>
    apply(c)
  File "/usr/libexec/vyos/conf_mode/interfaces_bonding.py", line 264, in apply
    b.update(bond)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/bond.py", line 477, in update
    super().update(config)
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/interface.py", line 1715, in update
    c_vlan = VLANIf(vif_c_ifname, **tmp)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/interface.py", line 338, in __init__
    self._create()
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/interface.py", line 1800, in _create
    self._cmd(cmd.format(**self.config))
  File "/usr/lib/python3/dist-packages/vyos/ifconfig/control.py", line 52, in _cmd
    return cmd(command, self.debug)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/vyos/utils/process.py", line 155, in cmd
    raise OSError(code, feedback)
OSError: [Errno 255] failed to run command: ip link add link bond20.1000 name bond20.1000.1000 type vlan id 1000
returned:
exit code: 255

noteworthy:
cmd 'nft -c delete element inet vrf_zones ct_iface_map { "bond20" }'
returned (out):

returned (err):
Error: Could not process rule: No such file or directory
delete element inet vrf_zones ct_iface_map { bond20 }
                                             ^^^^^^
cmd 'nft -c delete element inet vrf_zones ct_iface_map { "bond20.1000" }'
returned (out):

returned (err):
Error: Could not process rule: No such file or directory
delete element inet vrf_zones ct_iface_map { bond20.1000 }
                                             ^^^^^^^^^^^
cmd 'ip link add link bond20.1000 name bond20.1000.1000 type vlan id 1000'
returned (out):

returned (err):
Error: argument "bond20.1000.1000" is wrong: "name" not a valid ifname

[[interfaces bonding bond20]] failed
Commit failed

I appreciate that bond20.1000.1000 is a fairly extreme case, however we utilise QinQ fairly extensively and would definitely have scenarios with two stacked four-numeral VLAN IDs.

It is another bug, it should be a separate bug report https://vyos.dev/T6223

Viacheslav triaged this task as Normal priority.Wed, Apr 10, 2:58 PM
Viacheslav changed the task status from Open to Confirmed.Thu, Apr 11, 8:14 AM
Viacheslav changed the task status from Confirmed to In progress.Thu, Apr 11, 8:36 AM
Viacheslav claimed this task.
Viacheslav changed the task status from In progress to Needs testing.Thu, Apr 11, 4:04 PM

Wouldn’t your suggested fix to https://vyos.dev/T6223 also apply here? If the plan is to validate interface name lengths and allow custom names this would be a non-issue.

Wouldn’t your suggested fix to https://vyos.dev/T6223 also apply here? If the plan is to validate interface name lengths and allow custom names this would be a non-issue.

This is just my proposal for a solution to the problem, which may be approved or rejected.
Setting interface names could be part of 1.5 development but not for 1.4.

Viacheslav moved this task from Need Triage to Finished on the VyOS 1.5 Circinus board.
Viacheslav moved this task from Need Triage to Finished on the VyOS 1.4 Sagitta (1.4.0-epa3) board.