Page MenuHomeVyOS Platform

DHCP-FO with multiple subnets results in invalid/non-functioning dhcpd.conf configuration file output
Closed, ResolvedPublicBUG

Description

greetings everyone.. I'm so excited to [hopefully] be contributing to the project!

I am trying to use the DCHP failover (DHCP-FO) features in VyOS 1.3, and I am doing it for multiple subnets.

The current mechanism for establishing DHCP-FO is as the subnet level. You specify your IP, the remote peer, whether you are "primary" or "secondary" and a "name" that is supposed to be unique across your installation.

So the following example configuration:

set service dhcp-server shared-network-name sfo1-server authoritative
set service dhcp-server shared-network-name sfo1-server description 'SFO1 - Server Subnet'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 default-router '10.3.30.1'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 failover local-address '10.3.30.15'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 failover name 'sfo1-server'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 failover peer-address '10.3.30.16'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 failover status 'primary'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 range 10.3.30.0 start '10.3.30.100'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 range 10.3.30.0 stop '10.3.30.254'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 subnet-parameters 'ping-check true;'
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 subnet-parameters 'ping-timeout 3;'

set service dhcp-server shared-network-name sfo1-desktop authoritative
set service dhcp-server shared-network-name sfo1-desktop description 'SFO1 - Desktop Subnet'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 default-router '10.3.50.1'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 failover local-address '10.3.30.15'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 failover name 'sfo1-desktop'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 failover peer-address '10.3.30.16'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 failover status 'primary'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 range 10.3.50.0 start '10.3.50.100'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 range 10.3.50.0 stop '10.3.50.254'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 subnet-parameters 'ping-check true;'
set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 subnet-parameters 'ping-timeout 3;'

will yield the following lines in the resulting /run/dhcp-server/dhcpd.conf:

# Failover configuration for 10.3.50.0/24
failover peer "sfo1-desktop" {
    primary;
    mclt 1800;
    split 128;
    address 10.3.30.15;
    port 520;
    peer address 10.3.30.16;
    peer port 520;
    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;
}
# Failover configuration for 10.3.30.0/24
failover peer "sfo1-server" {
    primary;
    mclt 1800;
    split 128;
    address 10.3.30.15;
    port 520;
    peer address 10.3.30.16;
    peer port 520;
    max-response-delay 30;
    max-unacked-updates 10;
    load balance max seconds 3;
}

The problem is that this is an erroneous configuration as only the first failover peer defintion takes effect, and binds to the ports. It would appear no one actually uses the DHCP-FO functionality on multiple subnets simultaneously, as it does not actually work!

I would submit to you that this is not in fact the way the ISC DHCPd failover mechanism is designed to work, as demonstraed in the following document: https://kb.isc.org/docs/aa-00502

The failover partner definition should be globally defined, and that name should then be referenced inside each "pool" statement where it is intended to be used. Not only is this consistent with how the dhcpd.conf file is structured normally, but it requires a lot less duplication of data which keeps the VyOS configuration cleaner.

I am proposing a change to the configuration commands for this service/feature.

Perhaps something like this makes sense:

set service dhcp-server failover sfo1-failover local-address '10.3.30.15'
set service dhcp-server failover sfo1-failover peer-address '10.3.30.16'
set service dhcp-server failover sfo1-failover status 'primary'

set service dhcp-server shared-network-name sfo1-desktop subnet 10.3.50.0/24 failover sfo1-failover
set service dhcp-server shared-network-name sfo1-server subnet 10.3.30.0/24 failover sfo1-failover

This is something I believe that I am capable of contributing, but this will be my first time contributing to the VyOS project - so I want to make sure I follow all of the proper procedures.

Thank you in advance!

  • Joel C

Details

Difficulty level
Normal (likely a few hours)
Version
1.3.0-rc4
Why the issue appeared?
Design mistake
Is it a breaking change?
Config syntax change (migratable)
Issue type
Bug (incorrect behavior)

Event Timeline

oh good grief this is an old problem.. Just found a reference here while researching: https://community.ui.com/questions/DHCP-Failover-Configuration-Multiple-VLAN-interfaces/da7a0f03-2c4e-4d9f-9924-c2297db177db

This person identified this as a workaround:
I ran into this too, you don't need to manually edit the config file though. Define the failover config as usual in one of your subnets, then in the other subnets, inject the reference to the failover peer:

set service dhcp-server shared-network-name <name> shared-network-parameters "failover peer &quot;dhcp-failover&quot;;"

Make sure you use the same name for all of them.

This makes me sad. I appears no one cares about this issue, and it's been allowed to exist for so long!

I just completed some additional experimentation, and it does not appear that this workaround functions any longer, based on how the current DHCPD Jinja2 template is configured.

Can I get some buy-in on changing the way the CLI handles this? What is the ideal way to go about doing this? I believe the "old way" could be allowed as a form of backwards compatibility, while still allowing the more "correct" configuration.

Alternatively, it would not be difficult to "migrate" these configuration lines over to the new style. I'm guessing there is already an established pattern for handling that as well?

Thanks in advance!

// Joel

c-po set Issue type to Unspecified (please specify).
c-po changed the task status from Open to In progress.Sep 19 2021, 7:30 AM
c-po triaged this task as Normal priority.
c-po added a project: VyOS 1.4 Sagitta.
c-po changed Difficulty level from Unknown (require assessment) to Normal (likely a few hours).
c-po changed Why the issue appeared? from Will be filled on close to Design mistake.
c-po changed Issue type from Unspecified (please specify) to Bug (incorrect behavior).