Page MenuHomeVyOS Platform

VyOS does not support running multiple instances of DHCPv6 clients
Needs testing, NormalPublicBUG

Description

The current usage of the WIDE DHCP client within VyOS launches one dhcp6c daemon per interface that requires it (either to obtain an address or a prefix delegation). However, the WIDE DHCP client daemon does not support launching more than one instance of itself per system as each daemon binds to all incoming interfaces. Output of ss -ulnp with two WIDE DHCP clients appear as follows:

UNCONN              0                   0                                                          [::]:546                                    [::]:*                  users:(("dhcp6c",pid=2350,fd=4))              
UNCONN              0                   0                                                          [::]:546                                    [::]:*                  users:(("dhcp6c",pid=2389,fd=4))

The result is that DHCPv6 packets reaching the system end up being processed by the wrong dhcp6c instance, leading to DHCP transactional failure in all but one of the interfaces (as all packets seem to end up being processed by a single dhcp6c in a stable way). This can be further demonstrated by the output of journalctl:

dhcp6c[2350]: dhcp6_reset_timer: reset a timer on eth1, state=SOLICIT, timeo=40, retrans=123876
dhcp6c[2389]: client6_recv: unexpected interface (24)

Where it's evident that packets meant to go to dhcp6c with PID 2350 are instead being consumed by dhcp6c with PID 2389.

The net effect of the above being that only a single DHCPv6 client can ever succeed per system, breaking DHCPv6 clients on all other interfaces. The interface in which DHCPv6 succeeds may not be stable on reboot either, as it depends on a race between which dhcp6c daemon manages to start first.

A simple reproducible configuration would be as follows:

interfaces {
    ethernet eth0 {
        address dhcpv6
    }

    ethernet eth1 {
        address dhcpv6
    }
}

though as mentioned, any combination of request for prefix delegation or address will trigger the issue, e.g.:

interfaces {
    ethernet eth0 {
        address dhcpv6
    }

    ethernet eth1 {
        dhcpv6-options {
            pd 0 {
                ...
            }
        }
    }
}

This issue has already been encountered and documented for other software router distributions, on OPNsense and pfSense as well.

I've tried my hand at resolving the issue but it appears to require major architectural changes to how DHCPv6 clients are handled during interface commit (we need to construct a single configuration file from all interfaces and then launch a single instance of dhcp6c). One particular quirk with the WIDE client is that it also requires all the interfaces it's meant to process to be passed as arguments during launch, making handling dynamic interfaces like PPPoE very difficult (this issue was serious enough that OPNsense maintains their own fork of WIDE here to remove this requirement).

An alternative solution can be to abandon WIDE in favor of a DHCPv6 client that can be launched per interface.

Details

Difficulty level
Hard (possibly days)
Version
1.4
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

A patch to the WIDE DHCPv6 client seems to be sufficient to resolve this issue with respect to the way VyOS currently uses the daemon (one daemon per configured interface), PRs below:

https://github.com/vyos/vyos-build/pull/273
https://github.com/vyos/vyos-build/pull/274

Read the note in PR 273 for additional explanation of the caveats of this patch.

c-po changed the task status from Open to Needs testing.Sat, Nov 19, 5:59 AM
c-po assigned this task to initramfs.
c-po triaged this task as Normal priority.
c-po moved this task from Need Triage to Finished on the VyOS 1.4 Sagitta board.