Page MenuHomeVyOS Platform

DHCP client sometimes ignores `no-default-route` option of an interface
Closed, ResolvedPublicBUG

Description

Hi,

I have multihomed network with 2 internet connections. Up to not long ago it was cable and DSL but now I’m trying to replace DSL with Starlink.

Starlink had two modes of operation:

  • All in one with Wi-Fi router enabled that allocates IPv4 addresses via DHCP from 198.168.1.0/24 range
  • Router bypass mode that alocates IPv4 addresses from 100.64.0.0/10 range (CGNAT)

For obvious reasons I’m more interested in the latter one. The configuration shod work identically to my cable connection but for some strange reasons it doesn’t.

And here are some more details:

r24:~$ show version

Version:          VyOS 1.3.0-epa3
Release train:    equuleus

Built by:         Sentrium S.L.
Built on:         Sun 31 Oct 2021 17:38 UTC
Build UUID:       383e45ad-b32a-4359-8183-9baacc8e69d9
Build commit ID:  bb511522cc3bb2-dirty

Architecture:     x86_64
Boot via:         installed image
System type:      KVM guest

Hardware vendor:  QEMU
Hardware model:   Standard PC (Q35 + ICH9, 2009)
Hardware S/N:     
Hardware UUID:    18a48ed6-0124-41b0-a4f8-bbbca875d989

Copyright:        VyOS maintainers and contributors

the interface in question configuration:

r24# show interfaces ethernet eth4
 address dhcp
 description WAN-STARLINK
 dhcp-options {
     no-default-route
 }
 hw-id 00:1b:21:8c:bd:a3
 vrf STARLINK
[edit]

the routing table:

r24:~$ show ip route vrf STARLINK 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup

VRF STARLINK:
S>* 0.0.0.0/0 [1/0] via 100.64.0.1, eth4, weight 1, 10:30:21
K * 0.0.0.0/0 [255/8192] unreachable (ICMP unreachable), 10:32:18
S>* 34.120.255.244/32 [1/0] is directly connected, eth4, weight 1, 10:30:21
C>* 100.64.0.0/10 is directly connected, eth4, 10:30:21
S>* 192.168.100.1/32 [1/0] is directly connected, eth4, weight 1, 10:30:21

(note administrative distance of default route and the fact that it is created in the first place contadicts the cinterface configuration)

Log from dhcp client:

r24:~$ show log dhcp client
...
Apr 30 22:35:44 systemd[1]: Starting DHCP client on eth4...
Apr 30 22:35:44 systemd[1]: Started DHCP client on eth4.
Apr 30 22:35:44 dhclient-script-vyos[26939]: Current dhclient PID: 26938, Parent PID: 26937, IP version: 4, All dhclients for interface eth4: 26937 26938
Apr 30 22:35:44 dhclient-script-vyos[26939]: Passing command to /usr/sbin/ip: "link set dev eth4 up"
Apr 30 22:35:44 dhclient-script-vyos[26939]: No changes to apply via vyos-hostsd-client
Apr 30 22:35:44 dhclient[26938]: DHCPDISCOVER on eth4 to 255.255.255.255 port 67 interval 7
Apr 30 22:35:44 dhclient[26938]: DHCPOFFER of 100.123.57.53 from 100.64.0.1
Apr 30 22:35:44 dhclient[26938]: DHCPREQUEST for 100.123.57.53 on eth4 to 255.255.255.255 port 67
Apr 30 22:35:44 dhclient[26938]: DHCPACK of 100.123.57.53 from 100.64.0.1
Apr 30 22:35:44 dhclient-script-vyos[26961]: Current dhclient PID: 26938, Parent PID: 1, IP version: 4, All dhclients for interface eth4: 26938
Apr 30 22:35:44 dhclient-script-vyos[26961]: Passing command to /usr/sbin/ip: "-4 addr add 100.123.57.53/255.192.0.0 broadcast 100.127.255.255 valid_lft 300 preferred_lft 300 dev eth4 label eth4"
Apr 30 22:35:44 dhclient-script-vyos[26961]: Passing command to /usr/sbin/ip: "link set dev eth4 mtu 1500"
Apr 30 22:35:44 dhclient-script-vyos[26961]: Deleting nameservers with tag "dhcp-eth4" via vyos-hostsd-client
Apr 30 22:35:44 dhclient-script-vyos[26961]: Adding nameservers "1.1.1.1 8.8.8.8" with tag "dhcp-eth4" via vyos-hostsd-client
Apr 30 22:35:44 dhclient-script-vyos[26961]: Applying changes via vyos-hostsd-client
Apr 30 22:35:44 dhclient-script-vyos[26961]: No changes to apply via vyos-hostsd-client
Apr 30 22:35:44 dhclient-script-vyos[26961]: FRR status: running
Apr 30 22:35:44 dhclient-script-vyos[26961]: Checking if the route presented in kernel: 192.168.100.1/32 dev eth4
Apr 30 22:35:44 dhclient-script-vyos[26961]: Converted vtysh command: "ip route 192.168.100.1/32  eth4 tag 210  vrf STARLINK"
Apr 30 22:35:44 dhclient-script-vyos[26961]: Sending command to vtysh
Apr 30 22:35:44 dhclient-script-vyos[26961]: FRR status: running
Apr 30 22:35:44 dhclient-script-vyos[26961]: Checking if the route presented in kernel: 34.120.255.244/32 dev eth4
Apr 30 22:35:44 dhclient-script-vyos[26961]: Converted vtysh command: "ip route 34.120.255.244/32  eth4 tag 210  vrf STARLINK"
Apr 30 22:35:44 dhclient-script-vyos[26961]: Sending command to vtysh
Apr 30 22:35:44 dhclient-script-vyos[26961]: FRR status: running
Apr 30 22:35:44 dhclient-script-vyos[26961]: Checking if the route presented in kernel: 0.0.0.0/0 via 100.64.0.1 dev eth4
Apr 30 22:35:44 dhclient-script-vyos[26961]: Converted vtysh command: "ip route 0.0.0.0/0 100.64.0.1 eth4 tag 210  vrf STARLINK"
Apr 30 22:35:44 dhclient-script-vyos[26961]: Sending command to vtysh
Apr 30 22:35:45 dhclient[26938]: bound to 100.123.57.53 -- renewal in 150 seconds.

Expected behavior:

  • no default route created

Actual behavior:

  • the default route is create with administrative distance of 1 that is cannot be overwitten

Additional details:

  • The interface runs in a separate vrf but behaviour is identical in the default vrf as well. I’m exploring option of isolating starling routes
  • DHCP client log shows that route is injected with ... tag 210 which I assume is the distance but it ends up with distance of 1 in the routing table.
  • Also, if I switch Starlink into the default mode with Wi-Fi router enabled, it allocates an address and routes with correct distance. In theory I could use it as workaround (even though double NAT does not inspire confidence) bu the bigger problem is that it allocates address from 192.168.1.0/24 range that conflict with one of my existing subnets, so it is a no-go.

Details

Difficulty level
Unknown (require assessment)
Version
1.3.0-epa3, 1.3.1
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Unspecified (please specify)

Event Timeline

Below is a packet capture from DHCP exchange:


It seems that option 121 has more than one route. Could this be causing the abnormal behavior?

Could you also provide cat /var/lib/dhcp/dhclient_eth4.leases ?
no-default-route ignore just option routers and don't touch other options like classless-static-routes
https://github.com/vyos/vyos-1x/blob/2c29a3b3b46c7570f4a509f413b208348c0ce647/data/templates/dhcp-client/ipv4.tmpl#L18-L19

r24:/home/dtoubelis# cat /var/lib/dhcp/dhclient_eth4.leases
lease {
  interface "eth4";
  fixed-address 100.123.57.53;
  option subnet-mask 255.192.0.0;
  option relay-agent-information 1:4:0:0:4:cf:5:4:64:40:0:1:97:8:1:0:14:ed:0:0:14:ed:98:0;
  option dhcp-lease-time 300;
  option routers 100.64.0.1;
  option dhcp-message-type 5;
  option domain-name-servers 1.1.1.1,8.8.8.8;
  option dhcp-server-identifier 100.64.0.1;
  option interface-mtu 1500;
  option rfc3442-classless-static-routes 32,192,168,100,1,0,0,0,0,32,34,120,255,244,0,0,0,0,0,100,64,0,1;
  renew 2 2022/05/03 12:42:00;
  rebind 2 2022/05/03 12:44:26;
  expire 2 2022/05/03 12:45:04;
}
lease {
  interface "eth4";
  fixed-address 100.123.57.53;
  option subnet-mask 255.192.0.0;
  option relay-agent-information 1:4:0:0:4:cf:5:4:64:40:0:1:97:8:1:0:14:ed:0:0:14:ed:98:0;
  option dhcp-lease-time 300;
  option routers 100.64.0.1;
  option dhcp-message-type 5;
  option domain-name-servers 1.1.1.1,8.8.8.8;
  option dhcp-server-identifier 100.64.0.1;
  option interface-mtu 1500;
  option rfc3442-classless-static-routes 32,192,168,100,1,0,0,0,0,32,34,120,255,244,0,0,0,0,0,100,64,0,1;
  renew 2 2022/05/03 12:46:34;
  rebind 2 2022/05/03 12:48:50;
  expire 2 2022/05/03 12:49:28;
}
lease {
  interface "eth4";
  fixed-address 100.123.57.53;
  option subnet-mask 255.192.0.0;
  option relay-agent-information 1:4:0:0:4:cf:5:4:64:40:0:1:97:8:1:0:14:ed:0:0:14:ed:98:0;
  option dhcp-lease-time 300;
  option routers 100.64.0.1;
  option dhcp-message-type 5;
  option domain-name-servers 1.1.1.1,8.8.8.8;
  option dhcp-server-identifier 100.64.0.1;
  option interface-mtu 1500;
  option rfc3442-classless-static-routes 32,192,168,100,1,0,0,0,0,32,34,120,255,244,0,0,0,0,0,100,64,0,1;
  renew 2 2022/05/03 12:51:33;
  rebind 2 2022/05/03 12:53:25;
  expire 2 2022/05/03 12:54:03;
}
...
}

Also, these routes getting an administrative distance of 1, which is impossible to override. I believe the default route from DHCP normally has 210 which is manageable. So, the quick workaround could be increasing distance of these routes.

It may be a good idea to cherry-pick this for 1.4.x branch.

Viacheslav changed the task status from Open to Backport candidate.Tue, May 10, 10:26 AM
Viacheslav added a project: VyOS 1.4 Sagitta.
Viacheslav moved this task from Need Triage to Finished on the VyOS 1.3 Equuleus (1.3.2) board.
Viacheslav moved this task from Need Triage to Backport Candidates on the VyOS 1.4 Sagitta board.
Viacheslav claimed this task.
Viacheslav moved this task from Backport Candidates to Finished on the VyOS 1.4 Sagitta board.