Page MenuHomeVyOS Platform

Loss of connectivity on dhcp enabled ethernet interfaces after abrupt link restarts
Open, NormalPublicBUG

Description

Summary

I have a LTE/5G modem plugged on eth1. The address on this interface is assigned with DHCP and the cable has some issue so some times it disconnects and reconnects. I expect it to start working(modem's interface should be reachable and I should be able to access the internet using that device) after the link is back online but that does not happen.

DHCP lease reports every thing is okay, the interface has address assigned and the route exists but nothing is reachable.

vyos@router# run show dhcp client leases
Interface    eth1
IP address   100.90.76.10                  [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.72 59.144.144.99
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Wed Oct 25 22:41:35 IST 2023
Expiry       Thu Oct 26 22:41:35 IST 2023

Steps to reproduce:

  • Create a ethernet interface which fetches IP with DHCP
  • Disable link(by removing cable or disabling the interface of virtualized instance)
  • Enable link by plugging the cable back in

The exact config for this ethernet interface in my setup

address dhcp
dhcp-options {
    default-route-distance 2
}
hw-id a8:b8:e0:00:4a:d8
redirect ifb0

Expected:

Connectivity to resources on this link should be restored after the link is online

What actually happens:

It stops working. The interface has address from dhcp, the route is there but I can not reach any thing on the other side.

Logs

[188443.255622] igc 0000:02:00.0 eth1: NIC Link is Down
[188475.901593] igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[188481.197040] igc 0000:02:00.0 eth1: NIC Link is Down
[188484.449365] igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

Additional details

I suspect this is happening because when link goes down, the bridged 5g modem discards whatever state it has for the address assigned to the vyos router. When the router link goes back online, vyos tries to use the same address but the 5g modem now has no information about this lease and it doesn't accept any traffic from it.

The way I fix it is by releasing the lease and renewing it which almost always gives me a completely different IP address.

vyos should handle this a bit better. It should release dhcp addresses when the associated link goes down and renew them when the link comes back up.
Back in mikrotik land, The dhcp client works exactly like this

Details

Difficulty level
Normal (likely a few hours)
Version
1.5-rolling-202310190118
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

To verify that it isnt something in your 5G modem that triggers this behaviour try to put a L2-switch in between and then simulate a link failure between VyOS and this L2-switch and see how things behaves?

That is:

5G-modem <-> L2-switch <-> VyOS

For example when link goes down and the 5G-modem uses something like "ip verify" along with "dhcp-snooping" then a new DHCP request must arrive before that interface will continue to forward packets.

Could also be if legacy spanning-tree is being used that it will take approx 25 seconds before packets are being forwarded after a linkdown - workaround here is to enable "spanning-tree portfast" or disable spanning-tree.

Some equipment have a "link debounce-timer" which can be set so a short linkdown (like 1000ms or whatever you select as link debounce timer) wont trigger routing and forwarding withdraws.

Could also be if legacy spanning-tree is being used that it will take approx 25 seconds before packets are being forwarded after a linkdown - workaround here is to enable "spanning-tree portfast" or disable spanning-tree.

There is no STP on this link.

Some equipment have a "link debounce-timer" which can be set so a short linkdown (like 1000ms or whatever you select as link debounce timer) wont trigger routing and forwarding withdraws.

Today I learned! Although, I don't believe this particular solution will fix this problem.

I tested it once again and a bit differently this time.

  1. 5G modem is connected and the lease is active for another 14 hours.
  2. I reboot the 5G modem and wait
  3. Modem continues to be unavailable after reboot. The ethernet interface on the router still has the address assigned to it and the ethernet interface went down and up twice.
[38988.682044] igc 0000:02:00.0 eth1: NIC Link is Down
[39021.955890] igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[39027.157608] igc 0000:02:00.0 eth1: NIC Link is Down
[39030.354879] igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

I waited for ~20 minutes and it did not fix it on it's own.

  1. I ran release dhcp interface eth1 and it had no effect. The 5G modem continues to be unreachable and I waited 3-4 minutes at this step.
BEFORE 
[email protected]:~$ show dhcp client leases
Interface    eth1
IP address   100.90.78.196                 [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.110 202.56.215.41
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Thu Oct 26 15:50:58 IST 2023
Expiry       Fri Oct 27 15:50:57 IST 2023


AFTER 
[email protected]:~$ show dhcp client leases
Interface    eth1
IP address   100.74.241.155                 [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.112 117.96.122.110
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Fri Oct 27 02:27:45 IST 2023
Expiry       Sat Oct 28 02:27:44 IST 2023
  1. Run renew dhcp interface eth1 and I can reach the 5G modem(and internet via the modem) immediately.

Does your 5G-modem do any NAT on its own or does it just forward the DHCP to the ISP?

What kind of vendor/model is that 5G-modem?

Possible for you to use static IP between the VyOS and the 5G-modem?

Possible for you to try different vendor and/or model for 5G-modem?

Does your 5G-modem do any NAT on its own or does it just forward the DHCP to the ISP?

It's in bridge mode, so no nat, firewall or any thing else. Forwards dhcp traffic to ISP.

What kind of vendor/model is that 5G-modem?

It's from Huawei, The model number is H112-372

Possible for you to use static IP between the VyOS and the 5G-modem?

Nope. ISP uses DHCP for address assignments and I don't have any option to assign static addresses.

Possible for you to try different vendor and/or model for 5G-modem?

No, I don't have any other 5G modem I can test here.

I would still recommend you to try to test to put a L2-switch between your 5G-router and the VyOS box and see if that resolves the situation.

That is if the interface of your VyOS box is shaky then the L2-switch would keep the connection alive towards the 5G-router so it doesnt rigger whatever it seems to trigger once it detects a linkdown on the LAN interface.

If its currently not an option to try with some other vendor/model then I would try to reconfigure that 5G-router so it will do NAT and then use static IP (aka disable DHCP server on the 5G router) towards your VyOS as a workaround.

According to manual it does support "DMZ mode" so it will forward all incoming traffic towards that static IP of your VyOS:

https://usermanual.wiki/m/554249f42da59d4d55a38e446763dbf74c1d655a768a6622573d77a016cfa1d3

Also make sure that you have loaded the 5G-router with latest stable firmware available.

When the linkdown/up occurs, does the "last update" of run show dhcp client leases also update or doesnt the DHCP client in VyOS detect that the link went up/down?

Back in the days it was not uncommon to have various issues with ISPs when it comes to DHCP. Mainly windows boxes had issues since windows by default wants its current IP back after a reboot but some DHCP servers just ignored that with the result that the internetconnection suddently wouldnt work after a reboot. Fix in those cases was to add a script which would first release the aquired DHCP IP-address and THEN renew it (the renew would then ask for 0.0.0.0 or something like that which these DHCP-servers would be happy to bring a proper DHCP reply back to the client so it could assign an IP-address and continue whatever they were doing on the Internet).

I would still recommend you to try to test to put a L2-switch between your 5G-router and the VyOS box and see if that resolves the situation.

I tested this today and I can't get replicate it by just disconnecting the cable between switch and router. I now believe I missed a critical detail earlier. In my previous test, The router must have rebooted(It's supposed to reboot once every 7 days and I just tested it at the worst possible time). When I found out it was not working, I did not check the uptime on the modem's dashboard.

The new steps to reproduce this are,

  • Create a ethernet interface which fetches IP with DHCP
  • Reboot the "DHCP server" or the LTE/5G Modem
  • Wait until the modem has restarted
  • Test if internet is accessible using that modem

In my testing, Internet is inaccessible and the modem's interface is also inaccessible.

If its currently not an option to try with some other vendor/model then I would try to reconfigure that 5G-router so it will do NAT and then use static IP (aka disable DHCP server on the 5G router) towards your VyOS as a workaround.
According to manual it does support "DMZ mode" so it will forward all incoming traffic towards that static IP of your VyOS:

Sorry but this whole thing feels really weird to me and I do not want to do it. It supports bridge mode, it should be used in bridge mode. I'll also have to update my firewall rules to accommodate this because eth1 is a member of WAN interface group and there are firewall rules to drop any rfc1918 traffic in this group. This is not an option I'll consider in any situation.

Also make sure that you have loaded the 5G-router with latest stable firmware available.

It is updated and on latest available firmware

Annoyingly, This has started working properly some how.

When the linkdown/up occurs, does the "last update" of run show dhcp client leases also update or doesnt the DHCP client in VyOS detect that the link went up/down?

BEFORE
[email protected]:~$ show dhcp client leases
Interface    eth1
IP address   100.90.77.6                   [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.110 202.56.215.41
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Sat Oct 28 23:50:00 IST 2023
Expiry       Sun Oct 29 23:49:59 IST 2023

AFTER
[email protected]:~$ show dhcp client leases
[email protected]:~$ show dhcp client leases
[email protected]:~$ show dhcp client leases
Interface    eth1
IP address   100.78.202.231                 [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.112 117.96.122.110
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Sun Oct 29 00:47:43 IST 2023
Expiry       Mon Oct 30 00:47:43 IST 2023

For first few seconds(after modem reboot), there was no active lease and then this new lease showed up

Logs

ct 29 00:46:48 netplugd[2473375]: /etc/netplug/netplug eth1 out -> pid 2473375
Oct 29 00:46:48 zebra[1434]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Oct 29 00:46:48 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=911, pid=3161387868
Oct 29 00:46:48 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:46:49 systemd[1]: Stopping [email protected] - DHCP client on eth1...
Oct 29 00:46:49 dhclient[2473393]: Killed old client process
Oct 29 00:46:49 dhclient[2473393]: Killed old client process
Oct 29 00:46:50 dhclient[2473393]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:46:50 dhclient[2473393]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:46:50 dhclient[2473393]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:46:50 dhclient[2473393]: All rights reserved.
Oct 29 00:46:50 dhclient[2473393]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:46:50 dhclient[2473393]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:46:50 dhclient[2473393]: All rights reserved.
Oct 29 00:46:50 dhclient[2473393]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:46:50 dhclient[2473393]:
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:46:50 dhclient[2473393]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:46:50 dhclient[2473393]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:46:50 dhclient[2473393]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:46:50 dhclient[2473393]: Sending on   Socket/fallback
Oct 29 00:46:50 dhclient[2473393]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:46:50 dhclient[2473393]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:46:50 dhclient[2473393]: Sending on   Socket/fallback
Oct 29 00:46:50 dhclient[2473393]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Current dhclient PID: 2473393, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2473393
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Deleting search domains with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:46:50 zebra[1434]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Oct 29 00:46:50 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=920, pid=3161387868
Oct 29 00:46:50 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:46:50 vyos-hostsd[776]: Request data: {"type": "search_domains", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:46:50 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:46:50 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:46:50 vyos-hostsd[776]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:46:50 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:46:50 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Deleting default route: via 100.0.0.1 dev eth1 metric 2
Oct 29 00:46:50 dhclient-script-vyos[2473559]: FRR status: running
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Converted vtysh command: "no ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Sending command to vtysh
Oct 29 00:46:50 dhclient-script-vyos[2473559]: Applying changes via vyos-hostsd-client
Oct 29 00:46:50 vyos-hostsd[776]: Request data: {"op": "apply"}
Oct 29 00:46:50 vyos-hostsd[776]: Applying 2 changes
Oct 29 00:46:50 vyos-hostsd[776]: Writing /etc/resolv.conf
Oct 29 00:46:50 vyos-hostsd[776]: Writing /etc/hosts
Oct 29 00:46:50 vyos-hostsd[776]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
Oct 29 00:46:50 vyos-hostsd[776]: Writing /run/powerdns/recursor.forward-zones.conf
Oct 29 00:46:50 vyos-hostsd[776]: Running "rec_control reload-lua-config"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520610.409" command="reload-lua-config"
Oct 29 00:46:50 pdns-recursor[8737]: Reloaded Lua configuration file 'recursor.conf.lua', requested via control channel
Oct 29 00:46:50 vyos-hostsd[776]: Running "rec_control reload-zones"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520610.420" command="reload-zones"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Reloading zones, purging data from cache" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Reading zone forwarding information" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420" file="recursor.forward-zones.conf"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Done parsing forwarding instructions from file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420" count="1" file="recursor.forward-zones.conf"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420" zone="localhost"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420" zone="1.0.0.127.in-addr.arpa"
Oct 29 00:46:50 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.420" zone="router.home.arpa.home.arpa"
...
Oct 29 00:46:50 pdns-recursor[8737]: msg="Inserting rfc 1918 private space zones" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520610.422"
Oct 29 00:46:50 vyos-hostsd[776]: Success
Oct 29 00:46:50 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:46:50 vyos-hostsd[776]: Sent response: {'data': {'message': 'Applied 2 changes'}}
Oct 29 00:46:50 sudo[2473628]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:46:50 sudo[2473628]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:46:50 wan_lb[7864]: User signal: 12
Oct 29 00:46:50 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:46:50 sudo[2473628]: pam_unix(sudo:session): session closed for user root
Oct 29 00:46:50 systemd[1]: [email protected]: Deactivated successfully.
Oct 29 00:46:50 systemd[1]: Stopped [email protected] - DHCP client on eth1.
Oct 29 00:46:50 systemd[1]: [email protected]: Consumed 1.684s CPU time.
Oct 29 00:46:50 netplugd[1086]: eth1: state OUTING pid 2473375 exited status 0
Oct 29 00:47:07 wan_lb[2473809]: Error: Nexthop has invalid gateway.
Oct 29 00:47:07 wan_lb[7864]: failure to insert default route on active path with this command: ip route replace table 201 default dev eth1 via 100.0.0.1, resp:
Oct 29 00:47:13 wan_lb[2473813]: Error: Nexthop has invalid gateway.
Oct 29 00:47:13 wan_lb[7864]: failure to insert default route on active path with this command: ip route replace table 201 default dev eth1 via 100.0.0.1, resp:
Oct 29 00:47:21 wan_lb[2473993]: Error: Nexthop has invalid gateway.
Oct 29 00:47:21 wan_lb[7864]: failure to insert default route on active path with this command: ip route replace table 201 default dev eth1 via 100.0.0.1, resp:
Oct 29 00:47:22 kernel: igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Oct 29 00:47:22 netplugd[1086]: eth1: state INACTIVE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Oct 29 00:47:22 netplugd[2473995]: /etc/netplug/netplug eth1 in -> pid 2473995
Oct 29 00:47:22 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:22 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=926, pid=3161387868
Oct 29 00:47:22 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:22 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=927, pid=3161387868
Oct 29 00:47:22 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (176[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:22 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:22 systemd[1]: Starting [email protected] - DHCP client on eth1...
Oct 29 00:47:22 systemd[1]: Started [email protected] - DHCP client on eth1.
Oct 29 00:47:22 dhclient[2474003]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:22 dhclient[2474003]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:22 dhclient[2474003]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:22 dhclient[2474003]: All rights reserved.
Oct 29 00:47:22 dhclient[2474003]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:22 dhclient[2474003]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:22 dhclient[2474003]: All rights reserved.
Oct 29 00:47:22 dhclient[2474003]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:22 dhclient[2474003]:
Oct 29 00:47:22 dhclient-script-vyos[2474004]: Current dhclient PID: 2474003, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474003
Oct 29 00:47:22 dhclient-script-vyos[2474004]: Passing command to /usr/sbin/ip: "link set dev eth1 up"
Oct 29 00:47:22 netplugd[1086]: eth1: state INNING pid 2473995 exited status 0
Oct 29 00:47:22 dhclient-script-vyos[2474004]: No changes to apply via vyos-hostsd-client
Oct 29 00:47:22 sudo[2474032]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:22 sudo[2474032]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:22 wan_lb[7864]: User signal: 12
Oct 29 00:47:22 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:22 sudo[2474032]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:22 dhclient[2474003]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:22 dhclient[2474003]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:22 dhclient[2474003]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:22 dhclient[2474003]: Sending on   Socket/fallback
Oct 29 00:47:22 dhclient[2474003]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 4
Oct 29 00:47:22 dhclient[2474003]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:22 dhclient[2474003]: Sending on   Socket/fallback
Oct 29 00:47:22 dhclient[2474003]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 4
Oct 29 00:47:23 wan_lb[2474036]: Error: Nexthop has invalid gateway.
Oct 29 00:47:23 wan_lb[7864]: failure to insert default route on active path with this command: ip route replace table 201 default dev eth1 via 100.0.0.1, resp:
Oct 29 00:47:26 dhclient[2474003]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 8
Oct 29 00:47:26 dhclient[2474003]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 8
Oct 29 00:47:28 kernel: igc 0000:02:00.0 eth1: NIC Link is Down
Oct 29 00:47:28 netplugd[1086]: eth1: state ACTIVE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Oct 29 00:47:28 netplugd[2474038]: /etc/netplug/netplug eth1 out -> pid 2474038
Oct 29 00:47:28 zebra[1434]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Oct 29 00:47:28 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=934, pid=3161387868
Oct 29 00:47:28 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:28 systemd[1]: Stopping [email protected] - DHCP client on eth1...
Oct 29 00:47:28 dhclient[2474046]: Killed old client process
Oct 29 00:47:28 dhclient[2474046]: Killed old client process
Oct 29 00:47:29 sshd[2474047]: Accepted publickey for vyos from 10.0.10.34 port 48074 ssh2: ED25519 SHA256:jkaawS19mnw4U4Ub9Al6/lqBI2QCRzWpQgMJiMsMTnc
Oct 29 00:47:29 sshd[2474047]: pam_unix(sshd:session): session opened for user vyos(uid=1002) by (uid=0)
Oct 29 00:47:29 systemd-logind[989]: New session 292 of user vyos.
Oct 29 00:47:29 systemd[1]: Started session-292.scope - Session 292 of User vyos.
Oct 29 00:47:29 wan_lb[7864]: Interface eth1 has changed state to FAILED
...
Oct 29 00:47:29 wan_lb[2474096]: conntrack v1.4.6 (conntrack-tools): expectation table has been emptied.
Oct 29 00:47:29 systemd[1]: opt-vyatta-config-tmp-new_config_2474051.mount: Deactivated successfully.
Oct 29 00:47:29 sshd[2474047]: pam_env(sshd:session): deprecated reading of user environment enabled
Oct 29 00:47:29 dhclient[2474046]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:29 dhclient[2474046]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:29 dhclient[2474046]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:29 dhclient[2474046]: All rights reserved.
Oct 29 00:47:29 dhclient[2474046]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:29 dhclient[2474046]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:29 dhclient[2474046]: All rights reserved.
Oct 29 00:47:29 dhclient[2474046]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:29 dhclient[2474046]:
Oct 29 00:47:29 dhclient[2474046]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:29 dhclient[2474046]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:29 dhclient[2474046]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:29 dhclient[2474046]: Sending on   Socket/fallback
Oct 29 00:47:29 dhclient[2474046]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:47:29 dhclient[2474046]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:29 dhclient[2474046]: Sending on   Socket/fallback
Oct 29 00:47:29 dhclient[2474046]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Current dhclient PID: 2474046, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474046
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Deleting search domains with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:29 vyos-hostsd[776]: Request data: {"type": "search_domains", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:47:29 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:29 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:29 vyos-hostsd[776]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:47:29 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:29 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Deleting default route: via 100.0.0.1 dev eth1 metric 2
Oct 29 00:47:29 dhclient-script-vyos[2474153]: FRR status: running
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Converted vtysh command: "no ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Sending command to vtysh
Oct 29 00:47:29 dhclient[2474296]: % Refusing to remove a non-existent route
Oct 29 00:47:29 dhclient-script-vyos[2474153]: Applying changes via vyos-hostsd-client
Oct 29 00:47:30 vyos-hostsd[776]: Request data: {"op": "apply"}
Oct 29 00:47:30 vyos-hostsd[776]: Applying 2 changes
Oct 29 00:47:30 vyos-hostsd[776]: Writing /etc/resolv.conf
Oct 29 00:47:30 vyos-hostsd[776]: Writing /etc/hosts
Oct 29 00:47:30 vyos-hostsd[776]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
Oct 29 00:47:30 vyos-hostsd[776]: Writing /run/powerdns/recursor.forward-zones.conf
Oct 29 00:47:30 vyos-hostsd[776]: Running "rec_control reload-lua-config"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520650.055" command="reload-lua-config"
Oct 29 00:47:30 pdns-recursor[8737]: Reloaded Lua configuration file 'recursor.conf.lua', requested via control channel
Oct 29 00:47:30 vyos-hostsd[776]: Running "rec_control reload-zones"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520650.067" command="reload-zones"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Reloading zones, purging data from cache" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.067"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Reading zone forwarding information" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.067" file="recursor.forward-zones.conf"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Done parsing forwarding instructions from file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.067" count="1" file="recursor.forward-zones.conf"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.068" zone="localhost"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.068" zone="1.0.0.127.in-addr.arpa"
Oct 29 00:47:30 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.068" zone="router.home.arpa.home.arpa"
...
Oct 29 00:47:30 pdns-recursor[8737]: msg="Inserting rfc 1918 private space zones" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520650.069"
Oct 29 00:47:30 vyos-hostsd[776]: Success
Oct 29 00:47:30 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:30 vyos-hostsd[776]: Sent response: {'data': {'message': 'Applied 2 changes'}}
Oct 29 00:47:30 sudo[2474306]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:30 sudo[2474306]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:30 wan_lb[7864]: User signal: 12
Oct 29 00:47:30 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:30 sudo[2474306]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:30 systemd[1]: [email protected]: Deactivated successfully.
Oct 29 00:47:30 systemd[1]: Stopped [email protected] - DHCP client on eth1.
Oct 29 00:47:30 netplugd[1086]: eth1: state OUTING pid 2474038 exited status 0
Oct 29 00:47:31 kernel: igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Oct 29 00:47:31 netplugd[1086]: eth1: state INACTIVE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Oct 29 00:47:31 netplugd[2474312]: /etc/netplug/netplug eth1 in -> pid 2474312
Oct 29 00:47:31 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:31 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=944, pid=3161387868
Oct 29 00:47:31 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:31 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=945, pid=3161387868
Oct 29 00:47:31 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (176[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:31 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:31 systemd[1]: Starting [email protected] - DHCP client on eth1...
Oct 29 00:47:31 systemd[1]: Started [email protected] - DHCP client on eth1.
Oct 29 00:47:31 dhclient[2474320]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:31 dhclient[2474320]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:31 dhclient[2474320]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:31 dhclient[2474320]: All rights reserved.
Oct 29 00:47:31 dhclient[2474320]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:31 dhclient[2474320]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:31 dhclient[2474320]: All rights reserved.
Oct 29 00:47:31 dhclient[2474320]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:31 dhclient[2474320]:
Oct 29 00:47:31 dhclient-script-vyos[2474321]: Current dhclient PID: 2474320, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474320
Oct 29 00:47:31 dhclient-script-vyos[2474321]: Passing command to /usr/sbin/ip: "link set dev eth1 up"
Oct 29 00:47:31 netplugd[1086]: eth1: state INNING pid 2474312 exited status 0
Oct 29 00:47:31 dhclient-script-vyos[2474321]: No changes to apply via vyos-hostsd-client
Oct 29 00:47:31 sudo[2474349]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:31 sudo[2474349]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:31 wan_lb[7864]: User signal: 12
Oct 29 00:47:31 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:31 sudo[2474349]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:31 dhclient[2474320]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:31 dhclient[2474320]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:31 dhclient[2474320]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:31 dhclient[2474320]: Sending on   Socket/fallback
Oct 29 00:47:31 dhclient[2474320]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 3
Oct 29 00:47:31 dhclient[2474320]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:31 dhclient[2474320]: Sending on   Socket/fallback
Oct 29 00:47:31 dhclient[2474320]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 3
Oct 29 00:47:32 kernel: igc 0000:02:00.0 eth1: NIC Link is Down
Oct 29 00:47:32 netplugd[1086]: eth1: state ACTIVE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Oct 29 00:47:32 netplugd[2474521]: /etc/netplug/netplug eth1 out -> pid 2474521
Oct 29 00:47:32 zebra[1434]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Oct 29 00:47:32 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=952, pid=3161387868
Oct 29 00:47:32 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:32 sudo[2474549]:     vyos : TTY=pts/3 ; PWD=/home/vyos ; USER=root ; COMMAND=/usr/sbin/ip vrf exec default /bin/ping 192.168.8.1
Oct 29 00:47:32 sudo[2474549]: pam_unix(sudo:session): session opened for user root(uid=0) by vyos(uid=1002)
Oct 29 00:47:32 systemd[1]: Stopping [email protected] - DHCP client on eth1...
Oct 29 00:47:32 dhclient[2474555]: Killed old client process
Oct 29 00:47:32 dhclient[2474555]: Killed old client process
Oct 29 00:47:33 sudo[2474549]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:33 dhclient[2474555]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:33 dhclient[2474555]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:33 dhclient[2474555]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:33 dhclient[2474555]: All rights reserved.
Oct 29 00:47:33 dhclient[2474555]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:33 dhclient[2474555]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:33 dhclient[2474555]: All rights reserved.
Oct 29 00:47:33 dhclient[2474555]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:33 dhclient[2474555]:
Oct 29 00:47:33 dhclient[2474555]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:33 dhclient[2474555]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:33 dhclient[2474555]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:33 dhclient[2474555]: Sending on   Socket/fallback
Oct 29 00:47:33 dhclient[2474555]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:47:33 dhclient[2474555]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:33 dhclient[2474555]: Sending on   Socket/fallback
Oct 29 00:47:33 dhclient[2474555]: DHCPRELEASE of 100.90.77.6 on eth1 to 192.168.8.1 port 67
Oct 29 00:47:33 dhclient-script-vyos[2474556]: Current dhclient PID: 2474555, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474555
Oct 29 00:47:33 dhclient-script-vyos[2474556]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:47:33 dhclient-script-vyos[2474556]: Deleting search domains with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:34 vyos-hostsd[776]: Request data: {"type": "search_domains", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:47:34 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:34 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:34 vyos-hostsd[776]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:47:34 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:34 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Deleting default route: via 100.0.0.1 dev eth1 metric 2
Oct 29 00:47:34 dhclient-script-vyos[2474556]: FRR status: running
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Converted vtysh command: "no ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Sending command to vtysh
Oct 29 00:47:34 dhclient[2474615]: % Refusing to remove a non-existent route
Oct 29 00:47:34 dhclient-script-vyos[2474556]: Applying changes via vyos-hostsd-client
Oct 29 00:47:34 vyos-hostsd[776]: Request data: {"op": "apply"}
Oct 29 00:47:34 vyos-hostsd[776]: Applying 2 changes
Oct 29 00:47:34 vyos-hostsd[776]: Writing /etc/resolv.conf
Oct 29 00:47:34 vyos-hostsd[776]: Writing /etc/hosts
Oct 29 00:47:34 vyos-hostsd[776]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
Oct 29 00:47:34 vyos-hostsd[776]: Writing /run/powerdns/recursor.forward-zones.conf
Oct 29 00:47:34 vyos-hostsd[776]: Running "rec_control reload-lua-config"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520654.202" command="reload-lua-config"
Oct 29 00:47:34 pdns-recursor[8737]: Reloaded Lua configuration file 'recursor.conf.lua', requested via control channel
Oct 29 00:47:34 vyos-hostsd[776]: Running "rec_control reload-zones"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520654.213" command="reload-zones"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Reloading zones, purging data from cache" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.213"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Reading zone forwarding information" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.213" file="recursor.forward-zones.conf"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Done parsing forwarding instructions from file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.213" count="1" file="recursor.forward-zones.conf"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.213" zone="localhost"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.214" zone="1.0.0.127.in-addr.arpa"
Oct 29 00:47:34 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.214" zone="router.home.arpa.home.arpa"
...
Oct 29 00:47:34 pdns-recursor[8737]: msg="Inserting rfc 1918 private space zones" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520654.215"
Oct 29 00:47:34 vyos-hostsd[776]: Success
Oct 29 00:47:34 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:34 vyos-hostsd[776]: Sent response: {'data': {'message': 'Applied 2 changes'}}
Oct 29 00:47:34 sudo[2474625]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:34 sudo[2474625]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:34 wan_lb[7864]: User signal: 12
Oct 29 00:47:34 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:34 sudo[2474625]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:34 systemd[1]: [email protected]: Deactivated successfully.
Oct 29 00:47:34 systemd[1]: Stopped [email protected] - DHCP client on eth1.
Oct 29 00:47:34 netplugd[1086]: eth1: state OUTING pid 2474521 exited status 0
Oct 29 00:47:35 kernel: igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Oct 29 00:47:35 netplugd[1086]: eth1: state INACTIVE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Oct 29 00:47:35 netplugd[2474647]: /etc/netplug/netplug eth1 in -> pid 2474647
Oct 29 00:47:35 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:35 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=962, pid=3161387868
Oct 29 00:47:35 zebra[1434]: [HSYZM-HV7HF] Extended Error: Nexthop has invalid gateway
Oct 29 00:47:35 zebra[1434]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is unreachable, type=RTM_NEWNEXTHOP(104), seq=963, pid=3161387868
Oct 29 00:47:35 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (176[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:35 zebra[1434]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (343[100.0.0.1 if 3]) into the kernel
Oct 29 00:47:35 systemd[1]: Starting [email protected] - DHCP client on eth1...
Oct 29 00:47:35 systemd[1]: Started [email protected] - DHCP client on eth1.
Oct 29 00:47:35 dhclient[2474669]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:35 dhclient[2474669]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:47:35 dhclient[2474669]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:35 dhclient[2474669]: All rights reserved.
Oct 29 00:47:35 dhclient[2474669]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:35 dhclient[2474669]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:47:35 dhclient[2474669]: All rights reserved.
Oct 29 00:47:35 dhclient[2474669]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:47:35 dhclient[2474669]:
Oct 29 00:47:35 dhclient-script-vyos[2474670]: Current dhclient PID: 2474669, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474669
Oct 29 00:47:35 dhclient-script-vyos[2474670]: Passing command to /usr/sbin/ip: "link set dev eth1 up"
Oct 29 00:47:35 netplugd[1086]: eth1: state INNING pid 2474647 exited status 0
Oct 29 00:47:35 dhclient-script-vyos[2474670]: No changes to apply via vyos-hostsd-client
Oct 29 00:47:35 sudo[2474698]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:35 sudo[2474698]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:35 wan_lb[7864]: User signal: 12
Oct 29 00:47:35 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:35 sudo[2474698]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:35 dhclient[2474669]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:35 dhclient[2474669]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:35 dhclient[2474669]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:35 dhclient[2474669]: Sending on   Socket/fallback
Oct 29 00:47:35 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 2
Oct 29 00:47:35 dhclient[2474669]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:47:35 dhclient[2474669]: Sending on   Socket/fallback
Oct 29 00:47:35 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 2
Oct 29 00:47:37 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 2
Oct 29 00:47:37 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 2
Oct 29 00:47:39 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 4
Oct 29 00:47:39 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 4
Oct 29 00:47:43 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 9
Oct 29 00:47:43 dhclient[2474669]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 9
Oct 29 00:47:43 dhclient[2474669]: DHCPOFFER of 100.78.202.231 from 192.168.8.1
Oct 29 00:47:43 dhclient[2474669]: DHCPOFFER of 100.78.202.231 from 192.168.8.1
Oct 29 00:47:43 dhclient[2474669]: DHCPREQUEST for 100.78.202.231 on eth1 to 255.255.255.255 port 67
Oct 29 00:47:43 dhclient[2474669]: DHCPREQUEST for 100.78.202.231 on eth1 to 255.255.255.255 port 67
Oct 29 00:47:43 dhclient[2474669]: DHCPACK of 100.78.202.231 from 192.168.8.1
Oct 29 00:47:43 dhclient[2474669]: DHCPACK of 100.78.202.231 from 192.168.8.1
Oct 29 00:47:43 dhclient[2474669]: Invalid domain name.
Oct 29 00:47:43 dhclient[2474669]: Invalid domain name.
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Current dhclient PID: 2474669, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2474669
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Passing command to /usr/sbin/ip: "-4 addr add 100.78.202.231/255.0.0.0 broadcast 100.255.255.255 valid_lft 86400 preferred_lft 86400 dev eth1 label eth1"
Oct 29 00:47:43 dhclient-script-vyos[2474810]: FRR status: running
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Converted vtysh command: "ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Sending command to vtysh
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:43 vyos-hostsd[776]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth1"]}
Oct 29 00:47:43 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:43 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Adding nameservers "117.96.122.112 117.96.122.110" with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:47:43 vyos-hostsd[776]: Request data: {"type": "name_servers", "op": "add", "data": {"dhcp-eth1": ["117.96.122.112", "117.96.122.110"]}}
Oct 29 00:47:43 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:43 vyos-hostsd[776]: Sent response: {'data': None}
Oct 29 00:47:43 dhclient-script-vyos[2474810]: Applying changes via vyos-hostsd-client
Oct 29 00:47:43 vyos-hostsd[776]: Request data: {"op": "apply"}
Oct 29 00:47:43 vyos-hostsd[776]: Applying 2 changes
Oct 29 00:47:43 vyos-hostsd[776]: Writing /etc/resolv.conf
Oct 29 00:47:43 vyos-hostsd[776]: Writing /etc/hosts
Oct 29 00:47:43 vyos-hostsd[776]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
Oct 29 00:47:43 vyos-hostsd[776]: Writing /run/powerdns/recursor.forward-zones.conf
Oct 29 00:47:43 vyos-hostsd[776]: Running "rec_control reload-lua-config"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520663.908" command="reload-lua-config"
Oct 29 00:47:43 pdns-recursor[8737]: Reloaded Lua configuration file 'recursor.conf.lua', requested via control channel
Oct 29 00:47:43 vyos-hostsd[776]: Running "rec_control reload-zones"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Received rec_control command via control socket" subsystem="control" level="0" prio="Info" tid="0" ts="1698520663.920" command="reload-zones"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Reloading zones, purging data from cache" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Reading zone forwarding information" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920" file="recursor.forward-zones.conf"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Done parsing forwarding instructions from file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920" count="1" file="recursor.forward-zones.conf"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920" zone="localhost"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Inserting reverse zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920" zone="1.0.0.127.in-addr.arpa"
Oct 29 00:47:43 pdns-recursor[8737]: msg="Inserting forward zone based on hosts file" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.920" zone="router.home.arpa.home.arpa"
...
Oct 29 00:47:43 pdns-recursor[8737]: msg="Inserting rfc 1918 private space zones" subsystem="config" level="0" prio="Notice" tid="0" ts="1698520663.922"
Oct 29 00:47:43 vyos-hostsd[776]: Success
Oct 29 00:47:43 vyos-hostsd[776]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
Oct 29 00:47:43 vyos-hostsd[776]: Sent response: {'data': {'message': 'Applied 2 changes'}}
Oct 29 00:47:43 dhclient-script-vyos[2474810]: No changes to apply via vyos-hostsd-client
Oct 29 00:47:43 sudo[2474878]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:47:43 sudo[2474878]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:47:43 wan_lb[7864]: User signal: 12
Oct 29 00:47:43 wan_lb[7864]: wan_lb, rechecking interfaces..
Oct 29 00:47:43 sudo[2474878]: pam_unix(sudo:session): session closed for user root
Oct 29 00:47:44 dhclient[2474669]: Invalid domain name.
Oct 29 00:47:44 dhclient[2474669]: Invalid domain name.
Oct 29 00:47:44 dhclient[2474669]: bound to 100.78.202.231 -- renewal in 37428 seconds.
Oct 29 00:47:44 dhclient[2474669]: bound to 100.78.202.231 -- renewal in 37428 seconds.
Oct 29 00:47:55 sudo[2475104]:     vyos : TTY=pts/3 ; PWD=/home/vyos ; USER=root ; COMMAND=/usr/sbin/ip vrf exec default /bin/ping 192.168.8.1
Oct 29 00:47:55 sudo[2475104]: pam_unix(sudo:session): session opened for user root(uid=0) by vyos(uid=1002)
Oct 29 00:47:55 sudo[2475104]: pam_unix(sudo:session): session closed for user root
Oct 29 00:48:00 wan_lb[7864]: Interface eth1 has changed state to ACTIVE

I didn't update it since the initial report and I don't really understand how/why it's working again.

I saw quite a few threads in other places reported by people who had a similar issue with dhclient where it was not letting go of the lease on interface state changes, Most solutions were basically run a script which runs on ifplugd triggers and runs dhclient -r. I thought I would have to do the same thing and now it's working again. :/

I want to keep the thread open for a few more days and I'll monitor it a few more times see if it's really working properly. (I had a 2 day outage because of this bug so I want to make sure)

I am also curious about these messages.

Oct 29 00:46:50 dhclient[2473393]: parse_option_param: Bad format d
Oct 29 00:47:44 dhclient[2474669]: Invalid domain name.

Logs from just dhclient.

vyos@router:~$ monitor log dhcp client
Oct 29 00:56:22 dhclient-script-vyos[2482076]: FRR status: running
Oct 29 00:56:22 dhclient-script-vyos[2482076]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:56:22 dhclient-script-vyos[2482076]: Converted vtysh command: "no ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:56:22 dhclient-script-vyos[2482076]: Sending command to vtysh
Oct 29 00:56:22 dhclient-script-vyos[2482076]: Applying changes via vyos-hostsd-client
Oct 29 00:56:22 sudo[2482145]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:56:22 sudo[2482145]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:56:22 sudo[2482145]: pam_unix(sudo:session): session closed for user root
Oct 29 00:56:22 systemd[1]: [email protected]: Deactivated successfully.
Oct 29 00:56:22 systemd[1]: Stopped [email protected] - DHCP client on eth1.
Oct 29 00:56:54 systemd[1]: Starting [email protected] - DHCP client on eth1...
Oct 29 00:56:54 systemd[1]: Started [email protected] - DHCP client on eth1.
Oct 29 00:56:54 dhclient[2482873]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:56:54 dhclient[2482873]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:56:54 dhclient[2482873]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:56:54 dhclient[2482873]: All rights reserved.
Oct 29 00:56:54 dhclient[2482873]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:56:54 dhclient[2482873]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:56:54 dhclient[2482873]: All rights reserved.
Oct 29 00:56:54 dhclient[2482873]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:56:54 dhclient[2482873]:
Oct 29 00:56:54 dhclient-script-vyos[2482874]: Current dhclient PID: 2482873, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2482873
Oct 29 00:56:54 dhclient-script-vyos[2482874]: Passing command to /usr/sbin/ip: "link set dev eth1 up"
Oct 29 00:56:54 dhclient-script-vyos[2482874]: No changes to apply via vyos-hostsd-client
Oct 29 00:56:54 sudo[2482902]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:56:54 sudo[2482902]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:56:54 sudo[2482902]: pam_unix(sudo:session): session closed for user root
Oct 29 00:56:54 dhclient[2482873]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:56:54 dhclient[2482873]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:56:54 dhclient[2482873]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:56:54 dhclient[2482873]: Sending on   Socket/fallback
Oct 29 00:56:54 dhclient[2482873]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 5
Oct 29 00:56:54 dhclient[2482873]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:56:54 dhclient[2482873]: Sending on   Socket/fallback
Oct 29 00:56:54 dhclient[2482873]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 5
Oct 29 00:56:59 dhclient[2482873]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 11
Oct 29 00:56:59 dhclient[2482873]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 11
Oct 29 00:56:59 systemd[1]: Stopping [email protected] - DHCP client on eth1...
Oct 29 00:56:59 dhclient[2482914]: Killed old client process
Oct 29 00:56:59 dhclient[2482914]: Killed old client process
Oct 29 00:57:00 dhclient[2482914]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:57:00 dhclient[2482914]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:57:00 dhclient[2482914]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:57:00 dhclient[2482914]: All rights reserved.
Oct 29 00:57:00 dhclient[2482914]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:57:00 dhclient[2482914]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:57:00 dhclient[2482914]: All rights reserved.
Oct 29 00:57:00 dhclient[2482914]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:57:00 dhclient[2482914]:
Oct 29 00:57:00 dhclient[2482914]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:00 dhclient[2482914]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:00 dhclient[2482914]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:00 dhclient[2482914]: Sending on   Socket/fallback
Oct 29 00:57:00 dhclient[2482914]: DHCPRELEASE of 100.78.202.231 on eth1 to 192.168.8.1 port 67
Oct 29 00:57:00 dhclient[2482914]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:00 dhclient[2482914]: Sending on   Socket/fallback
Oct 29 00:57:00 dhclient[2482914]: DHCPRELEASE of 100.78.202.231 on eth1 to 192.168.8.1 port 67
Oct 29 00:57:00 dhclient-script-vyos[2482915]: Current dhclient PID: 2482914, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2482914
Oct 29 00:57:00 dhclient-script-vyos[2482915]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:57:00 dhclient-script-vyos[2482915]: Deleting search domains with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Deleting default route: via 100.0.0.1 dev eth1 metric 2
Oct 29 00:57:01 dhclient-script-vyos[2482915]: FRR status: running
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Converted vtysh command: "no ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Sending command to vtysh
Oct 29 00:57:01 dhclient[2482980]: % Refusing to remove a non-existent route
Oct 29 00:57:01 dhclient-script-vyos[2482915]: Applying changes via vyos-hostsd-client
Oct 29 00:57:01 sudo[2482990]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:57:01 sudo[2482990]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:57:01 sudo[2482990]: pam_unix(sudo:session): session closed for user root
Oct 29 00:57:01 systemd[1]: [email protected]: Deactivated successfully.
Oct 29 00:57:01 systemd[1]: Stopped [email protected] - DHCP client on eth1.
Oct 29 00:57:03 systemd[1]: Starting [email protected] - DHCP client on eth1...
Oct 29 00:57:03 systemd[1]: Started [email protected] - DHCP client on eth1.
Oct 29 00:57:03 dhclient[2483177]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:57:03 dhclient[2483177]: Internet Systems Consortium DHCP Client 4.4.3-P1
Oct 29 00:57:03 dhclient[2483177]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:57:03 dhclient[2483177]: All rights reserved.
Oct 29 00:57:03 dhclient[2483177]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:57:03 dhclient[2483177]: Copyright 2004-2022 Internet Systems Consortium.
Oct 29 00:57:03 dhclient[2483177]: All rights reserved.
Oct 29 00:57:03 dhclient[2483177]: For info, please visit https://www.isc.org/software/dhcp/
Oct 29 00:57:03 dhclient[2483177]:
Oct 29 00:57:03 dhclient-script-vyos[2483178]: Current dhclient PID: 2483177, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2483177
Oct 29 00:57:03 dhclient-script-vyos[2483178]: Passing command to /usr/sbin/ip: "link set dev eth1 up"
Oct 29 00:57:03 dhclient-script-vyos[2483178]: No changes to apply via vyos-hostsd-client
Oct 29 00:57:03 sudo[2483206]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:57:03 sudo[2483206]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:57:03 sudo[2483206]: pam_unix(sudo:session): session closed for user root
Oct 29 00:57:03 dhclient[2483177]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:03 dhclient[2483177]: Listening on LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:03 dhclient[2483177]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:03 dhclient[2483177]: Sending on   Socket/fallback
Oct 29 00:57:03 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 3
Oct 29 00:57:03 dhclient[2483177]: Sending on   LPF/eth1/a8:b8:e0:00:4a:d8
Oct 29 00:57:03 dhclient[2483177]: Sending on   Socket/fallback
Oct 29 00:57:03 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 3
Oct 29 00:57:06 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 7
Oct 29 00:57:06 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 7
Oct 29 00:57:13 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 15
Oct 29 00:57:13 dhclient[2483177]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 15
Oct 29 00:57:13 dhclient[2483177]: DHCPOFFER of 100.90.124.204 from 192.168.8.1
Oct 29 00:57:13 dhclient[2483177]: DHCPOFFER of 100.90.124.204 from 192.168.8.1
Oct 29 00:57:13 dhclient[2483177]: DHCPREQUEST for 100.90.124.204 on eth1 to 255.255.255.255 port 67
Oct 29 00:57:13 dhclient[2483177]: DHCPREQUEST for 100.90.124.204 on eth1 to 255.255.255.255 port 67
Oct 29 00:57:13 dhclient[2483177]: DHCPACK of 100.90.124.204 from 192.168.8.1
Oct 29 00:57:13 dhclient[2483177]: DHCPACK of 100.90.124.204 from 192.168.8.1
Oct 29 00:57:13 dhclient[2483177]: Invalid domain name.
Oct 29 00:57:13 dhclient[2483177]: Invalid domain name.
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Current dhclient PID: 2483177, Parent PID: 1, IP version: 4, All dhclients for interface eth1: 2483177
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth1 label eth1"
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Passing command to /usr/sbin/ip: "-4 addr add 100.90.124.204/255.0.0.0 broadcast 100.255.255.255 valid_lft 86400 preferred_lft 86400 dev eth1 label eth1"
Oct 29 00:57:13 dhclient-script-vyos[2483211]: FRR status: running
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Checking if the route presented in kernel: default via 100.0.0.1 dev eth1 metric 2
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Converted vtysh command: "ip route 0.0.0.0/0 100.0.0.1 eth1 tag 210 2 "
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Sending command to vtysh
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Deleting nameservers with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Adding nameservers "117.96.122.110 202.56.215.41" with tag "dhcp-eth1" via vyos-hostsd-client
Oct 29 00:57:13 dhclient-script-vyos[2483211]: Applying changes via vyos-hostsd-client
Oct 29 00:57:13 dhclient-script-vyos[2483211]: No changes to apply via vyos-hostsd-client
Oct 29 00:57:13 sudo[2483279]:     root : PWD=/run/dhclient ; USER=root ; COMMAND=/usr/bin/kill -s SIGUSR2 7864
Oct 29 00:57:13 sudo[2483279]: pam_unix(sudo:session): session opened for user root(uid=0) by (uid=0)
Oct 29 00:57:13 sudo[2483279]: pam_unix(sudo:session): session closed for user root
Oct 29 00:57:13 dhclient[2483177]: Invalid domain name.
Oct 29 00:57:13 dhclient[2483177]: Invalid domain name.
Oct 29 00:57:13 dhclient[2483177]: bound to 100.90.124.204 -- renewal in 41623 seconds.
Oct 29 00:57:13 dhclient[2483177]: bound to 100.90.124.204 -- renewal in 41623 seconds.

I believe it's probably generating this error because modem sends an empty hostname.

Oct 29 00:47:44 dhclient[2474669]: Invalid domain name.

image.png (769×1 px, 375 KB)

I don't know the reason for the other error

parse_option_param: Bad format d

This has happened 2-3 times since my last comment. I'll keep this open until kea-dhcp is merged and then check if this bug remains.

kea-dhcp will only be the server, not the DHCP client that you use here, I am unaware of a Kea based DHCP client.

oh okay. I don't know what else to do. this is difficult to replicate and it's scary if this happens when i am not at home and primary wan is also unavailable.

This happened again.

Logs

Dec 03 17:06:39 kernel: igc 0000:02:00.0 eth1: NIC Link is Down
Dec 03 17:06:39 netplugd[1394]: eth1: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Dec 03 17:06:39 zebra[1753]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Dec 03 17:06:39 zebra[1753]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=830, pid=4111354676
Dec 03 17:06:39 zebra[1753]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (106[100.0.0.1 if 3]) into the kernel
Dec 03 17:06:45 kernel: igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Dec 03 17:06:45 netplugd[1394]: eth1: state INSANE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Dec 03 17:06:47 kernel: igc 0000:02:00.0 eth1: NIC Link is Down
Dec 03 17:06:47 netplugd[1394]: eth1: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Dec 03 17:06:47 zebra[1753]: [HSYZM-HV7HF] Extended Error: Carrier for nexthop device is down
Dec 03 17:06:47 zebra[1753]: [WVJCK-PPMGD][EC 4043309093] netlink-dp (NS 0) error: Network is down, type=RTM_NEWNEXTHOP(104), seq=851, pid=4111354676
Dec 03 17:06:47 zebra[1753]: [X5XE1-RS0SW][EC 4043309074] Failed to install Nexthop (106[100.0.0.1 if 3]) into the kernel
Dec 03 17:06:50 kernel: igc 0000:02:00.0 eth1: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Dec 03 17:06:50 netplugd[1394]: eth1: state INSANE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Dec 03 17:06:57 wan_lb[2445853]: Interface eth1 has changed state to FAILED

Interface

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc cake state UP group default qlen 1000
    link/ether a8:b8:e0:00:4a:d8 brd ff:ff:ff:ff:ff:ff
    altname enp2s0
    inet 100.90.76.120/8 brd 100.255.255.255 scope global dynamic eth1
       valid_lft 60815sec preferred_lft 60815sec
    inet6 fe80::aab8:e0ff:fe00:4ad8/64 scope link
       valid_lft forever preferred_lft forever
vyos@router:~$ show dhcp client leases
Interface    eth1
IP address   100.90.76.120                 [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.72 59.144.144.99
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Sun Dec 03 10:13:39 IST 2023
Expiry       Mon Dec 04 10:13:39 IST 2023

Routes

vyos@router:~$ ip route
default nhid 433 dev pppoe0 proto static metric 20
10.0.10.0/24 dev br0.10 proto kernel scope link src 10.0.10.1
10.0.20.0/24 dev br0.20 proto kernel scope link src 10.0.20.1
10.0.30.0/24 dev br0.30 proto kernel scope link src 10.0.30.1
10.0.40.0/24 dev br0.40 proto kernel scope link src 10.0.40.1
10.0.50.0/24 dev br0.50 proto kernel scope link src 10.0.50.1
10.0.60.0/24 dev br0.60 proto kernel scope link src 10.0.60.1
10.0.70.0/24 dev br0.70 proto kernel scope link src 10.0.70.1
10.0.99.0/24 dev br0.99 proto kernel scope link src 10.0.99.1
10.0.150.0/24 dev br0.150 proto kernel scope link src 10.0.150.1
10.0.160.0/24 dev br0.160 proto kernel scope link src 10.0.160.1
10.1.1.0/24 dev wg0 proto kernel scope link src 10.1.1.2
10.1.10.0/24 dev wg0 proto kernel scope link src 10.1.10.1
10.1.20.0/24 dev wg0 proto kernel scope link src 10.1.20.1
10.1.30.0/24 dev wg0 proto kernel scope link src 10.1.30.1
10.1.40.0/24 dev wg0 proto kernel scope link src 10.1.40.1
10.20.128.1 nhid 88 dev wg1 proto static metric 20
100.0.0.0/8 dev eth1 proto kernel scope link src 100.90.76.120
100.64.0.1 dev pppoe0 proto kernel scope link src 103.83.146.116
192.168.1.0/24 dev eth0 proto kernel scope link src 192.168.1.2
192.168.8.0/24 nhid 85 dev eth1 proto static metric 20

Ping

vyos@router:~$ ping 192.168.8.1 count 4
PING 192.168.8.1 (192.168.8.1) 56(84) bytes of data.

--- 192.168.8.1 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3060ms

This happened 8 minutes ago and it has not fixed itself. I would have no connectivity if my primary WAN also goes out right at this moment. Now, I'll try to renew the lease.

vyos@router:~$ renew dhcp interface eth1
vyos@router:~$ ping 192.168.8.1
PING 192.168.8.1 (192.168.8.1) 56(84) bytes of data.
64 bytes from 192.168.8.1: icmp_seq=1 ttl=64 time=0.343 ms
64 bytes from 192.168.8.1: icmp_seq=2 ttl=64 time=0.401 ms
64 bytes from 192.168.8.1: icmp_seq=3 ttl=64 time=0.402 ms
^C
--- 192.168.8.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2041ms
rtt min/avg/max/mdev = 0.343/0.382/0.402/0.027 ms
vyos@router:~$ show dhcp client leases
Interface    eth1
IP address   100.90.126.248                [Active]
Subnet Mask  255.0.0.0
Domain Name
Router       100.0.0.1
Name Server  117.96.122.110 202.56.215.41
DHCP Server  192.168.8.1
DHCP Server  86400
VRF          default
Last Update  Sun Dec 03 17:23:03 IST 2023
Expiry       Mon Dec 04 17:23:03 IST 2023

And now it works

This has happened a few more times since last report.

I am currently on 1.5-rolling-202312130023 (upgraded 2 days ago)

This time, Router and the 5G CPE were online/booted up the entire time. The core issue was a loose cable which caused it to disconnect and reconnect for a second and it did not come online like it was supposed to once cable was reconnected. With help from some other people, I was able to dig into this a little bit further.

  1. netplug is responsible for restarting dhclient when it detects an interface was disconnected and reconnected. Some times, For some reason(perhaps a race condition some where?) it fails to do this and interface state gets stuck at INSANE instead of UP or INNING.

For reference,

This issue happened at 21:10 and then I manually restarted netplug at 21:47

Dec 15 21:10:32 router netplugd[1387]: eth1: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001003 UP,BROADCAST,MULTICAST
Dec 15 21:10:37 router netplugd[1387]: eth1: state INSANE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
Dec 15 21:47:44 router systemd[1]: Stopping netplug.service - LSB: Brings up/down network automatically...
Dec 15 21:47:44 router netplugd[1387]: caught signal 15 - exiting
Dec 15 21:47:44 router systemd[1]: netplug.service: Deactivated successfully.
Dec 15 21:47:44 router systemd[1]: Stopped netplug.service - LSB: Brings up/down network automatically.
Dec 15 21:47:44 router systemd[1]: netplug.service: Consumed 4.636s CPU time.
Dec 15 21:47:44 router systemd[1]: Starting netplug.service - LSB: Brings up/down network automatically...
Dec 15 21:47:44 router netplug[896542]: Warning: Could not probe for any interfaces
Dec 15 21:47:44 router netplugd[896608]: /etc/netplug/netplug br0.50 in -> pid 896608
Dec 15 21:47:44 router netplugd[896609]: /etc/netplug/netplug br0.60 in -> pid 896609
Dec 15 21:47:44 router systemd[1]: Started netplug.service - LSB: Brings up/down network automatically.
Dec 15 21:47:44 router netplugd[896610]: /etc/netplug/netplug eth0 in -> pid 896610
Dec 15 21:47:44 router netplugd[896611]: /etc/netplug/netplug br0.70 in -> pid 896611
Dec 15 21:47:44 router netplugd[896612]: /etc/netplug/netplug eth1 in -> pid 896612
Dec 15 21:47:44 router netplugd[896613]: /etc/netplug/netplug br0.99 in -> pid 896613
Dec 15 21:47:44 router netplugd[896617]: /etc/netplug/netplug br0.150 in -> pid 896617
Dec 15 21:47:44 router netplugd[896624]: /etc/netplug/netplug br0.40 in -> pid 896624
Dec 15 21:47:44 router netplugd[896620]: /etc/netplug/netplug br0 in -> pid 896620
Dec 15 21:47:44 router netplugd[896618]: /etc/netplug/netplug eth4 in -> pid 896618
Dec 15 21:47:44 router netplugd[896614]: /etc/netplug/netplug eth2 in -> pid 896614
Dec 15 21:47:44 router netplugd[896621]: /etc/netplug/netplug br0.10 in -> pid 896621
Dec 15 21:47:44 router netplugd[896619]: /etc/netplug/netplug br0.160 in -> pid 896619
Dec 15 21:47:44 router netplugd[896622]: /etc/netplug/netplug br0.20 in -> pid 896622
Dec 15 21:47:44 router netplugd[896615]: /etc/netplug/netplug br0.140 in -> pid 896615
Dec 15 21:47:44 router netplugd[896616]: /etc/netplug/netplug eth3 in -> pid 896616
Dec 15 21:47:44 router netplugd[896623]: /etc/netplug/netplug br0.30 in -> pid 896623
Dec 15 21:47:45 router netplugd[896607]: br0.50: state INNING pid 896608 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.160: state INNING pid 896619 exited status 0
Dec 15 21:47:45 router netplugd[896607]: eth4: state INNING pid 896618 exited status 0
Dec 15 21:47:45 router netplugd[896607]: eth2: state INNING pid 896614 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.20: state INNING pid 896622 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.30: state INNING pid 896623 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.150: state INNING pid 896617 exited status 0
Dec 15 21:47:45 router netplugd[896607]: eth3: state INNING pid 896616 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.70: state INNING pid 896611 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.140: state INNING pid 896615 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.99: state INNING pid 896613 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0: state INNING pid 896620 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.60: state INNING pid 896609 exited status 0
Dec 15 21:47:45 router netplugd[896607]: eth1: state INNING pid 896612 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.40: state INNING pid 896624 exited status 0
Dec 15 21:47:45 router netplugd[896607]: eth0: state INNING pid 896610 exited status 0
Dec 15 21:47:45 router netplugd[896607]: br0.10: state INNING pid 896621 exited status 0
  1. The 5G CPE is in bridge mode and operates a DHCP proxy. The rest of this is partially theories since I don't have a lot of access to it's internals. On the CPE side, it forgets about the issued lease when ethernet interface disconnects and it expects the router to send request for a new address when it reconnects. The dhclient on the router doesn't send any thing for ~12 minutes(I am not sure why) and then starts trying to renew lease for the same address. Looking at the packet capture, I have come to the conclusion the 5G cpe will not respond to these dhcp requests for renewing the old lease it probably wiped out from it's memory.

Because of all of this, Every thing is at a stalemate. netplug didn't switch to UP(or INNING?) state properly, didn't restart dhclient and dhclient is stuck trying to renew lease for an address that can not be used anymore.

I restarted netplug at 21:47 which changed eth1 state from INSANE to INNING but that alone did not fix the problem because dhclient is still stuck trying to renew lease for the old address. So, Restarting dhclient finally fixes the problem.

Complete logs from all services when this happened

Few solutions that were suggested,

  1. Limit max lease time for the dhcp lease. Currently the 5G CPE sends leases with 24h expiry. This can help in reducing the outage but I feel it's a poor duct tape and as of today, Vyos does not have configuration options to adjust this parameter for dhcp clients.
  1. Run a script in the background maybe at some intervals(?) to release/renew leases. This also feels like a poor solution to me.

Please let me know if there is any thing I can do to help here.

Viacheslav triaged this task as Normal priority.Jan 20 2024, 1:36 PM