Page MenuHomeVyOS Platform

[EDIT] Service Restored: Outage: Interface stops forwarding, IPv4 martian seen in the logs
Resolved (N/A)PublicBUG

Description

I've been chasing this "random outage" where the external interface eth0 stops forwarding traffic. I can reach all inter-vlan networks on other interfaces but not anything egressing eth0. I have to bounce the interface to start forwarding again. I even called my upstream ISP to make sure everything was good. Everything is good on their end.

This has happened multiple times today and I finally was able to catch it right when it happened. Each time it has happened, I see this in the logs: (UTC)

May 26 03:46:12 rtr kernel: [11107.071394] IPv4: martian source 173.69.150.1 from 173.69.150.200, on dev eth0
May 26 03:46:12 rtr kernel: [11107.071412] ll header: 00000000: ff ff ff ff ff ff 3c ec ef 6d 7d d0 08 06

I had a continuous ping going from my workstation simultaneously, in which would send 1 ping to 8.8.8.8 and print the current date: (Below is local time: EDT)

while :; do ping -c 1 8.8.8.8 | grep packet|grep -v ' 0%'; echo $(date); sleep 2; done

Wed May 25 23:45:58 EDT 2022
Wed May 25 23:46:00 EDT 2022
Wed May 25 23:46:02 EDT 2022
Wed May 25 23:46:04 EDT 2022
Wed May 25 23:46:06 EDT 2022
Wed May 25 23:46:08 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:46:20 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:46:32 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:46:44 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:46:56 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:47:08 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:47:20 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Wed May 25 23:47:32 EDT 2022

Some aditional show commands during the outage period:

admin@rtr:~$ show int
Codes: S - State, L - Link, u - Up, D - Down, A - Admin Down
Interface        IP Address                        S/L  Description
---------        ----------                        ---  -----------
eth0             173.69.150.200/24                 u/u
eth1             -                                 u/D
eth2             -                                 u/D
eth3             -                                 u/D
eth4             -                                 u/D
eth5             -                                 u/D
eth6             -                                 u/D
eth7             -                                 u/u
eth7.5           10.200.5.1/24                     u/u  v5: MGMT Interfaces
eth7.10          10.200.10.1/24                    u/u  v10: DNS
                 2600:4040:b034:250a::1/64
                 fd10::1/64
eth7.15          10.200.15.1/24                    u/u  v15: Printer
                 2600:4040:b034:250f::1/64
eth7.20          192.168.1.1/24                    u/u
                 10.0.0.1/24
lo               127.0.0.1/8                       u/u
                 ::1/128

admin@rtr:~$ sho ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

S>* 0.0.0.0/0 [210/0] via 173.69.150.1, eth0, weight 1, 01:54:24
C>* 10.0.0.0/24 is directly connected, eth7.20, 03:08:07
C>* 10.200.5.0/24 is directly connected, eth7.5, 03:08:08
C>* 10.200.10.0/24 is directly connected, eth7.10, 03:08:08
C>* 10.200.15.0/24 is directly connected, eth7.15, 03:08:08
C>* 173.69.150.0/24 is directly connected, eth0, 01:54:24
C>* 192.168.1.0/24 is directly connected, eth7.20, 03:08:07

admin@rtr:~$ sho ipv6 route
Codes: K - kernel route, C - connected, S - static, R - RIPng,
       O - OSPFv3, I - IS-IS, B - BGP, N - NHRP, T - Table,
       v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

K>* ::/0 [0/1024] via fe80::42b4:f0ff:fe0d:10f2, eth0, 01:54:32
C>* 2600:4040:b034:250a::/64 is directly connected, eth7.10, 01:54:26
C>* 2600:4040:b034:250f::/64 is directly connected, eth7.15, 01:54:26
C>* fd10::/64 is directly connected, eth7.10, 03:08:12
C * fe80::/64 is directly connected, eth0, 01:54:31
C * fe80::/64 is directly connected, eth7.15, 03:08:12
C * fe80::/64 is directly connected, eth7.5, 03:08:12
C * fe80::/64 is directly connected, eth7.20, 03:08:12
C * fe80::/64 is directly connected, eth7.10, 03:08:12
C * fe80::/64 is directly connected, eth7, 03:08:13
C>* fe80::/64 is directly connected, lo, 03:08:16

admin@rtr:~$ ping6 fe80::42b4:f0ff:fe0d:10f2
PING fe80::42b4:f0ff:fe0d:10f2(fe80::42b4:f0ff:fe0d:10f2) 56 data bytes
From fe80::200:ff:fe00:0%lo icmp_seq=1 Destination unreachable: No route
From fe80::200:ff:fe00:0%lo icmp_seq=2 Destination unreachable: No route
^C
--- fe80::42b4:f0ff:fe0d:10f2 ping statistics ---
2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1047ms

admin@rtr:~$ ping 173.69.150.1
PING 173.69.150.1 (173.69.150.1) 56(84) bytes of data.
^C
--- 173.69.150.1 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9245ms


admin@rtr:~$ ip link show eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:6d:7c:6a brd ff:ff:ff:ff:ff:ff

admin@rtr:~$ ip a show eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 3c:ec:ef:6d:7c:6a brd ff:ff:ff:ff:ff:ff
    inet 173.69.150.200/24 brd 173.69.150.255 scope global dynamic eth0
       valid_lft 6688sec preferred_lft 6688sec
    inet6 fe80::3eec:efff:fe6d:7c6a/64 scope link
       valid_lft forever preferred_lft forever

This happens on both images. I upgraded from one to the other today hoping a new release might fix it but it didn't:

admin@rtr:~$ sho system image
The system currently has the following image(s) installed:

   1: 1.4-rolling-202205250217 (default boot) (running image)
   2: 1.4-rolling-202201150317

I normally have to bounce the interface or reboot in order to forward traffic again. After I bounce the interface, here are the logs:

May 26 03:49:45 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 03:49:46 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:49:46 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 03:49:46 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:51:46 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 03:51:47 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:51:47 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 03:51:47 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:53:46 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 03:53:47 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:53:47 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 03:53:47 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:55:47 rtr vyos-configd[663]: Received message: {"type": "init"}
May 26 03:55:47 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 03:55:47 rtr vyos-configd[663]: config session pid is 5331
May 26 03:55:47 rtr vyos-configd[663]: Received message: {"type": "node", "data": "VYOS_TAGNODE_VALUE=eth0/usr/libexec/vyos/conf_mode/interfaces-ethernet.py"}
May 26 03:55:47 rtr systemd[1]: Stopping WIDE DHCPv6 client on eth0...
May 26 03:55:47 rtr dhcp6c[4114]: client6_recvreply: status code: success
May 26 03:55:47 rtr dhcp6c[4114]: check_exit: exiting
May 26 03:55:47 rtr systemd[1]: [email protected]: Succeeded.
May 26 03:55:47 rtr systemd[1]: Stopped WIDE DHCPv6 client on eth0.
May 26 03:55:47 rtr systemd[1]: Stopping DHCP client on eth0...
May 26 03:55:47 rtr dhclient[5477]: Killed old client process
May 26 03:55:47 rtr dhclient[5477]: Killed old client process
May 26 03:55:48 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:55:48 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 03:55:48 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 03:55:48 rtr dhclient[5477]: DHCPRELEASE of 173.69.150.200 on eth0 to 173.69.150.1 port 67
May 26 03:55:48 rtr dhclient-script-vyos[5481]: Current dhclient PID: 5477, Parent PID: 1, IP version: 4, All dhclients for interface eth0: 5477
May 26 03:55:48 rtr dhclient-script-vyos[5481]: Passing command to /usr/sbin/ip: "-4 addr flush dev eth0 label eth0"
May 26 03:55:48 rtr dhclient-script-vyos[5481]: Deleting search domains with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:55:48 rtr staticd[1154]: [S4MGP-4WQTA] route_notify_owner: Route 0.0.0.0/0 failed to install for table: 254
May 26 03:55:48 rtr vyos-hostsd[664]: Request data: {"type": "search_domains", "op": "delete", "data": ["dhcp-eth0"]}
May 26 03:55:48 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:55:48 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:55:48 rtr dhclient-script-vyos[5481]: Deleting nameservers with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:55:48 rtr vyos-hostsd[664]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth0"]}
May 26 03:55:48 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:55:48 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:55:49 rtr dhclient-script-vyos[5481]: Deleting default route: via 173.69.150.1 dev eth0 metric 210
May 26 03:55:49 rtr dhclient-script-vyos[5481]: FRR status: running
May 26 03:55:49 rtr dhclient-script-vyos[5481]: Checking if the route presented in kernel: default via 173.69.150.1 dev eth0 metric 210
May 26 03:55:49 rtr dhclient-script-vyos[5481]: Converted vtysh command: "no ip route 0.0.0.0/0 173.69.150.1 eth0 tag 210 210 "
May 26 03:55:49 rtr dhclient-script-vyos[5481]: Sending command to vtysh
May 26 03:55:49 rtr dhclient-script-vyos[5481]: Applying changes via vyos-hostsd-client
May 26 03:55:49 rtr vyos-hostsd[664]: Request data: {"op": "apply"}
May 26 03:55:49 rtr vyos-hostsd[664]: Applying 2 changes
May 26 03:55:49 rtr vyos-hostsd[664]: Writing /etc/resolv.conf
May 26 03:55:49 rtr vyos-hostsd[664]: Writing /etc/hosts
May 26 03:55:49 rtr vyos-hostsd[664]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
May 26 03:55:49 rtr vyos-hostsd[664]: Writing /run/powerdns/recursor.forward-zones.conf
May 26 03:55:49 rtr vyos-hostsd[664]: pdns_recursor not running, not sending "reload-lua-config"
May 26 03:55:49 rtr vyos-hostsd[664]: pdns_recursor not running, not sending "reload-zones"
May 26 03:55:49 rtr vyos-hostsd[664]: Success
May 26 03:55:49 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:55:49 rtr vyos-hostsd[664]: Sent response: {'data': {'message': 'Applied 2 changes'}}
May 26 03:55:49 rtr systemd[1]: [email protected]: Succeeded.
May 26 03:55:49 rtr systemd[1]: Stopped DHCP client on eth0.
May 26 03:55:49 rtr systemd[1]: [email protected]: Consumed 2.312s CPU time.
May 26 03:55:49 rtr netplugd[1034]: eth0: state INSANE flags 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000 -> 0x00001002 BROADCAST,MULTICAST
May 26 03:55:49 rtr netplugd[5558]: /etc/netplug/netplug eth0 probe -> pid 5558
May 26 03:55:49 rtr netplugd[1034]: eth0: state PROBING pid 5558 exited status 256
May 26 03:55:49 rtr netplugd[1034]: Could not bring eth0 back up
May 26 03:55:49 rtr vyos-configd[663]: Sending response 1
May 26 03:55:49 rtr systemd[1]: opt-vyatta-config-tmp-new_config_5331.mount: Succeeded.
May 26 03:55:49 rtr systemd[2667]: opt-vyatta-config-tmp-new_config_5331.mount: Succeeded.
May 26 03:55:50 rtr commit: Successful change to active configuration by user admin on /dev/pts/0
May 26 03:55:50 rtr ntpd[2411]: Deleting interface #27 eth0, 173.69.150.200#123, interface stats: received=89, sent=92, dropped=0, active_time=7225 secs
May 26 03:55:50 rtr ntpd[2411]: 34.206.168.146 local addr 173.69.150.200 -> <null>
May 26 03:55:50 rtr ntpd[2411]: 18.193.41.138 local addr 173.69.150.200 -> <null>
May 26 03:55:50 rtr ntpd[2411]: 122.248.201.177 local addr 173.69.150.200 -> <null>
May 26 03:55:50 rtr ntpd[2411]: Deleting interface #28 eth0, fe80::3eec:efff:fe6d:7c6a%4#123, interface stats: received=0, sent=0, dropped=0, active_time=7225 secs
May 26 03:55:50 rtr ntpd[2411]: Deleting interface #31 eth7.10, 2600:4040:b034:250a::1#123, interface stats: received=0, sent=0, dropped=0, active_time=7221 secs
May 26 03:55:50 rtr ntpd[2411]: Deleting interface #32 eth7.15, 2600:4040:b034:250f::1#123, interface stats: received=0, sent=0, dropped=0, active_time=7221 secs
May 26 03:56:35 rtr vyos-configd[663]: Received message: {"type": "init"}
May 26 03:56:35 rtr vyos-configd[663]: config session pid is 5331
May 26 03:56:35 rtr vyos-configd[663]: Received message: {"type": "node", "data": "VYOS_TAGNODE_VALUE=eth0/usr/libexec/vyos/conf_mode/interfaces-ethernet.py"}
May 26 03:56:35 rtr systemd[1]: Starting WIDE DHCPv6 client on eth0...
May 26 03:56:35 rtr dhcp6c[5632]: dhcp6_ctl_authinit: failed to open /run/dhcp6c/dhcp6c.eth0.sock: No such file or directory
May 26 03:56:35 rtr dhcp6c[5632]: client6_init: failed initialize control message authentication
May 26 03:56:35 rtr dhcp6c[5632]: client6_init: skip opening control port
May 26 03:56:35 rtr systemd[1]: Started WIDE DHCPv6 client on eth0.
May 26 03:56:35 rtr systemd[1]: Starting DHCP client on eth0...
May 26 03:56:35 rtr systemd[1]: Started DHCP client on eth0.
May 26 03:56:35 rtr dhcp6c[5633]: check_exit: exiting
May 26 03:56:35 rtr systemd[1]: Stopping WIDE DHCPv6 client on eth0...
May 26 03:56:35 rtr systemd[1]: [email protected]: Succeeded.
May 26 03:56:35 rtr systemd[1]: Stopped WIDE DHCPv6 client on eth0.
May 26 03:56:35 rtr systemd[1]: Starting WIDE DHCPv6 client on eth0...
May 26 03:56:35 rtr dhcp6c[5650]: dhcp6_ctl_authinit: failed to open /run/dhcp6c/dhcp6c.eth0.sock: No such file or directory
May 26 03:56:35 rtr dhcp6c[5650]: client6_init: failed initialize control message authentication
May 26 03:56:35 rtr dhcp6c[5650]: client6_init: skip opening control port
May 26 03:56:35 rtr systemd[1]: Started WIDE DHCPv6 client on eth0.
May 26 03:56:35 rtr dhclient-script-vyos[5645]: Current dhclient PID: 5640, Parent PID: 5639, IP version: 4, All dhclients for interface eth0: 5639 5640
May 26 03:56:35 rtr dhclient-script-vyos[5645]: Passing command to /usr/sbin/ip: "link set dev eth0 up"
May 26 03:56:35 rtr kernel: [11730.311653] 8021q: adding VLAN 0 to HW filter on device eth0
May 26 03:56:35 rtr netplugd[1034]: eth0: state DOWN flags 0x00001002 BROADCAST,MULTICAST -> 0x00001003 UP,BROADCAST,MULTICAST
May 26 03:56:35 rtr dhclient-script-vyos[5645]: No changes to apply via vyos-hostsd-client
May 26 03:56:35 rtr vyos-configd[663]: Sending response 1
May 26 03:56:35 rtr dhclient[5640]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 5
May 26 03:56:35 rtr systemd[1]: opt-vyatta-config-tmp-new_config_5331.mount: Succeeded.
May 26 03:56:35 rtr systemd[2667]: opt-vyatta-config-tmp-new_config_5331.mount: Succeeded.
May 26 03:56:35 rtr dhcp6c[5651]: client6_send: transmit failed: Network is unreachable
May 26 03:56:36 rtr commit: Successful change to active configuration by user admin on /dev/pts/0
May 26 03:56:37 rtr dhcp6c[5651]: client6_send: transmit failed: Network is unreachable
May 26 03:56:39 rtr dhcp6c[5651]: client6_send: transmit failed: Network is unreachable
May 26 03:56:39 rtr kernel: [11734.117246] igb 0000:65:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
May 26 03:56:39 rtr kernel: [11734.117392] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
May 26 03:56:39 rtr netplugd[1034]: eth0: state INACTIVE flags 0x00001003 UP,BROADCAST,MULTICAST -> 0x00011043 UP,BROADCAST,RUNNING,MULTICAST,10000
May 26 03:56:39 rtr netplugd[5720]: /etc/netplug/netplug eth0 in -> pid 5720
May 26 03:56:39 rtr dhclient[5726]: Killed old client process
May 26 03:56:39 rtr systemd[1]: [email protected]: Succeeded.
May 26 03:56:40 rtr systemd[1]: [email protected]: Scheduled restart job, restart counter is at 1.
May 26 03:56:40 rtr systemd[1]: Stopped DHCP client on eth0.
May 26 03:56:40 rtr systemd[1]: Starting DHCP client on eth0...
May 26 03:56:40 rtr systemd[1]: Started DHCP client on eth0.
May 26 03:56:40 rtr dhclient-script-vyos[5739]: Current dhclient PID: 5735, Parent PID: 5734, IP version: 4, All dhclients for interface eth0: 5722 5725 5726 5734 5735
May 26 03:56:40 rtr dhclient-script-vyos[5739]: Stopping dhclient with PID: 5722, PID file: , Leases file:
May 26 03:56:40 rtr dhclient-script-vyos[5739]: PID file  does not exists, killing dhclient with SIGTERM signal
May 26 03:56:40 rtr netplugd[1034]: eth0: state INNING pid 5720 exited status 256
May 26 03:56:40 rtr dhclient-script-vyos[5739]: Stopping dhclient with PID: 5725, PID file: /var/lib/dhcp/dhclient_eth0.pid, Leases file:
May 26 03:56:40 rtr dhclient-script-vyos[5739]: PID file /var/lib/dhcp/dhclient_eth0.pid does not exists, killing dhclient with SIGTERM signal
May 26 03:56:40 rtr dhclient-script-vyos[5739]: Stopping dhclient with PID: 5726, PID file: /var/lib/dhcp/dhclient_eth0.pid, Leases file:
May 26 03:56:40 rtr dhclient-script-vyos[5739]: PID file /var/lib/dhcp/dhclient_eth0.pid does not exists, killing dhclient with SIGTERM signal
May 26 03:56:40 rtr dhclient-script-vyos[5739]: Passing command to /usr/sbin/ip: "link set dev eth0 up"
May 26 03:56:40 rtr dhclient-script-vyos[5739]: No changes to apply via vyos-hostsd-client
May 26 03:56:40 rtr dhclient-script-vyos[5791]: Current dhclient PID: 5784, Parent PID: 5782, IP version: , All dhclients for interface eth0: 5734 5735 5782 5784
May 26 03:56:40 rtr dhclient-script-vyos[5791]: Stopping dhclient with PID: 5734, PID file: /var/lib/dhcp/dhclient_eth0.pid, Leases file: /var/lib/dhcp/dhclient_eth0.leases
May 26 03:56:40 rtr dhclient-script-vyos[5791]: PID file /var/lib/dhcp/dhclient_eth0.pid does not exists, killing dhclient with SIGTERM signal
May 26 03:56:40 rtr dhclient-script-vyos[5791]: Stopping dhclient with PID: 5735, PID file: /var/lib/dhcp/dhclient_eth0.pid, Leases file: /var/lib/dhcp/dhclient_eth0.leases
May 26 03:56:40 rtr dhclient[5735]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 5
May 26 03:56:40 rtr dhclient[5735]: write to parent: Broken pipe
May 26 03:56:40 rtr dhclient[5735]:
May 26 03:56:40 rtr dhclient[5735]: If you think you have received this message due to a bug rather
May 26 03:56:40 rtr dhclient[5735]: than a configuration issue please read the section on submitting
May 26 03:56:40 rtr dhclient[5735]: bugs on either our web page at www.isc.org or in the README file
May 26 03:56:40 rtr dhclient[5735]: before submitting a bug.  These pages explain the proper
May 26 03:56:40 rtr dhclient[5735]: process and the information we find helpful for debugging.
May 26 03:56:40 rtr dhclient[5735]:
May 26 03:56:40 rtr dhclient[5735]: exiting.
May 26 03:56:40 rtr dhclient-script-vyos[5791]: PID file /var/lib/dhcp/dhclient_eth0.pid does not exists, killing dhclient with SIGTERM signal
May 26 03:56:40 rtr systemd[1]: [email protected]: Succeeded.
May 26 03:56:40 rtr dhclient-script-vyos[5791]: Passing command to /usr/sbin/ip: "link set dev eth0 up"
May 26 03:56:40 rtr dhclient-script-vyos[5791]: No changes to apply via vyos-hostsd-client
May 26 03:56:40 rtr dhclient[5784]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
May 26 03:56:40 rtr systemd[1]: [email protected]: Scheduled restart job, restart counter is at 2.
May 26 03:56:40 rtr systemd[1]: Stopped DHCP client on eth0.
May 26 03:56:40 rtr systemd[1]: Starting DHCP client on eth0...
May 26 03:56:40 rtr systemd[1]: Started DHCP client on eth0.
May 26 03:56:40 rtr dhclient-script-vyos[5858]: Current dhclient PID: 5854, Parent PID: 5853, IP version: 4, All dhclients for interface eth0: 5784 5853 5854
May 26 03:56:40 rtr dhclient-script-vyos[5858]: Stopping dhclient with PID: 5784, PID file: /var/lib/dhcp/dhclient_eth0.pid, Leases file: /var/lib/dhcp/dhclient_eth0.leases
May 26 03:56:40 rtr dhclient[5882]: Killed old client process
May 26 03:56:40 rtr dhclient[5882]: Killed old client process
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth7.20 to 255.255.255.255 port 67 interval 8
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth7.15 to 255.255.255.255 port 67 interval 6
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth7.10 to 255.255.255.255 port 67 interval 6
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth7.5 to 255.255.255.255 port 67 interval 8
May 26 03:56:41 rtr dhcpd[2475]: DHCPDISCOVER from 3c:ec:ef:6d:7a:dd (rtr) via eth7.20
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth3 to 255.255.255.255 port 67 interval 5
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 3
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 4
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth7 to 255.255.255.255 port 67 interval 5
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth6 to 255.255.255.255 port 67 interval 3
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth5 to 255.255.255.255 port 67 interval 3
May 26 03:56:41 rtr dhclient[5882]: DHCPDISCOVER on eth4 to 255.255.255.255 port 67 interval 6
May 26 03:56:41 rtr dhclient-script-vyos[5858]: Passing command to /usr/sbin/ip: "link set dev eth0 up"
May 26 03:56:41 rtr dhclient-script-vyos[5858]: No changes to apply via vyos-hostsd-client
May 26 03:56:41 rtr dhclient[5854]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 2
May 26 03:56:42 rtr dhclient[5854]: DHCPOFFER of 173.69.150.200 from 173.69.150.1
May 26 03:56:42 rtr dhclient[5854]: DHCPREQUEST for 173.69.150.200 on eth0 to 255.255.255.255 port 67
May 26 03:56:42 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.165 to 3c:ec:ef:6d:7a:dd (rtr) via eth7.20
May 26 03:56:42 rtr dhclient[5854]: DHCPACK of 173.69.150.200 from 173.69.150.1
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Current dhclient PID: 5854, Parent PID: 1, IP version: 4, All dhclients for interface eth0: 5854
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Passing command to /usr/sbin/ip: "-4 addr add 173.69.150.200/255.255.255.0 broadcast 173.69.150.255 valid_lft 7200 preferred_lft 7200 dev eth0 label eth0"
May 26 03:56:42 rtr dhclient-script-vyos[5899]: FRR status: running
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Checking if the route presented in kernel: default via 173.69.150.1 dev eth0 metric 210
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Converted vtysh command: "ip route 0.0.0.0/0 173.69.150.1 eth0 tag 210 210 "
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Sending command to vtysh
May 26 03:56:42 rtr dhclient-script-vyos[5899]: Deleting search domains with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:56:43 rtr vyos-hostsd[664]: Request data: {"type": "search_domains", "op": "delete", "data": ["dhcp-eth0"]}
May 26 03:56:43 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:56:43 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:56:43 rtr dhclient-script-vyos[5899]: Adding domain name "verizon.net" as search domain with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:56:43 rtr vyos-hostsd[664]: Request data: {"type": "search_domains", "op": "add", "data": {"dhcp-eth0": ["verizon.net"]}}
May 26 03:56:43 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:56:43 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:56:43 rtr dhclient-script-vyos[5899]: Deleting nameservers with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:56:43 rtr vyos-hostsd[664]: Request data: {"type": "name_servers", "op": "delete", "data": ["dhcp-eth0"]}
May 26 03:56:43 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:56:43 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:56:43 rtr dhclient-script-vyos[5899]: Adding nameservers "71.242.0.12 71.252.0.12" with tag "dhcp-eth0" via vyos-hostsd-client
May 26 03:56:43 rtr vyos-hostsd[664]: Request data: {"type": "name_servers", "op": "add", "data": {"dhcp-eth0": ["71.242.0.12", "71.252.0.12"]}}
May 26 03:56:43 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:56:43 rtr vyos-hostsd[664]: Sent response: {'data': None}
May 26 03:56:43 rtr dhclient-script-vyos[5899]: Applying changes via vyos-hostsd-client
May 26 03:56:43 rtr vyos-hostsd[664]: Request data: {"op": "apply"}
May 26 03:56:43 rtr vyos-hostsd[664]: Applying 4 changes
May 26 03:56:43 rtr vyos-hostsd[664]: Writing /etc/resolv.conf
May 26 03:56:43 rtr vyos-hostsd[664]: Writing /etc/hosts
May 26 03:56:43 rtr vyos-hostsd[664]: Writing /run/powerdns/recursor.vyos-hostsd.conf.lua
May 26 03:56:43 rtr vyos-hostsd[664]: Writing /run/powerdns/recursor.forward-zones.conf
May 26 03:56:43 rtr vyos-hostsd[664]: pdns_recursor not running, not sending "reload-lua-config"
May 26 03:56:43 rtr vyos-hostsd[664]: pdns_recursor not running, not sending "reload-zones"
May 26 03:56:43 rtr vyos-hostsd[664]: Success
May 26 03:56:43 rtr vyos-hostsd[664]: Saving state to /run/vyos-hostsd/vyos-hostsd.state
May 26 03:56:43 rtr vyos-hostsd[664]: Sent response: {'data': {'message': 'Applied 4 changes'}}
May 26 03:56:43 rtr dhclient-script-vyos[5899]: No changes to apply via vyos-hostsd-client
May 26 03:56:43 rtr dhclient[5854]: bound to 173.69.150.200 -- renewal in 3147 seconds.
May 26 03:56:47 rtr ntpd[2411]: Listen normally on 33 eth0 173.69.150.200:123
May 26 03:56:47 rtr ntpd[2411]: Listen normally on 34 eth0 [fe80::3eec:efff:fe6d:7c6a%4]:123
May 26 03:56:47 rtr ntpd[2411]: Listen normally on 35 eth7.10 [2600:4040:b037:950a::1]:123
May 26 03:56:47 rtr ntpd[2411]: Listen normally on 36 eth7.15 [2600:4040:b037:950f::1]:123

Details

Difficulty level
Unknown (require assessment)
Version
1.4-rolling-202205250217
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

I just caught it again. Same logs line up with my continuous ping.

Also, I noticed this seems to happen about every 1 hour and 50 minutes (not exactly 110 minutes but close). I'm using the install timer on the default route as my gauge.

Here's some more logs: (UTC)

May 26 05:06:58 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.150 from 88:de:a9:15:23:4c (Basement-RokuUltra) via eth7.20
May 26 05:06:58 rtr dhcpd[2475]: DHCPACK on 192.168.1.150 to 88:de:a9:15:23:4c (Basement-RokuUltra) via eth7.20
May 26 05:07:01 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.160 (192.168.1.1) from 00:18:dd:05:46:04 (HDHR-1054604B) via eth7.20
May 26 05:07:01 rtr dhcpd[2475]: DHCPACK on 192.168.1.160 to 00:18:dd:05:46:04 (HDHR-1054604B) via eth7.20
May 26 05:22:11 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.148 from bc:5f:f4:20:be:36 via eth7.20
May 26 05:22:11 rtr dhcpd[2475]: DHCPACK on 192.168.1.148 to bc:5f:f4:20:be:36 via eth7.20
May 26 05:22:56 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.166 from a8:bb:50:63:78:f4 (wiz_6378f4) via eth7.20
May 26 05:22:56 rtr dhcpd[2475]: DHCPACK on 192.168.1.166 to a8:bb:50:63:78:f4 (wiz_6378f4) via eth7.20
May 26 05:22:58 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.151 from a8:bb:50:63:53:14 (wiz_635314) via eth7.20
May 26 05:22:58 rtr dhcpd[2475]: DHCPACK on 192.168.1.151 to a8:bb:50:63:53:14 (wiz_635314) via eth7.20
May 26 05:22:58 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.167 from b0:a7:b9:fa:7e:b8 (KP405) via eth7.20
May 26 05:22:58 rtr dhcpd[2475]: DHCPACK on 192.168.1.167 to b0:a7:b9:fa:7e:b8 (KP405) via eth7.20
May 26 05:22:59 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.128 from a8:bb:50:68:de:58 (wiz_68de58) via eth7.20
May 26 05:22:59 rtr dhcpd[2475]: DHCPACK on 192.168.1.128 to a8:bb:50:68:de:58 (wiz_68de58) via eth7.20
May 26 05:46:16 rtr kernel: [18311.049924] IPv4: martian source 173.69.150.1 from 173.69.150.200, on dev eth0
May 26 05:46:16 rtr kernel: [18311.049943] ll header: 00000000: ff ff ff ff ff ff 3c ec ef 6d 7d d0 08 06
May 26 05:47:53 rtr dhclient[5854]: DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67
May 26 05:49:40 rtr dhclient[5854]: message repeated 9 times: [ DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67]
May 26 05:49:41 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 05:49:42 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:49:42 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 05:49:42 rtr dhcpd[2475]: Wrote 0 class decls to leases file.
May 26 05:49:42 rtr dhcpd[2475]: Wrote 42 leases to leases file.
May 26 05:49:42 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:49:57 rtr dhclient[5854]: DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67
May 26 05:51:39 rtr dhclient[5854]: message repeated 8 times: [ DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67]
May 26 05:51:42 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 05:51:43 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:51:43 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 05:51:43 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:51:54 rtr dhclient[5854]: DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67
May 26 05:53:27 rtr dhclient[5854]: message repeated 6 times: [ DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67]
May 26 05:53:42 rtr dhcpd[2475]: DHCPDISCOVER from 00:25:09:97:c5:73 via eth7.20
May 26 05:53:43 rtr dhcpd[2475]: DHCPOFFER on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:53:43 rtr dhcpd[2475]: DHCPREQUEST for 192.168.1.164 (192.168.1.1) from 00:25:09:97:c5:73 via eth7.20
May 26 05:53:43 rtr dhcpd[2475]: DHCPACK on 192.168.1.164 to 00:25:09:97:c5:73 via eth7.20
May 26 05:53:45 rtr dhclient[5854]: DHCPREQUEST for 173.69.150.200 on eth0 to 173.69.150.1 port 67

My continuous ping logs: (EDT)

Thu May 26 01:45:58 EDT 2022
Thu May 26 01:46:00 EDT 2022
Thu May 26 01:46:02 EDT 2022
Thu May 26 01:46:04 EDT 2022
Thu May 26 01:46:06 EDT 2022
Thu May 26 01:46:08 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:46:20 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:46:32 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:46:44 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:46:56 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:47:08 EDT 2022
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Thu May 26 01:47:20 EDT 2022

and the routing table as a timer:

admin@rtr:~$ sho ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, F - PBR,
       f - OpenFabric,
       > - selected route, * - FIB route, q - queued, r - rejected, b - backup
       t - trapped, o - offload failure

S>* 0.0.0.0/0 [210/0] via 173.69.150.1, eth0, weight 1, 01:58:42
C>* 10.0.0.0/24 is directly connected, eth7.20, 05:13:43
C>* 10.200.5.0/24 is directly connected, eth7.5, 05:13:44
C>* 10.200.10.0/24 is directly connected, eth7.10, 05:13:44
C>* 10.200.15.0/24 is directly connected, eth7.15, 05:13:44
C>* 173.69.150.0/24 is directly connected, eth0, 01:58:42
C>* 192.168.1.0/24 is directly connected, eth7.20, 05:13:43

Again, disabling the interface via the VyOS cli, for about 30 seconds and re-enabling it brings it back up. Normally it acquires a new IP address from the ISP.

I'm trying to think what could have a 110 minute timer and the only think I can think of is the DHCP lease time:

May 26 05:58:49 rtr dhclient-script-vyos[7261]: No changes to apply via vyos-hostsd-client
May 26 05:58:49 rtr dhclient[7216]: bound to 72.81.238.169 -- renewal in 3075 seconds.

The DHCP client normally renews a little bit past 3000 seconds. If it renews at half the lease time then maybe something is going on with the dhcpclient handling the lease?

Ultimately I moved my physical connection from port eth0 to eth4, and configured eth4 with the same config as was on eth0. Once I did that everything was stable and has been stable for days (nearly a week now). I'm not sure what could be causing an interface to work for an amount of time, then as if on-cue stop forwarding packets.

The only thing I can think of is eth0 is a 1Gbps rj45 port using the igb driver and eth4 is a 1/10Gbps rj45 using the i40e driver. I'm not sure how to prove or disprove this theory that it might be strangely driver related.

I have verify eth4 is working stable on both 1.3 and 1.4 latest.

I'm not sure if this needs a fix or not as it might be a physical hardware issue but if anyone has ideas for data or logs to gather I am curious how to troubleshoot a problem like this.

showipintbri renamed this task from Outage: Interface stops forwarding, IPv4 martian seen in the logs to [EDIT] Service Restored: Outage: Interface stops forwarding, IPv4 martian seen in the logs.Jun 4 2022, 3:39 AM
dmbaturin added a subscriber: dmbaturin.

If the issue reappaers and you find the reproducing steps, feel free to reopen.