Page MenuHomeVyOS Platform

Any interface bonding changes cause interface flapping
Closed, ResolvedPublicBUG

Description

Any interface bonding changes cause interface flapping
Even if we change the description, it cause ip link set dev bondX down as we get shutdown_required even if we don't change the mode/lacp-rate

It seemed the bug added in this commit https://github.com/vyos/vyos-1x/commit/95e7676aa8ae5b3476b14a334b3815c2ae59f8d6 1.4
Also could be affected to 1.3.3 commit https://github.com/vyos/vyos-1x/commit/0f1d29ac0480dc202595b96357789e6d15d49f2c

To reproduce the initial configuration:

set interfaces bonding bond21 member interface 'eth1'
set interfaces bonding bond21 mode '802.3ad'
set vrf name dmz table 1010
set interfaces bonding bond21 vif 10 address '10.0.10.1/31'
set interfaces bonding bond21 vif 10 vrf 'dmz'

Change anything, it this example a description on subinterface on bonding interface:

set interfaces bonding bond21 vif 10 description "DMZ-10"

commit/debug:

vyos@r1# set interfaces bonding bond21 vif 10 description "DMZ-10"
[edit]
vyos@r1# commit
[ interfaces bonding bond21 ]
DEBUG/IFCONFIG returned (err):
Error: Could not process rule: No such file or directory
delete element inet vrf_zones ct_iface_map { bond21 }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEBUG/IFCONFIG cmd 'ip link set dev bond21 down'
DEBUG/IFCONFIG write '100' > '/sys/class/net/bond21/bonding/miimon'
DEBUG/IFCONFIG write 'layer2' > '/sys/class/net/bond21/bonding/xmit_hash_policy'
DEBUG/IFCONFIG write '0' > '/sys/class/net/bond21/bonding/min_links'
DEBUG/IFCONFIG cmd 'ip -json link show dev bond21'
DEBUG/IFCONFIG returned (out):
[{"ifindex":23,"ifname":"bond21","flags":["BROADCAST","MULTICAST","MASTER"],"mtu":1500,"qdisc":"noqueue","operstate":"DOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff"}]
DEBUG/IFCONFIG write '-eth1' > '/sys/class/net/bond21/bonding/slaves'
DEBUG/IFCONFIG cmd 'ip link set dev eth1 up'
DEBUG/IFCONFIG write '802.3ad' > '/sys/class/net/bond21/bonding/mode'
DEBUG/IFCONFIG write 'slow' > '/sys/class/net/bond21/bonding/lacp_rate'
DEBUG/IFCONFIG cmd 'ip addr flush dev "eth1"'
DEBUG/IFCONFIG cmd 'ip -json link show dev eth1'
DEBUG/IFCONFIG returned (out):
[{"ifindex":3,"ifname":"eth1","flags":["BROADCAST","MULTICAST","UP"],"mtu":1500,"qdisc":"pfifo_fast","operstate":"UP","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff"}]
DEBUG/IFCONFIG cmd 'ip link set dev eth1 down'
DEBUG/IFCONFIG write '+eth1' > '/sys/class/net/bond21/bonding/slaves'
DEBUG/IFCONFIG cmd 'ip link set dev eth1 up'
{'hash_policy': 'layer2',
 'ifname': 'bond21',
 'ip': {'arp_cache_timeout': '30'},
 'lacp_rate': 'slow',
 'member': {'interface': {'eth1': {}}},
 'mii_mon_interval': '100',
 'min_links': '0',
 'mode': '802.3ad',
 'mtu': '1500',
 'shutdown_required': {},
 'vif': {'10': {'address': ['10.0.10.1/31'],
                'description': 'DMZ-10',
                'ifname': 'bond21.10',
                'ip': {'arp_cache_timeout': '30'},
                'mtu': '1500',
                'vrf': 'dmz'}}}
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21'
DEBUG/IFCONFIG returned (out):
[{"ifindex":23,"ifname":"bond21","flags":["BROADCAST","MULTICAST","MASTER"],"mtu":1500,"qdisc":"noqueue","operstate":"DOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":68,"max_mtu":65535,"linkinfo":{"info_kind":"bond","info_data":{"mode":"802.3ad","miimon":100,"updelay":0,"downdelay":0,"peer_notify_delay":0,"use_carrier":1,"arp_interval":0,"arp_validate":null,"arp_all_targets":"any","primary_reselect":"always","fail_over_mac":"none","xmit_hash_policy":"layer2","resend_igmp":1,"num_peer_notif":1,"all_slaves_active":0,"min_links":0,"lp_interval":1,"packets_per_slave":1,"ad_lacp_active":"on","ad_lacp_rate":"slow","ad_select":"stable","ad_actor_sys_prio":65535,"ad_user_port_key":0,"ad_actor_system":"00:00:00:00:00:00","tlb_dynamic_lb":1}},"inet6_addr_gen_mode":"none","num_tx_queues":16,"num_rx_queues":16,"gso_max_size":65536,"gso_max_segs":65535}]
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21/link_filter'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21'
DEBUG/IFCONFIG returned (out):
[{"ifindex":23,"ifname":"bond21","flags":["BROADCAST","MULTICAST","MASTER"],"mtu":1500,"qdisc":"noqueue","operstate":"DOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":68,"max_mtu":65535,"linkinfo":{"info_kind":"bond","info_data":{"mode":"802.3ad","miimon":100,"updelay":0,"downdelay":0,"peer_notify_delay":0,"use_carrier":1,"arp_interval":0,"arp_validate":null,"arp_all_targets":"any","primary_reselect":"always","fail_over_mac":"none","xmit_hash_policy":"layer2","resend_igmp":1,"num_peer_notif":1,"all_slaves_active":0,"min_links":0,"lp_interval":1,"packets_per_slave":1,"ad_lacp_active":"on","ad_lacp_rate":"slow","ad_select":"stable","ad_actor_sys_prio":65535,"ad_user_port_key":0,"ad_actor_system":"00:00:00:00:00:00","tlb_dynamic_lb":1}},"inet6_addr_gen_mode":"none","num_tx_queues":16,"num_rx_queues":16,"gso_max_size":65536,"gso_max_segs":65535}]
DEBUG/IFCONFIG cmd 'ip link set dev bond21 nomaster'
DEBUG/IFCONFIG cmd 'nft -c delete element inet vrf_zones ct_iface_map { "bond21" }'
DEBUG/IFCONFIG cmd 'nft -a list chain raw VYOS_TCP_MSS'
DEBUG/IFCONFIG returned (out):
table ip raw {
	chain VYOS_TCP_MSS { # handle 1
		type filter hook forward priority raw; policy accept;
	}
}
DEBUG/IFCONFIG read '30000' < '/proc/sys/net/ipv4/neigh/bond21/base_reachable_time_ms'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21/arp_filter'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/arp_accept'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/arp_announce'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/arp_ignore'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/proxy_arp'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/proxy_arp_pvlan'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21/forwarding'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/bc_forwarding'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21/rp_filter'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21'
DEBUG/IFCONFIG returned (out):
[{"ifindex":23,"ifname":"bond21","flags":["BROADCAST","MULTICAST","MASTER"],"mtu":1500,"qdisc":"noqueue","operstate":"DOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":68,"max_mtu":65535,"linkinfo":{"info_kind":"bond","info_data":{"mode":"802.3ad","miimon":100,"updelay":0,"downdelay":0,"peer_notify_delay":0,"use_carrier":1,"arp_interval":0,"arp_validate":null,"arp_all_targets":"any","primary_reselect":"always","fail_over_mac":"none","xmit_hash_policy":"layer2","resend_igmp":1,"num_peer_notif":1,"all_slaves_active":0,"min_links":0,"lp_interval":1,"packets_per_slave":1,"ad_lacp_active":"on","ad_lacp_rate":"slow","ad_select":"stable","ad_actor_sys_prio":65535,"ad_user_port_key":0,"ad_actor_system":"00:00:00:00:00:00","tlb_dynamic_lb":1}},"inet6_addr_gen_mode":"none","num_tx_queues":16,"num_rx_queues":16,"gso_max_size":65536,"gso_max_segs":65535}]
DEBUG/IFCONFIG cmd 'nft -a list chain ip6 raw VYOS_TCP_MSS'
DEBUG/IFCONFIG returned (out):
table ip6 raw {
	chain VYOS_TCP_MSS { # handle 1
		type filter hook forward priority raw; policy accept;
	}
}
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21/forwarding'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21/accept_ra'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv6/conf/bond21/autoconf'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21/dad_transmits'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21'
DEBUG/IFCONFIG returned (out):
[{"ifindex":23,"ifname":"bond21","flags":["BROADCAST","MULTICAST","MASTER"],"mtu":1500,"qdisc":"noqueue","operstate":"DOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":68,"max_mtu":65535,"linkinfo":{"info_kind":"bond","info_data":{"mode":"802.3ad","miimon":100,"updelay":0,"downdelay":0,"peer_notify_delay":0,"use_carrier":1,"arp_interval":0,"arp_validate":null,"arp_all_targets":"any","primary_reselect":"always","fail_over_mac":"none","xmit_hash_policy":"layer2","resend_igmp":1,"num_peer_notif":1,"all_slaves_active":0,"min_links":0,"lp_interval":1,"packets_per_slave":1,"ad_lacp_active":"on","ad_lacp_rate":"slow","ad_select":"stable","ad_actor_sys_prio":65535,"ad_user_port_key":0,"ad_actor_system":"00:00:00:00:00:00","tlb_dynamic_lb":1}},"inet6_addr_gen_mode":"none","num_tx_queues":16,"num_rx_queues":16,"gso_max_size":65536,"gso_max_segs":65535}]
DEBUG/IFCONFIG cmd 'ip addr add fe80::5208:ff:fe03:1/64 dev bond21'
DEBUG/IFCONFIG cmd 'xdp_loader -d bond21 -U --auto-mode'
DEBUG/IFCONFIG returned (out):
INFO: xdp_link_detach() no curr XDP prog on ifindex:23
DEBUG/IFCONFIG cmd 'tc qdisc del dev bond21 parent ffff: 2>/dev/null'
DEBUG/IFCONFIG cmd 'tc qdisc del dev bond21 parent 1: 2>/dev/null'
DEBUG/IFCONFIG cmd 'ip link set dev bond21 up'
{'address': ['10.0.10.1/31'],
 'description': 'DMZ-10',
 'ifname': 'bond21.10',
 'ip': {'arp_cache_timeout': '30'},
 'mtu': '1500',
 'vrf': 'dmz'}
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21.10'
DEBUG/IFCONFIG returned (out):
[{"ifindex":24,"link":"bond21","ifname":"bond21.10","flags":["NO-CARRIER","BROADCAST","MULTICAST","UP"],"mtu":1500,"qdisc":"noqueue","master":"dmz","operstate":"LOWERLAYERDOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":0,"max_mtu":65535,"linkinfo":{"info_kind":"vlan","info_data":{"protocol":"802.1Q","id":10,"flags":["REORDER_HDR"]},"info_slave_kind":"vrf","info_slave_data":{"table":1010}},"inet6_addr_gen_mode":"none","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535}]
DEBUG/IFCONFIG cmd 'ip link set dev bond21.10 alias "DMZ-10"'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21.10/link_filter'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21.10'
DEBUG/IFCONFIG returned (out):
[{"ifindex":24,"link":"bond21","ifname":"bond21.10","flags":["NO-CARRIER","BROADCAST","MULTICAST","UP"],"mtu":1500,"qdisc":"noqueue","master":"dmz","operstate":"LOWERLAYERDOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":0,"max_mtu":65535,"linkinfo":{"info_kind":"vlan","info_data":{"protocol":"802.1Q","id":10,"flags":["REORDER_HDR"]},"info_slave_kind":"vrf","info_slave_data":{"table":1010}},"inet6_addr_gen_mode":"none","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535,"ifalias":"DMZ-10"}]
DEBUG/IFCONFIG cmd 'nft -a list chain raw VYOS_TCP_MSS'
DEBUG/IFCONFIG returned (out):
table ip raw {
	chain VYOS_TCP_MSS { # handle 1
		type filter hook forward priority raw; policy accept;
	}
}
DEBUG/IFCONFIG read '30000' < '/proc/sys/net/ipv4/neigh/bond21.10/base_reachable_time_ms'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21.10/arp_filter'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/arp_accept'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/arp_announce'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/arp_ignore'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/proxy_arp'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/proxy_arp_pvlan'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv4/conf/bond21.10/forwarding'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/bc_forwarding'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv4/conf/bond21.10/rp_filter'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21.10'
DEBUG/IFCONFIG returned (out):
[{"ifindex":24,"link":"bond21","ifname":"bond21.10","flags":["NO-CARRIER","BROADCAST","MULTICAST","UP"],"mtu":1500,"qdisc":"noqueue","master":"dmz","operstate":"LOWERLAYERDOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":0,"max_mtu":65535,"linkinfo":{"info_kind":"vlan","info_data":{"protocol":"802.1Q","id":10,"flags":["REORDER_HDR"]},"info_slave_kind":"vrf","info_slave_data":{"table":1010}},"inet6_addr_gen_mode":"none","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535,"ifalias":"DMZ-10"}]
DEBUG/IFCONFIG cmd 'nft -a list chain ip6 raw VYOS_TCP_MSS'
DEBUG/IFCONFIG returned (out):
table ip6 raw {
	chain VYOS_TCP_MSS { # handle 1
		type filter hook forward priority raw; policy accept;
	}
}
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21.10/forwarding'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21.10/accept_ra'
DEBUG/IFCONFIG read '0' < '/proc/sys/net/ipv6/conf/bond21.10/autoconf'
DEBUG/IFCONFIG read '1' < '/proc/sys/net/ipv6/conf/bond21.10/dad_transmits'
DEBUG/IFCONFIG cmd 'ip -json -detail link list dev bond21.10'
DEBUG/IFCONFIG returned (out):
[{"ifindex":24,"link":"bond21","ifname":"bond21.10","flags":["NO-CARRIER","BROADCAST","MULTICAST","UP"],"mtu":1500,"qdisc":"noqueue","master":"dmz","operstate":"LOWERLAYERDOWN","linkmode":"DEFAULT","group":"default","txqlen":1000,"link_type":"ether","address":"50:08:00:03:00:01","broadcast":"ff:ff:ff:ff:ff:ff","promiscuity":0,"min_mtu":0,"max_mtu":65535,"linkinfo":{"info_kind":"vlan","info_data":{"protocol":"802.1Q","id":10,"flags":["REORDER_HDR"]},"info_slave_kind":"vrf","info_slave_data":{"table":1010}},"inet6_addr_gen_mode":"none","num_tx_queues":1,"num_rx_queues":1,"gso_max_size":65536,"gso_max_segs":65535,"ifalias":"DMZ-10"}]
DEBUG/IFCONFIG cmd 'ip addr add fe80::5208:ff:fe03:1/64 dev bond21.10'
DEBUG/IFCONFIG cmd 'xdp_loader -d bond21.10 -U --auto-mode'
DEBUG/IFCONFIG returned (out):
INFO: xdp_link_detach() no curr XDP prog on ifindex:24
DEBUG/IFCONFIG cmd 'tc qdisc del dev bond21.10 parent ffff: 2>/dev/null'
DEBUG/IFCONFIG cmd 'tc qdisc del dev bond21.10 parent 1: 2>/dev/null'
DEBUG/IFCONFIG cmd 'ip link set dev bond21.10 up'

[edit]
vyos@r1#

Details

Difficulty level
Easy (less than an hour)
Version
VyOS 1.4-rolling-202212140319
Why the issue appeared?
Implementation mistake
Is it a breaking change?
Perfectly compatible
Issue type
Bug (incorrect behavior)

Event Timeline

I couldn't find an effective way to get all the new members added to the bond via config at commit-time without comparing the members to the running/effective config (the function leaf_node_changed() only gets the removed interfaces). Not doing so either causes runtime commit failures (where the bond fails to add/remove members) or boot failures (where the bond fails to add all it's members on boot).

The real code that needs to be addressed is here, where:

  1. It is not possible to manipulate bond members without taking the entire interface down
  2. The code is designed to 'flap' members as all bond members are removed and then re-added to the bond
  3. The bond code is not structured in a way to do any special handing for deleted members

The commit I wrote in response to T4668 was written while working within the constraints of the existing bond code (where we need to pass shutdown_required to get anything done). Ideally, the bond code in python/vyos/ifconfig/bond.py should be refactored to:

  1. Self-determine when the bond needs to be brought down (i.e. get rid of the shutdown_required flag from the config-side)
  2. Self-determine what member interfaces to manipulate (so it doesn't remove/add all interfaces every time)
  3. Have a section dedicated to handling removed interfaces (so we can set them to the correct admin up or admin down states based on their disable attribute in config

The whole shutdown_required flag should really be self-determined by the bond code rather than the commit/config code as we don't have a full picture of the bonding state (given the limitations of how the specific commit code is called). Satisfying the requirements above will mean the config side just needs to pass the desired configuration to the bond code and the bond code manipulates the bonding state to match said configuration.

zsdc changed the task status from Open to Confirmed.Dec 15 2022, 8:57 AM
zsdc claimed this task.
zsdc triaged this task as Normal priority.
zsdc changed Difficulty level from Unknown (require assessment) to Easy (less than an hour).
zsdc changed Why the issue appeared? from Will be filled on close to Implementation mistake.
zsdc changed Is it a breaking change? from Unspecified (possibly destroys the router) to Perfectly compatible.
zsdc changed the task status from Confirmed to In progress.Dec 15 2022, 9:50 AM

I agree that internal logic can be better, but I think that in this specific case the problem is much simpler: https://github.com/vyos/vyos-1x/pull/1708

@zsdc Yeah I see the bug now, I made the assumption that the config level by default was set to the bond (i.e. interfaces bonding bondX), good catch. Tested in a VM and I can confirm no regression in existing bonding behavior.

Viacheslav changed the task status from In progress to Needs testing.Dec 15 2022, 3:15 PM
zsdc changed the task status from Needs testing to Backport pending.Dec 16 2022, 10:42 AM

VyOS 1.3 uses config.set_level() in get_interface_dict() thus this bug is not present in VyOS 1.3.