Page MenuHomeVyOS Platform

Move nic to mac mapping out of the configuration file
Open, Requires assessmentPublic


today the nic to mac address mapping is inside the configuration file, via the hw-id tag that is parsed before the configuration is loaded into the device..
This is a device specific mapping and i'm trying to change this so that the mapping is in its own config file in /config/persistant_interface_mapping (or some other name)

but, what format to use for the file...

a few examples..

>>> for k in i:
...     print("{} {}".format(k,i[k]))
eth2 00:00:00:00:00:02
eth1 00:00:00:00:00:01
eth0 00:00:00:00:00:00
>>> for k in i:
...     print("{}={}".format(k,i[k]))
>>> json.dumps(i)
'{"eth2": "00:00:00:00:00:02", "eth1": "00:00:00:00:00:01", "eth0": "00:00:00:00:00:00"}'
>>> print(yaml.dump(i,  default_flow_style=False))
eth0: 00:00:00:00:00:00
eth1: 00:00:00:00:00:01
eth2: 00:00:00:00:00:02

personally i prefer a format that is parseable line by line. So, if the user "screws up" the file in some way, only the lines the user screwed up is "lost".. all other interfaces are still going to parse in without issues. we could also have more control over what the parser does every step of the way.

using eg json, if the user misses a ' , " or : all entries in this file becomes invalid, the same is partly true with a simple yaml syntax, because the whole file would not parse on a syntax error.

I need some input on this before continuing this work, what other opinions on this are there? or is there a format i've missed completely?


Difficulty level
Unknown (require assessment)
Why the issue appeared?
Will be filled on close

Event Timeline

runar created this object in space S1 VyOS Public.

What is the purpose of getting the nic/mac mapping before configuration is applied? Is the result passed to a second script which pins the ethernet interface to its mac address? Why not write a script which dynamically reads the info from the config, parses it and applies further actions without the need of an intermediate file?

Or did I miss something?

The issue here is not the scripts themself, it is our(vyattas) mixing of hardware/ system configuration.. the ethernet mapping table is a device specific table that only works on one particular device, all other configuration inside vyos is portable between devices, but this information is not.
This also makes it impossible to move a config file to another hardware without modifications (removing the hw-id mappings)

The second thing here is that this modification/remapping is done prior to system load, it is udev that is responsible for this remapping and the mapping is done on a read-only filesystem so modifications are not possible to be stored directly back to the config file. (We need some kind of intermediate storage, eg /run

This modification will not create an intermidiate file before passing things to the configuration file, it will replace the current implementation and remove everything from the config file and into its own device spesific file. And thous makes the vyos config files movable between systems without modifications.

That is the idea thow :)

Right now the script first alocates interfaces with udev and stores temp files inside /run/udev, then it is a second script thats is executed inside the rl-init script that saves this info into the hw-id tags right before vyos is loaded.

Did it answer your questions?

I had a few (one or two) cases where I did a rolling-to-rolling offline migration (installed the new rolling to a clean drive and copied the old config dir into new image's /boot/<image>/rw/) that the interface naming got completely screwed up, creating new eth devices in the config with higher numbers and leaving the config defined ones inactive, I then had to rescue the situation manually on console (don't remember how exactly). I probably should've created a issue then but I was just glad it was over with. Hopefully this does something to improve the situation as I'm now hesitant to do remote upgrades for fear of losing connectivity...

As for now, the mapping scheme is done with mac adress=>name, so as long as the mac address dont change and you preserve the persistent interface mapping file you should be all good.. i'm wondering if its possible to migrate from a mac-address mapping scheme to instead use eg. Pci index.. but havent found any good solution on this.. the best i've found is to start using systemd and rewrite all occations that has hardcoded eth name mappings.. but thats a bigger case i think.. so for now it's going to be mac address=>name

As i commented : the best i've found is to start using systemd and rewrite all occations that has hardcoded eth name mappings.. but thats a bigger case i think..

There is some further discussion here, which I found useful in considering the changes for Stretch:

I've updated all found instances of hardcoded eth instances to also take systemd interfaces names. this will be working when were upgrading to buster.. i've built an iso with these changes and i'm able to set an ip on my ens3/ens4 nic's and it works. for buster this could be a solution on this issue. the question then is if buster comes soon enough to not rewrite these scripts until then.

The largest issues with using systemd interface naming is that it requires an interface rename on upgrade (migration script?) and the user won't get a name mapping stating from zero as it is today.
But it will be a more consistent naming scheme on devices with the same hardware.
I don't have much experience in the naming scheme yet, but at least my qemu vm's gets ens3, ens4++ nics in ascending order as the nic's are on pcibus 0, slot 3 and 4.
My qotom device will get interfaces enp1s0, enp2s0, enp3s0 and enp4s0 as theire all slot0 on their own dedicated pcie bus'es bus 1-4.
multiport nic's will also be group'ed together by the pcibus, slot and port number syntax enp3s0f0 for the first port on a multi-nic card in pcie bus 3 slot 0

Just to add to this.

My vote would be a /config/config.platform file which is general to the specific system so that config.boot is more portable between systems (e.g. not needing the step of deleting hw-id when migrating).

Ideally the config.platform file uses the same format as config.boot and is merged into the config system upon boot so it's transparent to the operator and just appears in configuration mode as normal. Operations to save configuration would automatically take care of the split between what lands in config.boot and config.platform.

The easiest way to do this might be to create a new top level configuration branch called platform e.g. set platform network-interface <name> hw-id with alternatively supporting different methods of mapping than hw-id such as pci-id or uuid etc at a future date.

Another example of potential platform-specific configuration in the future would be any configuration to support mounting of additional file systems or configuration of RAID.

Support for boot configuration (additional kernel modules to load or boot parameters used when generating grub.cfg) could also be added here (again as an example of future use cases).

It would also be useful to have the ability to have optional data about the hardware for support purposes configured (e.g. model, serial number, asset tag, build date).

@rps Great idea. A summary of some brainstorming that we had on Slack with @runar, @rps and @Dmitry:

  1. A portable config.boot that can be moved between different identical systems that differ only in their NIC MAC addresses
  2. It should be possible to "preseed" a config.platform from the live installation ISO, that can be taken from an already installed system that is identical to the target one, possibly with different MACs but the same unique identifier depending on the mode). This is to allow pre-installation of drives in a virtual machine before installing them in the target system
  3. There are different modes of naming, for example: mac, pci-path, uuid, vmware, udev-filter, ... (all the possible systemd ones, virtualisation platforms, plus a custom udev filter expression)
  4. The filter-expression mode would read a variable that contains a udev expression that matches the particular interface, for example filter-expression = 'KERNELS == "virtio0"' (this may not be possible because values in the config management system cannot contain quotes, if this can't be worked around, the expression could be split up into two variables before and after the equals expression, e.g. udev-var-name = "KERNELS"; udev-var-value = "virtio0")
  5. The installer could auto-detect the platform from a set of predefined ones (vmware, ...) and ask the user if he wants to use that mode
  6. A script to make initial configuration easier by blinking each detected NIC and asking for the wanted name and description of it to generate an initial platform config
  7. Extending an already defined platform with pci-path mode would be done in this way: change the mode to mac, shut down, install new network cards, boot, add new interfaces to platform config (can be auto-added by a script), change mode back to pci-path. This is because the new cards may have internal PCI buses that may shift other buses around, depending on the order in which the BIOS enumerates them.
  8. If the MAC of an interface is spoofed in config, the config.platform needs to save the original MAC for reverting later (possibly all MACS would be saved )
  9. The order in the boot process when the interface renaming takes place would need to be evaluated to fix possible race conditions that result in boot failures (T577)
    1. Possibly a script would generate udev rulas from platform.config before they start getting added by udev, so no renaming is necessary
    2. If that's not possible, ensure that nothing starts using the interfaces before the renaming is complete (there was a mention that the reason for the T577 bugs is FRR starting to bring interfaces up before renaming is complete)
    3. Evaluate if we should use systemd unit ordering and depend on/provide, etc. or use our own scripts to ensure proper ordering, that all interface renaming is complete before any daemons are started
  10. A factory-default command to delete the unionfs rw folder

Any news on this one? Have posted some of the pain I've been having in T291 where VyOS is neither behaving as per documentation (match on hw-id) nor consistently across reboots.