Page MenuHomeVyOS Platform

repository restructuration suggestions
Open, Requires assessmentPublicFEATURE REQUEST

Description

As the code changes and we integrate the conf mode and operation mode from independent programs which are forked to code which is imported a number of repository change could be considered.

Move the code from /usr/libexec/vyos to the python vyos folder.

It is possible to import the configuration code using env PYTHONPATH=/usr/libexec/vyos python3 and then import conf_mode. It is however "hacky" as it requires to change the environment.

It would surely be preferable to have the code installed in the python folder, and symlinked from the old location (file by file or as a folder).

I will propose python/vyos/modules/configuration for /usr/libexec/vyos/conf_mode (for discussion sake) and then symlink that folder in libexec to preserve compatibility with any code which may have expected the file there. It should then be possible at a next release to remove the liecex folder. Giving enough warning to users (or even keeping it as a legacy symlink).

This should/could also be done for the op_mode, completion and validation. In the case of completion and validation, not all the code is however in python and some could be rewritten.

Rename some of the files / programs

The file in theses folders also have names with "-" which may it painful to import using the standard python keyword. It is still possible using importlib but clumsy. Therefore I would also suggest that any filename in the vyos-1x repository should use "_". Again, if symlink are used, it is possible to keep the name in libexec as they were.

If individual symlinks are done, it would be possible to write a single program which would introspect sys.argv[0] and then import and run the configuration mode previously under that name, allowing to change the name of the file in the repository while keeping backward compatibility (if this is required).

Define an API/Interface for Configuration

The configuration code is made of a number of classes , Config and ConfigTree being the main one. I would suggest that we should look at this code in term of "API", for otherw part of the code, in particular what function should be used / available from the conf_mode modules and should be considered as part of the "conf_mode" "API" and which one are not.

Also many of the CLI command do map exactly to some function in the ConfigTree. If you consider the concept of Interface (like Java or Go) then I believe we should define what should be implemented and how, so that multiple implementation can co-exist ( the current Vyatta, VyConf and an in-memory one) as they would all have different use case.

Some of the code should/could be refactored when shared. I will attempt to document this "visually" and update this task with something later on.

get_config_dict

This new API is very useful and should be used more. I would suggest that an improved version of ConfigurationState (which would not be a subclass of configuration but instead be returned by a factory function in Configuration) could become a better object to pass between the verify / generate and apply.

https://github.com/vyos/vyos-1x/blob/current/src/conf_mode/interfaces-tunnel.py#L35

Details

Difficulty level
Unknown (require assessment)
Version
-
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

runar added a subscriber: runar.Jun 5 2020, 8:55 AM

Here comes some suggestions from my part :)

As for the move of conf_mode scripts to the vyos python package i have tried to find all references to conf-mode scripts, as i've found i can only find 6 instances when ignoring the vyatta-cfg templates. this means we could move the scripts without keeping a symlink to it.
Here is the list of instances found thats not following the vyatta-xml syntax from the xml->node generator script:

vyos@vm4:~$ sudo egrep -e "(arp\.py|interfaces-dummy\.py|system-login-banner\.py|bcast_relay\.py|interfaces-ethernet\.py|mdns_repeater\.py|system-login\.py|dhcp_relay\.py|interfaces-geneve\.py|nat\.py|system-options\.py|dhcp_server\.py|interfaces-l2tpv3\.py|ntp\.py|system-proxy\.py|dhcpv6_relay\.py|interfaces-loopback\.py|protocols_bfd\.py|system-syslog\.py|dhcpv6_server\.py|interfaces-macsec\.py|protocols_igmp\.py|system-timezone\.py|dns_forwarding\.py|interfaces-openvpn\.py|protocols_mpls\.py|system-wifi-regdom\.py|dynamic_dns\.py|interfaces-pppoe\.py|protocols_pim\.py|task_scheduler\.py|firewall_options\.py|interfaces-pseudo-ethernet\.py|protocols_static_multicast\.py|tftp_server\.py|flow_accounting_conf\.py|interfaces-tunnel\.py|salt-minion\.py|vpn_l2tp\.py|host_name\.py|interfaces-vxlan\.py|service_ipoe-server\.py|vpn_pptp\.py|http-api\.py|interfaces-wireguard\.py|service_pppoe-server\.py|vpn_sstp\.py|https\.py|interfaces-wirelessmodem\.py|service_router-advert\.py|vrf\.py|igmp_proxy\.py|interfaces-wireless\.py|snmp\.py|vrrp\.py|intel_qat\.py|ipsec-settings\.py|ssh\.py|vyos_cert\.py|interfaces-bonding\.py|le_cert\.py|system-ip\.py|interfaces-bridge\.py|lldp\.py|system-ipv6\.py)" / -R --exclude-dir={dev,home,proc,boot,root,sys,'*squashfs*',run} 2>/dev/null

/etc/dhcp/dhclient-enter-hooks.d/04-vyos-resolvconf:        /usr/libexec/vyos/conf_mode/dns_forwarding.py --dhclient
/opt/vyatta/sbin/vyatta-conntrack-sync.pl:my $VRRP_UPDATE = '/usr/libexec/vyos/conf_mode/vrrp.py';
/opt/vyatta/share/vyatta-cfg/templates/vpn/node.def:    sudo ${vyos_conf_scripts_dir}/ipsec-settings.py || exit 1
/usr/libexec/vyos/conf_mode/http-api.py:    'https.py',
/usr/libexec/vyos/conf_mode/le_cert.py:    'https.py',
/usr/libexec/vyos/conf_mode/vyos_cert.py:    'https.py',
vyos@vm4:~$

for execution the new configurators i'm purpose the use of pythons "run module" functionality in the format python3 -m vyos.modules.configuration.{script_name} this will execute the if __name__ == "__main__": block inside the module if it exists, so everything will run as before.

as @thomas-mangin noted, files including a hyphen is not good in python modules, i'm purpose-ing a a change of this to divide the configuration module into sub-modules (sub-directories) and put all interfaces into a interfaces sub-module.
Loading of the ethernet configurator will then be VYOS_TAGNODE_VALUE='$VAR(@)' python3 -m vyos.modules.configuration.interfaces.ethernet. this could be done for interfaces, service and system and is the majority of hyphens.

The remaining scripts change the hyphen to a "underscore" ( _ ) . Here is the list with purposed new name:
(only for files with a hyphen bot all other follows the same syntax)

http-api.py                   : vyos.modules.configuration.http_api
interfaces-bonding.py         : vyos.modules.configuration.interfaces.bonding
interfaces-bridge.py          : vyos.modules.configuration.interfaces.bridge
interfaces-dummy.py           : vyos.modules.configuration.interfaces.dummy
interfaces-ethernet.py        : vyos.modules.configuration.interfaces.ethernet
interfaces-geneve.py          : vyos.modules.configuration.interfaces.geneve
interfaces-l2tpv3.py          : vyos.modules.configuration.interfaces.l2tpv3
interfaces-loopback.py        : vyos.modules.configuration.interfaces.loopback
interfaces-macsec.py          : vyos.modules.configuration.interfaces.macsec
interfaces-openvpn.py         : vyos.modules.configuration.interfaces.openvpn
interfaces-pppoe.py           : vyos.modules.configuration.interfaces.pppoe
interfaces-pseudo-ethernet.py : vyos.modules.configuration.interfaces.pseudo_ethernet
interfaces-tunnel.py          : vyos.modules.configuration.interfaces.tunnel
interfaces-vxlan.py           : vyos.modules.configuration.interfaces.vxlan
interfaces-wireguard.py       : vyos.modules.configuration.interfaces.wireguard
interfaces-wirelessmodem.py   : vyos.modules.configuration.interfaces.wirelessmodem
interfaces-wireless.py        : vyos.modules.configuration.interfaces.wireless
ipsec-settings.py             : vyos.modules.configuration.ipsec_settings
salt-minion.py                : vyos.modules.configuration.salt_minion
service_ipoe-server.py        : vyos.modules.configuration.service.ipoe_server
service_pppoe-server.py       : vyos.modules.configuration.service.pppoe_server
service_router-advert.py      : vyos.modules.configuration.service.router_advert
system-ip.py                  : vyos.modules.configuration.system.ip
system-ipv6.py                : vyos.modules.configuration.system.ipv6
system-login-banner.py        : vyos.modules.configuration.system.login_banner
system-login.py               : vyos.modules.configuration.system.login
system-options.py             : vyos.modules.configuration.system_options
system-proxy.py               : vyos.modules.configuration.system.proxy
system-syslog.py              : vyos.modules.configuration.system.syslog
system-timezone.py            : vyos.modules.configuration.system.timezone
system-wifi-regdom.py         : vyos.modules.configuration.system.wifi_regdom

NB! This needs to be discussed, as it could be harder to generate the full list of configurators available to be executed (if wee need it??) when doing this.
an alternative is to make a function in __init__.py in vyos.modules.configuration that returns all available configurators.

The next purposed change to the configuration script is to move the tag-collection from inside get_config() to the __name__ == .... block, and passing the tag as a tag=None variable to get_config this needs to be done to allow for the script to be executed as a python module from another script without the need of modifying os.environ.

purposed new main block in the config scripts:

def get_config(tag=None, **kwargs):
    if not tag:
        raise ConfigError('Interface TAG not specified')
......
if __name__ == "__main__":
    try:
        _tag = os.environ.get('VYOS_TAGNODE_VALUE', None)
        c = get_config(tag=_tag)
        verify(c)
        generate(c)
        apply(c)
    except ConfigError as e:
        print(e)
        exit(1)

NB! : this needs to be discussed, and i think there should be written a "Guide" on what belongs where etc., se comments below.

All scripts should use the same main block, and needs to follow the correct syntax as this will be important if multiple configurators will be executed in the same python interpretor.
do we need a "prepare/initiate" step prior to get_config?

configuration scripts not following the predefined syntax today is:

arp.py                      - missing verify step
dns_forwarding.py           - parse_args additions and an args to get_config
intel_qat.py                - missing generate() and executes     vpn_control('restore') on a fault
interfaces_l2tpv3.py        - has a check_kmod() step prior of get_config that loads modules on execution. move to the apply stage?
interfaces-wireguard.py     - _check_kmod() and  _migrate_default_keys() prior to executiln.. move to the apply stage or a prepare block?
interfaces-wirelessmodem.py - check_kmod() needs to be moved
nat.py                      - _check_kmod() needs to be moved
system-proxy.py             - generate returns a object thats passed into apply

Another question is who is responsible for saving the configuration file to disk. this is for now done in generate, but if the apply stage fails there are no configuration to rollback to i would suggest we move all template saving to the apply block, then on a fault we could rollback to the previous config-files before crashing the commit. The new syntax would then be that the generate step generates the templates needed, and apply applies them before reloading (as already done in eg. system-proxy)

I think this is enough for now.. :P

jjakob added a subscriber: jjakob.Jun 5 2020, 2:01 PM

To add my 2 cents:

I agree with the above, some comments

get_config_dict

This new API is very useful and should be used more. I would suggest that an improved version of ConfigurationState (which would not be a subclass of configuration but instead be returned by a factory function in Configuration) could become a better object to pass between the verify / generate and apply.

I agree. It would also be much easier to address the rollback feature as described by @runar

Another question is who is responsible for saving the configuration file to disk. this is for now done in generate, but if the apply stage fails there are no configuration to rollback to i would suggest we move all template saving to the apply block, then on a fault we could rollback to the previous config-files before crashing the commit. The new syntax would then be that the generate step generates the templates needed, and apply applies them before reloading (as already done in eg. system-proxy)

Of course all of the configuration scripts would need to be heavily rewritten to use this new ConfigurationState class. The initial requirement for it was already addressed in T2409.

An issue regarding supporting rollback is how to check if the new generated config is valid for the daemon it's generated for. Of course, the config shouldn't be generated invalid in the first place, verify() should catch it, but a lot of cases aren't verified currently. Some daemons support a 'test config' parameter with which it could be verified beforehand (e.g. dhcp-server), but most are ran through systemd which is only one-shot: it doesn't have a 'verify config' command. If the restart fails, the service fails, and the 'rollback' in that case is to move the old config files back and restart it again. Since the 'restart' is ran in apply(), if it fails, apply could raise an error which would be caught by the calling function that would then trigger the same script again with a 'rollback' parameter. That could regenerate the config file from the effective config instead of working config, or just move backed-up config files (backed up by the previous generate run) back into place and restart the daemon. So there are multiple ways by which to achieve rollback, depending on the service. These are all ideas out the top of my head.

There was also talk in a recent issue around slowness, that some scripts make multiple calls to Config which re-initializes it each time. Passing a pre-initialized Config (or the new ConfigurationState) to all child scripts so none would need to re-initialize it would be great.

dns_forwarding.py - parse_args additions and an args to get_config

That will be eliminated by a set of patches I'm currently working on. There's a lot so it took a long time to do, but it's on the way. Others will need to be separately rectified.

generate should make a backup of the previous file before generating the new one. It will then make it possible to create the rollback as a file move and service reload.

runar added a comment.Jun 5 2020, 2:42 PM

About rollback, i'm wondering about a try:expect loop around apply() that will catch faults and trigger a rollback() to restore old files etc.
The rollback won't be a 100% abort, because vyatta-cfg would not rollback subsystems that have allready been configured.. but we will get a pr. Subsystem rollback and thats a start :) to get a full rollback wee need to change the backend or the executor in the backend.

I've allready created a proof of concept executor https://github.com/runborg/vyos-1x/blob/main-cfg/src/conf_mode/main.py that has been extended by @thomas-mangin and used in T2522 https://github.com/thomas-mangin/vyos-1x/blob/T2522/python/vyos/conf/local.py

But this is not ready for primetime at the moment, and there are questions about how good this integrates into the current backend. These scripts also allows for only one instance of config() so they will get some of the speed benefits shown in T2522

Yes, we need to try/except the apply section (the other should never fail but we could still catch errors to not leave the system in an unknown state) but when applying the reverse configuration (ie: invert effective and new and re-apply) one must then be careful if that fails too (we do not want a forever loop :p). The code already runs all the get_dict and all the verify first, so we will only apply if all is ok, but still issues could occur.

The code also has an option for using threads or multiprocessing to accelerate the commit. I have not added any code for the detection of the number of core on the system but I believe whatever is used should depend on the HW.

runar added a comment.Jun 5 2020, 8:48 PM

Yes, we need to try/except the apply section (the other should never fail but we could still catch errors to not leave the system in an unknown state) but when applying the reverse configuration (ie: invert effective and new and re-apply) one must then be careful if that fails too (we do not want a forever loop :p). The code already runs all the get_dict and all the verify first, so we will only apply if all is ok, but still issues could occur.

the other should never fail but <-- as for now there are two sections that could fail with a config out of sync, the first is a failing generate(), and the other is a failing apply() for process fails to start because of a config file issue or other faults.
This is the reason i want to move all operations that save to effective configuration files to the apply stage. the generate stage could make the template etc and possibly save it to a temporary file or just hold it in memory.. but the file's should not be saved/overwritten before entering the apply() stage

Doing this will make the apply() stage the only stage that is critical for rollback, an other possibility is to create a inspect() function that could print the newly generated config without applying it for by user debugging etc.

generate should make a backup of the previous file before generating the new one. It will then make it possible to create the rollback as a file move and service reload.

I don't think this is a task for generate(), what is needed to keep the code clean is a new function that is responsible for doing this on the subsystems that supports this. eg. make backup of files etc. this might be a backup()?? (not the best name, suggestions are needed :) ) stage that has this responsibility.