Page MenuHomeVyOS Platform

Improving Boot Time for Large Firewall Configurations
Confirmed, NormalPublicFEATURE REQUEST


Our largest instance has 23,080 lines for config.boot (mostly firewall rule configuration for 15 or so VLANs).

VyOS 1.1.8 to 1.2 configuration migration and boot: 25 min.

Save configuration time: 10 sec.

VyOS 1.2.0 saved config boot time: 19 min.

Commit configuration change time: 3 min 10 sec. (and yes the wait is terrifying)

I've traced back the majority of the boot slowdown to executing iptables for each rule insertion.

Updating the firewall to create a temporary file with changes and calling it using iptables-restore -n < $FILENAME the -n flag being the noflush and important if not replacing the entire ruleset.

Because iptables-restore performs an atomic change there are a few advantages:

  1. Error in any of the rules will cause the atomic change to fail before applied for all rules making error recovery much easier in terms of VyOS not getting out of sync with itself.
  2. The speed of the atomic commit is orders of magnitude faster for large rulesets (from 15 min to 30 sec in one example).
  3. Because iptables-restore with the -n flag allows for insert and removal operations rather than a full flush it is easy to combine add and remove operations such that there is no gap in policy between deletion and insertion of rules making changes safer from a traffic perspective.

For simple operations the syntax is identical to iptables with each line in the file being the same as what would be provided after iptables ending with the word COMMIT (testing needed).

In terms of the commit time I don't know if there is an easy way to address this until the filesystem is no longer used for storing the config tree as a directory structure.

I think adopting the atomic netfilter configuration would be one something easy enough to implement in 1.2.1+ though.


Difficulty level
Unknown (require assessment)
Why the issue appeared?
Will be filled on close

Event Timeline

syncer changed the task status from Open to Confirmed.Feb 5 2019, 2:17 PM
syncer triaged this task as Normal priority.
syncer edited projects, added VyOS 1.2 Crux (VyOS 1.2.2); removed VyOS 1.2 Crux.

I am affected too by this issue.

My configuration file has around 7500 lines, while 5000 of them are related to firewall (rules and address-groups)

With VyOS 1.2 boot time is 12 minutes, while in VyOS 1.8 the same config takes less than 3 minutes. commit times went from 10 sec to nearly one minute.

I'm also affected by this, but even with a relatively "small" configuration (2662 lines, at present, where more than half are firewall rules, 5 interfaces).

My system takes approximately 5 minutes to boot, and commits are painfully long.

Also affected by this. Reboots take almost 10 minutes before the device is usable. Commits take a long time as well.

I see similar topics on the Vyos forum.

Will this be picked up?

Another affected by this! Several devices with more than 20K config lines cause our automation scripts to take really really long to complete, and also the devices take much time to boot.
It would be great having this fixed in 1.3 :-)

I definitely am not using large port-ranges. A pretty standard setup using a zone-based firewall.

See attached for reference. Running on an apu2c4, reboot to usable time is 10 minutes. The device itself can run pfSense, ESXi server etc. with no real issues. Yes I know it isn't a powerhouse but the current setup doesn't explain such lengthy reboot and commit times.

See console output:

[ 31.095034] vyos-router[834]: Waiting for NICs to settle down: settled in 2s.
[ 38.543873] vyos-router[834]: Started watchfrr.
[ 38.582174] vyos-router[834]: Mounting VyOS Config...done.
[ 706.098852] vyos-router[834]: Starting VyOS router: migrate rl-system firewa.
[ 706.119589] vyos-config[6524]: Configuration success

We don't do any firewalling — we have lots of prefix-lists for filtering eBGP sessions. Right now we're looking at a router that's taken more than 1h20minutes to boot up — and it is still not finished — on modern Xeon CPUs. That's doubled in length since adding a prefix-list of around 5000 entries (roughly double the total number of prefix-list entries as before).

There's definitely something wrong with how the configuration is applied. Is there an O(n^2) problem?

@csalcedo maybe you use large port-ranges

Thanks @Viacheslav but not, we already hit that bug some years ago and we made a workaround by defining the ports in the rule without using named port-ranges.

We mainly have lots of firewall rules with lots of firewall groups (address groups and network groups, but not port-groups).

also having lots of NAT rules makes the vyos config handling and boot time very slow..

I'm also affected by this. My configuration has about 5k ip prefixes in network group for policy based routing.