vyos info:
- Hypervisor: VMware and Xen HVM, same problem with both / doesn't seem to matter.
- VM specs: 1 vcpu, 1 GB RAM, 2x network interfaces.
- Hardware: older lab hardware running Intel Xeon 5600 series CPU, but same problem seems to also happen on modern hardware.
- version: vyos-1.3-beta-202101260443-amd64.iso
I have a scenario where thousands of DNAT rules are needed, and it seems already 215 DNAT rules completely breaks the vyos http API.. eg. even showConfig op/request gets broken when you have only 215 DNAT rules added to the configuration.
vyos http API enabled with:
```
set service https api debug
set service https api keys id MY-HTTP-API-ID key MY-HTTP-API-PLAINTEXT-KEY
```
Vyos configuration otherwise is very simple: ssh service enabled, and IPs configured for the two network interfaces, and that's all.
steps to reproduce the problem:
first check and verify the http api showConfig request works with curl:
```
linux# time curl -k -X POST -F data='{"op": "showConfig", "path": []}' -F key=MY-HTTP-API-PLAINTEXT-KEY https://vyos-ip/retrieve 1> /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1199 100 911 100 288 2436 770 --:--:-- --:--:-- --:--:-- 2435
real 0m0.406s
user 0m0.087s
sys 0m0.121s
```
all good, let's add 50 DNAT rules:
```
vyos@vyos:~$ configure
[edit]
vyos@vyos# for i in `seq 1 50`; do set nat destination rule $i description "test $i"; set nat destination rule $i protocol tcp; set nat destination rule $i destination port $i; set nat destination rule $i translation port $i; set nat destination rule $i inbound-interface eth0; set nat destination rule $i translation address 192.168.1.5 ; done
[edit]
vyos@vyos# commit
```
and then check the vyos http api showConfig op:
```
linux# time curl -k -X POST -F data='{"op": "showConfig", "path": []}' -F key=MY-HTTP-API-PLAINTEXT-KEY https://vyos-ip/retrieve 1> /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 9699 100 9411 100 288 4933 150 0:00:01 0:00:01 --:--:-- 4932
real 0m1.939s
user 0m0.097s
sys 0m0.114s
```
1.9 seconds, not too bad. Let's add 50 more DNAT rules and check again with 100 DNAT rules total:
```
linux# time curl -k -X POST -F data='{"op": "showConfig", "path": []}' -F key=MY-HTTP-API-PLAINTEXT-KEY https://vyos-ip/retrieve 1> /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 18203 100 17915 100 288 5317 85 0:00:03 0:00:03 --:--:-- 5319
real 0m3.401s
user 0m0.092s
sys 0m0.118s
```
3.4 seconds.. let's add 100 more, so total 200 DNAT rules now:
```
linux# time curl -k -X POST -F data='{"op": "showConfig", "path": []}' -F key=MY-HTTP-API-PLAINTEXT-KEY https://vyos-ip/retrieve 1> /dev/null
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 35603 100 35315 100 288 5386 43 0:00:06 0:00:06 --:--:-- 8017
real 0m6.587s
user 0m0.081s
sys 0m0.129s
```
6.5 seconds.. ok, now let's try with 300 DNAT rule total:
```
linux# time curl -k -X POST -F data='{"op": "showConfig", "path": []}' -F key=MY-HTTP-API-PLAINTEXT-KEY https://vyos-ip/retrieve
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 471 0 183 0 288 0 0 --:--:-- 0:10:00 --:--:-- 0
<html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.14.2</center>
</body>
</html>
real 10m0.321s
user 0m0.111s
sys 0m0.123s
```
So for some reason with 300 DNAT rules added it now takes up to 10 minutes (!) and then nginx timeouts the request.. I wonder what's going on here? while the curl showConfig request is waiting (and nothing happening) the vyos VM seems to be mostly idle, "top" shows most of the time the cpu is 99% idle and no suspicious processes running.
to reproduce the problem you can of course directly add 300 DNAT rules using the following:
```
vyos@vyos# for i in `seq 1 300`; do set nat destination rule $i description "test $i"; set nat destination rule $i protocol tcp; set nat destination rule $i destination port $i; set nat destination rule $i translation port $i; set nat destination rule $i inbound-interface eth0; set nat destination rule $i translation address 192.168.1.5 ; done
[edit]
vyos@vyos# commit
```
and then try fetching the configuration using vyos the http api showConfig op.
I actually tried playing with different number of DNAT rules, and it seems already 215 DNAT rules is too much and that also results in the nginx gateway timeout @ 10 minutes.
214 DNAT rules works OK and I get the showConfig op response in 7.5 seconds, but adding one more DNAT rule to total of 215 DNAT rules makes the http api get stuck and hit the timeout @ 10 mins.
Any idea where the problem might be ?
For reference here's the full vyos configuration *before* adding the DNAT rules using the script above:
```
vyos@vyos:~$ show configuration | strip-private
interfaces {
ethernet eth0 {
address dhcp
hw-id XX:XX:XX:XX:XX:b7
}
ethernet eth1 {
address 192.168.1.1/24
hw-id XX:XX:XX:XX:XX:0f
}
loopback lo {
}
}
service {
https {
api {
debug
keys {
id MY-HTTP-API-ID {
key xxxxxx
}
}
}
}
ssh {
}
}
system {
config-management {
commit-revisions 100
}
console {
device ttyS0 {
speed 115200
}
}
host-name xxxxxx
login {
user xxxxxx {
authentication {
encrypted-password xxxxxx
plaintext-password xxxxxx
}
}
}
name-servers-dhcp eth0
ntp {
server xxxxx.tld {
}
server xxxxx.tld {
}
server xxxxx.tld {
}
}
syslog {
global {
facility all {
level info
}
facility protocols {
level debug
}
}
}
}
```