Page MenuHomeVyOS Platform

Keepalived memory utilisation issue when constantly getting its state in JSON format
Closed, ResolvedPublicBUG

Description

Hi,

We are monitoring keepalived/VRRP by getting its state in JSON format every 10s, this is done by sending a particular signal to the keepalived process, as described in keepalived man pages.
Since we started doing it we noticed keepalived memory utilisation constantly growing until it consumes all the available memory and causes the firewall to crash. Keepalived version that comes with VyOS is 2.0.10. We upgraded it to 2.1.5 which is backported to debian buster and the issue disappeared.

● keepalived.service - Keepalive Daemon (LVS and VRRP)
   Loaded: loaded (/lib/systemd/system/keepalived.service; disabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/keepalived.service.d
           └─override.conf
   Active: active (running) since Wed 2021-05-12 05:57:56 UTC; 5h 17min ago
 Main PID: 15486 (keepalived)
    Tasks: 5 (limit: 546)
   Memory: 134.5M
   CGroup: /system.slice/keepalived.service
           ├─15486 /usr/sbin/keepalived --dont-fork --snmp
           ├─15499 /usr/sbin/keepalived --dont-fork --snmp
           └─15566 python3 /usr/libexec/vyos/system/keepalived-fifo.py /run/keepalived_notify_fifo
              total        used        free      shared  buff/cache   available
Mem:          484Mi       367Mi        15Mi       8.0Mi       101Mi        92Mi
Swap:            0B          0B          0B

I've built 2.0.10 with --mem-check to see logs but didn't see any issue reported.

---[ Keepalived memory dump for (VRRP Child process) ]---



---[ Keepalived memory dump summary for (VRRP Child process) ]---
Total number of bytes not freed...: 0
Number of entries not freed.......: 0
Maximum allocated entries.........: 271
Maximum memory allocated..........: 29217
Number of mallocs.................: 2348
Number of reallocs................: 380
Number of bad entries.............: 0
Number of buffer overrun..........: 0
Number of 0 size allocations......: 0

=> Program seems to be memory allocation safe...

Details

Difficulty level
Unknown (require assessment)
Version
1.4
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Related Objects