For the past few months, conntrack-sync on my routers has been growing to an absurd size (even though the conntrack table itself is relatively small), at which point I begin to see intermittent connectivity issues until I restart the service via restart conntrack-sync. This lasts a few days until I have to do the same thing again. Here's what the a side router looks like when comparing conntrack-sync's internal cache vs the actual conntrack table; you'll notice there's a huge discrepancy (284 active connections in the conntrack table, but nearly 22,000 in the cache), and this is 20 minutes after restarting the service:
trae@cr01a-vyos:~$ show conntrack table ipv4 | wc -l 286 trae@cr01a-vyos:~$ show conntrack statistics CPU Found Invalid Insert Insert fail Drop Early drop Errors Search restart ----- ------- ---------- ------------ --------------- -------- ------------- --------- ---------------- ----------- ---------------- cpu=0 found=0 invalid=74 insert=0 insert_failed=0 drop=0 early_drop=0 error=9 search_restart=0 (null)=13 (null)=0 cpu=1 found=0 invalid=64 insert=7121 insert_failed=0 drop=0 early_drop=0 error=2 search_restart=0 (null)=235 (null)=0 cpu=2 found=0 invalid=83 insert=73041 insert_failed=0 drop=0 early_drop=0 error=495 search_restart=6 (null)=93 (null)=0 cpu=3 found=0 invalid=73 insert=184 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=0 (null)=75 (null)=0 cpu=4 found=0 invalid=74 insert=0 insert_failed=0 drop=0 early_drop=0 error=885 search_restart=0 (null)=1 (null)=0 cpu=5 found=0 invalid=64 insert=1255 insert_failed=0 drop=0 early_drop=0 error=0 search_restart=0 (null)=0 (null)=0 cpu=6 found=2 invalid=68 insert=0 insert_failed=1 drop=1 early_drop=0 error=0 search_restart=0 (null)=1051 (null)=0 cpu=7 found=0 invalid=71 insert=3046 insert_failed=0 drop=0 early_drop=0 error=32 search_restart=0 (null)=88 (null)=0 trae@cr01a-vyos:~$ show conntrack-sync statist cache internal: current active connections: 22199 connections created: 73875 failed: 0 connections updated: 38217 failed: 0 connections destroyed: 51676 failed: 0 external inject: connections created: 72473 failed: 0 connections updated: 28742 failed: 0 connections destroyed: 270 failed: 0 traffic processed: 1031447727 Bytes 411616 Pckts multicast traffic (active device=bond0.110): 9065520 Bytes sent 8208184 Bytes recv 120422 Pckts sent 111498 Pckts recv 0 Error send 0 Error recv message tracking: 0 Malformed msgs 0 Lost msgs Main Table Statistics:
Here's the relevant portions of the configuration:
trae@cr01a-vyos:~$ show conf com | grep -P 'set (system conntrack|service conntrack-sync)' set service conntrack-sync accept-protocol 'icmp' set service conntrack-sync accept-protocol 'icmp6' set service conntrack-sync accept-protocol 'tcp' set service conntrack-sync accept-protocol 'udp' set service conntrack-sync disable-external-cache set service conntrack-sync event-listen-queue-size '100' set service conntrack-sync failover-mechanism vrrp sync-group 'CR01.INT' set service conntrack-sync ignore-address 'fe8::/10' set service conntrack-sync ignore-address 'ff00::/8' set service conntrack-sync ignore-address '169.254.0.0/16' set service conntrack-sync ignore-address '224.0.0.0/4' set service conntrack-sync ignore-address '127.0.0.0/8' set service conntrack-sync interface bond0.110 set service conntrack-sync sync-queue-size '100' set system conntrack flow-accounting set system conntrack modules set system conntrack table-size '1000000' set system conntrack timeout icmp '10' set system conntrack timeout other '60' set system conntrack timeout tcp close-wait '20' set system conntrack timeout tcp established '1800' set system conntrack timeout tcp fin-wait '30' set system conntrack timeout tcp syn-recv '30' set system conntrack timeout tcp syn-sent '60' set system conntrack timeout udp stream '60'
The b side shows similar symptoms and statistics. Please let me know if you need anything else, I can get you access to the routers as well.