Page MenuHomeVyOS Platform

Config Sync between two VyOS routers
Closed, ResolvedPublicFEATURE REQUEST

Description

When having two routers, will be excellent to have the functionality to create a stack so when making changes to the primary it does replicate to the secondary, similar to JUNOS SRX functionality.

Details

Difficulty level
Hard (possibly days)
Version
-
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)

Event Timeline

Could you please provide e.g. links to SRX docu/whitepapers?

Hello c-po, here is a good description of the JUNOS SRX HA configuration, and how it works pretty general: https://kb.juniper.net/InfoCenter/index?page=content&id=KB21312&actp=METADATA

Additional references: https://kb.juniper.net/InfoCenter/index?page=content&id=KB15694&actp=METADATA

Thank you very much!

syncer triaged this task as Wishlist priority.Sep 1 2018, 2:54 PM
dongjunbo removed a subscriber: dongjunbo.
dongjunbo added a subscriber: dongjunbo.

I've made some hacks to get vyatta-config-sync working with ssh-keys (and working at all).

I'm not programmer so the result looks ugly, but it works.

I've fixed https://github.com/Harliff/vyattta-config-sync/blob/master/scripts/vyatta-config-sync-update.sh
and added some instructions to https://github.com/Harliff/vyattta-config-sync/blob/master/README.md

Hello Harliff,

Really appreciate the information, I will definitely have a look at those
links you sent me.

Sincerely,

Viacheslav changed Difficulty level from Unknown (require assessment) to Hard (possibly days).
Viacheslav set Is it a breaking change? to Unspecified (possibly destroys the router).

PR https://github.com/vyos/vyos-1x/pull/2042

set service config-sync mode 'load'
set service config-sync secondary address '192.168.122.11'
set service config-sync secondary key 'foo'
set service config-sync section 'nat'

Example for nat

set nat source rule 100 description 'rule 100'
set nat source rule 100 outbound-interface 'eth1'
set nat source rule 100 translation address 'masquerade'

vyos@r14# commit
[ nat ]

INFO:vyos_config_sync:Config synchronization: Mode=load, Primary=192.168.122.14, Secondary=192.168.122.11
[edit]
vyos@r14#

Wow! Glad to see its moving on!

Viacheslav changed the task status from Open to Needs testing.Jul 7 2023, 7:17 AM

@Viacheslav Thanks for all the work on this, I'm glad to see it moving along! This doesn't appear to work for me on 1.4-rolling-202307070317; I've configured it for both firewall and NAT and it appears to not be getting triggered (though I've only tried firewall changes so far). Here's the primary side (cr01a-vyos.int) config:

trae@cr01a-vyos# show service config-sync
 mode load
 secondary {
     address cr01b-vyos.int.rtr.trae32566.org
     key <MyKey>
 }
 section nat
 section firewall

As you can see, it is able to reach the secondary side (cr01b-vyos.int) without issue using both IPv4 and IPv6:

trae@cr01a-vyos# nmap -Pn -p 443 cr01b-vyos.int.rtr.trae32566.org
Starting Nmap 7.93 ( https://nmap.org ) at 2023-07-08 22:21 CDT
Nmap scan report for cr01b-vyos.int.rtr.trae32566.org (192.168.253.3)
Host is up (0.00027s latency).
Other addresses for cr01b-vyos.int.rtr.trae32566.org (not scanned): fd52:d62e:8011:fffe::3

PORT    STATE SERVICE
443/tcp open  https

Nmap done: 1 IP address (1 host up) scanned in 0.07 seconds
trae@cr01a-vyos# nmap -6Pn -p 443 cr01b-vyos.int.rtr.trae32566.org
Starting Nmap 7.93 ( https://nmap.org ) at 2023-07-08 22:23 CDT
Nmap scan report for cr01b-vyos.int.rtr.trae32566.org (fd52:d62e:8011:fffe::3)
Host is up (0.00027s latency).
Other addresses for cr01b-vyos.int.rtr.trae32566.org (not scanned): 192.168.253.3

PORT    STATE SERVICE
443/tcp open  https

Nmap done: 1 IP address (1 host up) scanned in 0.04 seconds

I think I've configured the secondary correctly as well:

trae@cr01b-vyos# show service https
 api {
     graphql {
         authentication {
             type token
         }
         introspection
     }
     keys {
         id CR01A-VYOS.INT {
             key <MyKey>
         }
     }
     socket
 }
 virtual-host CONFIG-SYNC {
     allow-client {
         address 192.168.253.2
     }
     listen-address 192.168.253.3
     server-name cr01b-vyos.int.rtr.trae32566.org
 }
 virtual-host CONFIG-SYNC-V6 {
     allow-client {
         address fd52:d62e:8011:fffe::2
     }
     listen-address fd52:d62e:8011:fffe::3
     server-name cr01b-vyos.int.rtr.trae32566.org
 }

I tried adding a piece of firewall configuration and it doesn't seem to sync, and tcpdump on the secondary (cr01b-vyos.int) doesn't show any traffic coming from the primary (cr01a-vyos.int):
Primary:

trae@cr01a-vyos# set firewall name INT_TO_LOCAL rule 80 destination address 192.168.253.2-192.168.253.3
[edit]
trae@cr01a-vyos# comp
[firewall name INT_TO_LOCAL rule 80]
+ destination {
+     address "192.168.253.2-192.168.253.3"
+ }

[edit]
trae@cr01a-vyos# commit
[edit]

Secondary (note the missing destination address):

trae@cr01b-vyos# show firewall name INT_TO_LOCAL rule 80
 action accept
 description "API access"
 destination {
     port https
 }
 protocol tcp
 source {
     address 192.168.253.2-192.168.253.3
 }
[edit]

Is the firewall portion not working?

@trae32566 Try the same with ip address, I tested with IPv4 addresses

@trae32566 Try the same with ip address, I tested with IPv4 addresses

I tried using the IPv4 address; no luck :/

Same version on both, 1.4-rolling-202307070317. Also, if you can disable 2 factor on my Slack account ([email protected]) we can talk in Slack about this (lost my 2 factor app / backup codes).
@Viacheslav

@trae32566 Thanks, could you change one file and comment on one check?

sudo nano -c +140 /run/scripts/commit/post-hooks.d/vyos_config_sync

Set comment

# Config sync only if sections changed
#if not any(map(is_section_revised, sections)):
#    return

After this, sudo systemctl restart vyos-configd and check again.

@Viacheslav I think that fixed it...sorta. It looks like now it does sync successfully, though it appears to time out after awhile for some reason:

trae@cr01a-vyos:~$ sudo nano -c +140 /run/scripts/commit/post-hooks.d/vyos_config_sync
trae@cr01a-vyos:~$ sudo systemctl restart vyos-configd
trae@cr01a-vyos:~$ configure
[edit]
trae@cr01a-vyos# set firewall name INT_TO_LOCAL rule 80 destination address 192.168.253.2-192.168.253.3                                                                                                                                                                                                                    
[edit]
trae@cr01a-vyos# commit
INFO:vyos_config_sync:Config synchronization: Mode=load, Secondary=cr01b-vyos.int.rtr.trae32566.org

An error occurred: HTTPSConnectionPool(host='cr01b-vyos.int.rtr.trae32566.org', port=443): Read timed out. (read timeout=60)
ERROR:vyos_config_sync:An error occurred: HTTPSConnectionPool(host='cr01b-vyos.int.rtr.trae32566.org', port=443): Read timed out. (read timeout=60)

That being said, it does appear to have set the config on the other side:

trae@cr01b-vyos# show firewall name INT_TO_LOCAL rule 80
 action accept
 description "API access"
 destination {
     address 192.168.253.2-192.168.253.3
 }
 protocol tcp
 source {
     address 192.168.253.2-192.168.253.3
 }
trae@cr01b-vyos# cat /var/log/nginx/access.log 
fd52:d62e:8011:fffe::2 - - [09/Jul/2023:09:54:23 -0500] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
fd52:d62e:8011:fffe::2 - - [09/Jul/2023:09:55:25 -0500] "POST /configure-section HTTP/1.1" 499 0 "-" "python-requests/2.28.1"

Also, If it's any easier / you'd prefer I can set up a Webex or something.

@Viacheslav I think that fixed it...sorta. It looks like now it does sync successfully, though it appears to time out after awhile for some reason:

trae@cr01a-vyos:~$ configure
[edit]
trae@cr01a-vyos# set firewall name INT_TO_LOCAL rule 80 destination address 192.168.253.2-192.168.253.3                                                                                                                                                                                                                    
[edit]
trae@cr01a-vyos# commit
INFO:vyos_config_sync:Config synchronization: Mode=load, Secondary=cr01b-vyos.int.rtr.trae32566.org

An error occurred: HTTPSConnectionPool(host='cr01b-vyos.int.rtr.trae32566.org', port=443): Read timed out. (read timeout=60)
ERROR:vyos_config_sync:An error occurred: HTTPSConnectionPool(host='cr01b-vyos.int.rtr.trae32566.org', port=443): Read timed out. (read timeout=60)

That being said, it does appear to have set the config on the other side:

trae@cr01b-vyos# show firewall name INT_TO_LOCAL rule 80
 action accept
 description "API access"
 destination {
     address 192.168.253.2-192.168.253.3
 }
 protocol tcp
 source {
     address 192.168.253.2-192.168.253.3
 }
trae@cr01b-vyos# cat /var/log/nginx/access.log 
fd52:d62e:8011:fffe::2 - - [09/Jul/2023:09:54:23 -0500] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
fd52:d62e:8011:fffe::2 - - [09/Jul/2023:09:55:25 -0500] "POST /configure-section HTTP/1.1" 499 0 "-" "python-requests/2.28.1"

Also, If it's any easier / you'd prefer I can set up a Webex or something.

My internal test works without timeouts.

vyos@r14# sudo ping -6 r2.vyos.local
PING r2.vyos.local(r2.vyos.local (2001:db8::2)) 56 data bytes
64 bytes from r2.vyos.local (2001:db8::2): icmp_seq=1 ttl=64 time=0.215 ms

^C
--- r2.vyos.local ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2073ms
rtt min/avg/max/mdev = 0.215/0.260/0.300/0.034 ms
[edit]
vyos@r14# 
[edit]
vyos@r14# set nat source rule 100 description 123
[edit]
vyos@r14# commit
[ nat ]

INFO:vyos_config_sync:Config synchronization: Mode=load, Secondary=r2.vyos.local
[edit]
vyos@r14#

But I have a simple firewall.

2001:db8::1 - - [09/Jul/2023:19:04:09 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
2001:db8::1 - - [09/Jul/2023:19:04:10 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
2001:db8::1 - - [09/Jul/2023:19:04:31 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
2001:db8::1 - - [09/Jul/2023:19:04:31 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
2001:db8::1 - - [09/Jul/2023:19:04:49 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
2001:db8::1 - - [09/Jul/2023:19:04:50 +0300] "POST /configure-section HTTP/1.1" 200 80 "-" "python-requests/2.28.1"
[edit]
vyos@r11#

@Viacheslav I'm not sure why, but it appears that after doing this, there is high CPU usage on the secondary side, and eventually it stops responding entirely (bgp sessions go down, no response to anything via icmp) and has to be hard reset; it won't even respond to a console login attempt:

image.png (668×819 px, 380 KB)

This makes me think something in my firewall configuration is making it unhappy. I can paste my full firewall config somewhere if you'd like, but I'd prefer if it's not public for security reasons (is email fine?).

@Viacheslav I'm not sure why, but it appears that after doing this, there is high CPU usage on the secondary side, and eventually it stops responding entirely (bgp sessions go down, no response to anything via icmp) and has to be hard reset; it won't even respond to a console login attempt:

image.png (668×819 px, 380 KB)

This makes me think something in my firewall configuration is making it unhappy. I can paste my full firewall config somewhere if you'd like, but I'd prefer if it's not public for security reasons (is email fine?).

Pay attention config-sync mode 'load' fully replaces the firewall section and maybe requires increase timeout for big firewalls.
Requires more rests set service config-sync secondary timeout xxx

Yeah I tried increasing the timeout to the maximum (300) and it still timed out, but I'll try config-sync mode 'set' I guess. The config is fairly large; 549 lines of just set firewall.

@Viacheslav So I figured out what's causing it..it looks like for some reason my commit-archive configuration on the secondary side (which works fine normally) is causing the hanging. As soon as I remove the set system config-management commit-archive on the secondary side, everything starts working fine, even with my full firewall configuration. Has this been tested at all with commit-archive? Could there be some sort of bug happening with it? Here's the section of the secondary side config, for reference:

[system config-management]
- commit-archive {
-     location "sftp://<someUser>:<somePass>@stor01a-rh9.int.trae32566.org/int/cr01b-vyos"
-     source-address "fd52:d62e:8011:fffe::3"
- }

@trae32566 Thanks I can confirm it is a bug with using commit-archive location, there is a separate task https://vyos.dev/T5348
Thanks

dmbaturin claimed this task.