Page MenuHomePhabricator

BGP configuration (is lost|not applied) when updating 1.1.8 -> 1.2.1
Confirmed, HighPublicBUG

Description

Due to the joys of SACK Panic, we have begun migrating out 1.1.8 older routers to 1.2.1 finally.

We started with one of our EDGE routers, which BGP's to an upstream 3rd party for IP Transit, and then speaks to two of our core routers by sending them a default route. The Core <-> EDGE is handled via OSPF for finding each other, but BGP exchange for distribution to rest of network. Core Route Reflect to a redundant EDGE router to another router. Most of this is I presume not relevant to this issue.

On upgrade, the router had assumed all correct interface IP's, and all OSPF routing was working fine.

BGP simply was not started. It showed no current config, no neighbours, no AS. This is despite the 'show config' showing our previous fully functional BGP config.

By deleting and re-adding the same lines in the protocol -> bgp portion of the config, FRR BGP daemon slowly came online. It was a little bit of trying each line one by one to see which parts would make FRR respond. It appeared to take the neighbour config in part, but would not set the nexthop-self or remote-as parameters until they were deleted and then re-added in their seemingly identical format.

Details

Difficulty level
Normal (likely a few hours)
Version
1.2.1
Why the issue appeared?
Will be filled on close

Event Timeline

SquirePug updated the task description. (Show Details)Jun 26 2019, 2:34 AM

protocols {

bgp 132394 {
    address-family {
        ipv4-unicast {
            network 0.0.0.0/0 {
            }
            network 103.20.20.0/24 {
            }
            network 103.232.159.0/24 {
            }
            network 103.232.216.0/23 {
            }
            network 202.0.150.0/24 {
            }
        }
        ipv6-unicast {
            network 2402:7b80::/32 {
            }
        }
    }
    maximum-paths {
        ibgp 6
    }
    neighbor 103.232.216.229 {
        address-family {
            ipv4-unicast {
                nexthop-self
                route-map {
                    export DENY-EBGP-IBGP
                }
                soft-reconfiguration {
                    inbound
                }
            }
        }
        advertisement-interval 1
        remote-as 132394
        update-source 103.232.216.226
    }
    neighbor 103.232.216.230 {
        address-family {
            ipv4-unicast {
                nexthop-self
                route-map {
                    export DENY-EBGP-IBGP
                }
                soft-reconfiguration {
                    inbound
                }
            }
        }
        advertisement-interval 1
        remote-as 132394
        update-source lo
    }
    neighbor 203.20.64.242 {
        address-family {
            ipv4-unicast {
                prefix-list {
                    export BGP-OUT
                }
                route-map {
                    export AS132394-OUT
                    import AS132394-IN
                }
                soft-reconfiguration {
                    inbound
                }
            }
        }
        description "TRANSIT: VIRTUALNODE (AS137273)"
        ebgp-multihop 2
        remote-as 137273
        timers {
            holdtime 30
            keepalive 10
        }
        update-source 103.232.216.226
    }
    parameters {
        default {
            local-pref 125
        }
        log-neighbor-changes
        router-id 103.232.216.226
        scan-time 5
    }
    timers {
        holdtime 10
        keepalive 3
    }
}
pasik added a subscriber: pasik.Jun 26 2019, 8:46 PM

c-po claimed this task.Jul 4 2019, 7:43 PM
c-po added a comment.Jul 4 2019, 7:45 PM

Unfortunately the configuration is incomplete. Can you either add the missing parts or you can PM your real configuration if this works for you, or use the show configuration commands | strip-private command to remove any sensitive stuff.

c-po renamed this task from Upgrade from 1.1.8 -> 1.2.1-S2 ignored sections of config, until they were deleted, and re-setup to BGP configuration (is lost|not applied) when updating 1.1.8 -> 1.2.1.Jul 7 2019, 7:02 PM
c-po added a comment.Jul 7 2019, 7:16 PM

The issue can be reproduced using the sanitized configuration below:

VyOS 1.1.8

vyos@vyos# show
 interfaces {
     ethernet eth0 {
         duplex auto
         hw-id 00:50:56:9d:6d:f4
         smp_affinity auto
         speed auto
         vif 10 {
             address 172.16.33.104/24
         }
     }
     loopback lo {
         address 2001:db8::1/128
         address 10.10.10.10/32
     }
 }
 protocols {
     bgp 100 {
         address-family {
             ipv6-unicast {
                 network 2001:db8::/40 {
                 }
             }
         }
         maximum-paths {
             ibgp 6
         }
         neighbor 192.168.1.1 {
             remote-as 200000
             soft-reconfiguration {
                 inbound
             }
             timers {
                 holdtime 30
                 keepalive 10
             }
         }
         neighbor 2001:db8:ffff::4 {
             address-family {
                 ipv6-unicast {
                     nexthop-self
                     soft-reconfiguration {
                         inbound
                     }
                 }
             }
             remote-as 300000
             update-source lo
         }
         network 1.1.1.0/24 {
         }
         network 1.1.2.0/24 {
         }
         network 1.1.10.0/23 {
         }
         network 200.0.0.0/24 {
         }
         parameters {
             default {
                 local-pref 125
             }
             log-neighbor-changes
             scan-time 5
         }
         timers {
             holdtime 10
             keepalive 3
         }
     }
     static {
         route 0.0.0.0/0 {
             next-hop 172.16.33.254 {
             }
         }
     }
 }
 service {
     ssh {
         port 22
     }
 }
 system {
     config-management {
         commit-revisions 20
     }
     host-name vyos
     login {
         user vyos {
             authentication {
                 encrypted-password $1$n8RZidnr$V5B3zgZEjPMpI6iW5CiHx0
                 plaintext-password ""
             }
             level admin
         }
     }
     name-server 172.16.254.30
     ntp {
         server 0.pool.ntp.org {
         }
         server 1.pool.ntp.org {
         }
         server 2.pool.ntp.org {
         }
     }
     package {
         auto-sync 1
         repository community {
             components main
             distribution helium
             password ""
             url http://packages.vyos.net/vyos
             username ""
         }
     }
     syslog {
         global {
             facility all {
                 level notice
             }
             facility protocols {
                 level debug
             }
         }
     }
     time-zone UTC
 }
[edit]
vyos@vyos# sudo vtysh -c "show run"
Building configuration...

Current configuration:
!
log syslog
log facility local7
!
debug ospf6 lsa unknown
!
interface eth0
 ipv6 nd suppress-ra
 link-detect
!
interface eth0.10
 ipv6 nd suppress-ra
 link-detect
!
interface lo
!
router bgp 100
 bgp router-id 10.10.10.10
 bgp log-neighbor-changes
 bgp default local-preference 125
 bgp network import-check
 bgp scan-time 5
 network 1.1.1.0/24
 network 1.1.2.0/24
 network 1.1.10.0/23
 network 200.0.0.0/24
 timers bgp 3 10
 neighbor 192.168.1.1 remote-as 200000
 neighbor 192.168.1.1 timers 10 30
 neighbor 192.168.1.1 soft-reconfiguration inbound
 neighbor 2001:db8:ffff::4 remote-as 300000
 neighbor 2001:db8:ffff::4 update-source lo
 maximum-paths ibgp 6
!
 address-family ipv6
 network 2001:db8::/40
 neighbor 2001:db8:ffff::4 activate
 neighbor 2001:db8:ffff::4 next-hop-self
 neighbor 2001:db8:ffff::4 soft-reconfiguration inbound
 exit-address-family
!
ip route 0.0.0.0/0 172.16.33.254
!
ip forwarding
ipv6 forwarding
!
line vty
!
end
[edit]

VyOS 1.2.1

vyos@vyos# show
 interfaces {
     ethernet eth0 {
         duplex auto
         hw-id 00:50:56:9d:6d:f4
         smp-affinity auto
         speed auto
         vif 10 {
             address 172.16.33.104/24
         }
     }
     loopback lo {
         address 2001:db8::1/128
         address 10.10.10.10/32
     }
 }
 protocols {
     bgp 100 {
         address-family {
             ipv4-unicast {
                 network 1.1.1.0/24 {
                 }
                 network 1.1.2.0/24 {
                 }
                 network 1.1.10.0/23 {
                 }
                 network 200.0.0.0/24 {
                 }
             }
             ipv6-unicast {
                 network 2001:db8::/40 {
                 }
             }
         }
         maximum-paths {
             ibgp 6
         }
         neighbor 192.168.1.1 {
             address-family {
                 ipv4-unicast {
                     soft-reconfiguration {
                         inbound
                     }
                 }
             }
             remote-as 200000
             timers {
                 holdtime 30
                 keepalive 10
             }
         }
         neighbor 2001:db8:ffff::4 {
             address-family {
                 ipv6-unicast {
                     nexthop-self
                     soft-reconfiguration {
                         inbound
                     }
                 }
             }
             remote-as 300000
             update-source lo
         }
         parameters {
             default {
                 local-pref 125
             }
             log-neighbor-changes
             scan-time 5
         }
         timers {
             holdtime 10
             keepalive 3
         }
     }
     static {
         route 0.0.0.0/0 {
             next-hop 172.16.33.254 {
             }
         }
     }
 }
 service {
     ssh {
         port 22
     }
 }
 system {
     config-management {
         commit-revisions 20
     }
     host-name vyos
     login {
         user vyos {
             authentication {
                 encrypted-password $1$n8RZidnr$V5B3zgZEjPMpI6iW5CiHx0
                 plaintext-password ""
             }
             level admin
         }
     }
     name-server 172.16.254.30
     ntp {
         server 0.pool.ntp.org {
         }
         server 1.pool.ntp.org {
         }
         server 2.pool.ntp.org {
         }
     }
     syslog {
         global {
             facility all {
                 level notice
             }
             facility protocols {
                 level debug
             }
         }
     }
     time-zone UTC
 }
[edit]
vyos@vyos# sudo vtysh -c "show run"
Building configuration...

Current configuration:
!
frr version 7.0-20190411-01-g799dae6
frr defaults traditional
hostname debian
log syslog informational
hostname vyos
service integrated-vtysh-config
!
ip route 0.0.0.0/0 172.16.33.254
!
router bgp 100
 bgp log-neighbor-changes
 bgp default local-preference 125
!
line vty
!
end
[edit]
c-po changed the task status from Open to Confirmed.Jul 7 2019, 7:16 PM
c-po removed c-po as the assignee of this task.
c-po triaged this task as High priority.
c-po changed Difficulty level from Unknown (require assessment) to Normal (likely a few hours).
c-po changed Version from 1.2.1-S2 to 1.2.1.
c-po added a subscriber: c-po.

This is frr show run, immediately after an upgrade to 1.2.1

c-po added a comment.Jul 9 2019, 6:18 PM

Today I experienced the same asynchronity in the OSPFv3 subsystem

syncer assigned this task to zsdc.Fri, Aug 30, 11:44 PM
syncer edited projects, added VyOS 1.3 Equuleus; removed VyOS 1.2 Crux.

We did an update from 1.2.1-S2 to 1.2.3-epa1.

The upgrade went through without error, but the BGP config is AGAIN not being applied, and requires us to delete and re-set the BGP configuration directives.

Specifically setting remote-as and nexthop-self
e.g.
set protocols bgp NNNNNNN neighbor MM.MM.MM.MM address-family ipv4-unicast nexthop-self
set protocols bgp NNNNNNN neighbor MM.MM.MM.MM remote-as '132394'

Though my engineer deleted the entire protocol bgp stanza, and re-add'ed to be safe. Router works fine after we delete and re-add BGP config.
OSPF works fine on the same router without issue. No requirement to modify the config. All other configuration directives seem to survive the reboot/upgrade process.

Is there progress on getting this fixed?

from /var/log/vyatta/vyatta-commit.log

[ protocols bgp 132394 ]
%BGP: No IPv4 Unicast peer configured
%BGP: No IPv6 Unicast peer configured
% Unknown command: bgp scan-time 5
Error configuring routing subsystem. See log for more detailed information

kroy added a subscriber: kroy.Mon, Sep 16, 1:12 AM

There are a number of strange things going on here, and I suspect there are multiple bugs:

First, this option is invalid somehow:

admin@route-extern# set protocols bgp 64749 parameters scan-time 5
[edit]
admin@route-extern# commit
[ protocols bgp 64749 ]
% Unknown command: bgp scan-time 5
Error configuring routing subsystem.  See log for more detailed information

That's the core problem.

Here's the weird part though.

Before:

parameters {
    router-id 172.18.72.2
}

After:

admin@route-extern# set protocols bgp 64749 parameters scan-time 5
[edit]
admin@route-extern# commit
[ protocols bgp 64749 ]
% Unknown command: bgp scan-time 5
Error configuring routing subsystem.  See log for more detailed information

admin@route-extern#  exit discard
Warning: configuration changes have not been saved.
exit

show configuration: 

    parameters {
        router-id 172.18.72.2
        scan-time 5
    }