- User Since
- Aug 22 2020, 1:26 PM (34 w, 1 d)
Oct 14 2020
Just my thoughts - there are situations where rp_filter is not sufficient, and it was not clear to me how to do this cleanly with the zone firewall, so I ended up hacking a few iptables commands in rc.local instead.
Oct 2 2020
Sep 9 2020
Aug 31 2020
Even with customers routes redistributed by OSPF instead of iBGP, it has just crashed again:
I tried unit-cache earlier but it seems to have issues too - I've seen duplicate routes if the same client (all have static IP assigned by RADIUS based on username) connects to a different PPPoE server and the old route is not removed, as if the cached (not removed) PPPoE interfaces were not seen as removed in FRR. But I haven't investigated this in more detail as it's a production setup, can't experiment too much on live customers.
I'm considering if I could go back to redistributing PPPoE customers /32 routes in OSPF instead of iBGP - it has been that way for a few years (using MikroTik, before moving to VyOS), but I've recently changed it following "BGP Best Current Practices" http://www.bgp4all.com.au/pfs/_media/workshops/05-bgp-bcp.pdf which recommends using OSPF only for infrastructure, not customers - seems logical to me as BGP was designed for much larger routing tables (all of the Internet), but perhaps OSPF is still good enough for just a few hundreds of customers.
Aug 30 2020
I've just had two different routers (one bare metal and one VM) crash roughly at the same time, triggered by many PPPoE sessions disconnecting at the same time due to a short power failure (routers itself had power all the time, but power was interrupted for about a minute to a switch on the network between the routers and PPPoE clients). Stack traces are very similar (absolute addresses differ, but the same functions and offsets in them). And again, each time watchfrr restarted bgpd but it was not working until reboot. No problems so far with two other BGP routers running a similar configu but without any dynamic interfaces (only OSPF and BGP, no PPPoE servers).
Aug 28 2020
Aug 27 2020
It crashed again after 5 days in 1.2.6-epa1, in the same function, also when a dynamic PPPoE interface was deleted.
It happens less frequently after the former customers who repeatedly failed authentication have been physically disconnected.
Again, BGP no longer works after watchfrr has restarted the bgpd process. All works again after reboot.
Aug 25 2020
Aug 22 2020
Maybe related - https://github.com/FRRouting/frr/issues/6439