Page MenuHomeVyOS Platform

Closing IPV6CP by client closes PPPoE link completely, even if IPv6 is optional
Open, HighPublicBUG

Description

Some background - https://forum.vyos.io/t/pppoe-server-dual-stack-ipv4-ipv6-buggy-soho-routers-that-refuse-ipv6/9010/4
traffic capture - http://91.224.224.43/phicomm/phicomm6.cap
RFC1661 - 3.7 Link Termination Phase - Implementation Note:

The closing of the link by LCP is sufficient.  There is no need
for each NCP to send a flurry of Terminate packets.  Conversely,
the fact that one NCP has Closed is not sufficient reason to cause
the termination of the PPP link, even if that NCP was the only NCP
currently in the Opened state.

Some cheap buggy unsupported SOHO routers (specifically, Phicomm KE2M) as PPPoE clients don't support IPv6, but try to negotiate it anyway and then fail, client sends IPV6CP terminate request, and VyOS PPPoE server responds by terminating LCP, even though it could still work fine as IPv4 only. This makes IPv6 deployment difficult, as changing ppp-options ipv6 "deny" to "allow" breaks existing IPv4 customers who happen to have such buggy routers.

Details

Difficulty level
Unknown (require assessment)
Version
1.3.1-S1
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Behavior change
Issue type
Bug (incorrect behavior)

Event Timeline

Log messages - http://91.224.224.43/phicomm/phicomm6.log
PPPoE server config:

set service pppoe-server authentication mode 'radius'
set service pppoe-server authentication protocols 'pap'
set service pppoe-server authentication protocols 'chap'
set service pppoe-server authentication protocols 'mschap'
set service pppoe-server authentication protocols 'mschap-v2'
set service pppoe-server authentication radius acct-timeout '0'
set service pppoe-server authentication radius nas-ip-address '91.224.224.11'
set service pppoe-server authentication radius server 91.224.224.XXX key '********'
set service pppoe-server authentication radius source-address '91.224.224.11'
set service pppoe-server authentication radius timeout '5'
set service pppoe-server client-ip-pool subnet '185.38.221.112/28'
set service pppoe-server client-ipv6-pool delegate 2a04:7840:dd70::/48 delegation-prefix '56'
set service pppoe-server client-ipv6-pool prefix 2a04:7840:dd71::/48 mask '64'
set service pppoe-server gateway-address '91.224.224.11'
set service pppoe-server interface eth2.49
set service pppoe-server interface eth2.50
set service pppoe-server interface eth2.55
set service pppoe-server interface eth2.59
set service pppoe-server mtu '1500'
set service pppoe-server name-server '91.224.224.55'
set service pppoe-server name-server '91.224.224.44'
set service pppoe-server ppp-options ipv4 'require'
set service pppoe-server ppp-options ipv6 'allow'
set service pppoe-server ppp-options ipv6-accept-peer-intf-id
set service pppoe-server ppp-options ipv6-intf-id 'random'
set service pppoe-server ppp-options ipv6-peer-intf-id 'calling-sid'
set service pppoe-server ppp-options lcp-echo-failure '3'
set service pppoe-server ppp-options lcp-echo-interval '10'
set service pppoe-server ppp-options min-mtu '1280'
set service pppoe-server ppp-options mppe 'deny'

client pools only defined to suppress a warning during "commit", all IPv4/v6 addresses/prefixes are actually static assigned by RADIUS.
Simply change ppp-options ipv6 "allow" to "deny" - and all customers with Phicomm routers stop disconnecting, work fine as IPv4 only.
The setup is proven working otherwise - 10/10 on test-ipv6.com and 19/20 on ipv6-test.com (-1 for lack of IPv6 revDNS) via properly configured MikroTik router as PPPoE client.

See also https://github.com/accel-ppp/accel-ppp/issues/57
Testing this patch, PPPoE session with the Phicomm router now stays up, the missing part after "else" is to remove IPv6 configuration from ppp interface (not sure how to do it properly).

diff
diff --git a/accel-pppd/ppp/ppp_ipv6cp.c b/accel-pppd/ppp/ppp_ipv6cp.c
index 1194b31..2bac31b 100644
--- a/accel-pppd/ppp/ppp_ipv6cp.c
+++ b/accel-pppd/ppp/ppp_ipv6cp.c
@@ -738,7 +738,10 @@ static void ipv6cp_recv(struct ppp_handler_t*h)
                        if (conf_ppp_verbose)
                                log_ppp_info2("recv [IPV6CP TermReq id=%x]\n", hdr->id);
                        ppp_fsm_recv_term_req(&ipv6cp->fsm);
-                       ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0);
+                       if (conf_ipv6 == IPV6_REQUIRE)
+                               ap_session_terminate(&ipv6cp->ppp->ses, TERM_USER_REQUEST, 0);
+                       else
+                               ppp_layer_passive(ipv6cp->ppp, &ipv6cp->ld);
                        break;
                case TERMACK:
                        if (conf_ppp_verbose)

As a temporary workaround, I use the script below. For some reason /etc/rc.local no longer runs automatically on VyOS 1.3.2, so I run it manually after each reboot for now. Until it is run, Phicomm routers keep disconnecting due to failed IPV6CP negotiation incorrectly triggering complete PPPoE session termination. I have two PPPoE servers at different locations for redundancy, both rebooting at the same time is very unlikely, so I can live with it for now.

#! /bin/sh
set -e

# block ipv6cp from broken phicomm routers (MAC prefix: d8c8e9, fc7c02)
PPPOE_IFACES="eth2 eth4"

for i in $PPPOE_IFACES ; do
  tc qdisc del dev $i ingress || true
  tc qdisc add dev $i ingress
  tc filter add dev $i parent ffff: u32 match u32 0xd8c8e900 0xffffff00 at -8 match u16 0x8864 0xffff at -2 match u16 0x8057 0xffff at 6 action drop
  tc filter add dev $i parent ffff: u32 match u32 0xfc7c0200 0xffffff00 at -8 match u16 0x8864 0xffff at -2 match u16 0x8057 0xffff at 6 action drop
done

To verify the script worked:

$ tc filter show dev eth2 ingress
filter parent ffff: protocol all pref 49151 u32 chain 0
filter parent ffff: protocol all pref 49151 u32 chain 0 fh 801: ht divisor 1
filter parent ffff: protocol all pref 49151 u32 chain 0 fh 801::800 order 2048 key ht 801 bkt 0 terminal flowid ??? not_in_hw
  match fc7c0200/ffffff00 at -8
  match 00008864/0000ffff at -4
  match 00008057/0000ffff at 4
        action order 1: gact action drop
         random type none pass val 0
         index 2 ref 1 bind 1

filter parent ffff: protocol all pref 49152 u32 chain 0
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800: ht divisor 1
filter parent ffff: protocol all pref 49152 u32 chain 0 fh 800::800 order 2048 key ht 800 bkt 0 terminal flowid ??? not_in_hw
  match d8c8e900/ffffff00 at -8
  match 00008864/0000ffff at -4
  match 00008057/0000ffff at 4
        action order 1: gact action drop
         random type none pass val 0
         index 1 ref 1 bind 1