Page MenuHomePhabricator

PPPoE "Terminating on signal 15"
Closed, InvalidPublicBUG

Description

Firstly thanks for VyOS, I remember using Vyatta back in University and am glad I can use VyOS at home now

I have eth0 with pppoe setup to connect to my DSL service

On VyOS 1.18 this has been working great, but I'm in the process of upgrading the hardware (to 64 bit AMD) and with the latest 1.2.0 rolling release 1.2.0-rolling+201907210337 the PPPoE connection drops after a few minutes

Here's typical output from show interfaces pppoe pppoe5 log

Script /etc/ppp/ip-pre-up finished (pid 4226), status = 0x0
local  IP address <IPV4 REDACTED>
remote IP address <IPV4 REDACTED>
primary   DNS address <IPV4 REDACTED>
secondary DNS address <IPV4 REDACTED>
Script /etc/ppp/ip-up started (pid 4266)
Script /etc/ppp/ip-up finished (pid 4266), status = 0x0
nect command
Terminating on signal 15
Connect time 6.5 minutes.
Sent 6185000 bytes, received 20974007 bytes.
Script /etc/ppp/ip-down started (pid 4860)
sent [LCP TermReq id=0x2 "User request"]
rcvd [LCP TermAck id=0x2]
Connection terminated.
Waiting for 1 child processes...
  script /etc/ppp/ip-down, pid 4860
Script /etc/ppp/ip-down finished (pid 4860), status = 0x0
t command
Send PPPOE Discovery V1T1 PADI session 0x0 length 12
 dst ff:ff:ff:ff:ff:ff  src 00:0d:b9:44:5a:7c

Once I get that Terminating on signal 15 other commands fail

$ show pppoe-server interfaces
Connection to "localhost:2001" failed

What is the service that should be listening on localhost:2001 ?

I can take the interface down and bring it back up again but will only get around 5 minutes of connectivity before another Terminating on signal 15

$ disconnect interface pppoe5
Bringing interface pppoe5 down...
$ connect interface pppoe5
Bringing interface pppoe5 up...

Happy to provide any additional details to help debug this issue

Details

Difficulty level
Unknown (require assessment)
Version
1.2.0
Why the issue appeared?
Will be filled on close

Event Timeline

mustard created this task.Jul 21 2019, 2:17 PM
hagbard added a subscriber: hagbard.EditedJul 21 2019, 3:37 PM

Hi @mustard ,

your are confused with client and server commands.

show pppoe-server interfaces``` shows you connected clients when you provide pppoe as a service to other users.

localhost:2001 is a control connection to communicate with the accel-ppp daemon.
For more information see: pppoe-server

According to your description above, it looks more like that you are the client and your ISP the provider. The client log looks like that you as the client terminating the connection, that could be the case if there is no traffic flow and a idle timer is set etc.
It take a little more investigation. Can you share your setup (remove the credentials before posting it anywhre). A tcpdump, would be helpful as well (make sure to remove the credentials there as well), I'm more interested in the ppp control messages to see which side initiates the connection termination.

hagbard claimed this task.Jul 21 2019, 3:37 PM
Dmitry added a subscriber: Dmitry.Jul 21 2019, 8:34 PM
mustard added a comment.EditedJul 22 2019, 10:26 AM

Hi @hagbard

Thanks for the reply,

Sorry about my confusion with pppoe-server, yes I am the client in this case connecting to my ISP

My config is...

vyos@vyos# show interfaces ethernet eth0 pppoe
 pppoe 5 {
     default-route auto
     mtu 1492
     name-server auto
     password redacted
     policy {
         route pppoe-out
     }
     user-id redacted
 }
vyos@vyos# show policy route pppoe-out
 description "PPPoE TCPMSS Clamping"
 rule 100 {
     protocol tcp
     set {
         tcp-mss 1452
     }
     tcp {
         flags SYN
     }
 }

The following packet captures were taken with monitor traffic interface eth0 save <filename.pcap>

I then applied the filter pppoed or lcp or ipcp or pap or chap and then "Export Specified Packets" in wireshark

pppoe-filtered-after-disconnect.pcap was taken after I had lost connectivity what's confusing to me is that the Echo Request / Replies still seem to be going through

pppoe-filtered-during-disconnect.pcap was taken during the "disconnect" it was working till the 5 or so packet there after I couldn't access the intenet

Your filter was a bit to tight, all I see are the LCP echos. I' more interested in the PAD packets, in particular what sides sends a PADT and if PADS is clean.

hagbard added a comment.EditedJul 22 2019, 5:44 PM

I can't reproduce your issue, so I suspect no issues with the pppoe client we ship in vyos.

Jul 22 16:34:34 connect: ppp0 <--> pppoe(08:00:27:03:71:c9)
Jul 22 16:34:34 send [PAP AuthAck id=1 "Authentication succeeded"]
Jul 22 16:34:34 send [CCP ConfReq id=95 <mppe +H -M +S -L -D -C>]
Jul 22 16:34:34 test: authentication succeeded
Jul 22 16:34:34 recv [IPCP ConfReq id=1 <addr 0.0.0.0> <dns1 0.0.0.0> <dns2 0.0.0.0>]
Jul 22 16:34:34 send [IPCP ConfReq id=2e <addr 192.168.100.1>]
Jul 22 16:34:34 send [IPCP ConfRej id=1 <dns1 0.0.0.0> <dns2 0.0.0.0>]
Jul 22 16:34:34 recv [CCP ConfReq id=1]
Jul 22 16:34:34 send [CCP ConfAck id=1]
Jul 22 16:34:34 recv [CCP ConfRej id=95]
Jul 22 16:34:34 send [CCP ConfReq id=96]
Jul 22 16:34:34 recv [IPCP ConfAck id=2e <addr 192.168.100.1>]
Jul 22 16:34:34 recv [IPCP ConfReq id=2 <addr 0.0.0.0>]
Jul 22 16:34:34 send [IPCP ConfNak id=2 <addr 192.168.0.0>]
Jul 22 16:34:34 recv [CCP ConfAck id=96]
Jul 22 16:34:34 recv [IPCP ConfReq id=3 <addr 192.168.0.0>]
Jul 22 16:34:34 send [IPCP ConfAck id=3]

still connected at:
Mon Jul 22 17:43:30 UTC 2019

ifname | username | ip | ip6 | ip6-dp | calling-sid | rate-limit | state | uptime | rx-bytes | tx-bytes
--------+----------+-------------+-----+--------+-------------------+------------+--------+----------+----------+----------
ppp0 | test | 192.168.0.0 | | | 08:00:27:03:71:c9 | | active | 01:23:55 | 70 B | 64 B

pasik added a subscriber: pasik.Jul 22 2019, 9:00 PM
hagbard changed the task status from Open to On hold.Jul 23 2019, 4:05 PM
mustard added a comment.EditedJul 24 2019, 8:46 AM

Hi @hagbard

Thanks for trying to reproduce the issue,

I have attached another filtered capture of an entire session, the PPPoE and a ping 8.8.8.8 that was running the whole time, you can see around time 440 to 500 (the end of the capture) that there are no ICMP PING requests they were failing to reach 8.8.8.8 but the PPP LCP echo still happens,

Can you recommend any system debug to enable that might provide more details?

mustard added a comment.EditedJul 24 2019, 11:55 AM

I enabled debug in /etc/ppp/options and all seems to be ok with the underlying PPPoE connection but I still loose the connection. I think the default route is being dropped some how?

When I establish a connection it has the default route

root@vyos:/home/vyos# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         *               0.0.0.0         U     0      0        0 pppoe5
10.10.0.0       *               255.255.0.0     U     0      0        0 eth1
nme-sot-dry-bra *               255.255.255.255 UH    0      0        0 pppoe5

but after a few minutes it's removed, the PPPoE seems ok but there's no route so no connectivity

root@vyos:/home/vyos# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.10.0.0       *               255.255.0.0     U     0      0        0 eth1
10.20.23.115    *               255.255.255.255 UH    0      0        0 pppoe5

I've tried different options for interfaces ethernet eth0 pppoe 5 default-route auto, none, force and they must ony get applied when the route PPPoE connection is established because they don't seem to make a difference

Hello @mustard . Can you provide logs when route deleted?
show log tail 50

mustard added a comment.EditedJul 24 2019, 12:23 PM

Hi @Dmitry

The following log lines are produced when the disconnect occurs

Jul 24 12:17:38 vyos dbus[926]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Jul 24 12:17:38 vyos dbus[926]: [system] Successfully activated service 'org.freedesktop.hostname1'

log from PPPoE connect to route lost was

Jul 24 12:16:14 vyos pppd[3583]: pppd 2.4.7 started by vyos, uid 0
Jul 24 12:16:14 vyos pppd[3583]: Connected to c0:8c:60:cd:76:d0 via interface eth0
Jul 24 12:16:14 vyos pppd[3583]: Connect: ppp0 <--> eth0
Jul 24 12:16:14 vyos pppd[3583]: PAP authentication succeeded
Jul 24 12:16:14 vyos pppd[3583]: peer from calling number C0:8C:60:CD:76:D0 authorized
Jul 24 12:16:15 vyos pppd[3583]: local  IP address 14.201.9.105
Jul 24 12:16:15 vyos pppd[3583]: remote IP address 10.20.22.97
Jul 24 12:16:15 vyos pppd[3583]: primary   DNS address 203.12.160.35
Jul 24 12:16:15 vyos pppd[3583]: secondary DNS address 203.12.160.36
Jul 24 12:17:38 vyos dbus[926]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Jul 24 12:17:38 vyos dbus[926]: [system] Successfully activated service 'org.freedesktop.hostname1'
hagbard added a comment.EditedJul 24 2019, 4:25 PM

@mustard Your ppp session looks just fine. The removal of your default route doesn't terminate the pppoe connection, at least there was no termination request in the dump. Your icmp echo request time out since you don't have a default gateway, so we need to find out why the default gateway gets removed. When you see the icmp timeouts for 8.8.8.8 your pppoe session is still alive and is answering the LCP requests successful, so I think we need to focus on frr and why it is removing the efault route.

Can you please provide the version of VyOS you running? (show version)
Also please check for the occurrence of:`/etc/ppp/peers/pppoe0: unrecognized option 'replacedefaultroute' in /var/log/messages.

hagbard added a comment.EditedJul 24 2019, 5:14 PM

Tested rolling from yesterday and 1.2.2, no issues.

set interfaces ethernet eth0 pppoe 0 default-route 'auto'
set interfaces ethernet eth0 pppoe 0 password 'test'
set interfaces ethernet eth0 pppoe 0 user-id 'test'
vyos@vyos# run show ip route 
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       F - PBR, f - OpenFabric,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 [0/0] is directly connected, pppoe0, 00:26:01
C>* 10.1.1.0/24 is directly connected, eth1, 00:27:28
C>* 192.168.100.1/32 is directly connected, pppoe0, 00:26:01

So, I need first a way to replicate your issue somehow.
I also have an icmp echo successfully running.

Hi @hagbard

Sorry about the delays, been a couple hectic days at work.

Yea I see the unrecognized option

Jul 24 12:02:07 vyos commit: Successful change to active configuration by user vyos on /dev/pts/3
Jul 24 12:03:37 vyos pppd[6697]: In file /etc/ppp/peers/pppoe5: unrecognized option 'replacedefaultroute'
Jul 24 12:05:25 vyos dbus[924]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Jul 24 12:05:25 vyos dbus[924]: [system] Successfully activated service 'org.freedesktop.hostname1'

Version

vyos@vyos:~$ show version
Version:          VyOS 1.2.0-rolling+201907210337
Built by:         autobuild@vyos.net
Built on:         Sun 21 Jul 2019 03:37 UTC
Build UUID:       d91aec58-9179-420c-b8fc-6323f4e07e8a
Build Commit ID:  8c22ceead487b7

Architecture:     x86_64
Boot via:         installed image
System type:      bare metal

Hardware vendor:  PC Engines
Hardware model:   apu2
Hardware S/N:     123456789
Hardware UUID:    Unknown

Copyright:        VyOS maintainers and contributors

I also had the same issue on 1.2.0-rolling+201905190337 back in May last time I tried

When I get a chance, hopefully this weekend I'll try and come up with a minimum configuration / setup that recreates the issue, I really don't want to waste your time and appreciate the help

Cheers
Dan

That error will disappear once https://github.com/vyos/ppp-upstream/pull/1 has been merged in. You can meanwhile try to remove the route option and see if it makes any difference.

hagbard removed hagbard as the assignee of this task.Jul 30 2019, 7:31 PM
hagbard moved this task from Need Triage to Backlog on the VyOS 1.2 Crux board.

Added the following to /etc/ppp/ip-down
echo "IP Down $1 $2 $3 $4 $5 $6" >> /tmp/foo.log

And I could see the following when the default route was being removed but the PPPoE link was still up / ok. I don't know why or what was invoking this ip-down

$ cat /tmp/foo.log
IP Down pppoe5 eth0 0 14.201.9.105 10.20.23.115 pppoe5

It always produced the following in /var/log/messages at the same time

Jul 24 12:17:38 vyos dbus[926]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service'
Jul 24 12:17:38 vyos dbus[926]: [system] Successfully activated service 'org.freedesktop.hostname1'

I have tried very hard to reproduce this in virtual machines outside of the router PCEngines APU2 that had been giving me issues, and was unable to.

This made me think there was something wrong with my configuration so I got a slightly newer rolling release add system image https://downloads.vyos.io/rolling/current/amd64/vyos-1.2.0-rolling%2B201907300337-amd64.iso (perhaps this version had a fix?)

And while adding this image I chose "No" to saving my configuration, then a rebuilt my configuration by running configuration commands.

Since then it's working fine! And for the life of me I can't find anything different in my configuration :-(

That previous configuration had been copied over from a 1.1.8 version so perhaps it had something in it that caused an issue?

I am admitting defeat, so can close this as unable to reproduce?

hagbard closed this task as Invalid.Aug 5 2019, 1:35 AM

Thanks for the info.

syncer edited projects, added Invalid; removed VyOS 1.2 Crux.Aug 31 2019, 2:42 AM