zephyr: PPP interface LCP negotiation failure

I tried raising this issue in the networking sub-group for zephyr on Discord but did not get a response, hence raising it here instead.

This is with an nRF5340DK board using nRFConnect SDK (2.5.0)/Zephyr 3.4.99ncs: we are tied to the nRFConnect Zephyr versions but I think this should be OK, PPP-wise, as the only change in the PPP code since Zephyr 3.4.99ncs is to remove the “experimental” flag.

To be clear, this issue is with PPP, not with the modem stuff; we are not using that, we are JUST using PPP to a UART, on the other side of which a peer PPP entity is running.

THE PROBLEM

PPP will not pass LCP negotiation with a PPP-peer that I know to be well behaved (a production device that passes LCP negotiation v LWIP’s PPP interface).

I’ve attached a full log with the debug output from Zephyr PPP but, in summary, the Zephyr-side PPP sends an LCP Configure-Request (c0 21 01 01 00 04) and the debug prints shows that the Zephyr-side PPP receives back an LCP Configure-Acknowledge for this from the device (ff 03 c0 21 02 01 00 04 1c 90), which looks well formed to me (given the Configure-Request), but this response is ignored and Zephyr-side PPP repeats the LCP Configure-Request. I know that Zephyr-side PPP really is receiving/processing packets (i.e. it is not some weird data-routing or FCS issue) because the device sends its LCP Configure-Request a little while later and Zephyr-side PPP responds to that with a correctly-formatted Configure-Reject.

I am building with the following KConfig flags:

CONFIG_NETWORKING=y
CONFIG_NET_DRIVERS=y
CONFIG_PPP_NET_IF_NO_AUTO_START=y
CONFIG_NET_PPP=y
CONFIG_NET_L2_PPP=y
CONFIG_NET_PPP_ASYNC_UART=y
CONFIG_UART_ASYNC_API=y
CONFIG_NET_PPP_UART_BUF_LEN=512
CONFIG_NET_PPP_ASYNC_UART_TX_BUF_LEN=512

…and my code is currently not much more than:

pNetIf = net_if_get_first_by_type(&NET_L2_GET_NAME(PPP));
if (pNetIf != NULL) {
    net_if_carrier_on(pNetIf);
    if (net_if_up(pNetIf) == 0) {
        ...

…after which I just wait for a long time (no events hooked in or the like),

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 41 (19 by maintainers)

Most upvoted comments

I had assumed that net_if_up() returning [zero] meant that the network interface was up but it seems that’s not the case, I should be waiting for an event of some form?

Yes, you could wait NET_EVENT_IF_UP event if you want to know when the interface is up. The reason why this is not tied to the function return value is that the operation might take time and we do not want the function to block.

I managed to test the pap with Linux pppd and everything was working fine with upstream zephyr. Changes compared to my earlier instructions somewhere above:

diff --git a/loop-pppd.sh b/loop-pppd.sh
index 3d39169..6fae209 100755
--- a/loop-pppd.sh
+++ b/loop-pppd.sh
@@ -33,7 +33,7 @@ ctrl_c() {
 
 while [ $STOPPED -eq 0  ]; do
     pppd $PTS_DEV +ipv6 local \
-        noauth novj noccp noaccomp nodefaultroute nomp nodetach \
+        +pap novj noccp noaccomp nodefaultroute nodetach \
         debug silent 192.0.2.2:192.0.2.1
     sleep 2
 done

and then adding the secret to /etc/ppp/pap-secrets

*   *  blank  *

The Linux pppd then tells this:

Using interface ppp0
Connect: ppp0 <--> /dev/pts/3
rcvd [LCP ConfReq id=0x1]
sent [LCP ConfReq id=0x1 <asyncmap 0x0> <auth pap> <magic 0xb0dcb4ea> <pcomp>]
sent [LCP ConfAck id=0x1]
rcvd [LCP ConfRej id=0x1 <asyncmap 0x0> <magic 0xb0dcb4ea> <pcomp>]
sent [LCP ConfReq id=0x2 <auth pap>]
rcvd [LCP ConfAck id=0x2 <auth pap>]
sent [LCP EchoReq id=0x0 magic=0x0]
rcvd [PAP AuthReq id=0x1 user="blank" password=<hidden>]
sent [PAP AuthAck id=0x1 "Login ok"]
PAP peer authentication succeeded for blank
sent [IPCP ConfReq id=0x1 <addr 192.0.2.2>]
sent [IPV6CP ConfReq id=0x1 <addr fe80::c9af:1e91:ed32:8ff3>]
rcvd [LCP EchoRep id=0x0 magic=0x0]
rcvd [IPCP ConfReq id=0x1 <addr 0.0.0.0> <ms-dns1 0.0.0.0> <ms-dns2 0.0.0.0>]
sent [IPCP ConfRej id=0x1 <ms-dns1 0.0.0.0> <ms-dns2 0.0.0.0>]
rcvd [IPV6CP ConfReq id=0x1 <addr fe80::0000:5eff:fe00:53ca>]
sent [IPV6CP ConfAck id=0x1 <addr fe80::0000:5eff:fe00:53ca>]
rcvd [IPCP ConfAck id=0x1 <addr 192.0.2.2>]
rcvd [IPV6CP ConfAck id=0x1 <addr fe80::c9af:1e91:ed32:8ff3>]
local  LL address fe80::c9af:1e91:ed32:8ff3
remote LL address fe80::0000:5eff:fe00:53ca
Script /etc/ppp/ipv6-up started (pid 41424)
rcvd [IPCP ConfReq id=0x2 <addr 0.0.0.0>]
sent [IPCP ConfNak id=0x2 <addr 192.0.2.1>]
rcvd [IPCP ConfReq id=0x3 <addr 192.0.2.1>]
sent [IPCP ConfAck id=0x3 <addr 192.0.2.1>]
Script /etc/ppp/ip-pre-up started (pid 41427)
Script /etc/ppp/ip-pre-up finished (pid 41427), status = 0x0
local  IP address 192.0.2.2
remote IP address 192.0.2.1
Script /etc/ppp/ip-up started (pid 41433)
Script /etc/ppp/ipv6-up finished (pid 41424), status = 0x0
Script /etc/ppp/ip-up finished (pid 41433), status = 0x0
Modem hangup

If you are able to fix this PAP handshaking issue and submit it to upstream, then the fix will propagate itself to NCS too, I would not be too worried about it.

So it does seem likely that this is a timing thing: somehow the code is expecting a configure request to arrive from the modem in the same packet as the configure ack, whereas in this case it arrives in a separate packet.

No, there is no expectation like that. By default there should be 3sec timeout but for some reason the timeout happens almost immediately.

Looking at fsm.c, when it receives a Configure Acknowledge it does seem to call the FSM to send a Configure Request:

It might be that we are not in the proper state PPP_ACK_RECEIVED when we receive the ACK.

Also, if I’ve read your post correctly, the first thing that happens for you is rcvd [LCP ConfReq id=0x1], which is different to my case where it is Zephyr doing the first TX

In my test Zephyr sends the first packet too, the log is from the Linux (modem) side in this case and there pppd just waits data and the first packet is from zephyr LCP ConfReq

@RobMeades sorry for the delay, I have been busy with other stuff. The PPP worked in my setup, but next I need to figure out what I need to change to test your use case.

FYI, noticing IPV6CP above and no sign of anything saying IPV4 I tried adding CONFIG_NET_IPV4=y to the build but that did not change the pattern or the outcome.

For IPCP to work you indeed need to enable IPv4. The LCP happens before IPCP so missing IPCP is not affecting the issue you are seeing.