esp-idf: ESP32: Wifi/Network stack broken after beacon timeout (IDFGH-10357)
Answers checklist.
- I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
- I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
- I have searched the issue tracker for a similar issue and not found a similar issue.
IDF version.
v4.3.1
Operating System used.
macOS
How did you build your project?
Command line with idf.py
If you are using Windows, please specify command line type.
None
Development Kit.
Custom Board with ESP32DOWDQ6
Power Supply used.
External 3.3V
What is the expected behavior?
The ESP32 connects to wifi and stays connected normally until it is powered off or a call to esp_wifi_disconnect() is made.
What is the actual behavior?
Randomly our ESP32 lose Wifi connection, it happens in the filed regularly. The ESP32 operates correctly then suddenly two error messages appears :
wifi:m f null
wifi:bcn_timout,ap_probe_send_start
The first message is quite not comprehensible but the second suggests a timeout happened and that the AP did not send a beacon in the correct interval. Normally, the would actively probe the AP five times to “re-enable” the connection. However in this, it is not done and the ESP stays in this state until a reboot is performed.
All network services stop working, no more web server, mdns, udp sockets and aws connection. The network is completely done. At first I thought it could be because of a memory leak but when it happens memory is high (60K left).
The problem is that the issue is rather random and not reproducible. It seems that using the “WIFI_PS_MIN_MODEM” power saving mode and connecting to Ubiquiti appliances trigger the problem more often.
Steps to reproduce.
That’s the problem, we cannot reproduce it. We see it happening a lot and it happened yersterday on an ESP32 we were monitoring and we just got the two messages above. It seems to be related to power saving mode and to ubiquiti appliances.
Debug Logs.
No response
More Information.
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Reactions: 1
- Comments: 56 (11 by maintainers)
hi @FnxQT , there was a festival days before so sorry for the late reply
I saw captured packet and it seems to be an ethernet one, and the wireless one would be helpful 😃
And I think I have found the reason for this issue, which is a mistake for Wi-Fi power-save state machine when handling the failure of allocating nulldata. I have attacthed another lib below and this one shall be able to workaround this issue.
The
m f nullmay still happen, which was beacause of the shortage of memory at that moment, while the blocked packets shall be relieved later since the memory would be recoverd later. If you still find that there is no traffic afterm f nullfor a long time, please feel free to provide us with the packet and log. Thanks! esp32_76263271.zipHi @Espressif-liuuuu,
thank you for the followup. However as suggested by @Alvin1Zhang, it would be great to see the fix back ported to v4.4 and v4.3 since they are still under maintenance. Our products are currently using 4.4.4 and we can’t easily make a jump to v5.1…
What do you think about that ?
Hey @Espressif-liuuuu, I just returned from days off. Before going, I added the esp_get_free_internal_heap_size print where you told me to.
However, during the previous week, the problem did not trigger again. I will keep you updated.
Yes thats do weird. Could please add a check at the line below when the
heap_caps_mallocreturn null, and log the internal memory byesp_get_free_internal_heap_size? Lets make sure if the memory is sufficient at the moment.https://github.com/espressif/esp-idf/blob/8b94183c9cb47ede8f02df5598d8b9d68c754860/components/esp_wifi/esp32/esp_adapter.c#L544C4-L544C4
Hi @FnxQT , Sorry for late reply! This one is a little bit difficult to debug, so lets make sure two things at first on v4.4.4:
m f nullhappenedbeacon timeoutorwifi disconnect, but it failed to see STA in AP`s listAnd from the log you provided, it showed that the hardware was still working, meanwhile, only a little traffic seen. We may need a bit more information for it:
esp_set_wifi_ps(0)hi @FnxQT We are analyzing, and if there is any progress, we will inform you as soon as possible.