zephyr: ESP32 wifi sample with MQTT over TLS crashes

Describe the bug I’ve created a custom sample which is combination of samples/net/wifi and samples/net/mqtt_publisher. I’m developing on ESP32-WROOM-32E (ESP32 devkitc v4). I’ve created a helper shell command for using MQTT commands (connect/disconnect/publish/subscribe…). MQTT connection is using TLS.

Randomly after sending shell commands I get fatal error.

Sometimes I also get error below, but not always so I don’t think it’s relevant but listing it here as additional info.

[00:04:42.993,000] <err> net_pkt: Data buffer (1203) allocation failed.
[00:04:42.993,000] <err> net_tcp: conn: 0x3fffcb24 packet allocation failed, len=1155
[00:04:42.993,000] <err> net_tcp: TCP failed to allocate buffer in retransmission

I tried increasing various thread stack sizes (shell, workqueue) but to no avail.

To Reproduce Steps to reproduce the behavior:

  1. west build -b esp32
  2. west flash --esp-device /dev/ttyUSB0 && west espressif monitor -p /dev/ttyUSB0
  3. Connect to AP and MQTT broker, publish subscribe, disconnect, wifi scan
  4. Error happens randomly

Expected behavior I don’t expect to see any errors.

Impact Showstopper

Logs and console output

shell commands and logs

uart:~$ wifi connect "*****" *****
Connection requested
[00:04:24.983,000] <dbg> net_wifi_mgmt: wifi_connect: (shell_uart): ch 255 sec 1
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_start_internal: (esp_event): iface 0x3ffb1c68 state=init
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_start_internal: (esp_event): wait timeout=5s
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (esp_event): sched timeout dhcvp4=0x3ffb1d08 timeout=5s
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=init timeleft=5
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 5s
[00:04:31.972,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=init timeleft=0
[00:04:31.972,000] <dbg> net_dhcpv4: dhcpv4_enter_selecting: (sysworkq): enter state=selecting
[00:04:31.973,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 291
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (sysworkq): sched timeout dhcvp4=0x3ffb1d08 timeout=4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_send_discover: (sysworkq): send discover xid=0xdfe0571a timeout=4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=selecting timeleft=4
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): Received dhcp msg [op=0x2 htype=0x1 hlen=6 xid=0xdfe0571a secs=0 flags=0x80 chaddr=94:3C:C6:CD:D7:60
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   ciaddr=0.0.0.0
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   yiaddr=192.168.43.97
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   siaddr=192.168.43.1
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   giaddr=0.0.0.0]
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_server_id: 192.168.43.1
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_lease_time: 3600
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_renewal: 1800
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_rebinding: 3150
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_subnet_mask 255.255.255.0
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 28
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_router: 192.168.43.1
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 43
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_end
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_handle_reply: (rx_q[0]): state=selecting msg=offer
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_enter_requesting: (rx_q[0]): enter state=requesting
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (rx_q[0]): sched timeout dhcvp4=0x3ffb1d08 timeout=4s
[00:04:32.155,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 298
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_send_request: (rx_q[0]): send request dst=255.255.255.255 xid=0xdfe0571b ciaddr=<unknown> +server-id +requested-ip timeout=4s
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=requesting timeleft=4
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): Received dhcp msg [op=0x2 htype=0x1 hlen=6 xid=0xdfe0571b secs=0 flags=0x80 chaddr=94:3C:C6:CD:D7:60
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   ciaddr=0.0.0.0
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   yiaddr=192.168.43.97
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   siaddr=192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]):   giaddr=0.0.0.0]
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_server_id: 192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_lease_time: 3600
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_renewal: 1800
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_rebinding: 3150
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_subnet_mask 255.255.255.0
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 28
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_router: 192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 43
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_end
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_handle_reply: (rx_q[0]): state=requesting msg=ack
[00:04:32.358,000] <inf> net_dhcpv4: Received: 192.168.43.97
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_enter_bound: (rx_q[0]): enter state=bound renewal=1800s rebinding=3150s
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (rx_q[0]): sched timeout dhcvp4=0x3ffb1d08 timeout=1800s
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=bound timeleft=1800
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 1800s
[00:04:32.359,000] <inf> wifi_sample: Connected
Connected
uart:~$ mqtt connect
[00:04:41.923,000] <inf> mqtt_service: Resolving: ****
[00:04:41.927,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 42
[00:04:41.964,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 77
[00:04:42.183,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 42
[00:04:42.390,000] <inf> mqtt_service: IPv4 Address found ****
[00:04:42.395,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 58
[00:04:42.459,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.467,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 173
[00:04:42.539,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.583,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.623,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.630,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 942
[00:04:42.993,000] <err> net_pkt: Data buffer (1203) allocation failed.
[00:04:42.993,000] <err> net_tcp: conn: 0x3fffcb24 packet allocation failed, len=1155
[00:04:42.993,000] <err> net_tcp: TCP failed to allocate buffer in retransmission
[00:04:43.264,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 321
[00:04:43.606,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 323
[00:04:43.824,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 145
[00:04:44.030,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.040,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 155
[00:04:44.041,000] <inf> net_mqtt: Connect completed
[00:04:44.078,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.114,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.119,000] <inf> wifi_sample: MQTT Client connected!
uart:~$ kernel stacks
0x3ffb9120 mqtt_poll_thread                 (real size 4096):   unused 1600     usage 2496 / 4096 (60 %)
0x3ffb95c0 rx_q[0]                          (real size 2048):   unused  480     usage 1568 / 2048 (76 %)
0x3ffb9408 wifi                             (real size 3584):   unused 1692     usage 1892 / 3584 (52 %)
0x3ffb9508 net_mgmt                         (real size 2048):   unused  908     usage 1140 / 2048 (55 %)
0x3ffb9698 tcp_work                         (real size 1024):   unused  364     usage  660 / 1024 (64 %)
0x3ffb9848 esp_event                        (real size 4096):   unused 3168     usage  928 / 4096 (22 %)
0x3ffb9358 esp_timer                        (real size 4096):   unused 3516     usage  580 / 4096 (14 %)
0x3ffb9a48 sysworkq                         (real size 4096):   unused 3088     usage 1008 / 4096 (24 %)
0x3ffb92b8 shell_uart                       (real size 4096):   unused 1600     usage 2496 / 4096 (60 %)
0x3ffb91c0 logging                          (real size 1024):   unused  588     usage  436 / 1024 (42 %)
0x3ffb98e8 idle                             (real size 1024):   unused  828     usage  196 / 1024 (19 %)
0x3ffeaa30 IRQ 00                           (real size 2048):   unused 1760     usage  288 / 2048 (14 %)
uart:~$ net conn
     Context    Iface  Flags            Local             Remote
[ 1] 0x3ffbab10 1      4DU      0.0.0.0:36255          0.0.0.0:0
[ 2] 0x3ffbab9c 1      4ST      0.0.0.0:63266   x.x.x.x:8883

TCP        Context   Src port Dst port   Send-Seq   Send-Ack  MSS    State
0x3fffcb24 0x3ffbab9c   63266     8883 1313259406 1152228128  1460   ESTABLISHED
Set CONFIG_NET_TCP_LOG_LEVEL_DBG to enable TCP debugging support.
[00:05:44.046,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:05:44.240,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:06:44.053,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:06:44.247,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:07:44.059,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:07:44.253,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:08:44.066,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:08:44.272,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:09:44.072,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:09:44.324,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 123
[00:09:44.328,000] <err> os:  ** FATAL EXCEPTION
[00:09:44.328,000] <err> os:  ** CPU 0 EXCCAUSE 28 (load prohibited)
[00:09:44.328,000] <err> os:  **  PC 0x40085614 VADDR 0x8c
0x40085614: z_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:170

[00:09:44.328,000] <err> os:  **  PS 0x60720
[00:09:44.328,000] <err> os:  **    (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:7 CALLINC:2)
[00:09:44.328,000] <wrn> cbprintf_package: (unsigned) char * used for %p argument. It's recommended to cast it to void * because it may cause misbehavior in certain configurations. String:" **  A0 %p  SP %p  A2 %p  A3 %p" argument:1
[00:09:44.328,000] <err> os:  **  A0 0x80085900  SP 0x3ffe95a0  A2 (nil)  A3 0x60720
[00:09:44.328,000] <err> os:  **  A4 0x3ffb1a70  A5 (nil)  A6 0x60b20  A7 0x3fffa6d2
[00:09:44.328,000] <err> os:  **  A8 0x8008560c  A9 0x3ffe9590 A10 0x3ffe81cc A11 0x3ffb9650
[00:09:44.328,000] <err> os:  ** A12 (nil) A13 (nil) A14 0x3ffbb0bc A15 0x400f65b4
0x400f65b4: tcp_recv at /home/marin/zephyrproject/zephyr/subsys/net/ip/tcp.c:1568

[00:09:44.328,000] <err> os:  ** LBEG 0x400ef10c LEND 0x400ef14c LCOUNT 0x8
0x400ef10c: net_context_is_used at /home/marin/zephyrproject/zephyr/include/zephyr/net/net_context.h:353
 (inlined by) find_context at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:830
 (inlined by) net_context_packet_received at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:1901

0x400ef14c: find_context at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:829
 (inlined by) net_context_packet_received at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:1901

[00:09:44.328,000] <err> os:  ** SAR 0x4


Ba0x40085611:0x30x400858fd:0x30x40084f84:0x30x400f07a5:0x30x400d90ef:0x3
0x40085611: arch_irq_unlock at /home/marin/zephyrproject/zephyr/include/zephyr/arch/xtensa/irq.h:125
 (inlined by) do_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:154
 (inlined by) z_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:169

0x400858fd: z_pend_curr at /home/marin/zephyrproject/zephyr/kernel/sched.c:872

0x40084f84: z_impl_k_queue_get at /home/marin/zephyrproject/zephyr/kernel/queue.c:343

0x400f07a5: k_queue_get at /home/marin/zephyrproject/zephyr/samples/net/wifi/build/zephyr/include/generated/syscalls/kernel.h:659
 (inlined by) tc_rx_handler at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_tc.c:245

0x400d90ef: z_thread_entry at /home/marin/zephyrproject/zephyr/lib/os/thread_entry.c:36


[00:09:44.329,000] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:09:44.329,000] <err> os: Current thread: 0x3ffb95c0 (rx_q[0])
[00:09:44.513,000] <err> os: Halting system

Environment (please complete the following information):

  • OS: Linux
  • Toolchain: 0.15.1
  • Commit SHA: 4939463dc2

Configurations

boards/esp32.conf

CONFIG_WIFI=y
CONFIG_HEAP_MEM_POOL_SIZE=8192

CONFIG_NETWORKING=y
CONFIG_NET_L2_ETHERNET=y

CONFIG_NET_IPV6=n
CONFIG_NET_IPV4=y
CONFIG_NET_DHCPV4=y

CONFIG_NET_LOG=y

CONFIG_ESP32_WIFI_STA_AUTO_DHCPV4=y

CONFIG_MQTT_SERVICE=y

prj.conf

CONFIG_EARLY_CONSOLE=y

CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=100
CONFIG_LOG_BUFFER_SIZE=8192

CONFIG_NETWORKING=y
CONFIG_TEST_RANDOM_GENERATOR=y

CONFIG_NET_TX_STACK_SIZE=2048
CONFIG_NET_RX_STACK_SIZE=2048

CONFIG_NET_PKT_RX_COUNT=10
CONFIG_NET_PKT_TX_COUNT=10
CONFIG_NET_BUF_RX_COUNT=20
CONFIG_NET_BUF_TX_COUNT=20
CONFIG_NET_MAX_CONTEXTS=10
CONFIG_NET_DHCPV4=y

CONFIG_DNS_RESOLVER=y

CONFIG_NET_IPV4=y
CONFIG_NET_IPV6=n

CONFIG_NET_TCP=y

CONFIG_NET_LOG=y
CONFIG_INIT_STACKS=y

CONFIG_NET_SHELL=y

CONFIG_NET_STATISTICS=y
CONFIG_NET_STATISTICS_PERIODIC_OUTPUT=n

CONFIG_WIFI=y
CONFIG_NET_L2_WIFI_SHELL=y

CONFIG_WIFI_LOG_LEVEL_DBG=y
#CONFIG_NET_L2_ETHERNET_LOG_LEVEL_DBG=y
#CONFIG_NET_PKT_LOG_LEVEL_DBG=y
#CONFIG_NET_IPV4_LOG_LEVEL_DBG=y
CONFIG_NET_DHCPV4_LOG_LEVEL_DBG=y
CONFIG_NET_L2_WIFI_MGMT_LOG_LEVEL_DBG=y

CONFIG_HWINFO=y

CONFIG_MINIMAL_LIBC_MALLOC_ARENA_SIZE=8192

CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
CONFIG_SHELL_STACK_SIZE=4096

# MQTT TLS
CONFIG_MAIN_STACK_SIZE=4096
CONFIG_MQTT_LIB_TLS=y
CONFIG_NET_SOCKETS_SOCKOPT_TLS=y

# TLS configuration
CONFIG_MBEDTLS=y
CONFIG_MBEDTLS_BUILTIN=y
CONFIG_MBEDTLS_ENABLE_HEAP=y
CONFIG_MBEDTLS_HEAP_SIZE=70000
CONFIG_MBEDTLS_SSL_MAX_CONTENT_LEN=5120

CONFIG_MQTT_SERVICE_SERVER_PORT=8883
CONFIG_MQTT_SERVICE_LOG_LEVEL_DBG=y

CONFIG_NET_MGMT_EVENT_STACK_SIZE=2048
CONFIG_NET_MGMT_EVENT_QUEUE_SIZE=20

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 18 (7 by maintainers)

Most upvoted comments

@marinjurjevic, great, thanks. Yes, you have to increase that tcp_work stack (CONFIG_NET_TCP_WORKQ_STACK_SIZE=4096).
I was not able to get anymore crashes. Will submit this.