zephyr: ESP32 wifi sample with MQTT over TLS crashes
Describe the bug
I’ve created a custom sample which is combination of samples/net/wifi and samples/net/mqtt_publisher. I’m developing on ESP32-WROOM-32E (ESP32 devkitc v4). I’ve created a helper shell command for using MQTT commands (connect/disconnect/publish/subscribe…). MQTT connection is using TLS.
Randomly after sending shell commands I get fatal error.
Sometimes I also get error below, but not always so I don’t think it’s relevant but listing it here as additional info.
[00:04:42.993,000] <err> net_pkt: Data buffer (1203) allocation failed.
[00:04:42.993,000] <err> net_tcp: conn: 0x3fffcb24 packet allocation failed, len=1155
[00:04:42.993,000] <err> net_tcp: TCP failed to allocate buffer in retransmission
I tried increasing various thread stack sizes (shell, workqueue) but to no avail.
To Reproduce Steps to reproduce the behavior:
- west build -b esp32
- west flash --esp-device /dev/ttyUSB0 && west espressif monitor -p /dev/ttyUSB0
- Connect to AP and MQTT broker, publish subscribe, disconnect, wifi scan
- Error happens randomly
Expected behavior I don’t expect to see any errors.
Impact Showstopper
Logs and console output
shell commands and logs
uart:~$ wifi connect "*****" *****
Connection requested
[00:04:24.983,000] <dbg> net_wifi_mgmt: wifi_connect: (shell_uart): ch 255 sec 1
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_start_internal: (esp_event): iface 0x3ffb1c68 state=init
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_start_internal: (esp_event): wait timeout=5s
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (esp_event): sched timeout dhcvp4=0x3ffb1d08 timeout=5s
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=init timeleft=5
[00:04:26.971,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 5s
[00:04:31.972,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=init timeleft=0
[00:04:31.972,000] <dbg> net_dhcpv4: dhcpv4_enter_selecting: (sysworkq): enter state=selecting
[00:04:31.973,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 291
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (sysworkq): sched timeout dhcvp4=0x3ffb1d08 timeout=4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_send_discover: (sysworkq): send discover xid=0xdfe0571a timeout=4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=selecting timeleft=4
[00:04:31.973,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): Received dhcp msg [op=0x2 htype=0x1 hlen=6 xid=0xdfe0571a secs=0 flags=0x80 chaddr=94:3C:C6:CD:D7:60
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): ciaddr=0.0.0.0
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): yiaddr=192.168.43.97
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): siaddr=192.168.43.1
[00:04:32.153,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): giaddr=0.0.0.0]
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_server_id: 192.168.43.1
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_lease_time: 3600
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_renewal: 1800
[00:04:32.153,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_rebinding: 3150
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_subnet_mask 255.255.255.0
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 28
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_router: 192.168.43.1
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 43
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_end
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_handle_reply: (rx_q[0]): state=selecting msg=offer
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_enter_requesting: (rx_q[0]): enter state=requesting
[00:04:32.154,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (rx_q[0]): sched timeout dhcvp4=0x3ffb1d08 timeout=4s
[00:04:32.155,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 298
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_send_request: (rx_q[0]): send request dst=255.255.255.255 xid=0xdfe0571b ciaddr=<unknown> +server-id +requested-ip timeout=4s
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=requesting timeleft=4
[00:04:32.155,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 4s
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): Received dhcp msg [op=0x2 htype=0x1 hlen=6 xid=0xdfe0571b secs=0 flags=0x80 chaddr=94:3C:C6:CD:D7:60
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): ciaddr=0.0.0.0
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): yiaddr=192.168.43.97
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): siaddr=192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: net_dhcpv4_input: (rx_q[0]): giaddr=0.0.0.0]
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_server_id: 192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_lease_time: 3600
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_renewal: 1800
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_rebinding: 3150
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_subnet_mask 255.255.255.0
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 28
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_router: 192.168.43.1
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): option unknown: 43
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_parse_options: (rx_q[0]): options_end
[00:04:32.358,000] <dbg> net_dhcpv4: dhcpv4_handle_reply: (rx_q[0]): state=requesting msg=ack
[00:04:32.358,000] <inf> net_dhcpv4: Received: 192.168.43.97
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_enter_bound: (rx_q[0]): enter state=bound renewal=1800s rebinding=3150s
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_set_timeout: (rx_q[0]): sched timeout dhcvp4=0x3ffb1d08 timeout=1800s
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_manage_timers: (sysworkq): iface 0x3ffb1c68 dhcpv4=0x3ffb1d08 state=bound timeleft=1800
[00:04:32.359,000] <dbg> net_dhcpv4: dhcpv4_timeout: (sysworkq): Waiting for 1800s
[00:04:32.359,000] <inf> wifi_sample: Connected
Connected
uart:~$ mqtt connect
[00:04:41.923,000] <inf> mqtt_service: Resolving: ****
[00:04:41.927,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 42
[00:04:41.964,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 77
[00:04:42.183,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb424 len 42
[00:04:42.390,000] <inf> mqtt_service: IPv4 Address found ****
[00:04:42.395,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 58
[00:04:42.459,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.467,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 173
[00:04:42.539,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.583,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.623,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:42.630,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 942
[00:04:42.993,000] <err> net_pkt: Data buffer (1203) allocation failed.
[00:04:42.993,000] <err> net_tcp: conn: 0x3fffcb24 packet allocation failed, len=1155
[00:04:42.993,000] <err> net_tcp: TCP failed to allocate buffer in retransmission
[00:04:43.264,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 321
[00:04:43.606,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 323
[00:04:43.824,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 145
[00:04:44.030,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.040,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 155
[00:04:44.041,000] <inf> net_mqtt: Connect completed
[00:04:44.078,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.114,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:04:44.119,000] <inf> wifi_sample: MQTT Client connected!
uart:~$ kernel stacks
0x3ffb9120 mqtt_poll_thread (real size 4096): unused 1600 usage 2496 / 4096 (60 %)
0x3ffb95c0 rx_q[0] (real size 2048): unused 480 usage 1568 / 2048 (76 %)
0x3ffb9408 wifi (real size 3584): unused 1692 usage 1892 / 3584 (52 %)
0x3ffb9508 net_mgmt (real size 2048): unused 908 usage 1140 / 2048 (55 %)
0x3ffb9698 tcp_work (real size 1024): unused 364 usage 660 / 1024 (64 %)
0x3ffb9848 esp_event (real size 4096): unused 3168 usage 928 / 4096 (22 %)
0x3ffb9358 esp_timer (real size 4096): unused 3516 usage 580 / 4096 (14 %)
0x3ffb9a48 sysworkq (real size 4096): unused 3088 usage 1008 / 4096 (24 %)
0x3ffb92b8 shell_uart (real size 4096): unused 1600 usage 2496 / 4096 (60 %)
0x3ffb91c0 logging (real size 1024): unused 588 usage 436 / 1024 (42 %)
0x3ffb98e8 idle (real size 1024): unused 828 usage 196 / 1024 (19 %)
0x3ffeaa30 IRQ 00 (real size 2048): unused 1760 usage 288 / 2048 (14 %)
uart:~$ net conn
Context Iface Flags Local Remote
[ 1] 0x3ffbab10 1 4DU 0.0.0.0:36255 0.0.0.0:0
[ 2] 0x3ffbab9c 1 4ST 0.0.0.0:63266 x.x.x.x:8883
TCP Context Src port Dst port Send-Seq Send-Ack MSS State
0x3fffcb24 0x3ffbab9c 63266 8883 1313259406 1152228128 1460 ESTABLISHED
Set CONFIG_NET_TCP_LOG_LEVEL_DBG to enable TCP debugging support.
[00:05:44.046,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:05:44.240,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:06:44.053,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:06:44.247,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:07:44.059,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:07:44.253,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:08:44.066,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:08:44.272,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3e4 len 54
[00:09:44.072,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb364 len 123
[00:09:44.324,000] <dbg> esp32_wifi: esp32_wifi_send: pkt sent 0x3fffb3a4 len 123
[00:09:44.328,000] <err> os: ** FATAL EXCEPTION
[00:09:44.328,000] <err> os: ** CPU 0 EXCCAUSE 28 (load prohibited)
[00:09:44.328,000] <err> os: ** PC 0x40085614 VADDR 0x8c
0x40085614: z_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:170
[00:09:44.328,000] <err> os: ** PS 0x60720
[00:09:44.328,000] <err> os: ** (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:7 CALLINC:2)
[00:09:44.328,000] <wrn> cbprintf_package: (unsigned) char * used for %p argument. It's recommended to cast it to void * because it may cause misbehavior in certain configurations. String:" ** A0 %p SP %p A2 %p A3 %p" argument:1
[00:09:44.328,000] <err> os: ** A0 0x80085900 SP 0x3ffe95a0 A2 (nil) A3 0x60720
[00:09:44.328,000] <err> os: ** A4 0x3ffb1a70 A5 (nil) A6 0x60b20 A7 0x3fffa6d2
[00:09:44.328,000] <err> os: ** A8 0x8008560c A9 0x3ffe9590 A10 0x3ffe81cc A11 0x3ffb9650
[00:09:44.328,000] <err> os: ** A12 (nil) A13 (nil) A14 0x3ffbb0bc A15 0x400f65b4
0x400f65b4: tcp_recv at /home/marin/zephyrproject/zephyr/subsys/net/ip/tcp.c:1568
[00:09:44.328,000] <err> os: ** LBEG 0x400ef10c LEND 0x400ef14c LCOUNT 0x8
0x400ef10c: net_context_is_used at /home/marin/zephyrproject/zephyr/include/zephyr/net/net_context.h:353
(inlined by) find_context at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:830
(inlined by) net_context_packet_received at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:1901
0x400ef14c: find_context at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:829
(inlined by) net_context_packet_received at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_context.c:1901
[00:09:44.328,000] <err> os: ** SAR 0x4
Ba0x40085611:0x30x400858fd:0x30x40084f84:0x30x400f07a5:0x30x400d90ef:0x3
0x40085611: arch_irq_unlock at /home/marin/zephyrproject/zephyr/include/zephyr/arch/xtensa/irq.h:125
(inlined by) do_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:154
(inlined by) z_swap at /home/marin/zephyrproject/zephyr/kernel/include/kswap.h:169
0x400858fd: z_pend_curr at /home/marin/zephyrproject/zephyr/kernel/sched.c:872
0x40084f84: z_impl_k_queue_get at /home/marin/zephyrproject/zephyr/kernel/queue.c:343
0x400f07a5: k_queue_get at /home/marin/zephyrproject/zephyr/samples/net/wifi/build/zephyr/include/generated/syscalls/kernel.h:659
(inlined by) tc_rx_handler at /home/marin/zephyrproject/zephyr/subsys/net/ip/net_tc.c:245
0x400d90ef: z_thread_entry at /home/marin/zephyrproject/zephyr/lib/os/thread_entry.c:36
[00:09:44.329,000] <err> os: >>> ZEPHYR FATAL ERROR 0: CPU exception on CPU 0
[00:09:44.329,000] <err> os: Current thread: 0x3ffb95c0 (rx_q[0])
[00:09:44.513,000] <err> os: Halting system
Environment (please complete the following information):
- OS: Linux
- Toolchain: 0.15.1
- Commit SHA: 4939463dc2
Configurations
boards/esp32.conf
CONFIG_WIFI=y
CONFIG_HEAP_MEM_POOL_SIZE=8192
CONFIG_NETWORKING=y
CONFIG_NET_L2_ETHERNET=y
CONFIG_NET_IPV6=n
CONFIG_NET_IPV4=y
CONFIG_NET_DHCPV4=y
CONFIG_NET_LOG=y
CONFIG_ESP32_WIFI_STA_AUTO_DHCPV4=y
CONFIG_MQTT_SERVICE=y
prj.conf
CONFIG_EARLY_CONSOLE=y
CONFIG_LOG_PROCESS_THREAD_SLEEP_MS=100
CONFIG_LOG_BUFFER_SIZE=8192
CONFIG_NETWORKING=y
CONFIG_TEST_RANDOM_GENERATOR=y
CONFIG_NET_TX_STACK_SIZE=2048
CONFIG_NET_RX_STACK_SIZE=2048
CONFIG_NET_PKT_RX_COUNT=10
CONFIG_NET_PKT_TX_COUNT=10
CONFIG_NET_BUF_RX_COUNT=20
CONFIG_NET_BUF_TX_COUNT=20
CONFIG_NET_MAX_CONTEXTS=10
CONFIG_NET_DHCPV4=y
CONFIG_DNS_RESOLVER=y
CONFIG_NET_IPV4=y
CONFIG_NET_IPV6=n
CONFIG_NET_TCP=y
CONFIG_NET_LOG=y
CONFIG_INIT_STACKS=y
CONFIG_NET_SHELL=y
CONFIG_NET_STATISTICS=y
CONFIG_NET_STATISTICS_PERIODIC_OUTPUT=n
CONFIG_WIFI=y
CONFIG_NET_L2_WIFI_SHELL=y
CONFIG_WIFI_LOG_LEVEL_DBG=y
#CONFIG_NET_L2_ETHERNET_LOG_LEVEL_DBG=y
#CONFIG_NET_PKT_LOG_LEVEL_DBG=y
#CONFIG_NET_IPV4_LOG_LEVEL_DBG=y
CONFIG_NET_DHCPV4_LOG_LEVEL_DBG=y
CONFIG_NET_L2_WIFI_MGMT_LOG_LEVEL_DBG=y
CONFIG_HWINFO=y
CONFIG_MINIMAL_LIBC_MALLOC_ARENA_SIZE=8192
CONFIG_SYSTEM_WORKQUEUE_STACK_SIZE=4096
CONFIG_SHELL_STACK_SIZE=4096
# MQTT TLS
CONFIG_MAIN_STACK_SIZE=4096
CONFIG_MQTT_LIB_TLS=y
CONFIG_NET_SOCKETS_SOCKOPT_TLS=y
# TLS configuration
CONFIG_MBEDTLS=y
CONFIG_MBEDTLS_BUILTIN=y
CONFIG_MBEDTLS_ENABLE_HEAP=y
CONFIG_MBEDTLS_HEAP_SIZE=70000
CONFIG_MBEDTLS_SSL_MAX_CONTENT_LEN=5120
CONFIG_MQTT_SERVICE_SERVER_PORT=8883
CONFIG_MQTT_SERVICE_LOG_LEVEL_DBG=y
CONFIG_NET_MGMT_EVENT_STACK_SIZE=2048
CONFIG_NET_MGMT_EVENT_QUEUE_SIZE=20
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 18 (7 by maintainers)
@marinjurjevic, great, thanks. Yes, you have to increase that
tcp_workstack (CONFIG_NET_TCP_WORKQ_STACK_SIZE=4096).I was not able to get anymore crashes. Will submit this.