esp-idf: Wifi roaming does not work (802.11k 802.11v 802.11r or client-side roaming) in STA mode (IDFGH-10713)

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

I’m trying to get ESP32-S2 to have as stable wifi connection as possible - that means preiodic monitoring of network and roaming to neighboring AP if necessary (e.g. current AP has RSSI below certain threshold).

I thought I would use the roaming example, but ESP32 just refuses to roam:

00:03:29.421|  +WIFI-ROAM: RSSI threshold hit (-81dBm), will try to look for better AP
00:03:29.530|  +WIFI-RRM: Sending neighbor report request
00:03:29.534|  I (13122) wpa: action frame sent
00:03:29.543|  +WIFI-RRM: Received neighbor report [LEN=31]
00:03:29.552|  0000 | 01 34 0D D0 21 F9 5D 39  36 8F 08 00 00 51 06 07 |.4..!.]9 6....Q..|
00:03:29.560|  0010 | 34 0D 78 45 58 9F 27 D1  8F 58 00 00 51 0B 0E    |4.xEX.'. .X..Q.. |
00:03:29.561|  
00:03:29.569|  +WLAN-RMM: Neigbor report bssid=d0:21:f9:5d:39:36 info=0x88f op_class=81 chan=6 phy_type=7
00:03:29.580|  +WLAN-RMM: Neigbor report bssid=78:45:58:9f:27:d1 info=0x588f op_class=81 chan=11 phy_type=14
00:03:29.582|  I (13172) wpa: action frame sent
00:03:29.590|  +WIFI-RRM: Sending transition mgmt query [RES=0]
00:03:29.595|  I (13182) wpa: scan issued at time=169747532768
00:03:29.714|  I (13302) wpa: scan done received
00:03:29.718|  I (13302) wpa: action frame sent
~~ after this nothing relevant happens, no roaming or WLAN-related events... ~~

I have Ubiquiti APs and other devices on the same network roam without any problem. I also tried to change REASON_FRAME_LOSS to REASON_RSSI in esp_wnm_send_bss_transition_mgmt_query() but nothing has changed. The example mentioned explicitely that the station will be forced to roam but this seems to be false as the station will remain on the same AP.

I also enabled 802.11r in the network but no difference in ESP32 behavior. After that I was trying to figure out how to force ESP32 to initiate client-side roaming without dropping connection altogether (by calling esp_wifi_disconnect() and then esp_wifi_connect()).

Right now it seems that ESP32 will roam only when the current connection fully drops, which is unfortunate.

So to the question:

Is there any way to either get 802.11k/v/r working or debug why it is not working? Or is there any way to force client side roaming so I can have reliable connection without having to wait for the current connection to die (or kill it by calling disconnect)?

Has anyone managed to get ESP32 to have reliable and stable connection without having to manually scan every few seconds and then manually call disconnect()/connect() when better AP becomes available?

About this issue

Most upvoted comments

Hi @kapilkedawat : Ok, could you explain why the feature is not on the raodmap. Is it due to project priority reasons, or due to a technical restriction in ESP-Hardware?

@kapilkedawat as @xoores mentioned, the main problem is the disconnection of sockets while roaming.

Consider the following use case: ESP32 device is installed on a forklifter. Forklifter is constantly moving between access points. Now a client tries to load recorded data from the device via http. This operation will fail when the device is roaming while the http request is active. Of course, I can think about workarounds, but in my opinion, this should work.

So can you consider to change wifi stack NOT to close underlying LWIP connections while roaming?

@Saicharan67 : You can connect to a specific AP like this:

s_preferred_bssid contains the MAC, and s_preferred_channel contains the channel.

    wifi_config_t wifi_config;

    if ((err = esp_wifi_get_config(WIFI_IF_STA, &wifi_config)) != ESP_OK) {
        ESP_LOGE(TAG, "esp_wifi_get_config failed: %s", esp_err_to_name(err));
        return err;
    }

    memcpy(wifi_config.sta.bssid, s_preferred_bssid, sizeof(wifi_config.sta.bssid));
    wifi_config.sta.bssid_set = true;
    wifi_config.sta.channel = s_preferred_channel;

    if ((err = esp_wifi_set_config(WIFI_IF_STA, &wifi_config)) != ESP_OK) {
        ESP_LOGE(TAG, "esp_wifi_set_config failed: %s", esp_err_to_name(err));
        // try connect anyway
    }

    if ((err = esp_wifi_connect()) != ESP_OK) {
        ESP_LOGE(TAG, "esp_wifi_connect failed: %s", esp_err_to_name(err));
    }

Hi @KlausPopp , This methodology is called make-before-break. However we don’t have plans to implement it in near future. Only break-before-make is available with ESP devices.

Hello @kapilkedawat : Thanks for your reply. Why is it necessary to disconnect from the old AP first before connecting to the new AP? I monitored an iPhone while roaming. I does it absolutely seamless, no IP interruption occurs. I noticed that the IPhone reassociates with the new AP while it is still associated with the old AP. After some seconds, the old AP sends a disassociation request to the iPhone due to missing ACKs. Would such an approach be possible with ESP wifi stack?

Hello @xoores : I experienced the same problem. I am using ESP-IDF v5.1.1 and AVM Fritzbox mesh. I used a Raspberry PI and wireshark as a network sniffer. What I saw, is that the ESP sends a “BTM transition query” to the AP, but the AP ignores it. Thats why the STA does not roam. In other situations, the AP sends a BTM transition request (without a prior query from the STA), which causes the STA to roam.

For me, the big problem is, that it seems not possible with the current Wifi stack to perform a SEAMLESS roaming (i.e. without a connection loss when roaming). The wifi stack always disconnects from the old AP, and then connects to the new AP, which takes at least 2 seconds.

Espressif Team: Do you have plan to change this in the future?