zephyr: usb: stm32: testusb fails and often hangs forever
Describe the bug
On the Nucleo F767zi and Nucleo F070rb boards, many test cases from the testusb Linux kernel module fail. Failure rates and their consequences can take various forms: most failures only result in a timeout for a specific tranfer, but some of the failures can render the device unusable, forcing the user to reset the MCU in order to use the USB device again.
The following table summarize the status of each test case for the two boards.
| Test case | nucleo_f767zi | Failure Rate | nucleo_f070rb | Failure Rate |
|---|---|---|---|---|
| 0 | ✔️ | N/A | ✔️ | N/A |
| 1 | ❌ | 10-20% | ✔️ | N/A |
| 2 | ✔️ | N/A | ❌ | 100% |
| 3 | ❌ | 10-20% | ✔️ | N/A |
| 4 | ✔️ | N/A | ❌ | 100% |
| 5 | ✔️ | N/A | ✔️ | N/A |
| 6 | ✔️ | N/A | ❌ | 100% |
| 7 | ✔️ | N/A | ✔️ | N/A |
| 8 | ✔️ | N/A | ❌ | 100% |
| 9 | ✔️ | N/A | ✔️ | N/A |
| 10 | ✔️ | N/A | ❌ | 100% |
| 11 | ✔️ | N/A | ✔️ | N/A |
| 12 | ✔️ | N/A | ✔️ | N/A |
| 13 | ❌ | 100% | ❌ | 100% |
| 14 | ❌ | < 10% | ❌ | 100% |
| 17 | ❌ | 10-20% | ✔️ | N/A |
| 18 | ✔️ | N/A | ❌ | 100% |
| 19 | ❌ | 10-20% | ✔️ | N/A |
| 20 | ✔️ | N/A | ❌ | 100% |
| 21 | ❌ | < 10% | ❌ | 100% |
| 24 | ❌ | 100% | ✔️ | N/A |
| 27 | ✔️ | N/A | ✔️ | N/A |
| 28 | ✔️ | N/A | ❌ | 100% |
| 29 | ✔️ | N/A | ✔️ | N/A |
I used the following command line for all the tests:
sudo ./testusb -D /dev/usb/xxx/yyy -v 512 -t <test_case>
On the nucleo_f767zi board:
- test cases 1, 3, 17 and 19 fail between 10 and 20% of the time. When it fails, the test end with “Connection timed out”. The device is still usable after that failure.
- test case 13 always fail with “Invalid argument”. It doesn’t have any other impact.
- test cases 14 and 21 seem to trigger issue #27992 . Indeed, after enough time, these tests fail and the control endpoints become unresponsive. Unplugging the cable and plugging it again won’t solve the issue: the device is unable to enumerate and the MCU has to be reset in order to have a functional USB device.
- test case 24 always fails and is unable to complete. The testusb program stays stuck in the kernel in an uninterruptible sleep state. The device is still operational after plugging it again.
On the nucleo_f070rb board:
- test cases 2, 4, 6, 8, 14, 18, 20, 21 always fail and return “Connection timed out”. The device can operate normally after that.
- test case 13 always fail with “Invalid argument”.
- test cases 10 and 28 always fail and are stuck in the kernel indefinitely. The device needs to plugged again in order to
On both boards, whenever a test fails, the log of the device shows:
<err> usb_dc_stm32: Unable to get write lock (-16)
<err> usb_dc_stm32: Unable to get write lock (-16)
Changing the value of USB_NUMOF_EP_WRITE_RETRIES from 3 to 30 doesn’t seem to have any impact on the tests failures.
To Reproduce
For the nucleo_f767zi board:
- Compile and flash the testusb sample
- On the Linux host, follow the instructions from
sample/subsys/usb/testusb/README.rst - For each test case, run the following command:
sudo ./testusb -D /dev/usb/xxx/yyy -v 512 -t <test_case>
Since some test cases only fail around 10% of the time, this command may need to be executed multiple times before the failure occurs.
For the nucleo_f070rb board:
- Since that board doesn’t have a USB connector, you’ll need an external micro USB connector (the PA11 and PA12 pins can be used as USB D+/D-)
- Once you are able to connect a Linux host to the board, you need to set the status of the “usb” node of the device tree to “okay”:
diff --git a/boards/arm/nucleo_f070rb/nucleo_f070rb.dts b/boards/arm/nucleo_f070rb/nucleo_f070rb.dts
index 9f5923d36d57..dcdf20912124 100644
--- a/boards/arm/nucleo_f070rb/nucleo_f070rb.dts
+++ b/boards/arm/nucleo_f070rb/nucleo_f070rb.dts
@@ -41,6 +41,10 @@
};
};
+&usb {
+ status = "okay";
+};
+
&usart1 {
current-speed = <115200>;
status = "okay";
- Compile and flash the testusb sample
- On the Linux host, follow the instructions from
sample/subsys/usb/testusb/README.rst - Use the command:
sudo ./testusb -D /dev/usb/xxx/yyy -v 512 -t <test_case>
Expected behavior
All tests should pass.
Impact
There are multiple consequences to these failures:
- The USB device might not pass USB certification
- Some transfers randomly time out which compromise the communication with a host.
- In some cases, USB is completely unusable without a MCU hardware reset, which is not acceptable or even possible in production.
Environment (please complete the following information):
- Host: Ubuntu 19.10 with Linux kernel 5.3.0-64-generic
- Toolchain : Zephyr SDK
- Commit SHA: c8cca16c204543b8f6dd7261c643b48ba8c95a94
- Boards: nucleo_f767zi and nucleo_f070rb (with external usb connector)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 27 (10 by maintainers)
@carlescufi This is in my radar and action has been taken. This is taking more time than expected, but this should eventually be solved.
It seems you’re not using the right path for the USB device. According to your logs, the correct path is probably
/dev/bus/usb/001/014(the device number 14 on the USB bus 1).You should unplug the device and plug it again after that command: