zephyr: Filesystem Write Fails with Some SD-Cards
Writing Data to a file on the SD-card fails with the version *** Booting Zephyr OS build v3.1.99-ncs1-9-g85edda1e989c *** of zephyr. The Failure depends on the card and on the data being written. It fails with the cards Lexar 300x 32GB microSD HC and Transcend 4GB microSD HC. It succeeds with Delkin Devices. Utility + 4GB microSD HC. If the data is all zeros instead of initialized via the function provided below, the test always succeeds.
See more details and failing unit-test on target below.
This is a regression and it worked fine with version 2.6.
The target board is a variant of nrf9160ns.
To Reproduce Steps to reproduce the behavior:
- add the following test and invoke it with twister:
#include <fs/fs.h>
#include <ztest.h>
#define BUFFER_SIZE (1<<15)
#define BUFFER_CNT (15)
static char buffer[BUFFER_SIZE] = {0};
static void init_buffer(void){
for(int i=0;i<BUFFER_SIZE;i++){
buffer[i]+=3*(1+i);
}
}
static bool write_file(void){
struct fs_file_t file;
fs_file_t_init(&file);
int rc = fs_open(&file, "/SD:/write_test", FS_O_CREATE | FS_O_RDWR);
if (rc < 0)
{
printk("Failed to open file\n");
return false;
}
ssize_t ret = 0;
for(int i=0;i<BUFFER_CNT;i++){
printk("chunk: %d size: %d data: %02x ... %02x\n",
i,
BUFFER_SIZE,
buffer[0],
buffer[BUFFER_SIZE -1]);
ret = fs_write(&file, buffer, BUFFER_SIZE);
if (ret < 0)
{
printk("Failed to write %d\n", ret);
printk("WRITTEN %d PACKETS A %d = %d\n", i, BUFFER_SIZE,i*BUFFER_SIZE);
return false;
}
}
rc = fs_truncate(&file, BUFFER_CNT*BUFFER_SIZE);
if(rc < 0)
{
printk("error truncating file\n");
return false;
}
rc = fs_close(&file);
if (rc < 0)
{
printk("error closing file\n");
return false;
}
return true;
}
static void test_write(void)
{
init_buffer();
zassert_true(write_file(), "failed to write file");
}
void test_main(void)
{
ztest_test_suite(test_write_file,
ztest_unit_test(test_write)
);
ztest_run_test_suite(test_write_file);
}
Expected behavior The test should pass independently of the data written to the file and of the SD card being used.
Impact Show stopper. We are likely not able to update zephyr version without a workaround.
Logs and console output
Failing test result using v3.1.99 and Transcend 4GB microSD HC:
*** Booting Zephyr OS build v3.1.99-ncs1-9-g85edda1e989c ***
I: Start initializing SDHC.
I: Detected SD card
I: Maximum SD clock is under 25MHz, using clock of 8000000Hz
I: Block count 7680000
I: Sector size 512
I: Memory Size(MB) 3750
I: Disk mounted under "/SD:"
CENSOR
I: SDHC initialization succeeded.
Running TESTSUITE test_write_file
===================================================================
START - test_write
chunk: 0 size: 32768 data: 03 ... 00
chunk: 1 size: 32768 data: 03 ... 00
chunk: 2 size: 32768 data: 03 ... 00
chunk: 3 size: 32768 data: 03 ... 00
chunk: 4 size: 32768 data: 03 ... 00
chunk: 5 size: 32768 data: 03 ... 00
E: Write failed
E: file write error (-5)
Failed to write -5
WRITTEN 5 PACKETS A 32768 = 163840
Assertion failed at CMAKE_SOURCE_DIR/src/main.c:62: test_write: (write_file() is false)
failed to write file
FAIL - test_write in 17.438 seconds
===================================================================
TESTSUITE test_write_file failed.
===================================================================
RunID: fba8dcb6ff2a643162217b8944b19ed0
PROJECT EXECUTION FAILED
Succeeding test result using v3.1.99 and Delkin Devices. Utility + 4GB microSD HC:
DEBUG - DEVICE: *** Booting Zephyr OS build v3.1.99-ncs1-9-g85edda1e989c ***
DEBUG - DEVICE: I: Start initializing SDHC.
DEBUG - DEVICE: I: Detected SD card
DEBUG - DEVICE: I: Maximum SD clock is under 25MHz, using clock of 8000000Hz
DEBUG - DEVICE: I: Block count 7576576
DEBUG - DEVICE: I: Sector size 512
DEBUG - DEVICE: I: Memory Size(MB) 3699
DEBUG - DEVICE: I: Disk mounted under "/SD:"
CENSOR
DEBUG - DEVICE: I: SDHC initialization succeeded.
DEBUG - DEVICE: Running TESTSUITE test_write_file
DEBUG - DEVICE: ===================================================================
DEBUG - DEVICE: START - test_write
DEBUG - DEVICE: chunk: 0 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 1 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 2 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 3 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 4 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 5 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 6 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 7 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 8 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 9 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 10 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 11 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 12 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 13 size: 32768 data: 03 ... 00
DEBUG - DEVICE: chunk: 14 size: 32768 data: 03 ... 00
DEBUG - DEVICE: PASS - test_write in 1.313 seconds
DEBUG - DEVICE: ===================================================================
DEBUG - DEVICE: TESTSUITE test_write_file succeeded
DEBUG - DEVICE: ===================================================================
DEBUG - DEVICE: RunID: ff4fd55abfb0591aa484e9d6b58d34af
DEBUG - DEVICE: PROJECT EXECUTION SUCCESSFUL
Environment:
- Ubuntu 22.04.1 LTS running as VM
- Nordic SDK
Furthermore
- Changes to device tree:
- compatible = "zephyr,mmc-spi-slot";
- reg = <0>;
- status = "okay";
- label = "SDHC0";
- spi-max-frequency = <8000000>; // Maximum of NRF91
+ compatible = "zephyr,sdhc-spi-slot";
+ reg = <0>;
+ status = "okay";
+ mmc {
+ compatible = "zephyr,sdmmc-disk";
+ status = "okay";
+ };
+ spi-max-frequency = <8000000>; // Maximum of NRF91
};
- Changes to kconfig:
+CONFIG_DISK_ACCESS=y
+CONFIG_DISK_DRIVERS=y
+CONFIG_SDMMC_SUBSYS=y
+CONFIG_SPM=y
+CONFIG_ZTEST_STACK_SIZE=4096
- This is how we initialize the SD-card using
SYS_INIT. We do it early in the boot process (high priority).:
static const char *disk_pdrv = "SD";
uint64_t memory_size;
uint32_t block_count;
uint32_t block_size;
LOG_INF("Start initializing SDHC.");
const struct device *gpio = DEVICE_DT_GET(DT_NODELABEL(gpio0));
if (!gpio)
{
LOG_ERR("Failed to get GPIO device.");
return false;
}
gpio_pin_configure(gpio, DT_GPIO_PIN(DT_ALIAS(sdhcsw), gpios), GPIO_OUTPUT);
gpio_pin_set(gpio, DT_GPIO_PIN(DT_ALIAS(sdhcsw), gpios), 0);
int err = disk_access_init(disk_pdrv);
if (err)
{
LOG_ERR("Storage init ERROR: %d!", -err);
return false;
}
if(disk_access_ioctl(disk_pdrv, DISK_IOCTL_GET_SECTOR_COUNT, &block_count)) {
LOG_ERR("Unable to get sector count!");
return false;
}
LOG_INF("Block count %u", block_count);
if(disk_access_ioctl(disk_pdrv, DISK_IOCTL_GET_SECTOR_SIZE, &block_size)) {
LOG_ERR("Unable to get sector size!");
return false;
}
LOG_INF("Sector size %u", block_size);
memory_size = (uint64_t)block_count * block_size;
LOG_INF("Memory Size(MB) %u", (uint32_t)(memory_size >> 20));
mp.mnt_point = disk_mount_pt;
int res = fs_mount(&mp);
if (res == FR_OK) {
LOG_INF("Disk mounted under \"%s\"", mp.mnt_point);
}
else {
LOG_ERR("Error mounting disk!");
return false;
}
...
Let me know if more information is needed to reproduce.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 22
Commits related to this issue
- sd: utilize sdmmc_wait_ready with SPI mode. Update sdmmc framework to use sdmmc_wait_ready when accessing card in SPI mode. this will allow cards that do not return to ready to be polled for busy sta... — committed to nxp-zephyr/zephyr by danieldegrasse 2 years ago
- sd: utilize sdmmc_wait_ready with SPI mode. Update sdmmc framework to use sdmmc_wait_ready when accessing card in SPI mode. this will allow cards that do not return to ready to be polled for busy sta... — committed to zephyrproject-rtos/zephyr by danieldegrasse 2 years ago
I went back to our fork of version
3.1.99. I applied the two suggested commits (cherry-pick).With those I can not reproduce the bug under conditions it consistently happens without them.
Great job and congratulations on finding this 👍 Thanks for the help.
Feel free to close this issue, when appropriate.
[ Out of curiosity, do you know why only some of the SD-cards are affected? Might it be related to the speed-class, such that some of them are always ready again, before zephyr can contact them, while the slower might not be ready? ]
Thanks for the help. Yes I did not express myself well. At first I thought, that maybe the timeout was ignored. But in fact, as you point out, and as I later found out by figuring out the exact source of the problem, it is not the timeout causing the issue.
The error code is -5 returned here https://github.com/nxp-zephyr/zephyr/blob/065b7ac02661d580b9300d4c1c1048874671f118/subsys/sd/sdmmc.c#L1358 (I will double check just to be sure)
I will try the other suggestions today.
Thank you.