circuitpython: OSError: [Errno 5] Input/output: macOS Sonoma is delaying writes on small filesystems

CircuitPython version

Adafruit CircuitPython 8.2.6 on 2023-09-12; Adafruit CircuitPlayground Express with samd21g18

Code/REPL

# Write your code here :-)

print("Hello World!")

Behavior

Code stopped by auto-reload. Reloading soon.
soft reboot

Auto-reload is on. Simply save files over USB to run them or enter REPL to disable.
code.py output:
OSError: [Errno 5] Input/output error

Code done running.

Description

  • Error on each save

Additional information

On a fresh installation of MacOS sonoma every time that code.py is saved a OSError: [Errno 5] Input/output error pops up, eventually the code runs but it takes about 15 seconds, I have tried disabling USB Stores and spotlight with:

defaults write DSDontWriteUSBStores -bool false
sudo mdutil -i off -d /Volumes/CIRCUITPY/ 

No luck

About this issue

  • Original URL
  • State: closed
  • Created 9 months ago
  • Reactions: 1
  • Comments: 155 (6 by maintainers)

Most upvoted comments

Since the original bug appears to have been fixed, though still with problems, closing in favor of #8918, which describes the new problems. We can reopen this if the original problem reappears. This issue is getting unwieldy due to the number of comments.

I wrote this shell script that remounts the USB drive on the MatrixPort S3. The “sleep 2” is a cluge that I don’t know why is needed, but if you unmount and then immediately try to mount the disk it fails.

#!/bin/csh
#
# This works around bug where, by default, macOS 14.0 writes part of a file 
# immediately, and then doesn't update the directory for 20-60 seconds, causing
# the file system to be corrupted.
#

set disky=`df | grep CIRCUITPY | cut -d" " -f1`
sudo umount /Volumes/CIRCUITPY
sudo mkdir /Volumes/CIRCUITPY
sleep 2
sudo mount -v -o noasync -t msdos $disky /Volumes/CIRCUITPY

My theory panned out, the reason that behavior differs after a manual mount of the filesystem is that it brings in kernel space kext com.apple.filesystems.msdosfs which then gets used instead of the new user space driver.

The older kext does not delay meta data writes while the newer user space driver does. MacOS Sonoma has two entirely different fat filesystem drivers, which one gets used depends on the mechanism used to mount the filesystem.

The delay is fairly long, on the order of 10 to 20 seconds. I can see one shot of writes when I save the file, the delay, then the remaining writes:

flash_write_blocks: block_num=42, num_blocks=1
flash_write_blocks: block_num=43, num_blocks=1
***delay 10 to 20 seconds***
flash_write_blocks: block_num=2, num_blocks=1
flash_write_blocks: block_num=3, num_blocks=1
flash_write_blocks: block_num=4, num_blocks=1
flash_write_blocks: block_num=5, num_blocks=1
flash_write_blocks: block_num=6, num_blocks=1
flash_write_blocks: block_num=7, num_blocks=1
flash_write_blocks: block_num=8, num_blocks=1
flash_write_blocks: block_num=9, num_blocks=1

Unmounting then remounting the CIRCUITPY filesystem like this:

sudo umount /Volumes/CIRCUITPY
sudo mkdir /Volumes/CIRCUITPY
sudo mount -o noasync -t msdos /dev/disk4s1 /Volumes/CIRCUITPY

temporarily eliminates the delay. The special device name will vary. There should be a way to make this change permanent in Sonoma’s /etc/fstab, but so far I’m not finding a LABEL= solution works.

On behalf of all of us with this issue, our thanks for staying on top of this. On Jan 3, 2024 at 8:03 PM -0500, John Romkey @.***>, wrote:

macOS Sonoma 12.3 beta2 out today, no improvement. Filed updated feedback including the new information about the filesystem implementation and a suggestion that Apple simply disable the new implementation. Include links to this whole issue and to the detailed trace. Requested they simply disable the user space filesystem until they have the bandwidth to fix the real problem. My latest feedback ID is FB13513401, referencing FB13469123 FB13468526 FB13269327 FB13428357 FB13416096 FB13226668 FB13306896 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

If it helps anyone, this is the script I’ve been using lately. Taken from comments above. Works with bootloader disks by giving it the diskname as argument (remount-circuitpy.sh FEATHERBOOT)

#!/bin/bash
# remount-circuitpy.sh -- remount a FAT disk, usually CIRCUITPY, so it works
#
# Works around bug where macOS 14.x causes file corruption on small FAT
# filesystems. The remount apparently causes older kernel FAT driver
# to be loaded. see https://github.com/adafruit/circuitpython/issues/8449

diskname=CIRCUITPY
# allow other disk names (like FEATHERBOOT)
if [ -n "$1" ]; then
    diskname=$1
fi
echo "remounting $diskname... prepare to enter your sudo password"
diskdev=`df | grep ${diskname} | cut -d" " -f1`
echo "diskdev=${diskdev}"
sudo umount /Volumes/${diskname}
sudo mkdir /Volumes/${diskname}
sleep 2
sudo mount -v -t msdos $diskdev /Volumes/${diskname}

@romkey and @dahlbert I can confirm that 14.4 beta 2 fixes this issue. Tracing shows meta-data updates now follow a data write within a few hundred milliseconds. Like this:

flash_write_blocks: time(ms)=406324, dev_block_num=   42, num_blocks=    1, data
block 41 (data) : cluster 7.0
0000: 94 25 f6 76 cb 55 6d ea be ee 93 af 52 69 71 bf .%.v.Um.....Riq.
0010: f9 64 fe 8f d9 8f e2 9e 0f 0d 00 05 95 fa d1 89 .d..............
0020: ce 52 85 be 35 78 0f c2 a3 18 e5 af ef 8c d3 b4 .R..5x..........
0030: af 89 77 b7 e3 c3 02 55 26 00 78 a2 1c b8 72 39 ..w....U&.x...r9
0040: 31 f3 ed a3 fd 17 b2 7c 8e eb c5 21 2c 45 84 94 1......|...!,E..
0050: 23 b1 d4 1d 43 c9 c5 ed 11 f6 cc 7d d8 cb 44 7f #...C......}..D.
0060: d0 f2 18 c0 1e 66 2d 5b f9 af 36 d2 a4 92 e8 99 .....f-[..6.....
0070: 00 96 53 23 03 55 b0 27 cf 71 36 1a 71 a5 b7 8f ..S#.U.'.q6.q...
0080: 12 cd 92 1d 72 66 32 f1 72 fd 51 81 ba 98 df 7e ....rf2.r.Q....~
0090: 82 f0 19 33 12 e4 dc 31 c3 83 46 95 0a 60 d2 fb ...3...1..F..`..
00a0: 0f 06 e0 18 b1 40 4c e7 ba f0 9c 8c c6 86 ac 3e .....@L........>
00b0: 2b 44 28 ab 43 04 d4 aa 41 d8 21 27 62 84 0f f3 +D(.C...A.!'b...
00c0: 53 48 1a 2f e9 45 85 21 59 7f c0 91 20 a0 bf 40 SH./.E.!Y... ..@
00d0: 08 ef ae ee 5f d0 9b ad c3 2b e7 30 ba f2 45 00 ...._....+.0..E.
00e0: 3e f8 6b fc 34 77 78 a9 e2 d0 48 48 0c a1 05 44 >.k.4wx...HH...D
00f0: 89 34 24 73 76 0c c4 2f 7f bb b7 85 f2 45 b6 da .4$sv../.....E..
0100: c3 48 7d 9f 8d b7 0b ab c0 15 17 1b c3 a8 ec 81 .H}.............
0110: 19 a1 ac 6e d0 25 7b 91 d5 03 38 fa cb b1 ae 92 ...n.%{...8.....
0120: ad ee c6 aa 00 a3 94 0a d8 68 5d 10 0e 96 80 3d .........h]....=
0130: b1 9c 38 0a a3 af fb d8 84 de 0e 34 84 b1 3e 03 ..8........4..>.
0140: a8 f1 aa 94 67 be 57 85 1a 0e 76 84 b4 a8 f1 73 ....g.W...v....s
0150: 4d 68 71 96 33 e2 d4 69 8c e2 a3 db cb b3 48 ff Mhq.3..i......H.
0160: 82 3f 84 e3 b5 29 fd 54 ef f3 28 c5 15 54 7c c0 .?...).T..(..T|.
0170: 77 23 fe 1b ec 47 a2 bd a3 d1 e2 fa 14 c4 18 b0 w#...G..........
0180: 95 b1 4e e9 3d 90 15 24 46 11 cb cc c0 87 de a9 ..N.=..$F.......
0190: 07 4b 07 a8 40 0f 05 3d 9f de 59 4f cf 6f 6c bd .K..@..=..YO.ol.
01a0: 43 32 fb e8 14 90 9e 60 53 92 15 3c cd 2a 7d 3c C2.....`S..<.*}<
01b0: 8b 7d d1 76 66 50 2a 51 db 9e dc b8 99 9c e5 f1 .}.vfP*Q........
01c0: fb e7 f9 de 02 0e de 7d a9 28 4c b6 f6 ba 20 99 .......}.(L... .
01d0: 34 5c 12 ad 3a 6f ab 5d 4b 1c 71 59 47 00 3f f7 4\..:o.]K.qYG.?.
01e0: 6c 66 b1 48 48 3b 2b b7 23 81 f1 36 00 4c 61 05 lf.HH;+.#..6.La.
01f0: 47 22 ab ad 6f 26 ac f7 dc bc 10 e1 9c 08 73 77 G"..o&........sw

flash_write_blocks: time(ms)=406586, dev_block_num=    2, num_blocks=    2, FAT
block 1 (FAT) :
0000: f8 ff ff ff ff ff ff ff ff ff ff ff 00 f0 ff 00 ................
0010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

block 2 (FAT) :
0000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

chain 2->4095.
chain 3->4095.
chain 4->4095.
chain 5->4095.
chain 6->4095.
chain 7->4095.
flash_write_blocks: time(ms)=406723, dev_block_num=    4, num_blocks=    1, root dir
block 3 (FAT) :
0000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

flash_write_blocks: time(ms)=406739, dev_block_num=    5, num_blocks=    2, root dir
block 4 (root dir) :
0000: 43 49 52 43 55 49 54 50 59 20 20 08 00 00 00 00 CIRCUITPY  .....
0010: 00 00 00 00 00 00 93 92 4b 58 00 00 00 00 00 00 ........KX......
0020: 41 2e 00 66 00 73 00 65 00 76 00 0f 00 80 65 00 A..f.s.e.v....e.
0030: 6e 00 74 00 73 00 64 00 00 00 00 00 ff ff ff ff n.t.s.d.........
0040: 7e 31 20 20 20 20 20 20 46 53 45 10 00 00 00 00 ~1      FSE.....
0050: 00 00 00 00 00 00 02 00 21 50 02 00 00 00 00 00 ........!P......
0060: 42 65 00 72 00 5f 00 69 00 6e 00 0f 00 49 64 00 Be.r._.i.n...Id.
0070: 65 00 78 00 00 00 ff ff ff ff 00 00 ff ff ff ff e.x.............
0080: 01 2e 00 6d 00 65 00 74 00 61 00 0f 00 49 64 00 ...m.e.t.a...Id.
0090: 61 00 74 00 61 00 5f 00 6e 00 00 00 65 00 76 00 a.t.a._.n...e.v.
00a0: 7e 31 20 20 20 20 20 20 4d 45 54 20 00 00 02 00 ~1      MET ....
00b0: 21 50 00 00 00 00 02 00 21 50 00 00 00 00 00 00 !P......!P......
00c0: 41 2e 00 54 00 72 00 61 00 73 00 0f 00 7f 68 00 A..T.r.a.s....h.
00d0: 65 00 73 00 00 00 ff ff ff ff 00 00 ff ff ff ff e.s.............
00e0: 7e 31 20 20 20 20 20 20 54 52 41 20 00 00 02 00 ~1      TRA ....
00f0: 21 50 00 00 00 00 02 00 21 50 00 00 00 00 00 00 !P......!P......
0100: 41 73 00 65 00 74 00 74 00 69 00 0f 00 19 6e 00 As.e.t.t.i....n.
0110: 67 00 73 00 2e 00 74 00 6f 00 00 00 6d 00 6c 00 g.s...t.o...m.l.
0120: 53 45 54 54 49 4e 7e 31 54 4f 4d 20 00 00 03 00 SETTIN~1TOM ....
0130: 21 50 00 00 00 00 03 00 21 50 00 00 00 00 00 00 !P......!P......
0140: 43 4f 44 45 20 20 20 20 50 59 20 20 18 00 03 00 CODE    PY  ....
0150: 21 50 00 00 00 00 03 00 21 50 03 00 16 00 00 00 !P......!P......
0160: 4c 49 42 20 20 20 20 20 20 20 20 10 08 00 00 00 LIB        .....
0170: 00 00 00 00 00 00 03 00 21 50 04 00 00 00 00 00 ........!P......
0180: 42 4f 4f 54 5f 4f 55 54 54 58 54 20 18 00 02 00 BOOT_OUTTXT ....
0190: 21 50 00 00 00 00 02 00 21 50 05 00 a6 00 00 00 !P......!P......
01a0: 41 2e 00 76 00 73 00 63 00 6f 00 0f 00 92 64 00 A..v.s.c.o....d.
01b0: 65 00 00 00 ff ff ff ff ff ff 00 00 ff ff ff ff e...............
01c0: 56 53 43 4f 44 45 7e 31 20 20 20 10 00 0e 7c 07 VSCODE~1   ...|.
01d0: 93 57 93 57 00 00 7c 07 93 57 06 00 00 00 00 00 .W.W..|..W......
01e0: 52 41 4e 44 31 20 20 20 44 41 54 20 18 00 93 92 RAND1   DAT ....
01f0: 4b 58 4b 58 00 00 94 92 4b 58 07 00 00 02 00 00 KXKX....KX......

If anyone on this issue hasn’t already done so, file an issue using Apple’s Feedback Assistant app. When you do, please mention FB13226668 (the issue I opened) so that Apple ties the reports together and raises the priority of the issue. If you’ve already opened an issue, you can append a note mentioning my issue number. Thank you!

The awkward moment when you experienced this in July and asked about it in Discord, but forget to file an actual issue, and then Sonoma ships…

Can I suggest that in the meantime, relevant CP/Adafruit guides get updated with a little note about Sonoma issues with a link to a page noting the workarounds? I don’t know what the % of CP users on macOS are, but I’d assume it’s more than a handful of us, and Sonoma has been out for a couple of weeks now.

As suggested by @romkey I’ve opened a new issue (#8918) to track the performance regression in Sonoma 14.4 beta 2 and above.

By the way, the report I submitted to Apple way back on 10/3/2023 (FB13226668) now shows, “Resolution: Potential fix identified - for a future OS update”.

There was some question somewhere (not here, it seems) about whether the MSC device is reported as “removable”. The answer is that TinyUSB in its MSC device code always sets the RMB (“removable”) bit to 1 when it answers a SCSI INQUIRY command.

@romkey I’m perplexed, too. But, I have a theory. Earlier @jeremyamoore noted that Sonoma had moved fat filesystems from kernel to user space. I’d missed that entirely, so all my reading of the kext code never turned up a plausible bug. It’s possible that the automatic mount is using the new user space filesystem code while a command line mount is pulling in the older kernel extension (kext). Now to devise an experiment to see if that’s the case.

Another member reported that Arduino Nano RP2040 Connect is affected, which it shouldn’t be based on previous results here, because it has a filesystem >8MB. I’m trying to follow up with him about that.

It was giving exactly the same symptoms, but flash nuke and fresh CircuitPython fixed the problem. Also tried another flash nuke and downgrading to 7.0.0 in case it was an issue left over from an older filesystem, but still fine. Must have been random filesystem corruption or something. Sorry for the false alarm!

[@dmcomm:] Should we close #8657 ?

I would like to consolidate these, and perhaps extract the most pertinent posts from each issue into a new shorter issue, since this issue already has hidden posts because it is so long.

Verified that a board with >8M of filesystem flash will not suffer delayed writes by testing with an Adafruit RP2040 Metro with 16M flash (about 15M FAT16 filesystem).

If you have a chance would you mind trying that, copying a bunch of files quickly like with a wildcard or in a shell script with no delays between the copies?

Besides the simple copying of tiny foo.txt, I also copied a substantial batch of .py files, and could not get the error. But they are all pretty small files, I think. So I’m not worried we are getting different results. Since it’s still reproducible on real boards, unfortunately, we’ll just have to keep trying the betas.

Easier way to verify, doesn’t even need a USB drive.

Create an 8MB FAT12 disk image. Not sure how to do this on macOS, on Linux it’s:

truncate -s 8M disk8.img
mkfs.vfat -F 12 disk8.img

Transfer this to your Mac.

Mount it - either open Disk Utility in the Finder and use File | Open Disk Image…

or

diskutil image attach disk8.img

This should mount as /Volumes/NO NAME and appear in the sidebar in the Finder.

Copy small files to it from the command line or using Finder. You should see errors.

You can unmount it by ejecting the volume in Finder.

I haven’t confirmed if the filesystem is corrupted when errors happen during the copy, so it’s best to always start with a fresh disk image to be sure you’re not accidentally seeing actual filesystem errors.

@romkey That’s super helpful. Thank you!

@eightycc I tried again with a few other sizes. 8MB fails while 8.5MB, 10MB, 12MB and 16MB succeed (worked backwards on this, I’m not completely erratic).

Just to be sure I rebooted and 8.5MB still succeeded.

I tried 8MB three times with fresh filesystems and it failed each time.

I’m going to take a wild guess that 8MB is the breakpoint.

@romkey It’s fascinating and unexpected that filesystem size makes a difference, thank you for running that down. So that I can get trace data, I’m going build a modified CP that reports a 64M device. Also, thanks for the confirmation on the work around.

@romkey You should be able to work around any of the FAT filesystem write problems by unmounting, making a mount point, and then re-mounting the device. No need to specify mount option flags. Give it a try and let me know if you run into a case where that doesn’t work.

@romkey note that @eightycc’s testing seems to show that any manual remount with any flags seems to clear the problem. Your last two tests probably included a manual remount, so maybe that explains what you are seeing.

Some workarounds:

  • simply waiting until the write completes: code.py will restart again after the final write. [not tested by me yet]
  • Turning off autoreload: supervisor.runtime.autoreload = False.
  • Doing sync in a Terminal window
  • Adding an os.sync() in the right place in whatever editor is being used.
  • Downgrade back to Ventura. I don’t know how easy or possible this is.

I will not have an opportunity to try this on a Sonoma Mac for a few days, but will try to characterize the issue more exactly then by doing some USB monitoring.

@dhalbert Earlier you’d asked whether sync forces proper behavior on MacOS Sonoma. Yes, it does. I’ve verified that both the command line sync and python os.sync() force updates to the Express filesystem immediately.

I did a small survey and found that the Mu IDE is an entry point for beginners. So, I added an os.sync() to write_and_flush() in logic.py of Mu 1.2.0. With the patch, saving code.py works as expected, i.e., the file and all directory and FAT updates are written out immediately.

I also traced file update on Windows 10 22H2 and Debian Bookworm 13 kernel 6.1.52 running Mu 1.2.0 with and without the patch. In all cases saving the file writes the file and all directory and FAT updates immediately on save.

@dahlbert Under the covers what’s going on is that MacOS copy-on-reads blocks of a file into virtual memory and backs that memory with an unallocated area of the underlying filesystem. Writes become stores into virtual memory, triggering immediate page-outs to the filesystem. Because the filesystem mount flags include MNT_ASYNC, updates to the FAT and the metadata file are deferred until a timeout mechanism forces a flush. This is why pulling the stick results in a zero length file, the file data is there but the FAT update to point to it hasn’t happened yet.

This got me thinking that if autoreload were to trigger selectively on just writes to the FAT (probably just two on such a small file system) this would eliminate the I/O error at the expense of a delayed autoreload.

If you have a machine that’s still running Ventura, it would be interesting to compare.

I’ve got an older Monterey (10.12) machine I can test and trace.

My guess is delayed write. I see this occasionally with Linux too. I run sync from the command line and then the next reload works as expected.