vscode: File writes can hang when write locks are not cleared up

Issue Type: Bug

If there is File Write latency when saving a file (like remotely over SMB), and you are the type of person that spams Ctrl+S/Cmd+S the file can become stuck in a dirty state forever and can never be saved again until you force quit MS Visual Code.

While a file is stuck in a dirty state, other files in the same workspace/folder can still be opened, edited, and saved as normal.

Closing a file that is stuck in a dirty state and reopening it within the same Microsoft Visual Code instance will not fix the issue, it will still open in a dirty state and the file can still not be saved - attempting to save results in no activity, no errors. “Revert file” will put the file back into a clean state, but once you’ve changed it and it becomes dirty again you still cannot save it. While any file is stuck in a dirty state the only way to close the Workspace, or Microsoft Visual Code is to Force Quit.

Screen Shot 2022-02-07 at 7 48 26 PM

To recreate: File System: remote SMB File Size: it’s confusingly easier to recreate while editing small files versus larger files File Type: any type, including “plain text”

Repeatedly make a change to and save an open file (like repeatedly tapping Enter to insert a new line while also repeatedly tapping Ctrl+S/Cmd+S - quickly).

It’s easiest for me to recreate by holding down Ctrl or Cmd the entire time, and then quickly tapping S > Enter > S > Enter > S > Enter (.etc).

Eventually, the file will stay dirty in the editor and can never be saved again. P.S. it can be very difficult to recreate sometimes, but I’m able to recreate while editing a very small (new) file using the mentioned method on two unrelated SMB servers. The amount of times you have to spam a change+save to observe the issue varies greatly - it seems random enough that it could be a network packet loss issue though I’m not observing any packet loss to either of the SMB servers that I’ve tested this on

I cannot recreate the issue with a local file, only a remote(SMB) file.

This only seems to be the case since the last update. Maybe the changes from https://github.com/microsoft/vscode/commit/19034bc492734fa2c1d34a2051d456e59d3a3951 that aren’t yet in my version of 1.64.0 (5554b12acf27056905806867f251c859323ff7e9, 2022-02-03T04:20:17.224Z) will fix the issue?

VS Code version: Code 1.64.0 (5554b12acf27056905806867f251c859323ff7e9, 2022-02-03T04:20:17.224Z) OS version: Darwin x64 21.2.0 Restricted Mode: Yes

System Info
Item Value
CPUs Intel® Core™ i7-9750H CPU @ 2.60GHz (12 x 2600)
GPU Status 2d_canvas: enabled
gpu_compositing: enabled
metal: disabled_off
multiple_raster_threads: enabled_on
oop_rasterization: enabled
opengl: enabled_on
rasterization: enabled
skia_renderer: disabled_off_ok
video_decode: enabled
webgl: enabled
webgl2: enabled
Load (avg) 1, 2, 2
Memory (System) 16.00GB (0.36GB free)
Process Argv –crash-reporter-id 6048ceb8-ffc5-4877-8f7c-cee23ec9bc4a
Screen Reader no
VM 0%
Extensions: none
A/B Experiments
vsliv368cf:30146710
vsreu685:30147344
python383:30185418
vspor879:30202332
vspor708:30202333
vspor363:30204092
pythonvspyl392cf:30425750
pythontb:30283811
pythonptprofiler:30281270
vshan820:30294714
vstes263:30335439
pythondataviewer:30285071
vscod805:30301674
pythonvspyt200:30340761
binariesv615:30325510
bridge0708:30335490
bridge0723:30353136
vsaa593cf:30376535
vsc1dst:30433059
pythonvs932:30410667
wslgetstarted:30433507
vscop341:30404997
vsrem710:30416614

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 24 (14 by maintainers)

Most upvoted comments

@alexrs84 I found it! I totally missed an issue when implementing this locking and here is what happens:

  • resource locks are stored per file handles in a map and a file handle is just a number
  • we always have a sequence of open, write(n), close calls for each file write
  • in open we store the lock for the handle and in close we dispose it

And here is the race condition:

  • we call fs.close(fd) and then try to remove the lock from the map with that fd
  • however, after fs.close(fd) the operating system is free to assign that exact same fd to the next write call for any othe file
  • as such, there is a slight chance that our lock for fd is being replaced with a new lock after fs.close(fd) but before we get a chance to clean up the lock

In code (a bit simplified):

async close(fd: number): Promise<void> {
    try {
        return await Promises.close(fd);

        // !!!!! The `fd` is free now and can be assigned by the OS for another `write` operation !!!!!!

    } catch (error) {
        throw this.toFileSystemProviderError(error);
    } finally {

        // !!!!! here we wrongly assume that the map for file locks
        // is still associated with our file, but in fact it is the next
        // write operation's file we dispose here
        // As such, the original file lock hangs as never getting 
        // disposed

        const lockForHandle = this.mapHandleToLock.get(fd);
        if (lockForHandle) {
            this.mapHandleToLock.delete(fd);
            lockForHandle.dispose();
        }
    }
}

Thanks a lot for the patience and the help 🙏

dev console: slower-vscode-app-1645090832302.log

full stdout: slower-vscode-insiders-stdout.log


Regarding the SMB locks earlier, from what I understand smbstatus -l shows all opened resource handles (that includes resources opened with flags: r, r+, w, w+) that haven’t yet been released - regardless of any oplock/flock, if any.

I notice in src/vs/platform/files/node/diskFileSystemProvider.ts that when writing:

  • if windows: manual truncation, then fs.open(...) with “r+” flag
  • else: fs.open(...) with w flag

So the w file handle should show for me in the smbstatus -l list as a RDWR lock like

Locked files:
Pid      DenyMode   Access      R/W        Oplock      SharePath        Name            Time
--------------------------------------------------------------------------------------------------
25136    DENY_NONE  0x20087     RDWR       NONE        /mnt/vo.../www   test/Test.txt   We... 14:19:02 2022

and should remain while vscode is still running in the busy state, unless the previous doSave() finished up to and including the fs.close() call to release the previous w (RDWR) handle - so maybe it could help pinpoint what’s hanging, like maybe it stops executing after the fs.close() but before the internal lock dispose(). Probably isn’t hanging during fs.open() or fs.write() because the former RDWR lock in SMB no longer exists, it was released.

Anyway, let me know if you need any more --verbose logs.