runtime: File writes to Windows NFS share silently fail...
AB#1285380 We are using .NET Core 3.1 on Linux.
When we write to a file on a Windows NFS server, writes silently fail.
The following program illustrates the behavior:
using System;
using System.IO;
namespace test
{
class Program
{
static void Main(string[] args)
{
if( args.Length != 1 )
{
Console.WriteLine("Usage: test <folder>\n");
return;
}
var filename = Path.Combine(args[0], "file.txt");
File.WriteAllText(filename, $"Hello World as of {DateTime.Now}");
Console.WriteLine($"File Content: '{File.ReadAllText(filename)}'");
}
}
}
The folder should be a path to a directory on a Windows NFS server.
Basically, what is happening is .net is asking for a shared advisory lock on the file handle, and then issuing a write on that handle.
While that may work against a Linux NFS server, it does not work against a Windows NFS Server, which chooses to enforce advisory locking.
The write into the local buffer cache succeeds, but the write to the back-end server fails.
.net also fails to look at the return value of close(2), and hence never notices anything wrong – the program above succeeds and shows the file as having no data in it after the write to the file succeeds.
The following native C program exhibits the same issue, for comparison:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <sys/file.h>
int
main(int argc, char **argv)
{
int fd;
int rv;
char buffer[22];
char *filename;
if (argc != 2) {
printf("usage doit <file>\n");
exit(1);
}
filename = argv[1];
fd = openat(AT_FDCWD, filename, O_WRONLY|O_CREAT|O_CLOEXEC, 0666);
if (fd == -1) {
printf("openat(%s) failed %d\n", filename, errno);
exit(1);
}
rv = flock(fd, LOCK_SH|LOCK_NB);
if (rv == -1) {
printf("flock(%s) failed %d\n", filename, errno);
exit(1);
}
memset(buffer, 'a', sizeof(buffer));
rv = write(fd, buffer, sizeof(buffer));
if (rv != sizeof(buffer)) {
printf("write(%s, %d) failed %d, %d\n", filename, (int)sizeof(buffer), rv, errno);
exit(1);
}
printf("write(%s, %d) succeeded %d\n", filename, (int)sizeof(buffer), rv);
rv = close(fd);
if (rv == -1) {
printf("close(%s) failed %d\n", filename, errno);
exit(1);
}
return 0;
}
Workarounds for the user include:
- mount the NFS share with “-o nolock” to not propagate lock requests from the Windows NFS Server to Windows NLM.
This is what we are seeing inside the Linux VM, using strace:
openat(AT_FDCWD, "/home/hcsuser/mnt/14/second.txt", O_WRONLY|O_CREAT|O_CLOEXEC, 0666) = 20
fstat(20, {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
flock(20, LOCK_SH|LOCK_NB) = 0
fadvise64(20, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
ftruncate(20, 0) = 0
lseek(20, 0, SEEK_CUR) = 0
write(20, "example text to write\n", 22) = 22
flock(20, LOCK_UN) = 0
close(20)
Notice it seems (Linux) .net itself is sending the flock(SH), and according to strace the write succeeds with 22 bytes written.
(The write will not succeed inside Windows NFS Server, because it is incompatible with the lock mode!)
This is what we are seeing on the server (for a different run, but the same outcome):
11:32:59.3285886 AM System 4 4456 IRP_MJ_CREATE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt NAME NOT FOUND Desired Access: Maximum Allowed, Disposition: Open, Options: Write Through, Open Reparse Point, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a
11:32:59.3287878 AM System 4 4456 IRP_MJ_CREATE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt NAME NOT FOUND Desired Access: Maximum Allowed, Disposition: Open, Options: Write Through, Open Reparse Point, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a
11:32:59.3297571 AM System 4 4456 IRP_MJ_CREATE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Desired Access: Maximum Allowed, Disposition: Create, Options: Write Through, Non-Directory File, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: 0, OpenResult: Created
11:32:59.3303417 AM System 4 4456 IRP_MN_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryIdInformation
11:32:59.3303721 AM System 4 4456 IRP_MJ_QUERY_SECURITY N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Information: Owner, Group, DACL, SACL
11:32:59.3303847 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryStandardInformationFile, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False
11:32:59.3304132 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryBasicInformationFile, CreationTime: 11/10/2020 11:32:59 AM, LastAccessTime: 11/10/2020 11:32:59 AM, LastWriteTime: 11/10/2020 11:32:59 AM, ChangeTime: 11/10/2020 11:32:59 AM, FileAttributes: A 0x400000
11:32:59.3304254 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryStandardInformationFile, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False
11:32:59.3304437 AM System 4 4456 IRP_MJ_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryFileInternalInformationFile, IndexNumber: 0x3000000000002791
11:32:59.3309122 AM System 4 4456 IRP_MJ_CREATE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Desired Access: Maximum Allowed, Disposition: Open, Options: Write Through, Sequential Access, Non-Directory File, Open By ID, Open Reparse Point, Attributes: N, ShareMode: Read, Write, Delete, AllocationSize: n/a, OpenResult: Opened
11:32:59.3310229 AM System 4 4456 IRP_MN_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryIdInformation
11:32:59.3310509 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryBasicInformationFile, CreationTime: 11/10/2020 11:32:59 AM, LastAccessTime: 11/10/2020 11:32:59 AM, LastWriteTime: 11/10/2020 11:32:59 AM, ChangeTime: 11/10/2020 11:32:59 AM, FileAttributes: A 0x400000
11:32:59.3310618 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryStandardInformationFile, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False
11:32:59.3310810 AM System 4 4456 IRP_MJ_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryFileInternalInformationFile, IndexNumber: 0x3000000000002791
11:32:59.3310970 AM System 4 4456 IRP_MJ_QUERY_SECURITY N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Information: Owner, Group, DACL, SACL
11:32:59.3311156 AM System 4 4456 IRP_MN_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryLinks
11:32:59.3311448 AM System 4 4456 FASTIO_LOCK N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Exclusive: False, Offset: 0, Length: 4,294,967,295, Fail Immediately: True
11:32:59.3326908 AM System 4 4456 IRP_MJ_CLEANUP N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS
11:32:59.3329113 AM System 4 4456 IRP_MJ_SET_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: SetBasicInformationFile, CreationTime: 0, LastAccessTime: 0, LastWriteTime: 11/10/2020 11:32:59 AM, ChangeTime: 0, FileAttributes: n/a
11:32:59.3331090 AM System 4 4456 IRP_MJ_SET_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: SetBasicInformationFile, CreationTime: 0, LastAccessTime: 0, LastWriteTime: 0, ChangeTime: 0, FileAttributes: n/a
11:32:59.3331472 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryBasicInformationFile, CreationTime: 11/10/2020 11:32:59 AM, LastAccessTime: 11/10/2020 11:32:59 AM, LastWriteTime: 11/10/2020 11:32:59 AM, ChangeTime: 11/10/2020 11:32:59 AM, FileAttributes: A 0x400000
11:32:59.3331622 AM System 4 4456 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryStandardInformationFile, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False
11:32:59.3331922 AM System 4 4456 IRP_MJ_QUERY_SECURITY N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Information: Owner, Group, DACL, SACL
11:32:59.3335004 AM System 4 4456 IRP_MJ_WRITE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt FILE LOCK CONFLICT Offset: 0, Length: 22, I/O Flags: Write Through
11:32:59.3335631 AM System 4 25472 IRP_MJ_SET_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: SetBasicInformationFile, CreationTime: 0, LastAccessTime: 0, LastWriteTime: 0, ChangeTime: 0, FileAttributes: n/a
11:32:59.3336067 AM System 4 25472 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryBasicInformationFile, CreationTime: 11/10/2020 11:32:59 AM, LastAccessTime: 11/10/2020 11:32:59 AM, LastWriteTime: 11/10/2020 11:32:59 AM, ChangeTime: 11/10/2020 11:32:59 AM, FileAttributes: A 0x400000
11:32:59.3336425 AM System 4 25472 FASTIO_QUERY_INFORMATION N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Type: QueryStandardInformationFile, AllocationSize: 0, EndOfFile: 0, NumberOfLinks: 1, DeletePending: False, Directory: False
11:32:59.3336615 AM System 4 25472 IRP_MJ_QUERY_SECURITY N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Information: Owner, Group, DACL, SACL
11:32:59.3339400 AM System 4 4456 FASTIO_UNLOCK_SINGLE N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS Offset: 0, Length: 4,294,967,295
11:32:59.3339829 AM System 4 4456 IRP_MJ_CLEANUP N:\hcsshares\03c059d4-eb0f-4b77-bcb8-14a200969455\datadata\threex\second.txt SUCCESS
Note two potential problems:
- the NFS flock(SH) is mapped to FASTIO_LOCK(Exclusive: False) – this is unfortunate, since Windows will now fail writes; whereas, NFS would not – flock(SH) is never appropriate for a handle where writes may occur (.net issues this incorrectly)!
- the write that failed on the server (flushing from Linux buffer cache) is reported as succeeded on the client (write(2) into the buffer cache, and .net did not check the return value of close(2)) – there is no data in the file.
For the Windows Server lock semantics, see:
Locking a portion of a file for shared access denies all processes write access to the specified region of the file, including the process that first locks the region. All processes can read the locked region.
https://docs.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-lockfileex
Recommended fix:
- don’t ask for lock(SH) on a handle that can be used for write, since if you do, writes will always fail – the decision to use SH needs more information in .net than it has today (it needs share mode and access rights).
- check the return value for close(2) and throw an exception when it fails – this will typically be in a caller’s using block, if they are trying to control handle lifetimes, which is good…
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 35 (20 by maintainers)
I am not sure if I understand. What
close()
method returns in the C program? A success or a failure?@danmoseley I think that it would be just enough to change all the
File.Write*
APIs:to use
FileShare.None
, instead of using defaultStreamWriter
ctor that usesFileShare.Read
which is translated to the shared lock:https://github.com/dotnet/runtime/blob/9cfe6966b4b6cae834a7fb58b71cdd9097381330/src/libraries/System.Private.CoreLib/src/System/IO/StreamWriter.cs#L168
https://github.com/dotnet/runtime/blob/9cfe6966b4b6cae834a7fb58b71cdd9097381330/src/libraries/System.Private.CoreLib/src/System/IO/LegacyFileStreamStrategy.Unix.cs#L58-L59
BTW this is what
File.OpenWrite
already does so I assume this would be acceptable forFile.Write*
methods:https://github.com/dotnet/runtime/blob/9cfe6966b4b6cae834a7fb58b71cdd9097381330/src/libraries/System.IO.FileSystem/src/System/IO/File.cs#L263-L267
@rtestardi Even with good intentions in mind, this would be a breaking change and we don’t backport breaking changes to LTS releases. And even if we would hypothetically convince the shiproom to do it, your customers would still need to update their .NET Core installation. Not to mention how much time it would take.
Instead of this, we would like to ask you to ask them to use a simple workaround:
I hope that copying eight lines of code and getting unblocked immediately is a good trade-off. In the meantime, we are going to revisit this approach and hopefully change it for .NET 6 (our next major release).