runtime: Define and add `FileSystemError` enum and matching property to IOException
(similar to dotnet/corefx#34220)
Scenario: I want to create and write some default data into a file unless it already exists.
A naive implementation would be:
if (!File.Exists(path))
{
using (var fileStream = File.Open(path, FileMode.CreateNew))
{
// ...
}
}
The problem with this is that I’m accessing the file system at 2 points in time. As @jaredpar has taught me, File.Exists
is evil. There’s no guarantee that because File.Exists
returned false, the file still won’t exist when calling File.Open
.
To be robust, we should just call File.Open
and catch the exception it throws in case the file already exists. The problem is that it throws System.IO.IOException
with a message of “The file already exists”. There is no specific exception type for this scenario. At first it would seem that the only thing we can do is catch the exception depending on its message string (which is a terrible idea), but luckily, there is a specific HResult for this failure, leaving us with:
Stream fileStream = null;
try
{
fileStream = File.Open(path, FileMode.CreateNew);
}
catch (IOException e) when (e.HResult == -2147024816) // FILE_EXISTS
{
}
if (fileStream != null)
{
using (fileStream)
{
// ...
}
}
This works OK but makes the code seem unreadable and maybe even brittle because we’re depending on a magic constant that comes somewhere from the Windows API. I’m not sure if this code works outside of Windows at all, but even if it does, it’s definitely not obvious.
Please add a dedicated exception type for this kind of failure.
About this issue
- Original URL
- State: open
- Created 6 years ago
- Reactions: 15
- Comments: 63 (40 by maintainers)
I find it hard to believe that C# programmers would expect the API to follow the patterns used in C and Go rather than C++ and Java.
I don’t think we want to change what the value of HResult is; even if we pick the existing Win32 values (which is rather limiting and do not necessarily match to sufficient distinct causes) that would break anyone expecting the Unix code today. Plus, it’s a poor name for a cross platform code. On top of that, it’s weakly typed as
int
rather than a nice enum.This would require an API proposal:
to add a property to
IOException
named perhapsError
orIOError
(I thinkFileSystemError
is a poor choice as, at least on Windows, it did not necessarily originate with a file system operation.). It would have the type of the enum in 2.to add a new enumeration named eg
IOError
and populate it with several useful entries perhaps based on my analysis above. We do not need to think of them all now, it’s not a breaking change to add more later, but it would need a (brief) review so better to do it now.@Neme12 any interest in doing this? The process is here and you would append to your top post because that makes it easier on the review committee.
That sounds fine to me too. I do think something should be done, as we have made it worse when we made it cross platform.
Are they complementary?
This is an adjustment to help all the existing IO code written using exceptions be a bit more robust, with the addition of little or no code in CoreFX beyond the types - just changing some places we throw IOException to throw something more specific. It is just tuning an existing exception hierarchy.
The other proposal is a new approach to IO, which will need substantial design, establishing a new set of abstract IO codes, etc, … It will take time, and does not help existing code that is not worth rewriting to the try model.
https://github.com/dotnet/corefx/issues/31917 is related to this. The current file I/O APIs are broken for “file already exist” and other similar situations that the programs often want to recover from gracefully.
We can fix it by either: (1) Introducing more fine grained exceptions, like what is proposed here. This option encourages using exception handling for control flow that is not desirable. And/or (2) Introducing new set of non-throwing APIs, like what is proposed in dotnet/corefx#31917. The advantage of this option is that it is more robust better performing design that matches library capabilities of other programming languages (C, Go, …).
My vote would be to spend energy on (2). It is the more forward looking option.
We tend to form a strong opinion about the preferred style during the API design and add the new APIs around it. We do not generally add duplicate APIs to support other styles. Having one preferred style has a lot of value for platform usability. Also, the preferred styles tend to change over time as the new requirements or new fundamental features show up.
I think these are the questions to answer from the API design point of view:
Error codes do not imply HResults. .NET has number of domain-specific error code enums actually, e.g.
SerialError
,SocketError
orWebSocketError
. Maybe we should consider addingFileSystemError
enum and matching property to IOException. The error code enum is much more lightweight than fine grained exception hierarchy. And it would work nicely for both exception-throwing and error-code returning APIs and minimize duplication at the same time.I would expect that migrating the code to use error codes for file I/O would be much easier on average than migrating the code to use Spans. Opening a file and similar operations are usually scoped to a single method that one can change easily. On the other hand, Span is about passing data around. It is not unusual for Spans migration to involve non-trivial refactoring (changing many method arguments from arrays to Spans, etc.).
My point here was: users would also expect do to be able to the same thing with the throwing APIs as opposed to having to use the non-throwing one. Not that I don’t want the non-throwing APIs to exist. But I think the throwing ones will always be seen as more idiomatic in .NET. It’s definitely not something that people will consider to be obsolete.
@jkotas I don’t think this is quite the same as unmanaged pointers vs span. With span, you’ll be able to use the same code as you would have if the API took an unmanaged pointer. It was worthwhile to wait for the new API because it was a generalization of the proposed one.
New non-throwing IO APIs on the other hand introduce a completely new paradigm. It’s not a generalization. It’s an alternative way of doing the same thing and it’s somewhat a matter of style. I also believe that throwing and non-throwing alternatives should have the same feature set. Having the throwing versions lack certain exceptions for some very specific error codes, while having exceptions for the majority of other ones seems like it would be surprising to any user who isn’t familiar with the fact that the non-throwing APIs are newer or that we’re trying to encourage them to use the non-throwing API (which won’t be obvious at all unless you actually make the exception hierarchy
[Obsolete]
).Furthermore, it’s not as simple as saying “do not use exceptions for control flow”. In this case, you might consider “file already exists” to be part of control flow since it’s an expected situation, but other kinds of IO failures would not be as expected and might be handled in a try-catch higher up the call stack. Exceptions would be more convenient for that use case.
I do not think they are complementary. The value of (1) is going to be lost once we have (2). Once we have (2), the fine-grained exception hierarchy will be obsolete - the guidance is going to use the new APIs and to not use exception handling for control flow.
I agree that (2) is more work than (1).
We have been in similar situations before. For example, we used to have number of suggestions for new API variants that take arrays or unmanaged pointers. We could have spent energy on adding those and it would certainly be an incremental improvement on top of the existing designs. We have not done that. Instead, we have spent our energy on adding Span that was a superior design. API version of the Innovator’s Dilemma.
This makes sense to me, right now we hand the user either 0x80070050 (ERROR_FILE_EXISTS) or 0x10014 (EEXIST). Same for the linked issue, the user has to look for ERROR_DIR_NOT_EMPTY or ENOTEMPTY or whatever it may be on any future OS.
That’s doubly unfortunate because .NET is designed around exceptions not error codes. It is clumsy to handle both in the same method. Also, it forces platform specific code - and even on one platform several error codes may map to a single condition for example removing a non empty directory can give ENOTEMPTY or EEXISTS just on Linux. If the caller is not diligent, they may use the message field instead of the code, which will break when localized (or, more often if the OS is localized or changes the text).
If we do anything here, and it would be nice, it should not be bit by bit. We should do an audit to come up with a complete proposal, examining all the places where we overload IOException. In a quick glance it is used for include
Perhaps one reason IOException is so broad may have been a concern that it is easy to break code inadvertently by changing the exception type, if the differences are fine grained. (I know this is the reason that XmlException is used for so many distinct cases.) This does not seem like a good reason, with good unit tests and a stable platform. PathTooLongException, FileNotFoundException, DirectoryNotFoundException, DriveNotFoundException are all good stable fine grained exception types, and all derive from IOException. Also, if code depends on the error code, they already depend on the “type” of error.
As an aside we should also have a first class field for the path, if any. Right now that also has to be parsed out of the message if it’s needed and not already known.
@stephentoub @JeremyKuhne @jkotas
Question: Why not just catch
IOException
. Why would you need to differentiate between this failure and other IO failures?Answer: If the file already exists, that’s fine and we don’t need to do anything in that scenario. But in case of other IO failures, we might want to show an error message to the user, a “retry” dialog or have some other fallback logic.
@danmosemsft I think I should be able to do item (1) and (2) from that in given timeline for sure 😃 Already working on it.
@MarcoRossignoli yep, but this is a bunch of work up front to define and reach consensus on the cases/values for API review. I made a good start above.
Maybe the property should be called
IOErrorCode
to matchSocketException.SocketErrorCode
,WebSocketException.WebSocketErrorCode
and to a lesser extentHttpResponseMessage.StatusCode
.For what it’s worth, this is something that shows up relatively often in .NET and it can be a little frustrating. Many APIs throw a generic exception type with no way to determine the failure other than checking the exception message or if you’re lucky, an HResult.
Please consider this in future APIs especially in an area like IO where this is not a contract exception - these exceptions are meant to be caught and dealt with. There’s no way of knowing whether a file system action would fail in advance.