roslyn: CLR bug with static variable in struct could be warned by the compiler

The following code compiles, but throws a TypeLoadException when executed.

using System;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            var x = MyStruct.Empty;
        }

        public struct MyStruct
        {
            public static readonly MyStruct? Empty = null;
        }
    }
}

Since Empty is static, this is not a cyclic struct layout issue. It would be nice if the compiler produced warnings that tells you that you are likely to run into this CLR bug at runtime

[Note from @gafter this is due to CLR bug https://github.com/dotnet/coreclr/issues/4049. See also https://github.com/dotnet/coreclr/issues/7957 and https://github.com/dotnet/coreclr/issues/22553 for other situations where the warning would be helpful]

About this issue

Original URL
State: open
Created 8 years ago
Reactions: 2
Comments: 27 (8 by maintainers)

Most upvoted comments

@coderealm, your analysis continues to be specious.

First, Jared’s article does not ever say that the behaviour of the CLR’s type loader is as specified or by design; rather, he notes that he did an experiment – as I did – to determine its behaviour. The question of whether it is a bug in the CLR must be answered by examination of the specification of the CLR in order to determine whether this behaviour is within the by-design parameters of the runtime, or an error. You cannot determine what was intended by examining behaviour! The definition of a bug is that it is unintended behaviour.

Second, the quote you make from Jared’s blog is a footnote on the subject of why definite assignment rules for structs are as they are. It has nothing to do with the issue at hand.

Third, let’s suppose for the sake of argument that the behaviour of the CLR is correct. It is deeply unfortunate when the C# language permits an unverifiable, crashing program to be compiled cleanly. Ideally we would like the C# language to match the by-design semantics of the underlying type system implementation when possible at a reasonable cost. If the CLR behaviour is correct – and I have no evidence from any specification either way – then the question immediately raised is whether this program compiling cleanly is a flaw in the C# specification.

I would argue that if the CLR is correct here – and again, I have no evidence one way or the other – then that motivates adding the same rule to C#, and making this program formally illegal, so that the compiler catches the error rather than the runtime.

You are reasoning from lack of evidence, and that is bad reasoning. The evidence that we have is that a particular program compiles but crashes at runtime. Plainly this is a bad situation to be in. Now, most of the time, when a program compiles but crashes at runtime, it is the fault of a programmer making an unwarranted assumption that is not policed by the type system – like a bad cast, or a null dereference. In this case, the program is simple, it seems entirely plausible, and there is no clear rule in the C# language that would imply that this program is not verifiable; there’s no “unsafe” anywhere, or any such thing.

C# was not designed to be a “gotcha” language; let us consider the possibilities from the language design perspective. The possibilities are:

This plausible program is correct; the CLR has a bug in its type loader.
The CLR type loader is correct; the C# specification and its compiler could be making this an error at compile time with a clear diagnostic error message that informs the developer of the subtle error they’ve stumbled upon.
The CLR type loader is correct and the C# language contains a “gotcha”. A perfectly plausible program is unreliable, and moreover unpredictably unreliable. Remember, this happens the first time a type is used; a mission-critical program might have been running for a day before that type happens to be loaded and the program fails.

The C# team tries very hard to stay out of that third bucket, as they should. You can’t simply make the argument that the programmer was wrong to write code like this. Either the code is right, in which case the CLR should be fixed, or the code is wrong, in which case the compiler shouldn’t allow it. Allowing plausible code to crash horribly is a terrible situation to be in.

ericlippert on Apr 1, 2016

There are numerous errors in much of the analysis so far; I will not address those errors in detail. Going forward, it would be better to analyse a more minimal repro:

struct N<T> {}
struct M { public N<M> E; }
class P { static void Main() { var x = default(M); } }

This compiles without error, but crashes with a type load exception at both runtime and when running PEVerify on it.

From this more minimal repro we can see:

(1) The exception is not triggered by use of E. It is triggered by any attempt to access the type M. (As one would expect in the case of a type load exception.)

(2) The exception reproduces whether the field is static or instance, readonly or not; this has nothing to do with the nature of the field.

(3) The exception has nothing whatsoever to do with “invocation”; nothing is being “invoked” in the minimal repro.

(4) The exception has nothing whatsoever to do with the member access operator “.”. It does not appear in the minimal repro.

(5) The exception has nothing whatsoever to do with nullables; nothing is nullable in the minimal repro.

So, where then can we lay the blame for the bug? One or more of:

The CLR type loader, for crashing.
The C# specification, for allowing a program that crashes
The C# implementation, for emitting a program that crashes, or
The actual program itself

must be wrong, but which one? Were I asked to guess, I would guess that the CLR type loader is at fault here; it seems to me that it ought to be able to handle this type topology. However it may be the case that the loader is correct to reject this program at runtime, and it is the C# specification and its implementation which ought to detect this error. This needs more investigation and I don’t have the time today to go spec diving in the CLR.

ericlippert on Mar 31, 2016

Had a chat with the CLR team this morning about this issue and gained clarity on the following items:

This program is indeed legal according to the CLI spec.
The Desktop CLR and CoreCLR implementations have a limitation that prevent them from allowing this correct program. Other runtimes, such as ProjectN, correctly handle this program.

This is a tricky type of issue to handle in the compiler. This program is correct to spec and has known runtimes where it can execute successfully. Yet Desktop / CoreClr programs are far more prolific than ProjectN ones at this time. Hence the compiler is allowing code that will fail to execute in the majority of cases. That’s not a position we like to put ourselves in.

There are a couple option available to us:

Make it an error (go back to 5.0 behavior). This is a back compat break which gives it an incredibly high bar and hence almost certainly won’t be done.
Make it a warning. Still a chance of back compat break here but can be mitigated by disabling the warning. Even so new warning still have a moderate bar to meet.
Do nothing.

At this point I think we’re going to wait and see how many more instances of this behavior show up. If it’s limited to a single case, namely this bug, then I’d be inclined to go with 3. If more instances pop up though we’ll look into doing 2.

jaredpar on Apr 1, 2016

The type load error is an implementation limitation of some CLR implementations. It is not required by the specification, and some CLR implementations do not have this bug. We are considering whether to add a warning for this case so that you can suppress the warning if you happen to be targeting a runtime that does not have this bug.

gafter on Apr 1, 2016

Here is the bug tracking the underlying CLR issue:

https://github.com/dotnet/coreclr/issues/4049

jaredpar on Apr 1, 2016

@peter-perot Well done. Excellent find. I have thrown in the towel. I personally emailed @jaredpar and I appreciate very much for his swift response and clearing the air. Well done you Peter. Thank you @jaredpar

coderealm on Apr 1, 2016