roslyn: C# Compiler is not consistent with the CLR for the underlying representation of bool
Issue
Today the C# compiler does not emit code handling for bool
that is consistent with the underlying representation of bool
in the CLR.
The C# specification states:
The bool type represents boolean logical quantities. The possible values of type bool are true and false.
The spec also indicates that it is 1-byte. Specifically that the result of the sizeof(bool)
expression is 1
.
While the ECMA 335 specification states:
A CLI Boolean type occupies 1 byte in memory. A bit pattern of all zeroes denotes a value of false. A bit pattern with any one or more bits set (analogous to a non-zero integer) denotes a value of true. For the purpose of stack operations boolean values are treated as unsigned 1-byte integers (§III.1.1.1).
This can lead to confusion about the handling and cause various inconsistencies in various edge cases (see https://github.com/dotnet/coreclr/pull/16138#discussion_r165256495, for such a thread).
Proposal
It would be good to determine:
- Should the spec be updated?
- The spec would explicitly list the expected values of
true
/false
so that two implementations don’t behave differently
- The spec would explicitly list the expected values of
- Can the compiler be updated?
- Given that we are emitting for the CLR, we would make our handling match the expected representations for the two boolean values (
0
/not 0
) of the underlying platform
- Given that we are emitting for the CLR, we would make our handling match the expected representations for the two boolean values (
Current Behavior
The majority of the behaviors below actually match the CLR expectation that a bool
can be more than just 0
or 1
. However some of the behaviors (such as &&
) does not match this expectation and can cause issues when interoping with any code that would assume otherwise (generally this is some kind of interop or unsafe code).
!value
ldarg.1
ldc.i4.0
ceq
left == right
ldarg.1
ldarg.2
ceq
left != right
ldarg.1
ldarg.2
ceq
ldc.i4.0
ceq
left & right
ldarg.1
ldarg.2
and
left | right
ldarg.1
ldarg.2
or
left ^ right
ldarg.1
ldarg.2
xor
left && right
ldarg.1
ldarg.2
and
left || right
ldarg.1
ldarg.2
or
value ? "true" : "false"
ldarg.1
brtrue.s
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 23 (21 by maintainers)
Commits related to this issue
- Document representation of boolean values Fixes #24652 — committed to gafter/roslyn by gafter 6 years ago
- Document representation of boolean values (#26609) Fixes #24652 — committed to dotnet/roslyn by gafter 6 years ago
@Joe4evr @gafter I don’t think it would make sense to require the
unsafe
keyword anyway. Explicit layouts are completely verifiable IL as long as you’re not overlaying references (type safety violation) or private fields of structs (visibility violation). Blitting simple primitives on top of each other doesn’t violate either of these conditions.Only if you limited such code to C#.
It is entirely possible for C# to be consuming code produced by another language where the
bool
value conforms to the CLR definition but doesn’t line up with the C# implementation. In which case updating the spec would cause the behavior in that scenario to be well-defined instead of undefined or implementation-specific.