PowerShell: `0b11111111` returns `-1` instead of `255`
Prerequisites
- Write a descriptive title.
- Make sure you are able to repro it on the latest released version
- Search the existing issues.
- Refer to the FAQ.
- Refer to Differences between Windows PowerShell 5.1 and PowerShell.
Steps to reproduce
[convert]::ToString(255,2)
0b11111111
[convert]::ToInt32(11111111,2)
11111111
-1
255
Expected behavior
11111111
255
255
Actual behavior
0b11111111 returns -1
Error details
No Errors
Environment data
Name Value
---- -----
PSVersion 7.4.0-preview.1
PSEdition Core
GitCommitId 7.4.0-preview.1
OS Linux 6.1.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 14 Feb 2023 22:08:08 +0000
Platform Unix
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
and 7.3.2
Visuals
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (10 by maintainers)
WG-Engine discussed this today. We agree the current behaviour is unintuitive and it may be worth the breaking change in this instance.
We suggest the parsing should be modified to line up functionally with the hexadecimal parsing: all literals up to 32-bits wide should be considered effectively “unsigned” – so that things like
0b11111111
should not resolve to a-1
, and the only accepted sign bits should be 32, 64, and above that are currently accepted by the parsing.Implementation wise this should be a relatively simple change, just removing these two
switch
cases from the binary parsing utility method:https://github.com/PowerShell/PowerShell/blob/04b93af4ae333013e4fe00013a196268d314cb00/src/System.Management.Automation/engine/Utils.cs#L188-L189
@mklement0 thanks for filling all that in.
I thought this was the crucial part.
In other words we don’t know if 1111 1111 was a signed byte -128, an unsigned byte (+255), or a positive number written with 16 or 32 bits with the leading zeros dropped. Similarly we don’t know if ffff was -1 in an int16 or 65535 in an int32. Even if we had leading zeros we wouldn’t know if the original was the signed or unsigned version of a type.
However-
0xffff
returns 65535 - the lack of leading zeros doesn’t matter for hex but does for binary. 8 digit hex numbers still follow 2s compliment so I can see that this could be be misunderstoodTBH (1) I could argue for the behaviour of the binary conversion if 1-8 digits returned an
sbyte
and 9-16 digits returned anint16
but they don’t: they returnint32
s ; the hex conversion also returnsint32
but it assumes 2 and 4 digits are the non-zero part of a 32 bit number, unless they are suffixed to say otherwise (2) Even if shorter numbers became smaller types, binary and hex would still be inconsistent, and I wouldn’t argue for having both ways of working. So net, 0b handling just seems wrong. (3) 0b notation is not in Windows PowerShell, so although changing it would be a breaking change, there aren’t scripts out there which have depended on current behaviour for ages. (That’s another argument for leaving 0x alone and changing 0b to match it)I should have clarified that I was merely citing the documentation and showing a workaround, but let me weigh in on what I think of this behavior:
It is certainly surprising, and the true rules are obscure (and currently incorrectly documented, as @rhubarb-geek-nz points out) and inconsistent with the rules for hex literals.
Let me try to document the current de facto behavior (inferred from experiments - haven’t looked at the source code):
Binary literals without a type suffix or with signed integer-type suffixes (
y
,s
,l
,d
,n
) that have8
,16
,32
, or64
binary digits (bits) and whose first digit is1
(the high bit) become negative numbers: they are treated as bit patterns of a signed integer type of that width (see two’s complement).0b10000000
is-128
You would think that the resulting number types map onto signed types
[sbyte]
,[int16]
([short]
),[int32]
([int]
) and[int64]
([long]
), but binary literals with8
,16
digits are converted to[int32]
too.[int32]
and[int64]
are also the automatically chosen types used for non-suffixed hex literals up to 64 bits.32
-bit values and64
-bit values with the high bit set are negative by default (e.g.,0xffffffff
and0xffffffffffffffff
) and smaller numbers can only be negative with an explicit type suffix for a smaller signed type (e.g,.0xffY
([sbyte
]) and0xffffS
([short]
==[int16]
)Type suffixes for unsigned integer types prevent interpretation as a negative number, both for binary and hex literals:
uy
([byte]
),us
([ushort]
==[uint16]
),u
([uint32]
, widened to[uint64]
if needed), andul
([ulong]
==[uint64
])Beyond 64 bits, number literals fundamentally only work with suffixes
d
([decimal]
) andn
([bigint]
) , both of which are signed types, so the non-suffixed negativity rules apply them as well, plus to every additional32
bits of width, up to96
for[decimal]
, and open-ended for[bigint]
, but for binary literals only. The only way by prevent negative numbers for these types is to prepend0
; e.g.,0b0111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111D
([decimal]
,79228162514264337593543950335
, which is[decimal]::MaxValue
)d
([decimal]
) cannot be used, becaused
is also a valid hex digit.n
([bigint]
):0xffffffffffffffffffffffffN
is79228162514264337593543950335
, whereas the equivalent binary literal,0b111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111N
, is-1
(and would require a leading0
to be positive).Finally, there are outright bugs:
96
,128
, … digits should break (the widest auto-selected type is[int64]
, which cannot hold such numbers), but the parsing quietly overflows and returns-1
of type[int32]
; e.g:0b111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
-1
of the requested type); e.g.:0b1111111111111111111111111111111111111111111111111111111111111111y
0b1111111111111111111111111111111111111111111111111111111111111111s