PowerShell: `0b11111111` returns `-1` instead of `255`

Prerequisites

Write a descriptive title.
Make sure you are able to repro it on the latest released version
Search the existing issues.
Refer to the FAQ.
Refer to Differences between Windows PowerShell 5.1 and PowerShell.

Steps to reproduce

[convert]::ToString(255,2)
0b11111111
[convert]::ToInt32(11111111,2)

11111111
-1 
255

Expected behavior

11111111
255
255

Actual behavior

0b11111111 returns -1

Error details

No Errors

Environment data

Name                           Value
----                           -----
PSVersion                      7.4.0-preview.1
PSEdition                      Core
GitCommitId                    7.4.0-preview.1
OS                             Linux 6.1.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 14 Feb 2023 22:08:08 +0000
Platform                       Unix
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0


and 7.3.2

Visuals

No response

About this issue

Original URL
State: closed
Created a year ago
Comments: 21 (10 by maintainers)

Most upvoted comments

WG-Engine discussed this today. We agree the current behaviour is unintuitive and it may be worth the breaking change in this instance.

We suggest the parsing should be modified to line up functionally with the hexadecimal parsing: all literals up to 32-bits wide should be considered effectively “unsigned” – so that things like 0b11111111 should not resolve to a -1, and the only accepted sign bits should be 32, 64, and above that are currently accepted by the parsing.

Implementation wise this should be a relatively simple change, just removing these two switch cases from the binary parsing utility method:

https://github.com/PowerShell/PowerShell/blob/04b93af4ae333013e4fe00013a196268d314cb00/src/System.Management.Automation/engine/Utils.cs#L188-L189

vexx32 on Apr 3, 2023

@mklement0 thanks for filling all that in.

I thought this was the crucial part.

These calls create representations strictly based on the bit pattern of the input number, with leading 0s omitted. That is, the input type is lost in the process, including whether or not the input number was negative.

In other words we don’t know if 1111 1111 was a signed byte -128, an unsigned byte (+255), or a positive number written with 16 or 32 bits with the leading zeros dropped. Similarly we don’t know if ffff was -1 in an int16 or 65535 in an int32. Even if we had leading zeros we wouldn’t know if the original was the signed or unsigned version of a type.

However- 0xffff returns 65535 - the lack of leading zeros doesn’t matter for hex but does for binary. 8 digit hex numbers still follow 2s compliment so I can see that this could be be misunderstood

PS>  -0xff      
-255

PS>  -0xffff
-65535

PS>  -0xffffffff
1

TBH (1) I could argue for the behaviour of the binary conversion if 1-8 digits returned an sbyte and 9-16 digits returned an int16 but they don’t: they return int32s ; the hex conversion also returns int32 but it assumes 2 and 4 digits are the non-zero part of a 32 bit number, unless they are suffixed to say otherwise (2) Even if shorter numbers became smaller types, binary and hex would still be inconsistent, and I wouldn’t argue for having both ways of working. So net, 0b handling just seems wrong. (3) 0b notation is not in Windows PowerShell, so although changing it would be a breaking change, there aren’t scripts out there which have depended on current behaviour for ages. (That’s another argument for leaving 0x alone and changing 0b to match it)

jhoneill on Mar 8, 2023

I should have clarified that I was merely citing the documentation and showing a workaround, but let me weigh in on what I think of this behavior:

It is certainly surprising, and the true rules are obscure (and currently incorrectly documented, as @rhubarb-geek-nz points out) and inconsistent with the rules for hex literals.

Let me try to document the current de facto behavior (inferred from experiments - haven’t looked at the source code):

Binary literals without a type suffix or with signed integer-type suffixes (y, s, l, d, n) that have 8, 16, 32, or 64 binary digits (bits) and whose first digit is 1 (the high bit) become negative numbers: they are treated as bit patterns of a signed integer type of that width (see two’s complement).
- Note that this means that the high bit alone decides whether the resulting number is negative; e.g., 0b10000000 is -128
You would think that the resulting number types map onto signed types [sbyte], [int16] ([short]), [int32] ([int]) and [int64] ([long]), but binary literals with 8, 16 digits are converted to [int32] too.
- Hex literals:
  - [int32] and [int64] are also the automatically chosen types used for non-suffixed hex literals up to 64 bits.
  - However, the negativity rules differ: only 32-bit values and 64-bit values with the high bit set are negative by default (e.g., 0xffffffff and 0xffffffffffffffff) and smaller numbers can only be negative with an explicit type suffix for a smaller signed type (e.g,. 0xffY ([sbyte]) and 0xffffS ([short]==[int16])
Type suffixes for unsigned integer types prevent interpretation as a negative number, both for binary and hex literals: uy ([byte]), us ([ushort]==[uint16]), u ([uint32], widened to [uint64] if needed), andul ([ulong]==[uint64])
- Beyond 64 bits, number literals fundamentally only work with suffixes d ([decimal]) and n ([bigint]) , both of which are signed types, so the non-suffixed negativity rules apply them as well, plus to every additional 32 bits of width, up to 96 for [decimal], and open-ended for [bigint], but for binary literals only. The only way by prevent negative numbers for these types is to prepend 0; e.g., 0b0111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111D ([decimal], 79228162514264337593543950335, which is [decimal]::MaxValue)
  - Hex literals:
    - Type suffix d ([decimal]) cannot be used, because d is also a valid hex digit.
    - With type suffix n ([bigint]):
      - Hex. literals In the beyond-64-bit range are always positive, unlike binary literals; e.g., 0xffffffffffffffffffffffffN is 79228162514264337593543950335, whereas the equivalent binary literal, 0b111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111N, is -1 (and would require a leading 0 to be positive).
Finally, there are outright bugs:
- Non-suffixed binary literals with 96, 128, … digits should break (the widest auto-selected type is [int64], which cannot hold such numbers), but the parsing quietly overflows and returns -1 of type [int32]; e.g: 0b111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
- Similarly, binary literals that result in negative numbers but are suffixed with an explicit type that is too small quietly overflow (-1 of the requested type); e.g.:
  - 0b1111111111111111111111111111111111111111111111111111111111111111y
  - 0b1111111111111111111111111111111111111111111111111111111111111111s

mklement0 on Mar 8, 2023