PowerShell: PSCustomObject does not work with Select-Object -Unique and Sort-Object -Unique
Prerequisites
- Write a descriptive title.
- Make sure you are able to repro it on the latest released version
- Search the existing issues.
- Refer to the FAQ.
- Refer to Differences between Windows PowerShell 5.1 and PowerShell.
Steps to reproduce
Given any collection of PSCustomObject
s with different fields, they will not compare equal but still collapse into a single item under Select-Object -Unique
or Sort-Object -Unique
. Notable cmdlets outputting PSCustomObject
collections are:
Select-Object
itself (-Unique
works when used immediately, but not later on)ConvertFrom-Csv
This behavior has been mentioned in https://github.com/PowerShell/PowerShell/issues/15806 and https://github.com/PowerShell/PowerShell/issues/12059 , but neither are for this exact problem, and it affecting Select-Object
transformations and ConvertFrom-Csv
makes it very prominent.
Expected behavior
> $files = Get-ChildItem | Select-Object Name
> $files
Name
----
dotnet-sdk-5.0.408-linux-arm64
dotnet-sdk-6.0.400-linux-arm64
> $files[0] -eq $files[1]
False
> $files | Select-Object -Unique
Name
----
dotnet-sdk-5.0.408-linux-arm64
dotnet-sdk-6.0.400-linux-arm64
Actual behavior
> $files | Select-Object -Unique
Name
dotnet-sdk-5.0.408-linux-arm64
Error details
No response
Environment data
Name Value
---- -----
PSVersion 7.3.0-preview.3
PSEdition Core
GitCommitId 7.3.0-preview.3-304-gd02c59addc24e13da3b8ee5e1a8e7aa27e00c745
OS Linux 5.15.0-1013-raspi #15-Ubuntu SMP PREEMPT Mon Aug 8 06:33:06 UTC 2022
Platform Unix
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0
Visuals
No response
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 41 (17 by maintainers)
@PowerShell/wg-powershell-cmdlets reviewed this and agree that the current behavior for PSCustomObject is incorrect. We believe this is likely a bucket 3 breaking change that users are relying on the current behavior. We considered an option to inform the user to use
-Property *
if we detect that the objects are PSCustomObjects, however, it seems more useful to fix this behavior that future users can rely upon. So the change is only if the first object is aPSCustomObject
and-Property
isn’t specified, then it gets set to-Property *
.Looking at the results of the proposed fix, I am not satisfied. While the main scenario is fixed by this fix (usually all objects of the same type in the pipeline), I’m afraid that we’ll immediately get feedback that it doesn’t work if the first object isn’t a PSCustomObject. Since we’re doing a slow accumulation of objects from the pipeline anyway, there’s nothing stopping us from checking their type until we encounter a PSCustomObject.
This in turn makes me think that this fix is more of a workaround and the main problem is somewhere deeper and should be fixed, although it’s not trivial anymore.
@dkaszews No warning. Check only first object being PSCustomObject and Property parameter is not present.
This was discussed in cmdlet working group yesterday. It was suggested that if the objects are PSCustomObjects and no
-Property
parameter is passed,Select-Object
could behave as ifProperty *
had been specified. This will be investigated, which is not (yet) a commitment to make a change.Can we circle back to @jhoneill 's original solution to simply default
-Property
to*
? It fixes the issue and I cannot see any flows it could break. If nobody can think of a reason that would be a breaking change, I suggest we go with it, as it is trivial to implement.Alternatively, we could implement it to look at all
NoteProperty
if and only if the object contains noProperty
. Not much more complicated and even less likely to break existing flows in case there are some objects which store their real values asProperty
and useNoteProperty
only for metadata.I think
pscustomobject
equality and comparison semantics is a key here. We need to describe them exactly and also we need to investigate whether it is possible to change this without breaking other PSObjects.The second line only returns one item. The third returns all of them. The same happens if you use
Sort-Object -unique
But if the first line is$files = Get-ChildItem | % name
or just$files = Get-ChildItem
both lines return all itemsI don’t think it is hashing
I think the code ( https://github.com/PowerShell/PowerShell/blob/master/src/Microsoft.PowerShell.Commands.Utility/commands/utility/Select-Object.cs line 628 onwards) compares base objects without note properties - and guess what… PSCustomObjects are ALL note properties -and ONLY looks at note properties if -Properties is specified. So
Correctly returns ONE item
Should make a difference between a and b and return two items, but
Select-Object
ignores it because it is a note propertyCorrectly only returns one object because it is not told to look at the note property that is different
Now returns two objects because it IS looking at the note property.
This looks to be by design two objects are not considered different if they only differ in note properties, UNLESS those note properties are specified. (Which is why adding * fixes the problem in the initial example). But that design doesn’t allow for PSCustomObjects
Works , but if the second doesn’t have a property parameter it fails. The two linked items suggest that select-object uses the hash code, but this casts doubt on that.
I’ve never understood why -ExcludeProperty doesn’t work unless property is specified, I wonder if both could be solved if Property defaulted to “*”