PowerShell: Wrong format of CommandLine property of Get-Process cmdlet

Prerequisites

Steps to reproduce

CommandLine property returns the non-separated string. For example:

(Get-Process -Id $PID).CommandLine

return pwsh-NoLogo

Expected behavior

pwsh -NoLogo

Actual behavior

pwsh-NoLogo

Error details

no errors

Environment data

Name                           Value
----                           -----
PSVersion                      7.3.0
PSEdition                      Core
GitCommitId                    7.3.0
OS                             Linux 6.0.10-arch2-1 #1 SMP PREEMPT_DYNAMIC Sat, 26 Nov 2022 16:51:18 +0000
Platform                       Unix
PSCompatibleVersions           {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion      2.3
SerializationVersion           1.1.0.1
WSManStackVersion              3.0

Visuals

ArcoLinux_2022-12-02_00-20-03

I think this depends on dotnet:

ArcoLinux_2022-12-02_00-30-18

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 23 (7 by maintainers)

Most upvoted comments

I apologize for the nervous behavior.

Note that the simple space concatenation of the verbatim arguments that you were looking for (as also reported by ps) now HAS been implemented, as at least some improvement over the current behavior (possibly to be improved later):

I presume this will be available in v7.3.1 (as well as its preview versions).

I think we just need to take a step back. There’s obviously a problem as you’ve pointed out. The issue is how do we fix it. There are many options available to us but we need input from the pwsh team on which route to take. Anything more will just result in frustration between us all and we certainly don’t want that.

Just as an FYI, ps uses option one that I mentioned and doesn’t deal with quotes. Running the following in bash pwsh -NoExit -Command '$pid; "testing abc"' and the doing ps -aux | grep pwsh in another terminal will give you:

jborean:~$ ps -aux | grep pwsh
jborean   259054  8.0  0.1 274836356 176756 pts/0 Sl+ 14:20   0:00 pwsh -NoExit -Command $pid; "testing abc"
jborean   259506  0.0  0.0 222020  2308 pts/1    S+   14:21   0:00 grep --color=auto pwsh

Notice how the single quotes of the original process are gone, ps is just going the simple route of separated by space and not worrying about trying to re-add the quotes back in.

I’m not protecting anyone, and it’s not a rhetorical question. Linux, unlike Windows, does not store the command line of a process as a string, it stores it as an array of values. This means the string people would have typed into their shell is not something you can necessarily get back. The shell used will convert the string to an array of arguments and that’s what is stored in the file. As I mentioned, if you were to do /bin/test argument 'quoted value' in bash the cmdline file will be /bin/test\0argument\0quoted value\0. Notice how the single quotes that were used in the shell are no longer present.

This brings up my question, how should PowerShell interpret the raw value of null terminated strings into a command line that people expect. How does it know that the original command line used single quotes vs double quotes? Should it care about it at all? Should it just return an array of strings instead? These are all things that would need to be answered before this is fixed and the best people to answer that is the people who are going to use this value, like yourself.

As I said I’m happy to submit a PR to try and fix this but I don’t want to waste my time if the route I take is not what people want. I prefer to get a consensus of what the output format should be before I do the work.

I’m curious what would would you expect it to show in the case of arguments that have a space. Say you ran /bin/test argument 'quoted value' in Bash what would you expect back? Should it be

  • /bin/test argument quoted value
  • /bin/test argument 'quoted value'
  • /bin/test argument "quoted value"
  • Something else?

What about if the argument had embedded single or double quotes as the value (`/bin/test argument “‘quoted value’”), the escaping method for these quotes is highly dependent on the shell used.

It’s hard to convert the raw argv value in the /proc/PID/cmdline file back to what you actually ran as that information isn’t preserved anywhere. So while PowerShell could do the naive approach, or quote it in a way that works for it, it’s not necessarily going to be what people may expect if they ran it from a different shell.

That being said, I’m happy to submit a PR to fix this but I need to know from the pwsh team what format they expect the array of args to be displayed as for this property or whether they want to have it return an array for Linux and String for Windows.

Why then is the identical result in [system.diagnostics.process]?

This returns the same object as Get-Process, the ETS properties are automatically added on top of the System.Diagnostics.Process type regardless of where it is from.

Strange that developers did not replace u{0} with u{32}

Because the value in that file is essentially the serialized form of the argv array when the process was created. This array is essentially an array of null terminated strings. If the question was why PowerShell didn’t convert it, it’s probably because they didn’t realise that was the format. Still just blindly converting to a space may give a value people won’t expect as it’s not always going to be the string people put in their shell.

The CommandLine property is added by PowerShell as an ETS property and is not something in dotnet

$member = Get-Process -Id $pid | Get-Member -Name CommandLine
$member
$member.Definition

We can see the member definition of this ScriptProperty being

System.Object CommandLine {get=
                        if ($IsWindows) {
                            (Get-CimInstance Win32_Process -Filter "ProcessId = $($this.Id)").CommandLine
                        } elseif ($IsLinux) {
                            Get-Content -LiteralPath "/proc/$($this.Id)/cmdline"
                        }
                    ;}

So on Linux it is simply getting the contents of /prop/$processId/cmdline. It looks like we need to update the logic to properly read the file as it’s using a null byte to delimit each argument rather than a space. For example the pid 75688 was started as pwsh -NoProfile -NoLogo but we can see the raw bytes of this file being:

PS /home/jborean> Format-Hex -Path /proc/75688/cmdline

   Label: /proc/75688/cmdline

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 70 77 73 68 00 2D 4E 6F 4C 6F 67 6F 00 2D 4E 6F pwsh -…
0000000000000010 50 72 6F 66 69 6C 65 00                         Profil…

Each argument is separated by 00 and not a space 20. This makes sense as on Linux a process is started with an array of arguments rather than a string like on Windows. So there’s no canonical command line string of all the argument together rather it’s just the argv array of null terminated strings.

A naive fix is to just replace the null byte with a space (or split by null and join by space)

$cmdPid = $this.Id
$rawCmd = Get-Content -LiteralPath "/proc/$cmdPid/cmdline"

# Need to trim the last char with is the null terminating char for the last arg
$rawCmd.Substring(0, $rawCmd.Length - 1) -replace "`0", " "

The problem with this approach is if an argument itself contains a space you now need to decide on how to “quote” these arguments. You could;

  • Just not care and say this is a best guesstimate
  • Try and convert it to a valid command line that can be used in pwsh (single quotes or double quotes)
  • Use bash argument quoting rules, or figure out what started it and try and use those rules
  • Have the property return an array to align with how Linux works - would be a breaking change but technically more correct

To give an example of what I mean here, say I run the following in bash pwsh -NoExit -Command '$pid; "testing abc"'. Bash will process this string and convert it to an array of arguments being:

  • pwsh
  • -NoExit
  • -Command
  • $pid; "testing abc"

You’ll notice that bash has “eaten” the single quotes around the '$pid; "testing abc"' part specified in the terminal as it’s treated that as a single argument. You can see using the logic above with some tweaks this is how the argument array was passed from bash

$cmdPid = 85157
$rawCmd = Get-Content -LiteralPath "/proc/$cmdPid/cmdline"

# Need to trim the last char with is the null terminating char for the last arg
$rawArgs = $rawCmd.Substring(0, $rawCmd.Length - 1)

# What it actually is
$rawArgs -split "`0" | ForEach-Object { $i = 0 } { "[$i] $_"; $i++ }

# Our naive approach
$rawArgs -replace "`0", " "

We can see

image

So unless PowerShell tries to reverse this array into a string acceptable by a shell, you can’t really roundtrip it back to how it was started.