PowerShell: Investigate possible BUG Invoke-WebRequest content type charset

Prerequisites

Steps to reproduce

Invoke-WebRequest POST with -Body (string) and -Headers containing a content type with charset (ex. plain/text; charset=“windows-1251”) and no -ContentType.

Invoke-WebRequest -URI "some uri" -Method POST -Headers @{Content-Type="plain/text; charset=windows-1251"} -Body "string encoded with windows-1251 charset"

The charset in the headers are ignored be ignored and default to UTF8. This doesn’t happen if we use -ContentType

Invoke-WebRequest -URI "some uri" -Method POST -ContentType "plain/text; charset=windows-1251" -Body "string encoded with windows-1251 charset"

Both should have the same behaviour

I found this possible bug while reading the code; I haven’t encountered it in the real world so I don’t really know how we could test it (and if testing it is even possible)

Expected behavior

SetRequestContent should try to get the charset from the ContentType in Headers saved in the WebSession
WebSession.ContentHeaders[HttpKnownHeaderNames.ContentType]

Actual behavior

SetRequestContent only checks for -ContentType charset

Error details

While writing some code in `WebRequestPSCmdlet.Common.cs` think I found a BUG (line 1491) it should check for `WebSession.ContentHeaders[HttpKnownHeaderNames.ContentType]` instead of `ContentType` (this should be ok because if both are present `ContentType` overwrites `WebSession.ContentHeaders[HttpKnownHeaderNames.ContentType]`) unfortunately I don't know how to test it (I think it would require sending a wierdily encoded string as Body), if confirmed I'll open another PR to fix it.

Environment data

Latest master
https://github.com/PowerShell/PowerShell/blob/60b4e9704fca546f76e1128e30de4077c51a3ca2/src/Microsoft.PowerShell.Commands.Utility/commands/utility/WebCmdlet/Common/WebRequestPSCmdlet.Common.cs

Visuals

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 22 (12 by maintainers)

Most upvoted comments

@mklement0 Yes exactly, I’ll edit the initial post to make it clearer

@CarloToso,

Based on my summary of what I think the issue is, and your confirmation of it as correct, I’m confused by the following sentence in the initial post:

The charset in the headers should be ignored and default to UTF8

To reconfirm: It should not be ignored, and should instead be honored, just like it already is for -ContentType, correct?


Proceeding on the assumption that my summary is correct:

A simple, setup-free way to verify that -ContentType is honored, but an equivalent Content-Type header field is not is via https://httpbin.org/post, which simply echoes the body back to you:

  • Given that UTF-8 is JSON’s default, and that PowerShell now uses UTF-8 encoding by the default for a [string]-based -Body argument with content type application/json, the following works as expected:
# Output: [pscustomobject] @{ foo = 'sß' }
(Invoke-RestMethod https://httpbin.org/post -Body '{ "foo": "sß" }' -Method POST -ContentType application/json).json
# NO output, because the effectively ISO-8859-1-encoded body cannot be decoded *as UTF8* by https://httpbin.org/post
(Invoke-RestMethod https://httpbin.org/post -Body '{ "foo": "sß" }' -Method POST -ContentType 'application/json; charset=iso-8859-1').json
  • Trying to make the same intentional mistake with an equivalent Content-Type header field does NOT provoke the symptom, which implies that the charset attribute was ignored by PowerShell, and that UTF-8 was sent:
# !! Output: [pscustomobject] @{ foo = 'sß' } - despite the attempt to send broken encoding
# !! PowerShell *ignored* the 'charset' attribute.
(Invoke-RestMethod https://httpbin.org/post -Body '{ "foo": "sß" }' -Method POST -Headers @{ 
  'Content-Type' = 'application/json; charset=iso-8859-1' 
}).json

@CarloToso, you’re right - I misread the source code; I’ve edited my previous comment to remove the mistaken interpretation.