vswhere: vswhere.exe uses local code page to emit invalid JSON/XML

When I run vswhere -products * -legacy -format json under the Japanese mode/edition/version of Windows 10 Pro, I got one of the line:

"description": "学生、オープン ソース、および個々の開発者のための無料で完全な機能を備えた IDE",

The message above is correct, but encoded by code-page 932 (The default codec for Japanese mode).

Today, As described in RFC 8259, at the section “8.1. Character Encoding”, JSON files MUST use UTF-8 (and must NOT use byte-order-mark). Please use UTF-8, to make the valid JSON even when it includes non-ASCII string like above. Otherwise, valid JSON decoders claim the JSON file as invalid, especially they process the file as including bad Javascript \ escapes.

Almost same thing about vswhere -products * -legacy -format xml. vswhere.exe uses local code page (cp932, under my environment) without encoding declaration at the beginning of xml file. To simplify, just hard-code to use UTF-8.

On the other hand, default format mode (-format text or not using -format) should use local code page, I think. Otherwise it shows unreadable strings(mojibake) in the window of cmd.exe.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (10 by maintainers)

Commits related to this issue

Most upvoted comments

Option # 3 is the safest bet, but I also think that writing UTF-8 unconditionally when in JSON mode is also a safe bet. I have high doubts that many tools out there that are consuming the JSON data were (or are) able to handle non-UTF-8 JSON.