click: UnicodeEncodeError on Windows when there are Unicode chars in the help message

I have come across an error when I try to print the help message for my app (--help) on Windows (using bash, cmd, and powershell). My help message has unicode characters in it (the project name) which is what seems to be causing the problem:

This PR tests running the app with --help and it fails on Windows and Python 3.6 and 3.10: https://github.com/leouieda/nene/pull/12

Here is a minimum example that fails:

# example.py
import click

@click.command(context_settings={"help_option_names": ["-h", "--help"]})
def main():
    """
    App description with Unicode ‣
    """
    pass

if __name__ == '__main__':
    main()
$ python example.py -h
Traceback (most recent call last):
  File "example.py", line 11, in <module>
    main()
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 1052, in main
    with self.make_context(prog_name, args, **extra) as ctx:
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 914, in make_context
    self.parse_args(ctx, args)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 1370, in parse_args
    value, args = param.handle_parse_result(ctx, opts, args)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 2347, in handle_parse_result
    value = self.process_value(ctx, value)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 2309, in process_value
    value = self.callback(ctx, self, value)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\core.py", line 1270, in show_help
    echo(ctx.get_help(), color=ctx.color)
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\site-packages\click\utils.py", line 298, in echo
    file.write(out)  # type: ignore
  File "C:\hostedtoolcache\windows\Python\3.6.8\x64\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2023' in position 62: character maps to <undefined>

I can confirm that it’s the Unicode characters in the docstring of the function wrapped with the main @click.command that causes the issue. Removing them fixes the problem (the second CI run on https://github.com/leouieda/nene/pull/12). This issue does not happen on Linux and Mac.

For now, I’ll remove the unicode characters so I’m not pushing a broken package but it would be great to be able to include the proper spelling of the package name in the future.

Environment:

  • Python version: 3.6 and 3.10
  • Click version: 8.0.3

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 20 (6 by maintainers)

Commits related to this issue

Most upvoted comments

would it make sense to simply catch this error in click.echo and re-raise it with a more verbose message?

try:
    file.write(out)
except UnicodeEncodeError as exc:
    if sys.flags.utf8_mode:
        raise
    msg = "Failed to echo some Unicode character. Try enabling [UTF-8 mode](https://docs.python.org/3/library/os.html#utf8-mode)."
    raise UnicodeEncodeError(msg) from exc

I believe this can be closed, as it is not an issue caused by click. I wrote up an explanation on another issue and a gist but the tldr is that this is caused by the Windows agent redirecting command output to a file and the default locale code page not being Unicode compatible. While click may be able to solve for this, it is definitely not caused by click.

fwiw, encountered this error for click.echo('├─') in the CI of ddelange/pipgrip#128.

It’s on Github Actions windows-latest runners, which will return sys.getfilesystemencoding() == 'utf-8', meaning it’s running python in utf8 mode.

Somehow, click still goes into a cp1252 routine in that GHA environment…

logs.txt

Feel free to review my gist covering this, but just a quick heads up that checking sys.getfilesystemencoding() won’t necessarily be accurate. You’re better off checking sys.stdout.encoding. If you haven’t set PYTHONUTF8 or PYTHONIOENCODING in your pipeline yet I would try that before doing anything else.