poetry: On Git Bash for Windows, poetry fails to print anything because of UnicodeEncodeError

  • I am on the latest Poetry version.
  • I have searched the issues of this repo and believe that this is not a duplicate.
  • If an exception occurs when executing a command, I executed it again in debug mode (-vvv option).
  • OS version and name:
$ uname --all
MINGW64_NT-10.0-18363 Jaepil-PC 3.0.7-338.x86_64 2019-11-21 23:07 UTC x86_64 Msys

I’m on Windows 10, using Git Bash for Windows.

$ git --version
git version 2.25.0.windows.1

Issue

This is what I get when I try to print poetry’s commands with $ poetry

On this issue, someone pointed out that this is an issue with the locale but it never worked however I changed locale.

Also, this error is also referring to '\xa0' in position 30 like the previous issue, so I believe it might be a poetry-specific problem. I looked up what the heck ‘\xa0’ is, and it is a no-break space and apparently, it has caused a lot of problems, especially in Python.

I think there’s a room for improvement if this error could be handled from Poetry, even though Poetry might not be directly responsible for the issue.

Thanks for making a great package manager.

Jaepil@Jaepil-PC MINGW64 /e/VSCodeProjects/poetry-demo
$ poetry -vvv
Poetry version 1.0.2

USAGE

[UnicodeEncodeError]
'cp949' codec can't encode character '\xa0' in position 30: illegal multibyte sequence

Traceback (most recent call last):
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\console_application.py", line 131, in run
    status_code = command.handle(parsed_args, io)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\api\command\command.py", line 120, in handle
    status_code = self._do_handle(args, io)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\api\command\command.py", line 171, in _do_handle
    return getattr(handler, handler_method)(args, io, self)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\handler\help\help_text_handler.py", line 29, in handle
    usage.render(io)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\ui\help\abstract_help.py", line 31, in render
    layout.render(io, indentation)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\ui\layout\block_layout.py", line 42, in render
    element.render(io, self._indentations[i] + indentation)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\ui\components\labeled_paragraph.py", line 70, in render
    + '
'"
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\cleo\io\io_mixin.py", line 55, in write
    super(IOMixin, self).write(string, flags)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\api\io\io.py", line 58, in write
    self._output.write(string, flags=flags)
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\api\io\output.py", line 61, in write
    self._stream.write(to_str(formatted))
  File "C:\Users\Jaepil\.poetry\lib\poetry\_vendor\py3.7\clikit\io\output_stream\stream_output_stream.py", line 24, in write
    self._stream.write(string)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 1
  • Comments: 15 (2 by maintainers)

Most upvoted comments

The fix is in crashtest 0.4.0, which is pulled in by Cleo, so we would need a Cleo release, I think – paging @Secrus

It turned out this cp949 encoding issue was actually an issue from one of poetry’s dependency, crashtest. I made a PR(sdispater/crashtest#5) on crashtest which will also be reflected on poetry once the PR is merged.

Wow…I’m surprised how it turned out to be a very simple issue at the end of the day. This exactly is a prime example of how very seemingly subtle bugs can creep into production code without anyone’s notice 😆

Anyway, I look forward to seeing the PR being merged sometime soon.

@Jarmos-san

Little update. It turned out this cp949 encoding issue was actually an issue from one of poetry’s dependency, crashtest. I made a PR(https://github.com/sdispater/crashtest/issues/5) on crashtest which will also be reflected on poetry once the PR is merged. Cheers.

Could you give me your opinion on whether this is a quick & easy & good fix?

I’m not the right person to review this as I’m not well acquainted with Poetry internals either. So, I suggest you open the PR & hope it gets reviewed & merged. If it does get merged, well that’s good news for Git Bash users.

And to other Git Bash users, if this issue is bothering you, just use a PowerShell environment. I’ve ditched Git Bash completely (and don’t even use WSL) & I couldn’t be more satisfied with Poetry’s support for Windows machines.

@Jarmos-san I happened to revisit the issue after encountering UnicodeDecodeError: 'cp949' codec can't decode byte 0xe2 in position 9921: illegal multibyte sequence issue again.

This time I tried to debug it and as I expected, simply adding encoding='utf-8' option when opening the file seems to resolve the issue.

I’m thinking of applying the fix and make a PR. However, I just want to listen to another person’s opinion whether making this change is a good idea. I don’t foresee any technical problem in changing the encoding type but I’m only working on Windows platform on my local machine and I’m not so familiar with poetry internals so there might be a possible unwanted side-effect.

Could you give me your opinion on whether this is a quick & easy & good fix?

  File "C:\Users\chlje\.poetry\lib\poetry\_vendor\py3.8\crashtest\frame.py", line 51, in file_content
    file_content = f.read()
UnicodeDecodeError: 'cp949' codec can't decode byte 0xe2 in position 9921: illegal multibyte sequence
    @property
    def file_content(self) -> str:
        if self._file_content is None:
            if not self._filename:
                file_content = ""
            else:
                if self._filename not in self.__class__._content_cache:
                    try:
                        with open(self._filename) as f: # change to open(self._filename, encoding='utf-8')
                            file_content = f.read()
                    except OSError:
                        file_content = ""

@Jarmos-san , thought it was better to unclog the unresolved issue for the maintainers but in your perspective, makes sense to keep it open.

Googled this issue, and I think it can be handled pretty easily if encoding is set to ‘utf-8’ correctly. I’m not guru enough to contribute directly to poetry branch, so I’ll share some findings I searched.

example 1: https://newtoynt.tistory.com/entry/error-UnicodeEncodeError-cp949-codec-cant-encode-character-u2764-in-position-19-illegal-multibyte-sequence?category=541890

example 2: http://blog.naver.com/PostView.nhn?blogId=haya14&logNo=220781798607&parentCategoryNo=&categoryNo=20&viewDate=&isShowPopularPosts=false&from=postView

I think encoding=‘utf-8’ should be added somewhere in the source code.

My initial guess was that its encoding should be changed in io.open(foo, 'r', encoding='utf-8') somewhere in the code but it’s only a guess and the core cause might be placed at somewhere else.