poetry: poetry develop failing on non-ASCII characters

authors = [
    "Sébastien Eustace <sebastien@eustace.io>"
]
$ poetry develop -vvv

[AttributeError]
'NoneType' object has no attribute 'group'
authors = [
    "Sebastien Eustace <sebastien@eustace.io>"
]
Installing dependencies from lock file

Nothing to install or update

Installing poetry (0.11.0-alpha.3)

As far as I know, the re library doesn’t have any ability to support unicode character classes but regex can handle them properly.

I don’t know if this has been brought up before or this is a windows-only thing, considering this happened while poetry developing poetry itself. as far as I checked, nobody has made an issue about this before.

Windows 10, python 3.6.4, poetry 0.11.0a3.

edit: #66 is similar.

In the meantime, catching errors:

    def _get_author(self):  # type: () -> dict
+       if self._authors:
+           m = AUTHOR_REGEX.match(self._authors[0])
+       else:
+           m = None

-       if not self._authors:
+       if not m:
+           # log.info('Could not find an author') or whatever
            return {"name": None, "email": None}

        m = AUTHOR_REGEX.match(self._authors[0])

        name = m.group("name")
        email = m.group("email")

        return {"name": name, "email": email}

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 4
  • Comments: 26 (12 by maintainers)

Most upvoted comments

@jacebrowning Thanks for the pointer a few months back regarding AUTHOR_REGEX. After a bit of experimentation, I think that this has to do not with Poetry per se but rather with a bug in the re module (see https://github.com/lark-parser/lark/issues/590).

Replacing re with regex solves everything:

import re
AUTHOR_REGEX = re.compile(r"(?u)^(?P<name>[- .,\w\d'’\"()]+) <(?P<email>.+?)>$")
AUTHOR_REGEX.match("ம. ஆ. ஜூலீஎன் <julien.malard@mail.mcgill.ca>")
>>> None
# But...
import regex as re
AUTHOR_REGEX = re.compile(r"(?u)^(?P<name>[- .,\w\d'’\"()]+) <(?P<email>.+?)>$")
AUTHOR_REGEX.match("ம. ஆ. ஜூலீஎன் <julien.malard@mail.mcgill.ca>")
>>> <regex.Match object; span=(0, 44), match='ம. ஆ. ஜூலீஎன் <julien.malard@mail.mcgill.ca>'>

So my question would now be - should I submit a pull request with import regex as re to Poetry? Or would adding a dependency risk breaking things? Thanks!

@vlcinsky sure I understand. I just needed to get something done really quickly.

root@95eea793181d:/app# poetry --version
Poetry version 1.1.4

The full output:

root@95eea793181d:/app# poetry
Poetry version 1.1.4

USAGE

  UnicodeEncodeError

  'ascii' codec can't encode character '\xa0' in position 30: ordinal not in range(128)

  at ~/.poetry/lib/poetry/_vendor/py3.6/clikit/io/output_stream/stream_output_stream.py:24 in write
Traceback (most recent call last):
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/console_application.py", line 131, in run
    status_code = command.handle(parsed_args, io)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/command/command.py", line 120, in handle
    status_code = self._do_handle(args, io)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/command/command.py", line 171, in _do_handle
    return getattr(handler, handler_method)(args, io, self)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/handler/help/help_text_handler.py", line 29, in handle
    usage.render(io)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/help/abstract_help.py", line 31, in render
    layout.render(io, indentation)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/layout/block_layout.py", line 42, in render
    element.render(io, self._indentations[i] + indentation)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/components/labeled_paragraph.py", line 70, in render
    + "\n"
  File "/root/.poetry/lib/poetry/_vendor/py3.6/cleo/io/io_mixin.py", line 55, in write
    super(IOMixin, self).write(string, flags)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/io/io.py", line 58, in write
    self._output.write(string, flags=flags)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/io/output.py", line 61, in write
    self._stream.write(to_str(formatted))
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/io/output_stream/stream_output_stream.py", line 24, in write
    self._stream.write(string)
UnicodeEncodeError: 'ascii' codec can't encode character '\xa0' in position 30: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.poetry/bin/poetry", line 19, in <module>
    main()
  File "/root/.poetry/lib/poetry/console/__init__.py", line 5, in main
    return Application().run()
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/console_application.py", line 142, in run
    trace.render(io, simple=isinstance(e, CliKitException))
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/components/exception_trace.py", line 232, in render
    return self._render_exception(io, self._exception)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/components/exception_trace.py", line 269, in _render_exception
    self._render_snippet(io, current_frame)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/components/exception_trace.py", line 289, in _render_snippet
    self._render_line(io, code_line)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/ui/components/exception_trace.py", line 402, in _render_line
    io.write_line("{}{}".format(indent * " ", line))
  File "/root/.poetry/lib/poetry/_vendor/py3.6/cleo/io/io_mixin.py", line 65, in write_line
    super(IOMixin, self).write_line(string, flags)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/io/io.py", line 66, in write_line
    self._output.write_line(string, flags=flags)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/io/output.py", line 69, in write_line
    self.write(string, flags=flags, new_line=True)
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/api/io/output.py", line 61, in write
    self._stream.write(to_str(formatted))
  File "/root/.poetry/lib/poetry/_vendor/py3.6/clikit/io/output_stream/stream_output_stream.py", line 24, in write
    self._stream.write(string)
UnicodeEncodeError: 'ascii' codec can't encode character '\u2502' in position 27: ordinal not in range(128)

I have installed it like this:

root@95eea793181d:/app#  curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python
Retrieving Poetry metadata

# Welcome to Poetry!

This will download and install the latest version of Poetry,
a dependency and package manager for Python.

It will add the `poetry` command to Poetry's bin directory, located at:

$HOME/.poetry/bin

This path will then be added to your `PATH` environment variable by
modifying the profile file located at:

$HOME/.profile

You can uninstall at any time by executing this script with the --uninstall option,
and these changes will be reverted.

Installing version: 1.1.4
  - Downloading poetry-1.1.4-linux.tar.gz (57.03MB)

Poetry (1.1.4) is installed now. Great!

To get started you need Poetry's bin directory ($HOME/.poetry/bin) in your `PATH`
environment variable. Next time you log in this will be done
automatically.

To configure your current shell run `source $HOME/.poetry/env`

I have the same problem as @laxan. Can’t init new project because of non-ascii character in my git user name.

asciicast

The same here! My lastname has an ó

➜  rgh git:(develop) poetry  init -vvv

This command will guide you through creating your pyproject.toml config.

Package name [rgh]:
Version [0.1.0]:
Description []:
'ascii' codec can't encode character u'\xf3' in position 29: ordinal not in range(128)
'ascii' codec can't encode character u'\xf3' in position 29: ordinal not in range(128)
'ascii' codec can't encode character u'\xf3' in position 29: ordinal not in range(128)
...

I have the same problem as @laxan. Can’t init new project because of non-ascii character in my git user name.

asciicast

I get the same on poetry build from non-ascii package names, for example:

[tool.poetry]
name = "lassi"
version = "0.1.0"
description = ""
authors = ["ਜ਼ੂਲੀਏਂ ਮਲਾਰ (Julien Malard) <julien.malard@mail.mcgill.ca>"]
packages = [
    { include = "ਲੱਸੀ" }
]

This is on MacOS. Edit: Unicode author name crashes as well.