cython: [BUG] unicode.split does not allow to pass None for sep

Describe the bug I’m hitting the difference in behaviour in between CPython and Cython for unicode.split - with Cython passing sep=None explicitly raises TypeError. Please find details below:

To Reproduce Code to reproduce the behaviour:

---- 8< ---- usplit.pyx

# cython: language_level=3

def mysplit(q):
    return unicode.split(q, None)

print(mysplit("hello world"))

Expected behavior

I expect it to behave the same as in Python - i.e. print [‘hello’, ‘world’]:

---- 8< ---- usplit_py.py

def mysplit(q):
    return str.split(q, None)

print(mysplit("hello world"))
$ python usplit_py.py 
['hello', 'world']

However what I get instead is the following exception that None could not be used for sep:

$ cythonize -i usplit.pyx 
Compiling /home/kirr/usplit.pyx because it changed.
[1/1] Cythonizing /home/kirr/usplit.pyx
running build_ext
building 'usplit' extension
creating /home/kirr/tmp3kckc5wa/home
creating /home/kirr/tmp3kckc5wa/home/kirr
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/home/kirr/src/wendelin/venv/py3.venv/include -I/usr/include/python3.9 -c /home/kirr/usplit.c -o /home/kirr/tmp3kckc5wa/home/kirr/usplit.o
x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fwrapv -O2 -Wl,-z,relro -g -fwrapv -O2 -g -ffile-prefix-map=/build/python3.9-RNBry6/python3.9-3.9.2=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 /home/kirr/tmp3kckc5wa/home/kirr/usplit.o -o /home/kirr/usplit.cpython-39-x86_64-linux-gnu.so
$ python -c 'import usplit'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "usplit.pyx", line 6, in init usplit
    print(mysplit("hello world"))
  File "usplit.pyx", line 4, in usplit.mysplit
    return unicode.split(q, None)
TypeError: must be str, not NoneType

Environment (please complete the following information):

  • OS: [Debian GNU/Linux 11]
  • Python version [e.g. 3.9.2]
  • Cython version [e.g. 0.29.27]

Thanks beforehand, Kirill

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 16 (16 by maintainers)

Commits related to this issue

Most upvoted comments

Sure, here is my current list:

  • count
  • endswith
  • find
  • index
  • rfind
  • rindex
  • split (this one)
  • startswith