tokenizers: The Conda package doesn't work on CentOS 7 and Ubuntu 18.04
import tokenizers
doesn’t work on CentOS 7 (and RHEL 7), because it has glibc 2.17, while the Conda package (at least for Python 3.8) was compiled against a newer version (the error says version 'GLIBC_2.18' not found
).
I believe CentOS and RHEL 7 are both still popular, though I can’t back up this claim (I can’t find a link). My university cluster uses it at least. And note version 7 is the latest CentOS version with long-term support.
It does work from PyPI’s version, so I just use that one. However, it’d be good it worked from Conda. So it seems it’s related to the glibc version specified by Conda, while the one provided by ubuntu-latest
(Ubuntu 18.04 as of today, IIUC) seems to work fine. Conda docs say they do provide their own glibc version. Because there isn’t such a libc package in Conda, they provide a solution via virtual packages. So it seems that all that’s needed is to set an env var like:
CONDA_OVERRIDE_GLIBC=2.17
(note Rust dynamically links the compiled binary to the available glibc version at build time).
Could you take a look at it? Not sure how it can be tested w/o pushing a release.
Related issues (some people also mention CentOS 7):
- rust-lang/libc#1412
- How to compile rust with a specific GLIBC version for gnueabihf architecture?
- rust-lang/rust#57497
Update: it’s got a bit worse, as now it requires a newer GLIBC version:
/lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29’ not found
It’s not working for me on Ubuntu 18.04 (GLIBC 2.27).
What this article details may be the solution for it: Building Rust binaries in CI that work with older GLIBC
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 26
- Comments: 15
Links to this issue
Commits related to this issue
- Attempt at fixing Conda builds Ref #585 — committed to huggingface/tokenizers by n1t0 3 years ago
As a workaround, I’ve installed the previous tokenizers version, and everything works fine now:
use pip instead of conda:
Same issue on Ubuntu 18.04.5 LTS. Ubuntu 18.04’s latest GLIBC version is 2.27. Conda-installed
tokenizers
(throughtransformers
installation) version 0.10.2 requires GLIBC 2.29.Same issue for me on Ubuntu 18.04. I just changed to pip installation, which seems to be working.
This article explains how to fix it, if anybody wants to go ahead: https://kobzol.github.io/rust/ci/2021/05/07/building-rust-binaries-in-ci-that-work-with-older-glibc.html