umap: UMAP Segmentation Faults

Hi,

I thought it might be best to just create a new issue, for this as the issue seems a little different to https://github.com/lmcinnes/umap/issues/421 which I originally commented this error on.

I’ve started getting a segfault when trying to play around with the numba threading layers (setting it tbb) in order to use UMAP with ProcessPoolExecutor. It happened very suddenly, and now consistently happens whenever I try to run UMAP inside a script, regardless of threading layer or if it is running inside a process pool.

The weird thing is that the seg fault does not occur if I just run UMAP inside of a python terminal, it only occurs when I run it via command line through a script.

The error looks like this on one set of data:

Fatal Python error: Segmentation fault

Current thread 0x00007f5c2431c700 (most recent call first):
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/umap_.py", line 580 in fuzzy_simplicial_set
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/umap_.py", line 2373 in fit
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/flight/rosella/embedding.py", line 405 in fit_transform
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/flight/rosella/rosella.py", line 248 in perform_binning
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/flight/flight.py", line 442 in bin
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/flight/flight.py", line 365 in main
  File "/home/n10853499/.conda/envs/rosella-dev/bin/flight", line 8 in <module>

And there is secondary error on another set of data that looks like this:

Fatal Python error: Segmentation fault

Thread 0x00007f22211a4700 (most recent call first):
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/pynndescent/pynndescent_.py", line 874 in __inFatal Python error: iSegmentation faultt

__
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/umap_.py", line 328 in nearest_neighbors
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/umap/umap_.py", line 2415 in fit
  File "/home/n10853499/.conda/envs/rosella-dev/lib/python3.8/site-packages/flight/rosella/embedding.py", line 405 in fit_transform
  File "/home/n10853499/Segmentation fault (core dumped)

Downgrading numba doesn’t help with this issue, nor does downgrading pynndescent or using the master branch from the pynndescent github. Additionally, this is happening on a fresh conda environment so something pretty odd seems to be happening.

and my conda environment looks like this:

# packages in environment at /home/n10853499/.conda/envs/rosella-dev:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
attrs                     21.2.0             pyhd8ed1ab_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
biopython                 1.79             py38h497a2fe_0    conda-forge
blis                      0.8.1                h7f98852_1    conda-forge
brotlipy                  0.7.0           py38h497a2fe_1001    conda-forge
bwa                       0.7.17               h5bf99c6_8    bioconda
bzip2                     1.0.8                h7f98852_4    conda-forge
ca-certificates           2021.5.30            ha878542_0    conda-forge
cachecontrol              0.12.6                     py_0    conda-forge
certifi                   2021.5.30        py38h578d9bd_0    conda-forge
cffi                      1.14.4           py38ha312104_0    conda-forge
chardet                   4.0.0            py38h578d9bd_1    conda-forge
charset-normalizer        2.0.0              pyhd8ed1ab_0    conda-forge
cryptography              3.4.7            py38ha5dfef3_0    conda-forge
curl                      7.71.1               he644dc0_3    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cython                    0.29.24          py38h709712a_0    conda-forge
decorator                 5.0.9              pyhd8ed1ab_0    conda-forge
flight-genome             1.2.1              pyh5e36f6f_0    bioconda
freetype                  2.10.4               h0708190_1    conda-forge
gsl                       2.6                  he838d99_2    conda-forge
hdbscan                   0.8.27           py38h5c078b8_0    conda-forge
hdmedians                 0.14.2           py38hb5d20a5_0    conda-forge
htslib                    1.9                  h4da6232_3    bioconda
idna                      3.1                pyhd3deb0d_0    conda-forge
imageio                   2.9.0                      py_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
ipython                   7.26.0           py38he5a9106_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
jedi                      0.18.0           py38h578d9bd_2    conda-forge
joblib                    0.17.0                     py_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
k8                        0.2.5                h9a82719_1    bioconda
kiwisolver                1.3.1            py38h1fd1430_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libcblas                  3.9.0               10_openblas    conda-forge                                                                                  [38/1790]
libcurl                   7.71.1               hcdd3856_3    conda-forge
libdeflate                1.6                  h516909a_0    conda-forge
libedit                   3.1.20191231         h46ee950_2    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 11.1.0               hc902ee8_8    conda-forge
libgfortran-ng            11.1.0               h69a702a_8    conda-forge
libgfortran5              11.1.0               h6c583b3_8    conda-forge
libgomp                   11.1.0               hc902ee8_8    conda-forge
liblapack                 3.9.0               10_openblas    conda-forge
libllvm10                 10.0.1               he513fc3_3    conda-forge
libopenblas               0.3.17          pthreads_h8fe5266_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              11.1.0               h56837e0_8    conda-forge
libtiff                   4.3.0                hf544144_0    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
llvmlite                  0.36.0           py38h4630a5e_0    conda-forge
lockfile                  0.12.2                     py_1    conda-forge
lz4-c                     1.9.3                h9c3ff4c_1    conda-forge
matplotlib-base           3.4.2            py38hcc49a3a_0    conda-forge
matplotlib-inline         0.1.2              pyhd8ed1ab_2    conda-forge
minimap2                  2.21                 h5bf99c6_0    bioconda
more-itertools            8.8.0              pyhd8ed1ab_0    conda-forge
msgpack-python            1.0.2            py38h1fd1430_1    conda-forge
natsort                   7.1.1              pyhd8ed1ab_0    conda-forge
ncurses                   6.1               hf484d3e_1002    conda-forge
numba                     0.53.1           py38h8b71fd7_1    conda-forge
numpy                     1.21.1           py38h9894fe3_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openblas                  0.3.17          pthreads_h4748800_1    conda-forge
openjpeg                  2.4.0                hb52868f_1    conda-forge
openssl                   1.1.1k               h7f98852_0    conda-forge
packaging                 21.0               pyhd8ed1ab_0    conda-forge
pandas                    1.3.1            py38h1abd341_0    conda-forge
parallel                  20160622                      1    bioconda
parso                     0.8.2              pyhd8ed1ab_0    conda-forge
patsy                     0.5.1                      py_0    conda-forge
perl                      5.32.1          0_h7f98852_perl5    conda-forge
perl-threaded             5.26.0                        0    bioconda
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.3.1            py38h8e6f84c_0    conda-forge
pip                       21.2.2             pyhd8ed1ab_0    conda-forge
pkg-config                0.29.2            h36c2ea0_1008    conda-forge
pluggy                    0.13.1           py38h578d9bd_4    conda-forge
prompt-toolkit            3.0.19             pyha770c72_0    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygments                  2.9.0              pyhd8ed1ab_0    conda-forge
pynndescent               0.5.4              pyh6c4a22f_0    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pysam                     0.16.0.1         py38hbdc2ae9_1    bioconda
pysocks                   1.7.1            py38h578d9bd_3    conda-forge
pytest                    6.2.4            py38h578d9bd_0    conda-forge
python                    3.8.5           h4d41432_2_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.8                      2_cp38    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
readline                  8.0                  h46ee950_1    conda-forge
requests                  2.26.0             pyhd8ed1ab_0    conda-forge
rosella                   0.3.3                h443a992_0    bioconda
samtools                  1.9                 h10a08f8_12    bioconda
scikit-bio                0.5.6            py38h0b5ebd8_4    conda-forge
scikit-learn              0.24.2           py38hdc147b9_0    conda-forge
scipy                     1.7.1            py38h56a6a73_0    conda-forge
seaborn                   0.11.1               hd8ed1ab_1    conda-forge
seaborn-base              0.11.1             pyhd8ed1ab_1    conda-forge
setuptools                49.6.0           py38h578d9bd_3    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
sqlite                    3.32.3               hcee41ef_1    conda-forge
starcode                  1.4                  h779adbc_1    bioconda
statsmodels               0.12.2           py38h5c078b8_0    conda-forge
tbb                       2020.2               h4bd325d_4    conda-forge
threadpoolctl             2.2.0              pyh8a188c0_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tornado                   6.1              py38h497a2fe_1    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
umap-learn                0.5.1            py38h578d9bd_1    conda-forge
urllib3                   1.26.6             pyhd8ed1ab_0    conda-forge
vt                        2015.11.10           he941832_3    bioconda
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.5.0                ha95c52a_0    conda-forge

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 28 (3 by maintainers)

Most upvoted comments

The specific issue to do with changing the number of threads in use between first compilation and cache replay has been fixed in https://github.com/numba/numba/pull/7625 and is in the 0.56.x release series of Numba. Updating the Numba version in your environment to 0.56.x should mitigate this specific issue.

I guess this issue could be marked as fixed/closed were UMAP to require 0.56.x onward as a dependency, or, were UMAP’s use of @jit(cache=True) only enabled if Numba 0.56.x onward is present.