rdkit: RDKit segfaults on macOS

Description:

  • RDKit Version:
$ python -c "import rdkit; rdkit.__version__"
Segmentation fault: 11
  • Platform: macOS, anaconda Python, fresh environment.

Conda environment spec:

# packages in environment at /Users/ericmjl/anaconda/envs/rdkit_test:
#
# Name                    Version                   Build  Channel
blas                      1.0                         mkl  
boost                     1.68.0           py37h3e44d54_1    conda-forge
boost-cpp                 1.68.0               h3a22d5f_0    conda-forge
bzip2                     1.0.6                h1de35cc_5  
ca-certificates           2018.03.07                    0  
cairo                     1.14.12              hc4e6be7_4  
certifi                   2018.11.29               py37_0  
fontconfig                2.13.0               h5d5b041_1  
freetype                  2.9.1                hb4e5f40_0  
gettext                   0.19.8.1             h15daf44_3  
glib                      2.56.2               hd9629dc_0  
icu                       58.2                 h4b95b61_1  
intel-openmp              2019.1                      144  
jpeg                      9b                   he5867d9_2  
libcxx                    4.0.1                hcfea43d_1  
libcxxabi                 4.0.1                hcfea43d_1  
libedit                   3.1.20170329         hb402a30_2  
libffi                    3.2.1                h475c297_4  
libgfortran               3.0.1                h93005f0_2  
libiconv                  1.15                 hdd342a3_7  
libpng                    1.6.35               ha441bb4_0  
libtiff                   4.0.9                hcb84e12_2  
libxml2                   2.9.8                hab757c2_1  
mkl                       2019.1                      144  
mkl_fft                   1.0.6            py37h27c97d8_0  
mkl_random                1.0.2            py37h27c97d8_0  
ncurses                   6.1                  h0a44026_1  
numpy                     1.15.4           py37hacdab7b_0  
numpy-base                1.15.4           py37h6575580_0  
olefile                   0.46                     py37_0  
openssl                   1.1.1a               h1de35cc_0  
pandas                    0.23.4           py37h6440ff4_0  
pcre                      8.42                 h378b8a2_0  
pillow                    5.3.0            py37hb68e598_0  
pip                       18.1                     py37_0  
pixman                    0.34.0               hca0a616_3  
pycairo                   1.18.0           py37ha54c0a8_0  
python                    3.7.2                haf84260_0  
python-dateutil           2.7.5                    py37_0  
pytz                      2018.7                   py37_0  
rdkit                     2018.09.1       py37hdb4e85a_1000    conda-forge
readline                  7.0                  h1de35cc_5  
setuptools                40.6.3                   py37_0  
six                       1.12.0                   py37_0  
sqlite                    3.26.0               ha441bb4_0  
tk                        8.6.8                ha441bb4_0  
wheel                     0.32.3                   py37_0  
xz                        5.2.4                h1de35cc_4  
zlib                      1.2.11               h1de35cc_3  

Code sample:

$ python -c "import rdkit; rdkit.__version__"
Segmentation fault: 11

It appears that segfault issues on macOS has been happening for quite a while now:

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 17 (8 by maintainers)

Most upvoted comments

@ericmjl that doesn’t help when we want to use multiple tools together. That said it is helpful for troubleshooting.

Not sure where I first saw the double-colon notation, maybe from @pstjohn. I just looked for some documentation now, and couldn’t find much. It seems to be part of the “MatchSpec package query language”. These conda 4.4 release notes have some info: https://www.anaconda.com/blog/developer-blog/how-to-get-ready-for-the-release-of-conda-4-4/

Conda has a built-in query language for searching for and matching packages, what we often refer to as MatchSpec. The MatchSpec is what users input on the command line when they specify packages for create, install, update, and remove operations. … We have also substantially enhanced our MatchSpec query language. For example,

conda install conda-forge::python

is now a valid command, which specifies that regardless of the active list of channel priorities, the python package itself should come from the conda-forge channel. … The canonical string form for a MatchSpec is thus

(channel::)name(version(build_string))

environments track user-requested state: Building on our enhanced MatchSpec query language, conda environments now also track and differentiate (a) packages added to an environment because of an explicit user request from (b) packages brought into an environment to satisfy dependencies. For example, executing

conda install conda-forge::scikit-learn

will confine all future changes to the scikit-learn package in the environment to the conda-forge channel, until the spec is changed again. A subsequent command conda install scikit-learn=0.18 would drop the conda-forge channel restriction from the package. And in this case, scikit-learn is the only user-defined spec, so the solver chooses dependencies from all configured channels and all available versions.

So essentially I think you are pinning only rdkit specifically to the conda-forge channel, but all other dependencies continue to be resolved using the regular channel priority order in .condarc, which may just be the defaults channel.

Yeah, conda-forge is awesome and an amazing service to the community. It can, however, be pretty heavyweight if you don’t know about doing conda install conda-forge::rdkit instead of conda install -c conda-forge rdkit and can sometimes lead to a bit of dependency hell with projects that aren’t in conda-forge.

We discussed what to do here briefly on the mailing list a while ago and decided that we need to keep the rdkit channel alive for testing purposes and to make sure that there’s an option for people who don’t want to (or can’t) switch over to conda-forge.

I should update the RDKit documentation to suggest installation via the conda-forge channel before the rdkit channel.

As a side note, I also can confirm that conda-forge RDKit works! 😄 😄 😄

The boost version is wrong ( we need to fix the recipe )

conda install boost=1.65.0

Should fix it.