scikit-learn: Tests failing on master due to tree recythonization issues
Description
The last commit to master, https://github.com/scikit-learn/scikit-learn/commit/376aa50e70d7b45e115e01654bc0a91b5cb9b60d (from #6954) is failing tests on travisCI (although the PR passed all tests). The reason it’s doing so is because of a phenomenon that people who work with the tree
module are quite familiar with — for some reason, it isn’t recythonizing all of the dependencies and thus throws an ImportError
.
This behavior has been reported several times, e.g. at: https://github.com/scikit-learn/scikit-learn/pull/4899#issuecomment-209023134
Steps/Code to Reproduce
see: https://travis-ci.org/scikit-learn/scikit-learn/jobs/147795508 line 2226
Expected Results
Tests pass
Actual Results
Tests fail
======================================================================
ERROR: Failure: ValueError (sklearn.tree._tree.TreeBuilder has the wrong size, try recompiling. Expected 72, got 64)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/build/scikit-learn/scikit-learn/testvenv/local/lib/python2.7/site-packages/nose/loader.py", line 418, in loadTestsFromName
addr.filename, addr.module)
File "/home/travis/build/scikit-learn/scikit-learn/testvenv/local/lib/python2.7/site-packages/nose/importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "/home/travis/build/scikit-learn/scikit-learn/testvenv/local/lib/python2.7/site-packages/nose/importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "/home/travis/sklearn_build_ubuntu/scikit-learn/sklearn/ensemble/__init__.py", line 17, in <module>
from .gradient_boosting import GradientBoostingClassifier
File "/home/travis/sklearn_build_ubuntu/scikit-learn/sklearn/ensemble/gradient_boosting.py", line 36, in <module>
from ._gradient_boosting import predict_stages
File "sklearn/tree/_tree.pxd", line 84, in init sklearn.ensemble._gradient_boosting (sklearn/ensemble/_gradient_boosting.c:19325)
ValueError: sklearn.tree._tree.TreeBuilder has the wrong size, try recompiling. Expected 72, got 64
Misc Comments
I’ve been using hacky methods to get around this for awhile (e.g. manually adding a change to the file that isn’t being recythonized), but it also seems to work if someone clears the travis cache. However, these are annoying “solutions” – is there anyway we can fix the behavior? Namely, it seems that _gradient_boosting.pyx
isn’t being recythonized when things change in the tree
module.
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 18 (18 by maintainers)
A quick stop-gap solution would be to always do
make clean
on Travis, this way you always recythonize from scratch and you don’t have this dependency issue.Cythonization of all files takes 40s on my machine so I agree the caching is not that crucial. Looking at timings on Travis most build take 8-12 minutes so you are looking at saving at most 8% of build time for each individual build.
I think raghav is on the right track, several times I’ve modified tree cython files and gotten this error in gradient boosting.