scikit-learn: MNIST classification example seems to be broken
Description
I cannot reproduce the MNIST classfification using multinomial logistic + L1 example.
Steps/Code to Reproduce
I copied the whole code, as is, into a Jupyter Notebook.
Expected Results
No errors and similar results to the example’s web page.
Actual Results
Automatically created module for IPython interactive environment
---------------------------------------------------------------------------
JSONDecodeError Traceback (most recent call last)
<ipython-input-5-e53e77eeb3b2> in <module>
19
20 # Load data from https://www.openml.org/d/554
---> 21 X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
22
23 random_state = check_random_state(0)
/opt/tljh/user/lib/python3.6/site-packages/sklearn/datasets/openml.py in fetch_openml(name, version, data_id, data_home, target_column, cache, return_X_y)
478 "data_id.")
479
--> 480 data_description = _get_data_description_by_id(data_id, data_home)
481 if data_description['status'] != "active":
482 warn("Version {} of dataset {} is inactive, meaning that issues have "
/opt/tljh/user/lib/python3.6/site-packages/sklearn/datasets/openml.py in _get_data_description_by_id(data_id, data_home)
293 error_message = "Dataset with data_id {} not found.".format(data_id)
294 json_data = _get_json_content_from_openml_api(url, error_message, True,
--> 295 data_home)
296 return json_data['data_set_description']
297
/opt/tljh/user/lib/python3.6/site-packages/sklearn/datasets/openml.py in _get_json_content_from_openml_api(url, error_message, raise_if_error, data_home)
131 else:
132 return None
--> 133 json_data = json.loads(response.read().decode("utf-8"))
134 response.close()
135 return json_data
/opt/tljh/user/lib/python3.6/json/__init__.py in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
352 parse_int is None and parse_float is None and
353 parse_constant is None and object_pairs_hook is None and not kw):
--> 354 return _default_decoder.decode(s)
355 if cls is None:
356 cls = JSONDecoder
/opt/tljh/user/lib/python3.6/json/decoder.py in decode(self, s, _w)
337
338 """
--> 339 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
340 end = _w(s, end).end()
341 if end != len(s):
/opt/tljh/user/lib/python3.6/json/decoder.py in raw_decode(self, s, idx)
355 obj, end = self.scan_once(s, idx)
356 except StopIteration as err:
--> 357 raise JSONDecodeError("Expecting value", s, err.value) from None
358 return obj, end
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Versions
System
------
python: 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56) [GCC 7.2.0]
executable: /opt/tljh/user/bin/python
machine: Linux-4.15.0-38-generic-x86_64-with-debian-buster-sid
BLAS
----
macros:
lib_dirs:
cblas_libs: cblas
Python deps
-----------
pip: 10.0.1
setuptools: 39.1.0
sklearn: 0.20.0
numpy: 1.15.3
scipy: 1.1.0
Cython: None
pandas: None
as well as these warnings:
/opt/tljh/user/lib/python3.6/site-packages/numpy/distutils/system_info.py:625: UserWarning:
Atlas (http://math-atlas.sourceforge.net/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [atlas]) or by setting
the ATLAS environment variable.
self.calc_info()
/opt/tljh/user/lib/python3.6/site-packages/numpy/distutils/system_info.py:625: UserWarning:
Blas (http://www.netlib.org/blas/) libraries not found.
Directories to search for the libraries can be specified in the
numpy/distutils/site.cfg file (section [blas]) or by setting
the BLAS environment variable.
self.calc_info()
/opt/tljh/user/lib/python3.6/site-packages/numpy/distutils/system_info.py:625: UserWarning:
Blas (http://www.netlib.org/blas/) sources not found.
Directories to search for the sources can be specified in the
numpy/distutils/site.cfg file (section [blas_src]) or by setting
the BLAS_SRC environment variable.
self.calc_info()
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (13 by maintainers)
Thanks for the quick response, can confirm it works on scikit-learn master now.
Sorry, this was caused by a bad configuration setting on the server. Fixed now.
Oh god. PHP rendering HTML-wrapped error messages to stdout! 😦 @janvanrijn
The important code is:
I was able to reproduce the error on the current release of scikit-learn but not on the most recent development version in this repo. I know some work was done recently on
fetch_openml
so I believe the issue has been resolved. However, if someone else could confirm that would be great.