hvplot: sample_data try/except import wrapper fails

ALL software version info

hvplot: 0.7.3

Description of expected behavior and the observed behavior

The following import fails, despite the all-catching except in the code?? (Honestly stumped)

from hvplot.sample_data import us_crime, airline_flights
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_3185062/1788543639.py in <module>
----> 1 from hvplot.sample_data import us_crime, airline_flights

~/miniconda3/envs/py39/lib/python3.9/site-packages/hvplot/sample_data.py in <module>
     23 # Add catalogue entries to namespace
     24 for _c in catalogue:
---> 25     globals()[_c] = catalogue[_c]

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/base.py in __getitem__(self, key)
    398             if e.container == 'catalog':
    399                 return e(name=key)
--> 400             return e()
    401         if isinstance(key, str) and '.' in key:
    402             key = key.split('.')

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/entry.py in __call__(self, persist, **kwargs)
     75             raise ValueError('Persist value (%s) not understood' % persist)
     76         persist = persist or self._pmode
---> 77         s = self.get(**kwargs)
     78         if persist != 'never' and isinstance(s, PersistMixin) and s.has_been_persisted:
     79             from ..container.persist import store

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in get(self, **user_parameters)
    287             return self._default_source
    288 
--> 289         plugin, open_args = self._create_open_args(user_parameters)
    290         data_source = plugin(**open_args)
    291         data_source.catalog_object = self._catalog

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in _create_open_args(self, user_parameters)
    261 
    262         if len(self._plugin) == 0:
--> 263             raise ValueError('No plugins loaded for this entry: %s\n'
    264                              'A listing of installable plugins can be found '
    265                              'at https://intake.readthedocs.io/en/latest/plugin'

ValueError: No plugins loaded for this entry: parquet
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

For reference, this is the code in 0.7.3:

import os

try:
    from intake import open_catalog
except:
    raise ImportError('Loading hvPlot sample data requires intake '
                      'and intake-parquet. Install it using conda or '
                      'pip before loading data.')

How can intake throw a ValueError??

Complete, minimal, self-contained example code that reproduces the issue

  • Have only the package intake installed, no other intake-subpackages.
  • Execute : from hvplot.sample_data import us_crime, airline_flights
# code goes here between backticks
from hvplot.sample_data import us_crime, airline_flights

Stack traceback and/or browser JavaScript console output

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_3185062/1788543639.py in <module>
----> 1 from hvplot.sample_data import us_crime, airline_flights

~/miniconda3/envs/py39/lib/python3.9/site-packages/hvplot/sample_data.py in <module>
     23 # Add catalogue entries to namespace
     24 for _c in catalogue:
---> 25     globals()[_c] = catalogue[_c]

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/base.py in __getitem__(self, key)
    398             if e.container == 'catalog':
    399                 return e(name=key)
--> 400             return e()
    401         if isinstance(key, str) and '.' in key:
    402             key = key.split('.')

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/entry.py in __call__(self, persist, **kwargs)
     75             raise ValueError('Persist value (%s) not understood' % persist)
     76         persist = persist or self._pmode
---> 77         s = self.get(**kwargs)
     78         if persist != 'never' and isinstance(s, PersistMixin) and s.has_been_persisted:
     79             from ..container.persist import store

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in get(self, **user_parameters)
    287             return self._default_source
    288 
--> 289         plugin, open_args = self._create_open_args(user_parameters)
    290         data_source = plugin(**open_args)
    291         data_source.catalog_object = self._catalog

~/miniconda3/envs/py39/lib/python3.9/site-packages/intake/catalog/local.py in _create_open_args(self, user_parameters)
    261 
    262         if len(self._plugin) == 0:
--> 263             raise ValueError('No plugins loaded for this entry: %s\n'
    264                              'A listing of installable plugins can be found '
    265                              'at https://intake.readthedocs.io/en/latest/plugin'

ValueError: No plugins loaded for this entry: parquet
A listing of installable plugins can be found at https://intake.readthedocs.io/en/latest/plugin-directory.html .

Additional info

The list of required package is now this:

  • intake-parquet
  • intake-xarray
  • s3fs

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 26 (25 by maintainers)

Commits related to this issue

Most upvoted comments

I uninstalled iprogress because it was explicitly installed when creating the environment. iprogress is no longer supported and from what I can see it has been replaced by something similar in ipywidgets. I can only get the download to work with ipywidgets and not with iprogress. I don’t understand why yours work with iprogress and mine doesn’t.

I will file some bug reports later, so hopefully, new users (and me…) will be able to run the example without all these problems.

Thank you for helping me with finding the root of the problem, I really appreciate it!

I just think it is a lot to ask for new users to download 4 packages just to get access to a 8 KB file (us-crime) and a 15 MB (airline_flights) file. I just tried to see if I could run the Plotting page from a clean environment:

Created the environment with mamba create -n hvplot_example python=3.8 hvplot jupyterlab

First cell needed to install dask.

Second cell needed to install intake intake-parquet intake-xarray s3fs.

Third cell needed to install IProgress, afterwards it raises a FileNotFoundError? Then I tried to change cell to:

import dask.dataframe as dd
flights = dd.read_parquet("s3://assets.holoviews.org/data/airline_flights.parq").persist()
print(type(flights))
flights.head()

But this gives a NoCredentialsError: Unable to locate credentials. Got this to work by changing s3 to http.

To run the bivariate plot I needed to install scipy.

For the section Large Data to run I needed datashader.

Other things I noticed when trying to get the notebook to work:

  1. The links in hvplot namespace do not work as they should be lowercase.
  2. I don’t think you explicit need to run .compute on dask dataframe anymore to use hvplot (but I could be wrong).
  3. The import error for datashader references the name datashading instead of datashader.

I will properly make a PR for 1 and 3 today.

Saw the same thing in #562.

I still feel like it is a lot to install just to run the second page in a user guide. Why not just download the data with request or urllib like e.g. bokeh does?

Ah, i even didn’t see that deliberate IProgress install in the mamba command. I

  • redid the test without it, then
    • tqdm again fails (which is a bug, because if it needs something like that to run, it should add it to the conda-forge package dependency),
  • then installed ipywidgets only this time, and that seems to cover whatever tqdm needs to properly run.
    • Yes, had to remove ~/.intake again, due to 2nd bug with intake’s file management.

yes, I do. But I think by only renaming the cache folder you corrupt the intake data management system, because the persisted folder is still there, with partial records that need to access the cache folder whenever required. So when you only rename the cache folder, you have created an impossible state for intake. If you had renamed the whole .intake folder, things would have worked, I just confirmed that.

They seem to be independent packages:

❯ mamba info intake-xarray=0.5

intake-xarray 0.5.0 pyhd8ed1ab_0
--------------------------------
file name   : intake-xarray-0.5.0-pyhd8ed1ab_0.tar.bz2
name        : intake-xarray
version     : 0.5.0
build string: pyhd8ed1ab_0
build number: 0
channel     : https://conda.anaconda.org/conda-forge/noarch
size        : 1.4 MB
arch        : None
constrains  : ()
license     : BSD-2-Clause
license_family: BSD
md5         : 43d9d1c90da0b2b28cc16e58a52a0f2b
noarch      : python
package_type: noarch_python
platform    : None
sha256      : 91a388e5eb015b192bc17de04c55b102576d1c1b08571a80a1a9a1bc6c878f91
subdir      : noarch
timestamp   : 1616085245631
url         : https://conda.anaconda.org/conda-forge/noarch/intake-xarray-0.5.0-pyhd8ed1ab_0.tar.bz2
dependencies:
    dask >=2.2
    intake >=0.5.2
    netcdf4
    python >=3.5
    xarray >=0.12.0
    zarr
WARNING: 'conda info package_name' is deprecated.
          Use 'conda search package_name --info'.

site-packages/hvplot/examples via 🐍 v3.9.9 via 🅒 py39 took 5s 
❯ mamba search intake-parquet
Loading channels: done
# Name                       Version           Build  Channel             
intake-parquet                 0.2.1            py_0  conda-forge         
intake-parquet                 0.2.2            py_0  conda-forge         
intake-parquet                 0.2.3            py_0  conda-forge         

site-packages/hvplot/examples via 🐍 v3.9.9 via 🅒 py39 took 5s 
❯ mamba info intake-parquet=0.2.3

intake-parquet 0.2.3 py_0
-------------------------
file name   : intake-parquet-0.2.3-py_0.tar.bz2
name        : intake-parquet
version     : 0.2.3
build string: py_0
build number: 0
channel     : https://conda.anaconda.org/conda-forge/noarch
size        : 10 KB
arch        : None
constrains  : ()
license     : BSD-2-Clause
license_family: BSD
md5         : b7d04be2fb7b43946cf06dc5f7f04ad1
noarch      : python
package_type: noarch_python
platform    : None
sha256      : 2981d0998aa3e30713c6b2012a4557e77b70ed6e04778f9365c4fdeb593576ca
subdir      : noarch
timestamp   : 1573509119874
url         : https://conda.anaconda.org/conda-forge/noarch/intake-parquet-0.2.3-py_0.tar.bz2
dependencies:
    dask
    fastparquet
    intake >=0.3
    jinja2
    pandas
    pyarrow
    python >=3.5
WARNING: 'conda info package_name' is deprecated.
          Use 'conda search package_name --info'.

and s3fs is obviously unrelated. Will play with it and then submit a PR.