superset: jdbc+hive in sqlalchemy URI is not working
Make sure these boxes are checked before submitting your issue - thank you!
- I have checked the superset logs for python stacktraces and included it here as text if any
- I have reproduced the issue with at least the latest released version of superset
- I have checked the issue tracker for the same issue and I haven’t found one similar
Superset version
0.18.4
Expected results
using jdbc+hive:// in sqlalchemy URI will work
Actual results
superset web server raise an exception:
sqlalchemy.exc.NoSuchModuleError: Can't load plugin: sqlalchemy.dialects:jdbc.hive
Steps to reproduce
pip install -U pyhive- create a new database in superset using
jdbc+hive://prefix, and then press the test button.
More
I’ve read https://github.com/airbnb/superset/issues/241 to learn that it’s a known issue, and @shkr had posted a databricks tutorial that will guide new comers to setup this jdbc+hive connector, but the link within https://github.com/airbnb/superset/issues/241#issuecomment-234010902 is already gone, and I haven’t been able to found any related information on https://docs.databricks.com/user-guide/getting-started.html
that’s why I’m re-raising this issue, focusing on how to get jdbc+hive:// to work, and hopefully help make the docs more complete and friendly.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (9 by maintainers)
SQLAlchemy URI: hive://localhost:10000
I solved this problem by follow reference. Hoping can help 😃
https://pypi.python.org/pypi/PyHive
Requirements
Install using
pip install pyhive[hive] for the Hive interface and pip install pyhive[presto] for the Presto interface.
pyhive==0.5.0 maybe also raise some error
pip install pythrifthiveapi
comment out getProgressUpdate in site-packages/pyhive/hive.py
@JazzChen , I followed the solution you provided, still have the same issue with Can’t load plugin: sqlalchemy.dialects:hive.jdbc
I am using superset version 0.20.5
visualization works.
Here is my table column setup.
The type of a time column should be
TIMESTAMP, and selected theIs temporalimpylasupportTIMESTAMPtype https://github.com/cloudera/impyla/blob/master/impala/sqlalchemy.py#L99I used
impala://127.0.0.1:10000/defaultto connect a spark thrift server, and it works.yes, I have installed pyhive v0.3.0, and
hive://seems to be working correctly, but there’s some other issue relating pyhive and its dependencies that I can’t get around with, so I’m trying sparksql on superset. @xrmx