impyla: HS2Error when running as_pandas
I’m running a smallish query (the result is 8MB of data), and getting an HS2Error when I try to read the data. as_pandas is working on smaller queries. Any idea what could be going on here?
Here’s what I’m running:
import impala.dbapi
from impala.util import as_pandas
c = impala.dbapi.connect(port=21050).cursor() # works fine
c.execute("[my query]") # works fine
df = as_pandas(c) # oh no!
and the error:
---------------------------------------------------------------------------
HS2Error Traceback (most recent call last)
<ipython-input-5-bee92ca13acd> in <module>()
----> 1 df = as_pandas(c)
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/util.pyc
in as_pandas(cursor)
21 def as_pandas(cursor):
22 names = [metadata[0] for metadata in cursor.description]
---> 23 return pd.DataFrame([dict(zip(names, row)) for row in
cursor], columns=names)
24 except ImportError:
25 print "Failed to import pandas"
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/dbapi.pyc
in next(self)
246 rows = impala.rpc.fetch_results(self.service,
247 self._last_operation_handle, self.description,
--> 248 self.buffersize)
249 self._buffer.extend(rows)
250 if len(self._buffer) == 0:
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in wrapper(*args, **kwargs)
116 if not transport.isOpen():
117 transport.open()
--> 118 return func(*args, **kwargs)
119 except socket.error as e:
120 pass
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/rpc.pyc
in fetch_results(service, operation_handle, schema, max_rows,
orientation)
235 maxRows=max_rows)
236 resp = service.FetchResults(req)
--> 237 err_if_rpc_not_ok(resp)
238
239 rows = []
/Users/jocelyn/anaconda/lib/python2.7/site-packages/impyla-0.9.0_dev-py2.7.egg/impala/error.pyc
in err_if_rpc_not_ok(resp)
55 if (resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_STATUS'] and
56 resp.status.statusCode !=
TStatusCode._NAMES_TO_VALUES['SUCCESS_WITH_INFO_STATUS']):
---> 57 raise HS2Error(resp.status.errorMessage)
HS2Error: Invalid session id
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Comments: 15 (10 by maintainers)
Sounds like a timeout. You may want to increase your Connection’s timeout.
After invalidating a table it can take quite a lot of hive metastore calls before it becomes operational again.