aws-sdk-pandas: TypeError: an integer is required (got type str) ini Pyarrow
P.S. Don’t attach files. Please, prefer add code snippets directly in the message body.
I received the following error:
File "etl_fp_ledger_landing_tables.py", line 165, in extract_data
df = wr.db.read_sql_query(query, con=sql_engine)
File "/Users/johnchan/.local/share/virtualenvs/fp_ledger-kbdDDZPp/lib/python3.8/site-packages/awswrangler/db.py", line 438, in read_sql_query
return _records2df(
File "/Users/johnchan/.local/share/virtualenvs/fp_ledger-kbdDDZPp/lib/python3.8/site-packages/awswrangler/db.py", line 197, in _records2df
array: pa.Array = pa.array(obj=col_values, safe=safe) # Creating Arrow array
File "pyarrow/array.pxi", line 296, in pyarrow.lib.array
File "pyarrow/array.pxi", line 39, in pyarrow.lib._sequence_to_array
File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
TypeError: an integer is required (got type str)
The error said an integer is required. I have tried to be more specific on the query to identify which columns causing the problem. It turns out to be the datetime column. In MySQL, the data type is datetime
and in glue the data type is timestamp
.
There is no problem for other tables. Not sure why this happen.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (8 by maintainers)
Yes, I realised it @maxispeicher I will fix the data on the source to avoid this error on the framework. Thanks a lot for your help and I hope the above discussion help others facing the same issue. Cheers ✌️
I’ve just did a test with the most recent versions of
pyarrow
(4.0.1) andpandas
(1.2.3) forawswrangler==2.9.0
and there it works without any problems. So it seems like there was a bug in one of these (most probablepyarrow
) which was fixed in the meantime.