aws-sdk-pandas: TypeError: an integer is required (got type str) ini Pyarrow

P.S. Don’t attach files. Please, prefer add code snippets directly in the message body.

I received the following error:

  File "etl_fp_ledger_landing_tables.py", line 165, in extract_data
    df = wr.db.read_sql_query(query, con=sql_engine)
  File "/Users/johnchan/.local/share/virtualenvs/fp_ledger-kbdDDZPp/lib/python3.8/site-packages/awswrangler/db.py", line 438, in read_sql_query
    return _records2df(
  File "/Users/johnchan/.local/share/virtualenvs/fp_ledger-kbdDDZPp/lib/python3.8/site-packages/awswrangler/db.py", line 197, in _records2df
    array: pa.Array = pa.array(obj=col_values, safe=safe)  # Creating Arrow array
  File "pyarrow/array.pxi", line 296, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 39, in pyarrow.lib._sequence_to_array
  File "pyarrow/error.pxi", line 122, in pyarrow.lib.pyarrow_internal_check_status
TypeError: an integer is required (got type str)

The error said an integer is required. I have tried to be more specific on the query to identify which columns causing the problem. It turns out to be the datetime column. In MySQL, the data type is datetime and in glue the data type is timestamp.

There is no problem for other tables. Not sure why this happen.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 17 (8 by maintainers)

Most upvoted comments

Yes, I realised it @maxispeicher I will fix the data on the source to avoid this error on the framework. Thanks a lot for your help and I hope the above discussion help others facing the same issue. Cheers ✌️

I’ve just did a test with the most recent versions of pyarrow (4.0.1) and pandas (1.2.3) for awswrangler==2.9.0 and there it works without any problems. So it seems like there was a bug in one of these (most probable pyarrow) which was fixed in the meantime.