amazon-redshift-python-driver: Redshift Driver Truncates Important Information in Error Messages
Driver version
2.0.907
Redshift version
Redshift 1.0.38698
Client Operating System
Docker python:3.10.2 image
Python version
3.10.2
Table schema
test_column: INTEGER
Problem description
When attempting to copy a file from S3 into Redshift via awswrangler, a data type mismatch will correctly throw an error. However, the error message is truncated, which makes it hard to debug the issue in non-trivial applications
Python Driver trace logs
redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "
error: Spectrum Scan Error
code: 15007
context: File 'https://s3.region.amazonaws.com/bucket/bucket_directory/subdirectory/afile.snappy.parquet' has an incompatible Parquet schema for column 's3://bucket/bucket_director
query: 1234567
location: dory_util.cpp:1226
process: worker_thread [pid=12345]
Reproduction code
import pandas as pd
import awswrangler as wr
df = pandas.DataFrame([[1.23]], columns="test_column") # target schema is an integer, this is a float
wr.redshift.copy(
df=df,
path="s3://bucket/bucket_directory/subdirectory/",
table="test",
)
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 18 (11 by maintainers)
Haven’t forgot about this, I’m working to escalate the ticket I opened with the team for this issue. Ref: 6124
Hi @justbldwn – unfortunately I do not have any update from my end. As many engineers are on holiday this week I will ping the team again next week for an update on this.
For any readers that may come across the same issue, I’ve noticed that while the traceback in Python truncates this error message you can still get the full error message if you query SVL_S3LOG . Might help as a workaround
The issue has been opened, I’ll keep this open in the meantime and provide updates here once available 😃