amazon-redshift-python-driver: Redshift Driver Truncates Important Information in Error Messages

Driver version

2.0.907

Redshift version

Redshift 1.0.38698

Client Operating System

Docker python:3.10.2 image

Python version

3.10.2

Table schema

test_column: INTEGER

Problem description

When attempting to copy a file from S3 into Redshift via awswrangler, a data type mismatch will correctly throw an error. However, the error message is truncated, which makes it hard to debug the issue in non-trivial applications

Python Driver trace logs

redshift_connector.error.ProgrammingError: {'S': 'ERROR', 'C': 'XX000', 'M': 'Spectrum Scan Error', 'D': "
error: Spectrum Scan Error 
code: 15007 
context: File 'https://s3.region.amazonaws.com/bucket/bucket_directory/subdirectory/afile.snappy.parquet' has an incompatible Parquet schema for column 's3://bucket/bucket_director 
query: 1234567 
location: dory_util.cpp:1226 
process: worker_thread [pid=12345]

Reproduction code

import pandas as pd
import awswrangler as wr

df = pandas.DataFrame([[1.23]], columns="test_column")  # target schema is an integer, this is a float
wr.redshift.copy(
  df=df,
  path="s3://bucket/bucket_directory/subdirectory/",
  table="test",
)

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 18 (11 by maintainers)

Most upvoted comments

Haven’t forgot about this, I’m working to escalate the ticket I opened with the team for this issue. Ref: 6124

Hi @justbldwn – unfortunately I do not have any update from my end. As many engineers are on holiday this week I will ping the team again next week for an update on this.

For any readers that may come across the same issue, I’ve noticed that while the traceback in Python truncates this error message you can still get the full error message if you query SVL_S3LOG . Might help as a workaround

The issue has been opened, I’ll keep this open in the meantime and provide updates here once available 😃