delta-rs: Python write_deltalake() to Non-AWS S3 failing
Environment
Delta-rs version: 0.6.2
Binding: Python
Environment: Docker container: Python: 3.10.7 OS: Debian GNU/Linux 11 (bullseye) S3: Non-AWS (Ceph based)
Bug
What happened:
Delta lake write is failing when trying to write table to Ceph based S3 (non-AWS). I am writing the table to a path which does not contain any delta table or any sort of file previously.
I have also tried different mode but writing the table still does not work and throws the same error.
My code:
storage_options = {"AWS_ACCESS_KEY_ID": f"{credentials.access_key}",
"AWS_SECRET_ACCESS_KEY": f"{credentials.secret_key}",
"AWS_ENDPOINT_URL": "https://xxx.yyy.zzz.net"
}
df = pd.DataFrame({'x': [1, 2, 3]})
table_uri = "s3a://<bucket-name>/delta_test"
dl.writer.write_deltalake(table_uri, df, storage_options=storage_options)
Fails with the following error:

Any idea what might be the problem? I am able to read the delta tables with the same storage_options.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 15
The
EntityTooSmallerror is a bug in the S3 implementation, and it’s triggered if any files in the table are over 5 MB. (It doesn’t seem to happen in the local emulators we use for testing, but does happen in AWS S3.) I have a fix ready in https://github.com/apache/arrow-rs/pull/3234, which will hopefully be included in the next release. Thanks for reporting this!Hello! I will give it a go, will let you know as soon as possible!
I currently get
SignatureDoesNotMatchwhen providing credentials.