duckdb: Error when coping data to Cloudflare R2
What happens?
using the S3 API, when I try to copy a file to R2, I get this error
duckdb.IOException: IO Error: Unexpected response during S3 multipart upload finalization
the same file was fine using AWS Boto package
To Reproduce
import streamlit as st
import duckdb
con=duckdb.connect()
con.execute(f'''
install httpfs;
LOAD httpfs;
PRAGMA enable_object_cache ;
set s3_region = 'auto';
set s3_access_key_id = "{st.secrets["aws_access_key_id_secret"]}" ;
set s3_secret_access_key = '{st.secrets["aws_secret_access_key_secret"] }';
set s3_endpoint = '{st.secrets["endpoint_url_secret"]}' ;
SET s3_url_style='path';
COPY (select * from 'Parquet/0.1/lineitem.parquet' ) to 's3://delta/1/lineitem.parquet'
''')
OS:
windows 10
DuckDB Version:
0.6.2.dev480
DuckDB Client:
python
Full Name:
mimoune djouallah
Affiliation:
Personal project
Have you tried this on the latest master branch?
- I agree
Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?
- I agree
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 29 (10 by maintainers)
@Arttii A 0.7.1 bugfix release is coming up with a fix for this
@elithrar thanks for the input, it was an url encoding issue in duckdb after all, we were incorrectly not encoding the multipart upload id in the query params
@samansmink thanks, it works
@samansmink still it does not works with Cloudflare.
It works now on latest master with S3, also the column order is now correct. Thanks a lot @samansmink / @Mytherin!
working on it, pr is almost ready!
We think we’ve identified the issue here, relating to assumptions around periods not appearing in urls (aside from as the file extension). This assumption fails when periods appear in urls for other reasons (eg, presigned urls)