duckdb: Error when coping data to Cloudflare R2

What happens?

using the S3 API, when I try to copy a file to R2, I get this error duckdb.IOException: IO Error: Unexpected response during S3 multipart upload finalization

the same file was fine using AWS Boto package

To Reproduce

import streamlit as st
import duckdb
con=duckdb.connect()

con.execute(f'''
install httpfs;
LOAD httpfs;
PRAGMA enable_object_cache ;
set s3_region = 'auto';
set s3_access_key_id = "{st.secrets["aws_access_key_id_secret"]}" ;
set s3_secret_access_key = '{st.secrets["aws_secret_access_key_secret"] }';
set s3_endpoint = '{st.secrets["endpoint_url_secret"]}'  ;
SET s3_url_style='path';
COPY (select * from 'Parquet/0.1/lineitem.parquet' ) to 's3://delta/1/lineitem.parquet'
''')

OS:

windows 10

DuckDB Version:

0.6.2.dev480

DuckDB Client:

python

Full Name:

mimoune djouallah

Affiliation:

Personal project

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 29 (10 by maintainers)

Most upvoted comments

@Arttii A 0.7.1 bugfix release is coming up with a fix for this

@elithrar thanks for the input, it was an url encoding issue in duckdb after all, we were incorrectly not encoding the multipart upload id in the query params

@samansmink thanks, it works

@samansmink still it does not works with Cloudflare.

It works now on latest master with S3, also the column order is now correct. Thanks a lot @samansmink / @Mytherin!

working on it, pr is almost ready!

We think we’ve identified the issue here, relating to assumptions around periods not appearing in urls (aside from as the file extension). This assumption fails when periods appear in urls for other reasons (eg, presigned urls)