duckdb: Failed to download extension "httpfs" at …/linux_arm64_gcc4/…

What happens?

python-duckdb fails to download extensions

HTTP Error: Failed to download extension “httpfs” at URL “http://extensions.duckdb.org/v0.8.1/linux_arm64_gcc4/httpfs.duckdb_extension.gz

To Reproduce

Inside a docker container (I’m using python:3.11-slim) for arm64.

  • docker run -it --rm python:3.11-slim /bin/bash
  • pip install duckdb
  • python
  • import duckdb
  • duckdb.query("install 'httpfs';")

It seems like it is looking for extensions in linux_arm64_gcc4 and those don’t exist. I tried grabbing the ones at linux_arm64 and putting them in the ~/.duckdb/extensions/v0.8.1/linux_arm64_gcc4/ directory, but they fail to load.

OS:

docker on macOS on aarch64

DuckDB Version:

0.8.1

DuckDB Client:

python

Full Name:

michael conrad tadpol tilstra

Affiliation:

Exosite

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 17
  • Comments: 71 (15 by maintainers)

Commits related to this issue

Most upvoted comments

Following https://github.com/duckdb/duckdb/issues/8035#issuecomment-1833331284, the sqlite_scanner extension works by building the extension and installing duckdb from that build:

RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
    git clone https://github.com/duckdb/sqlite_scanner.git && \
    cd sqlite_scanner && \
    git clone --branch v0.9.2 https://github.com/duckdb/duckdb && \
    make && \
    pip uninstall -y duckdb && \
    cd duckdb/tools/pythonpkg && python -m pip install . ; \
fi

In order to use the extension with Python I had to create the connection with config={"allow_unsigned_extensions": "true"} and load the extension created under the above directory (sqlite_scanner/build/release/extension/sqlite_scanner/sqlite_scanner.duckdb_extension).

@faridgt you might need something similar for postgres_scanner.

Looking into this further (now that I’m back from holidays), looks like there was some user error on my part. The results I pasted in my earlier comment were from my local dev build, which I had enabled the httpfs extension as part of. It isn’t actually built into any of the Python versions that I can see

What this actually means is that we need to update the DuckDB Python extensions job to build for linux_arm64 as well as amd64

Hey! Any news on this issue? This should affect many people using containers on their Apple silicon Macs. I might try to help if given some pointers.

Thanks!

Given the solution leaves us (those who develop/deploy on arm under docker) without full support…


FROM python:3.11-slim AS compile-image
ARG TARGETPLATFORM
ARG BUILDPLATFORM
RUN apt-get update
RUN apt-get install -y --no-install-recommends \
    git \
    g++ \
    cmake \
    libssl-dev 

RUN python -m venv /opt/venv

# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"

# duckdb
# we need to compile duckdb ourselves because duckdb doesnt provide
# binary extensions for 'httpfs' in platform: linux_arm64_gcc4
# this means duckdb is not working to query remote files in 
# both Mac M1 (only under docker) and Linux ARM (only under docker)
# Note: without docker, duckdb extensions autoload mechanism works.
# More info at:
# https://github.com/duckdb/duckdb/issues/8035

RUN echo "I am running on $BUILDPLATFORM, building for $TARGETPLATFORM"

RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
    git clone --depth 1 --branch v0.9.2 https://github.com/duckdb/duckdb && \
    cd duckdb/tools/pythonpkg && BUILD_HTTPFS=1 python -m pip install . ; \
    elif [ "$TARGETPLATFORM" = "linux/amd64" ]; then \
    pip install duckdb==0.9.2; \
fi


FROM python:3.11-slim AS build-image
COPY --from=compile-image /opt/venv /opt/venv


# Make sure we use the virtualenv:
ENV PATH="/opt/venv/bin:$PATH"

CMD ["python", "-c", "import duckdb;print(duckdb.sql('select count(*) from read_csv_auto(\"https://raw.githubusercontent.com/duckdb/duckdb/main/data/csv/aws_locations.csv\")'))"]

hope this helps.

Hello everyone,

This thread has been quite frequently commented on in the last few months.

To clarify the status of the linux_arm64_gcc4 extensions, I added the following line to the working with extensions page:

We currently do not distribute binaries for extensions on the linux_arm64_gcc4 platform.

The rationale behind this is that distributing these packages would require a complex CI setup that’s difficult to maintain and likely incurs a large cost to run. Extensions can still be built manually for this platform.

Gabor

For those that want a ready to use image, I just pushed phidata/duckdb:0.9.2 built from source.

Run it using:

 docker run -it --pull always phidata/duckdb:0.9.2

This is very sad to not support linux_arm64_gcc4, it is impossible to use duckdb with rust on aws lambda…If someone have an idea…

While not using aws lambda, with version 0.10.0 I was able to get duckdb running on sagemaker with graviton instances, so I think the amd64 builds are working now.

I’m subscribed to this ticket and got confused by all this hubbub from people facing problems, so I did the Docker-based reproduction steps at the top of the ticket (and expanded a bit on them) and I see it working just fine:

% docker pull python:3.11-slim                  
3.11-slim: Pulling from library/python
f546e941f15b: Pull complete 
24935aba99a7: Pull complete 
07b3e0dc751a: Pull complete 
7e0115596a7a: Pull complete 
a66610a3b2a1: Pull complete 
Digest: sha256:ce81dc539f0aedc9114cae640f8352fad83d37461c24a3615b01f081d0c0583a
Status: Downloaded newer image for python:3.11-slim
docker.io/library/python:3.11-slim

% docker run -it --rm python:3.11-slim /bin/bash
root@85e22fab99d7:/# pip install duckdb
Collecting duckdb
  Downloading duckdb-0.10.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl.metadata (763 bytes)
Downloading duckdb-0.10.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl (16.4 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 16.4/16.4 MB 42.8 MB/s eta 0:00:00
Installing collected packages: duckdb
Successfully installed duckdb-0.10.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

root@85e22fab99d7:/# python
Python 3.11.8 (main, Feb 13 2024, 09:14:01) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

>>> import duckdb

>>> duckdb.query("install 'httpfs';")

>>> duckdb.query("load 'httpfs';")

>>> duckdb.query("select count(*) from 'https://gist.githubusercontent.com/sacundim/73bd069669edaca11e21b9f25aaa5309/raw/ba13a5065c559709476ed05eed66fd19f3b3842e/titanic_casualties.csv';")
┌──────────────┐
│ count_star() │
│    int64     │
├──────────────┤
│           11 │
└──────────────┘

The python:3.11-slim image is currently based on Debian 12 (Bookworm); you may get different results on other operating systems, but that’s down to those OSes.

Also I don’t know anything about Rust, but this ticket was very clearly filed to cover the DuckDB Python extension, which is a completely different matter than the Rust API.

With duckdb v0.10.0, I’m facing the invalid ELF header issue on linux_amd64_gcc4

RUN if [ “$TARGETPLATFORM” = “linux/arm64” ]; then
git clone https://github.com/duckdb/sqlite_scanner.git &&
cd sqlite_scanner &&
git clone --branch v0.9.2 https://github.com/duckdb/duckdb &&
make &&
pip uninstall -y duckdb &&
cd duckdb/tools/pythonpkg && python -m pip install . ;
fi

I am following those instructions but have one little issue related to postgres extension /postgres_scanner/src/include/postgres_utils.hpp:12:10: fatal error: libpq-fe.h: No such file or directory I installed libq but still not resolved apt-get install libpq-dev

Update: it works after I edit the reference to libpq-fe.h: edit postgres_scanner/src/include/postgres_utils.hpp - change to #include “/usr/include/postgresql/libpq-fe.h”

edit postgres_scanner/src/postgres_scanner.cpp - change #include “/usr/include/postgresql/libpq-fe.h”

@acirtep one note as we do python -m pip install . no need to load extension manually just a normal install and load will works as the extension already deployed as part of the python package

million thanks @acirtep

Thanks for the update, @Mause. I’ve subscribed to the following GitHub issue to make sure I get updates about linux_arm64 runners: https://github.com/actions/runner-images/issues/5631 (it’s locked, so I’m hoping it’s still relevant).

Would you consider trying one of these options in the meantime?

I can confirm that I’m running into these issues as well and that having a sense of the timelines or priorities here would help plot out things on my end too regarding whether it’d be more prudent to wait or to continue to iterate on on the self-build process.

You can find the extensions in the build directory, e.g.:

$ > find build -name "*.duckdb_extension"
build/release/extension/autocomplete/autocomplete.duckdb_extension
build/release/extension/sqlsmith/sqlsmith.duckdb_extension
build/release/extension/inet/inet.duckdb_extension
build/release/extension/json/json.duckdb_extension
build/release/extension/tpcds/tpcds.duckdb_extension
build/release/extension/tpch/tpch.duckdb_extension
build/release/extension/parquet/parquet.duckdb_extension
build/release/extension/icu/icu.duckdb_extension
build/release/test/extension/loadable_extension_demo.duckdb_extension
build/release/test/extension/loadable_extension_optimizer_demo.duckdb_extension

DuckDB can install from a local path, so you can run the SQL query to install an extension:

INSTALL 'build/release/extension/icu/icu.duckdb_extension'

@stanoswald I thought about your comment for a while longer, and realized we could probably update the manylinux standard version we compile for to also update the version of gcc we use. While this will fix this issue, it will also mean dropping support for some very old Linux versions, so we’re going to test it out first before we commit

In summary, thanks for the comment!

We’re running into the same issue at Hugging Face:

duckdb.HTTPException: HTTP Error: Failed to download extension "httpfs" at URL "http://extensions.duckdb.org/v0.8.1/linux_arm64_gcc4/httpfs.duckdb_extension.gz"
Extension "httpfs" is an existing extension.

Are you using a development build? In this case, extensions might not (yet) be uploaded."

Happening in a linux docker container on mac m2