dbt-core: [CT-106] [Bug] Docker image build process is broken

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

On building I get:

ERROR: Could not find a version that satisfies the requirement dbt-postgres (unavailable) (from versions: 0.13.0a1, 0.13.0a2, 0.13.0rc1, 0.13.0, 0.13.1a1, 0.13.1a2, 0.13.1, 0.14.0a1, 0.14.0a2, 0.14.0rc1, 0.14.0, 0.14.1a1, 0.14.1rc1, 0.14.1rc2, 0.14.1, 0.14.2, 0.14.3rc1, 0.14.3, 0.14.4, 0.15.0b1, 0.15.0b2, 0.15.0b3, 0.15.0rc1, 0.15.0rc2, 0.15.0, 0.15.1rc1, 0.15.1rc2, 0.15.1, 0.15.2, 0.15.3rc1, 0.15.3, 0.16.0b1, 0.16.0b2, 0.16.0b3, 0.16.0rc1, 0.16.0rc2, 0.16.0rc3, 0.16.0rc4, 0.16.0, 0.16.1rc1, 0.16.1, 0.17.0b1, 0.17.0b2, 0.17.0rc1, 0.17.0rc2, 0.17.0rc3, 0.17.0rc4, 0.17.0, 0.17.1rc1, 0.17.1rc2, 0.17.1rc3, 0.17.1rc4, 0.17.1, 0.17.2b1, 0.17.2rc1, 0.17.2, 0.18.0b1, 0.18.0b2, 0.18.0rc1, 0.18.0rc2, 0.18.0, 0.18.1b1, 0.18.1b2, 0.18.1b3, 0.18.1rc1, 0.18.1, 0.18.2rc1, 0.18.2, 0.19.0b1, 0.19.0rc1, 0.19.0rc2, 0.19.0rc3, 0.19.0, 0.19.1b2, 0.19.1rc1, 0.19.1rc2, 0.19.1, 0.19.2rc1, 0.19.2rc2, 0.19.2, 0.20.0b1, 0.20.0rc1, 0.20.0rc2, 0.20.0, 0.20.1rc1, 0.20.1, 0.20.2rc1, 0.20.2rc2, 0.20.2, 0.21.0b1, 0.21.0b2, 0.21.0rc1, 0.21.0rc2, 0.21.0, 0.21.1rc1, 0.21.1rc2, 0.21.1, 1.0.0b1, 1.0.0b2, 1.0.0rc1, 1.0.0rc2, 1.0.0rc3, 1.0.0, 1.0.1rc1, 1.0.1)
ERROR: No matching distribution found for dbt-postgres (unavailable)

Expected Behavior

I’d expect to have docker image

Steps To Reproduce

  1. Clone fresh git repo
  2. docker build -t tmp -f docker/Dockerfile

Relevant log output

[3/9] STEP 2/2: RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"
Collecting dbt-postgres
  Cloning https://github.com/dbt-labs/ to /tmp/pip-install-epfngjko/dbt-postgres_8d7eb22fbdbe4b06bf3a20348f72b0c7
  Running command git clone --filter=blob:none -q https://github.com/dbt-labs/ /tmp/pip-install-epfngjko/dbt-postgres_8d7eb22fbdbe4b06bf3a20348f72b0c7
  remote: Not Found
  fatal: repository 'https://github.com/dbt-labs/' not found
WARNING: Discarding git+https://github.com/dbt-labs/#egg=dbt-postgres&subdirectory=plugins/postgres. Command errored out with exit status 128: git clone --filter=blob:none -q https://github.com/dbt-labs/ /tmp/pip-install-epfngjko/dbt-postgres_8d7eb22fbdbe4b06bf3a20348f72b0c7 Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement dbt-postgres (unavailable) (from versions: 0.13.0a1, 0.13.0a2, 0.13.0rc1, 0.13.0, 0.13.1a1, 0.13.1a2, 0.13.1, 0.14.0a1, 0.14.0a2, 0.14.0rc1, 0.14.0, 0.14.1a1, 0.14.1rc1, 0.14.1rc2, 0.14.1, 0.14.2, 0.14.3rc1, 0.14.3, 0.14.4, 0.15.0b1, 0.15.0b2, 0.15.0b3, 0.15.0rc1, 0.15.0rc2, 0.15.0, 0.15.1rc1, 0.15.1rc2, 0.15.1, 0.15.2, 0.15.3rc1, 0.15.3, 0.16.0b1, 0.16.0b2, 0.16.0b3, 0.16.0rc1, 0.16.0rc2, 0.16.0rc3, 0.16.0rc4, 0.16.0, 0.16.1rc1, 0.16.1, 0.17.0b1, 0.17.0b2, 0.17.0rc1, 0.17.0rc2, 0.17.0rc3, 0.17.0rc4, 0.17.0, 0.17.1rc1, 0.17.1rc2, 0.17.1rc3, 0.17.1rc4, 0.17.1, 0.17.2b1, 0.17.2rc1, 0.17.2, 0.18.0b1, 0.18.0b2, 0.18.0rc1, 0.18.0rc2, 0.18.0, 0.18.1b1, 0.18.1b2, 0.18.1b3, 0.18.1rc1, 0.18.1, 0.18.2rc1, 0.18.2, 0.19.0b1, 0.19.0rc1, 0.19.0rc2, 0.19.0rc3, 0.19.0, 0.19.1b2, 0.19.1rc1, 0.19.1rc2, 0.19.1, 0.19.2rc1, 0.19.2rc2, 0.19.2, 0.20.0b1, 0.20.0rc1, 0.20.0rc2, 0.20.0, 0.20.1rc1, 0.20.1, 0.20.2rc1, 0.20.2rc2, 0.20.2, 0.21.0b1, 0.21.0b2, 0.21.0rc1, 0.21.0rc2, 0.21.0, 0.21.1rc1, 0.21.1rc2, 0.21.1, 1.0.0b1, 1.0.0b2, 1.0.0rc1, 1.0.0rc2, 1.0.0rc3, 1.0.0, 1.0.1rc1, 1.0.1)
ERROR: No matching distribution found for dbt-postgres (unavailable)
error building at STEP "RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"": error while running runtime: exit status 1


### Environment

```markdown
- OS: linux
- Python: -
- dbt: main

What database are you using dbt with?

No response

Additional Context

No response

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

I had run into the same issue. I guess @iknox-fa has “BuildKit” enabled, and that allows extra feature in Dockerfile, like reusing ARG across multiple stages, skipping not-required stages.

I have a fix to Dockerfile, but I don’t want to sign CLA, so I post it here.

  1. Move the “base” image FROM statement after ARG dbt_third_party.
  2. Insert ARG dbt_<component>_ref before each RUN statement referencing that argument.
  3. Insert all seven ARG dbt_<component>_ref in “dbt-all” target.
  4. Move dbt-third-party target to the end of Dockerfile, so it won’t be called unless specified in the command line.

Note: an “ARG” after second FROM without values will get the default value, see reference 1.

The test.sh also need a fix to work with the latest Dockerfile.

  1. Either bump the version to latest 1.3.0b1
  2. or remove those version specific tests.

With the fixes above, this Dockerfile will work with both docker and podman, BuildKit(env variable DOCKER_BUILDKIT=1) is no longer needed.

References:

  1. Understand how ARG and FROM interact, see “An ARG declared before a FROM is outside of a build stage, so it can’t be used in any instruction after a FROM. To use the default value of an ARG declared before the first FROM use an ARG instruction without a value inside of a build stage”
  2. Containerfile(5) man page, see “Note that a second FROM in a Containerfile sets the values associated with an Arg variable to nil and they must be reset if they are to be used later in the Containerfile”
  3. What is Podman? Podman is a daemonless, open source, Linux native tool designed to make it easy to find, run, build, share and deploy applications using Open Containers Initiative (OCI) Containers and Container Images.

I think I found the error. Is Dockerfile config. Due Docker create a new build stage, isn’t able to keep the ARG variables after a FROM statement. So what I’ve done, is declare every variable again before every RUN statement. Dockerfile as below:

##
#  Generic dockerfile for dbt image building.
#  See README for operational details
##

# Top level build args
ARG build_for=linux/amd64

##
# base image (abstract)
##
FROM --platform=$build_for python:3.10.3-slim-bullseye as base

# N.B. The refs updated automagically every release via bumpversion
# N.B. dbt-postgres is currently found in the core codebase so a value of dbt-core@<some_version> is correct

ARG dbt_core_ref=dbt-core@v1.2.0a1
ARG dbt_postgres_ref=dbt-core@v1.2.0a1
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
ARG dbt_spark_ref=dbt-spark@v1.0.0
# special case args
ARG dbt_spark_version=all
ARG dbt_third_party

# System setup
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    git \
    ssh-client \
    software-properties-common \
    make \
    build-essential \
    ca-certificates \
    libpq-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*

# Env vars
ENV PYTHONIOENCODING=utf-8
ENV LANG=C.UTF-8

# Update python
RUN python -m pip install --upgrade pip setuptools wheel --no-cache-dir

# Set docker basics
WORKDIR /usr/app/dbt/
VOLUME /usr/app
ENTRYPOINT ["dbt"]

##
# dbt-core
##
FROM base as dbt-core
ARG dbt_core_ref=dbt-core@v1.2.0a1
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_core_ref}#egg=dbt-core&subdirectory=core"

##
# dbt-postgres
##a
FROM base as dbt-postgres
ARG dbt_postgres_ref=dbt-core@v1.2.0a1
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"


##
# dbt-redshift
##
FROM base as dbt-redshift
ARG dbt_redshift_ref=dbt-redshift@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"


##
# dbt-bigquery
##
FROM base as dbt-bigquery
ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"


##
# dbt-snowflake
##
FROM base as dbt-snowflake
ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"

##
# dbt-spark
##
FROM base as dbt-spark
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    python-dev \
    libsasl2-dev \
    gcc \
    unixodbc-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*

ARG dbt_spark_ref=dbt-spark@v1.0.0
ARG dbt_spark_version=all
RUN python -m pip install --no-cache-dir "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"


##
# dbt-third-party
##a
FROM dbt-core as dbt-third-party
ARG dbt_third_party
RUN python -m pip install --no-cache-dir "${dbt_third_party}"

##
# dbt-all
##
FROM base as dbt-all
RUN apt-get update \
  && apt-get dist-upgrade -y \
  && apt-get install -y --no-install-recommends \
    python-dev \
    libsasl2-dev \
    gcc \
    unixodbc-dev \
  && apt-get clean \
  && rm -rf \
    /var/lib/apt/lists/* \
    /tmp/* \
    /var/tmp/*
  ARG dbt_redshift_ref=dbt-redshift@v1.0.0
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_redshift_ref}#egg=dbt-redshift"
  ARG dbt_bigquery_ref=dbt-bigquery@v1.0.0
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_bigquery_ref}#egg=dbt-bigquery"
  ARG dbt_snowflake_ref=dbt-snowflake@v1.0.0
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_snowflake_ref}#egg=dbt-snowflake"
  ARG dbt_spark_ref=dbt-spark@v1.0.0
  ARG dbt_spark_version=all
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_spark_ref}#egg=dbt-spark[${dbt_spark_version}]"
  ARG dbt_postgres_ref=dbt-core@v1.2.0a1
  RUN python -m pip install --no-cache "git+https://github.com/dbt-labs/${dbt_postgres_ref}#egg=dbt-postgres&subdirectory=plugins/postgres"