airflow: apache-airflow-providers-common-sql==1.3.0 breaks BigQuery operators

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

Airflow version: 2.3.4 (Cloud Composer 2.0.32)

Issue: apache-airflow-providers-common-sql==1.3.0 breaks all BigQuery operators provided by the apache-airflow-providers-google==8.4.0 package. The error is as follows:

Broken DAG: [/home/airflow/gcs/dags/test-dag.py] Traceback (most recent call last):
  File "/home/airflow/gcs/dags/test-dag.py", line 6, in <module>
    from airflow.providers.google.cloud.operators.bigquery import BigQueryExecuteQueryOperator
  File "/opt/python3.8/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/bigquery.py", line 35, in <module>
    from airflow.providers.common.sql.operators.sql import (
ImportError: cannot import name '_get_failed_checks' from 'airflow.providers.common.sql.operators.sql' (/opt/python3.8/lib/python3.8/site-packages/airflow/providers/common/sql/operators/sql.py)

Why this issue is tricky: other providers such as apache-airflow-providers-microsoft-mssql==3.3.0 and apache-airflow-providers-oracle==3.5.0 have a dependency on apache-airflow-providers-common-sql>=1.3.0 and will therefore install it when adding to the Composer environment

Current mitigation: Downgrade provider packages such that apache-airflow-providers-common-sql==1.2.0 is installed instead

What you think should happen instead

A minor version upgrade of apache-airflow-providers-common-sql (1.2.0 to 1.3.0) should not break other providers (e.g. apache-airflow-providers-google==8.4.0)

How to reproduce

  • Deploy fresh deployment of Composer composer-2.0.32-airflow-2.3.4
  • Install apache-airflow-providers-common-sql==1.3.0 via Pypi package install feature
  • Deploy a dag that uses one of the BigQuery operators, such as
import airflow

from airflow import DAG
from datetime import timedelta

from airflow.providers.google.cloud.operators.bigquery import BigQueryExecuteQueryOperator


default_args = {
    'start_date': airflow.utils.dates.days_ago(0),
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}

dag = DAG(
    'test-dag',
    default_args=default_args,
    schedule_interval=None,
    dagrun_timeout=timedelta(minutes=20))

t1 = BigQueryExecuteQueryOperator(
    ...
)

Operating System

Ubuntu 18.04.6 LTS

Versions of Apache Airflow Providers

  • apache-airflow-providers-apache-beam @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_apache_beam-4.0.0-py3-none-any.whl
  • apache-airflow-providers-cncf-kubernetes @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_cncf_kubernetes-4.4.0-py3-none-any.whl
  • apache-airflow-providers-common-sql @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_common_sql-1.3.0-py3-none-any.whl
  • apache-airflow-providers-dbt-cloud @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_dbt_cloud-2.2.0-py3-none-any.whl
  • apache-airflow-providers-ftp @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_ftp-3.1.0-py3-none-any.whl
  • apache-airflow-providers-google @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_google-8.4.0-py3-none-any.whl
  • apache-airflow-providers-hashicorp @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_hashicorp-3.1.0-py3-none-any.whl
  • apache-airflow-providers-http @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_http-4.0.0-py3-none-any.whl
  • apache-airflow-providers-imap @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_imap-3.0.0-py3-none-any.whl
  • apache-airflow-providers-mysql @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_mysql-3.2.1-py3-none-any.whl
  • apache-airflow-providers-postgres @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_postgres-5.2.2-py3-none-any.whl
  • apache-airflow-providers-sendgrid @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_sendgrid-3.0.0-py3-none-any.whl
  • apache-airflow-providers-sqlite @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_sqlite-3.2.1-py3-none-any.whl
  • apache-airflow-providers-ssh @ file:///usr/local/lib/airflow-pypi-dependencies-2.3.4/python3.8/apache_airflow_providers_ssh-3.2.0-py3-none-any.whl

Deployment

Composer

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (15 by maintainers)

Commits related to this issue

Most upvoted comments

@potiuk I usually pin versions (==1.2.3) of providers that ship a lot of dependencies, to what is shown in the official image by running pip show <package> and only upgrade if it was upgraded in the next release. Also, we don’t upgrade to every release right away, so the snippets I posted were for 2.4.1 version where we did some dependency shuffling (no version bumps, simple poetry update) and then I saw errors popping on test env after new image was deployed.

Reason we use poetry is to resolve potential incompatibilities between our own libraries and airflow dependencies. For some providers - like datadog in the example above - it is more or less safe to only lock major and minor with ~, but for google after running into problems with the protobuf library upgrade, I learned it is safer to pin dependencies.