airflow: BigQueryValueCheckOperator does not accept non-default project_id, which lead to jobs with impersonation_chain to fail.

Apache Airflow version

2.6.3

What happened

airflow.providers.google.cloud.operators.bigquery.BigQueryValueCheckOperator cannot specify project_id, and always use default project_id. So when I use a impersonated service account that does not have access to the tables in the default project, it gave 403 Access Denied error.

What you think should happen instead

BigQueryValueCheckOperator should be like other bigquery operators, such as BigQueryUpsertTableOperator, BigQueryInsertJobOperator, a non-default project_id can be assigned, and so access request can always to redirect to the right project, instead of always using the current/default project.

How to reproduce

Create an Airflow instance in project A, and then use a service account that does not have access to the tables in project A (but can access to the tables in project B) to run the below code:

with models.DAG(
   dag_id="test",
   start_date=xxxx,
  schedule="x x x * *"
):
     BigQueryValueCheckOperator(
        task_id="value_check",
        sql=f"select count(1) from `{project_B}.{DATASET}.{TABLE_NAME}`",
        pass_value=102,
        tolerance=0.15,
        use_legacy_sql=False,
        location=location,
        impersonation_chain=[service_email]
    )

Operating System

GCP Cloud

Versions of Apache Airflow Providers

apache-airflow-providers-google==10.4.0

Deployment

Google Cloud Composer

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 19 (10 by maintainers)

Most upvoted comments

I am OK with this workaround for now. Thanks!

Happy to take a look at this. Feel free to assign me!