airflow: BigQueryValueCheckOperator does not accept non-default project_id, which lead to jobs with impersonation_chain to fail.
Apache Airflow version
2.6.3
What happened
airflow.providers.google.cloud.operators.bigquery.BigQueryValueCheckOperator cannot specify project_id
, and always use default project_id. So when I use a impersonated service account that does not have access to the tables in the default project, it gave 403 Access Denied error.
What you think should happen instead
BigQueryValueCheckOperator should be like other bigquery operators, such as BigQueryUpsertTableOperator, BigQueryInsertJobOperator, a non-default project_id can be assigned, and so access request can always to redirect to the right project, instead of always using the current/default project.
How to reproduce
Create an Airflow instance in project A, and then use a service account that does not have access to the tables in project A (but can access to the tables in project B) to run the below code:
with models.DAG(
dag_id="test",
start_date=xxxx,
schedule="x x x * *"
):
BigQueryValueCheckOperator(
task_id="value_check",
sql=f"select count(1) from `{project_B}.{DATASET}.{TABLE_NAME}`",
pass_value=102,
tolerance=0.15,
use_legacy_sql=False,
location=location,
impersonation_chain=[service_email]
)
Operating System
GCP Cloud
Versions of Apache Airflow Providers
apache-airflow-providers-google==10.4.0
Deployment
Google Cloud Composer
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 19 (10 by maintainers)
I am OK with this workaround for now. Thanks!
Happy to take a look at this. Feel free to assign me!