airflow: DAG's parameter access_control is not refreshing in the UI

Apache Airflow version: 1.10.9

Environment:

  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
  • Kernel (e.g. uname -a): Linux 4cc8ac3c2cfb 5.4.0-26-generic #30-Ubuntu SMP Mon Apr 20 16:58:30 UTC 2020 x86_64 GNU/Linux
  • Install tools: N/A

What happened:

When I update DAG’s parameter access_control the change is not updated in the UI at roles/list/. I have to trigger refresh DAG manually in the UI to get the change. (I tried it many times and waited 10+ minutes.)

What you expected to happen:

I assume the change change should be updated automatically, as for any other DAG’s parameter.

How to reproduce it:

Create new DAG for example with parameter: access_control={'Public': ['can_dag_read']}.

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 4
  • Comments: 23 (7 by maintainers)

Commits related to this issue

Most upvoted comments

I’ll be fixing this as part of #15311.

Any update on this?

@jdavidheiser haven’t got pinged in Airflow for a while 😃 If I recall(~ 2 years ago), we first need this commit (https://github.com/apache/airflow/pull/4642/files) to be use access_control in the Airflow version.

I have thought about that at that time to allow calling sync_perm_for_dag to work around this issue (Pinterest did this way). But the issue here is that RBAC lives in webserver/ flask-app builder while scheduler does the DAG parsing. Calling a webserver function within scheduler kinda violate the contract which ideally it should calls a DB function instead of a webserver internal call(scheduler shouldn’t be aware of webserver internal and shouldn’t instantiate a webserver obj). But the challenge is that all the RBAC db models live in RBAC instead of Airflow if I recall.

I think the ultimate solution is to have an API gateway(if we don’t have that, webserver API is fine) which allows scheduler to calls that to update permission during DAG parsing. What we did currently to workaround is to use cli to periodic update that part (cc @astahlman ). I haven’t looked at the latest Airflow code for a while, will defer to @ashb or other to comment if there are other alternative approaches.

Hi @kaxil and @ashb, Me and @Cabeda just made the test under version 2.0.0beta3 and it still exists. Here is how to we replicated it:

  1. Install version 2.0.0beta3 via Docker;
  2. (the following are always using Airflow)
  3. Create another user called “test_user”, besides having the “Admin” user;
  4. Create role “test_role”;
  5. Remove all roles from user “test_user” and assign role “test_role” to it;
  6. Change DAG “example_bash_operator” code to account with access_control={'test_role': ['can_edit', 'can_read']}, inside the DAG() constructor;
  7. See the DAG code change as an Admin logged in;
  8. Log out and log in as “test_user”;
  9. You don’t see the DAG as it was expected;
  10. Log out and log in as “Admin”;
  11. Click the refresh button of the “example_bash_operator” DAG;
  12. Log out and log in as “test_user”;
  13. Now you see the DAG.

Hope this helps tackling it. This one is really impacting our BAU when using Airflow. If we can help in any other way. Let us know!

Hey @ashb @kaxil @jhtimmins -> I think that one is something we should - I think - fix for rc?

@iam432 Let me answer your question on behalf of @vdusek

We noticed if you click on DAG refresh button on the right side of the main screen (in Links column) access rights are updated for the particular DAG - however you have to be admin. This is not usable workaround since every time a new DAG is added someone with Admin rights would have to go into UI and press refresh button.

At this moment we had to disable access control on the level of DAGs however it’s something we need to resolve mid-term.

@vdusek : created a shell script for sync (there are other things in script) and scheduled cron run this script every 5 minutes as below.

[airflow@hostname ~]$ cat /data/airflow/sync.sh #! /bin/bash airflow sync_perm [airflow@hostname ~]$

[airflow@hostname ~]$ crontab -l */5 * * * * sh /data/airflow/sync.sh [airflow@hostname ~]$