airflow: DAG's parameter access_control is not refreshing in the UI
Apache Airflow version: 1.10.9
Environment:
- Cloud provider or hardware configuration: N/A
- OS (e.g. from /etc/os-release): Debian GNU/Linux 9 (stretch)
- Kernel (e.g.
uname -a
): Linux 4cc8ac3c2cfb 5.4.0-26-generic #30-Ubuntu SMP Mon Apr 20 16:58:30 UTC 2020 x86_64 GNU/Linux - Install tools: N/A
What happened:
When I update DAG’s parameter access_control
the change is not updated in the UI at roles/list/
. I have to trigger refresh DAG manually in the UI to get the change. (I tried it many times and waited 10+ minutes.)
What you expected to happen:
I assume the change change should be updated automatically, as for any other DAG’s parameter.
How to reproduce it:
Create new DAG for example with parameter: access_control={'Public': ['can_dag_read']}
.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 4
- Comments: 23 (7 by maintainers)
Commits related to this issue
- Sync DAG specific permissions when parsing (#15311) This POC allows the DAG specific permissions to be created/updated during DAG parsing, instead of during webserver start or cli `sync-perm`. Wit... — committed to apache/airflow by jedcunningham 3 years ago
- Sync DAG specific permissions when parsing (#15311) This POC allows the DAG specific permissions to be created/updated during DAG parsing, instead of during webserver start or cli `sync-perm`. With ... — committed to astronomer/airflow by jedcunningham 3 years ago
I’ll be fixing this as part of #15311.
Any update on this?
@jdavidheiser haven’t got pinged in Airflow for a while 😃 If I recall(~ 2 years ago), we first need this commit (https://github.com/apache/airflow/pull/4642/files) to be use access_control in the Airflow version.
I have thought about that at that time to allow calling
sync_perm_for_dag
to work around this issue (Pinterest did this way). But the issue here is that RBAC lives in webserver/ flask-app builder while scheduler does the DAG parsing. Calling a webserver function within scheduler kinda violate the contract which ideally it should calls a DB function instead of a webserver internal call(scheduler shouldn’t be aware of webserver internal and shouldn’t instantiate a webserver obj). But the challenge is that all the RBAC db models live in RBAC instead of Airflow if I recall.I think the ultimate solution is to have an API gateway(if we don’t have that, webserver API is fine) which allows scheduler to calls that to update permission during DAG parsing. What we did currently to workaround is to use cli to periodic update that part (cc @astahlman ). I haven’t looked at the latest Airflow code for a while, will defer to @ashb or other to comment if there are other alternative approaches.
Hi @kaxil and @ashb, Me and @Cabeda just made the test under version 2.0.0beta3 and it still exists. Here is how to we replicated it:
access_control={'test_role': ['can_edit', 'can_read']},
inside theDAG()
constructor;Hope this helps tackling it. This one is really impacting our BAU when using Airflow. If we can help in any other way. Let us know!
Hey @ashb @kaxil @jhtimmins -> I think that one is something we should - I think - fix for rc?
@iam432 Let me answer your question on behalf of @vdusek
We noticed if you click on DAG refresh button on the right side of the main screen (in Links column) access rights are updated for the particular DAG - however you have to be admin. This is not usable workaround since every time a new DAG is added someone with Admin rights would have to go into UI and press refresh button.
At this moment we had to disable access control on the level of DAGs however it’s something we need to resolve mid-term.
@vdusek : created a shell script for sync (there are other things in script) and scheduled cron run this script every 5 minutes as below.
[airflow@hostname ~]$ cat /data/airflow/sync.sh #! /bin/bash airflow sync_perm [airflow@hostname ~]$
[airflow@hostname ~]$ crontab -l */5 * * * * sh /data/airflow/sync.sh [airflow@hostname ~]$