airflow: Cannot fetch log from Celery worker
Discussed in https://github.com/apache/airflow/discussions/26490
<div type='discussions-op-text'>Originally posted by emredjan September 19, 2022
Apache Airflow version
2.4.0
What happened
When running tasks on a remote celery worker, webserver fails to fetch logs from the machine, giving a ‘403 - Forbidden’ error on version 2.4.0. This behavior does not happen on 2.3.3, where the remote logs are retrieved and displayed successfully.
The webserver / secret_key
configuration is the same in all nodes (the config files are synced), and their time is synchronized using a central NTP server, making the solution in the warning message not applicable.
My limited analysis pointed to the serve_logs.py
file, and the flask request object that’s passed to it, but couldn’t find the root cause.
What you think should happen instead
It should fetch and show remote celery worker logs on the webserver UI correctly, as it did in previous versions.
How to reproduce
Use airflow version 2.4.0 Use CeleryExecutor with RabbitMQ Use a separate Celery worker machine Run a dag/task on the remote worker Try to display task log on the web UI
Operating System
Red Hat Enterprise Linux 8.6 (Ootpa)
Versions of Apache Airflow Providers
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-common-sql==1.1.0
apache-airflow-providers-ftp==3.0.0
apache-airflow-providers-hashicorp==3.0.0
apache-airflow-providers-http==3.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-microsoft-mssql==3.0.0
apache-airflow-providers-mysql==3.0.0
apache-airflow-providers-odbc==3.0.0
apache-airflow-providers-sftp==3.0.0
apache-airflow-providers-sqlite==3.0.0
apache-airflow-providers-ssh==3.0.0
Deployment
Virtualenv installation
Deployment details
Using CeleryExecutor / rabbitmq with 2 servers
Anything else
All remote task executions has the same problem.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 17 (12 by maintainers)
Commits related to this issue
- Fix proper joining of the path for logs retrieved from celery workers The change #26377 "fixed" the way how logs were retrieved from Celery, but it - unfortunately broke the retrieval eventually. Th... — committed to potiuk/airflow by potiuk 2 years ago
- Fix proper joining of the path for logs retrieved from celery workers (#26493) The change #26377 "fixed" the way how logs were retrieved from Celery, but it - unfortunately broke the retrieval event... — committed to apache/airflow by potiuk 2 years ago
- Fix proper joining of the path for logs retrieved from celery workers (#26493) The change #26377 "fixed" the way how logs were retrieved from Celery, but it - unfortunately broke the retrieval eventu... — committed to astronomer/airflow by potiuk 2 years ago
- Fix proper joining of the path for logs retrieved from celery workers (#26493) The change #26377 "fixed" the way how logs were retrieved from Celery, but it - unfortunately broke the retrieval eventu... — committed to apache/airflow by potiuk 2 years ago
Perfect, I confirm this indeed fixes the issue. Thanks for the swift help @potiuk !
Would be great if it wasn’t there in the first place 😦 .Thanks for confirming!
https://github.com/apache/airflow/pull/26493 contains the fix - one
/
needs to be added.I have the same problem The container works on the same machine, but not on different hosts