airflow: Airflow scheduler with statsd enabled crashes when dag_id contains unexpected characters
Apache Airflow version
2.1.1
Operating System
Oracle Linux 7.9
Versions of Apache Airflow Providers
No response
Deployment
Virtualenv installation
Deployment details
The problem initially happened in a virtualenv installation of airflow 2.1.1 running as a systemd service.
I tried to reproduce it using docker-compose and airflow version 2.1.0; the problem occurs there as well.
What happened
A new DAG was added with the symbol “ö” in its dag_id. As soon as the DAG was triggered, the airflow scheduler died unexpectedly and could not be started again until the DAG was deleted from the UI and its dag_id changed in the DAG file.
What you expected to happen
The scheduler should continue running even if an exception occured when emitting metrics for a DAG run. An exception related to a single DAG run should not be able to kill the entire scheduler process.
Ideally, if a dag_id contains characters which are not allowed, the error should appear as soon as Airflow tries to parse it, and it should not be possible to schedule it at all.
It would also be nice if the error message showing all allowed characters displayed them in alphabetic order.
How to reproduce
- Start an instance of airflow with statsd metrics enabled
- Create a DAG with a dag_id containing symbols which are not allowed
- Trigger the DAG. The scheduler will shut down as soon as the DAG is triggered.
Anything else
The problem occurs only when AIRFLOW__METRICS__STATSD_ON=true . When statsd metrics are disabled, the DAG runs without problems.
Scheduler output with some values replaced with equivalent placeholders:
Sep 03 14:25:37 my.airflow.domain python[110341]: [2021-09-03 14:25:37,134] {{dagrun.py:444}} INFO - Marking run <DagRun other_dag @ 2021-09-03 11:24:34.7
60370+00:00: scheduled__2021-09-03T11:24:34.760370+00:00, externally triggered: False> successful
Sep 03 14:25:37 my.airflow.domain python[110341]: [2021-09-03 14:25:37,147] {{scheduler_job.py:1229}} INFO - Executor reports execution of other_dag.sched
uler_delay_to_zabbix execution_date=2021-09-03 11:24:34.760370+00:00 exited with status success for try_number 1
Sep 03 14:26:21 my.airflow.domain python[110341]: [2021-09-03 14:26:21,324] {{stats.py:235}} ERROR - Invalid stat name: dagrun.schedule_delay.Dag_With_ö.
Sep 03 14:26:21 my.airflow.domain python[110341]: Traceback (most recent call last):
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/stats.py", line 232, in wrapper
Sep 03 14:26:21 my.airflow.domain python[110341]: stat = handler_stat_name_func(stat)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/stats.py", line 207, in stat_name_default_handler
Sep 03 14:26:21 my.airflow.domain python[110341]: stat_name=stat_name, allowed_characters=ALLOWED_CHARACTERS
Sep 03 14:26:21 my.airflow.domain python[110341]: airflow.exceptions.InvalidStatsNameException: The stat name (dagrun.schedule_delay.Dag_With_ö) has to be composed with characters in
Sep 03 14:26:21 my.airflow.domain python[110341]: {'0', 'f', 'k', 'p', 'J', 'v', '7', 'T', 'V', 'n', 'r', 'W', 'X', 'L', 'Q', 't', '_', 'u', '-', 'c', 'h', '3', 'm', '6', 'l', 'S', 'C', 'a', 'F', 'G', 'x', 'b', 'K', '8', 'j', 'w', 'D', 'g', '1', '4', 'q', '5', 'e', 'i', 'M', 'P', 'R', 'U', 'I', 'Z', '.', 'A', 'O', '9', 'y', '2', 's', 'E', 'Y', 'z', 'B', 'H', 'd', 'N', 'o'}.
Sep 03 14:26:21 my.airflow.domain python[110341]: [2021-09-03 14:26:21,408] {{stats.py:235}} ERROR - Invalid stat name: dagrun.dependency-check.Dag_With_ö.
Sep 03 14:26:21 my.airflow.domain python[110341]: Traceback (most recent call last):
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/stats.py", line 232, in wrapper
Sep 03 14:26:21 my.airflow.domain python[110341]: stat = handler_stat_name_func(stat)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/stats.py", line 207, in stat_name_default_handler
Sep 03 14:26:21 my.airflow.domain python[110341]: stat_name=stat_name, allowed_characters=ALLOWED_CHARACTERS
Sep 03 14:26:21 my.airflow.domain python[110341]: airflow.exceptions.InvalidStatsNameException: The stat name (dagrun.dependency-check.Dag_With_ö) has to be composed with characters in
Sep 03 14:26:21 my.airflow.domain python[110341]: {'0', 'f', 'k', 'p', 'J', 'v', '7', 'T', 'V', 'n', 'r', 'W', 'X', 'L', 'Q', 't', '_', 'u', '-', 'c', 'h', '3', 'm', '6', 'l', 'S', 'C', 'a', 'F', 'G', 'x', 'b', 'K', '8', 'j', 'w', 'D', 'g', '1', '4', 'q', '5', 'e', 'i', 'M', 'P', 'R', 'U', 'I', 'Z', '.', 'A', 'O', '9', 'y', '2', 's', 'E', 'Y', 'z', 'B', 'H', 'd', 'N', 'o'}.
Sep 03 14:26:21 my.airflow.domain python[110341]: [2021-09-03 14:26:21,409] {{scheduler_job.py:1319}} ERROR - Exception when executing SchedulerJob._run_scheduler_loop
Sep 03 14:26:21 my.airflow.domain python[110341]: Traceback (most recent call last):
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1303, in _execute
Sep 03 14:26:21 my.airflow.domain python[110341]: self._run_scheduler_loop()
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1396, in _run_scheduler_loop
Sep 03 14:26:21 my.airflow.domain python[110341]: num_queued_tis = self._do_scheduling(session)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1535, in _do_scheduling
Sep 03 14:26:21 my.airflow.domain python[110341]: self._schedule_dag_run(dag_run, active_runs_by_dag_id.get(dag_run.dag_id, set()), session)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/jobs/scheduler_job.py", line 1765, in _schedule_dag_run
Sep 03 14:26:21 my.airflow.domain python[110341]: schedulable_tis, callback_to_run = dag_run.update_state(session=session, execute_callbacks=False)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/utils/session.py", line 67, in wrapper
Sep 03 14:26:21 my.airflow.domain python[110341]: return func(*args, **kwargs)
Sep 03 14:26:21 my.airflow.domain python[110341]: File "/app01/venv/airflowvenv/lib64/python3.6/site-packages/airflow/models/dagrun.py", line 403, in update_state
Sep 03 14:26:21 my.airflow.domain python[110341]: with Stats.timer(f"dagrun.dependency-check.{self.dag_id}"):
Sep 03 14:26:21 my.airflow.domain python[110341]: AttributeError: __enter__
Sep 03 14:26:21 my.airflow.domain python[110341]: [2021-09-03 14:26:21,457] {{local_executor.py:387}} INFO - Shutting down LocalExecutor; waiting for running tasks to finish. Signal again if you don't want to wait.
Sep 03 14:26:22 my.airflow.domain python[110341]: [2021-09-03 14:26:22,561] {{process_utils.py:100}} INFO - Sending Signals.SIGTERM to GPID 8988
Sep 03 14:26:23 my.airflow.domain python[110341]: [2021-09-03 14:26:23,178] {{process_utils.py:66}} INFO - Process psutil.Process(pid=8988, status='terminated', exitcode=0, started='2021-08-17 15:46:22') (8988) terminated with exit code 0
Sep 03 14:26:23 my.airflow.domain python[110341]: [2021-09-03 14:26:23,179] {{scheduler_job.py:1330}} INFO - Exited execute loop
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: open
- Created 3 years ago
- Comments: 17 (14 by maintainers)
I don’t think this counts as a duplicate as we’re talking about different components, different versions and different errors. In my case (version 2.1.1) the webserver does not seem to be affected by the non-ascii characters at all and runs just fine. It is the scheduler that crashes.