airflow: Airflow 2.3 scheduler error: 'V1Container' object has no attribute '_startup_probe'
Apache Airflow version
2.3.0 (latest released)
What happened
After migrating from Airflow 2.2.4 to 2.3.0 scheduler fell into crash loop throwing:
--- Logging error ---
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 736, in _execute
self._run_scheduler_loop()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 826, in _run_scheduler_loop
self.executor.heartbeat()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/executors/base_executor.py", line 171, in heartbeat
self.sync()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py", line 613, in sync
self.kube_scheduler.run_next(task)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py", line 300, in run_next
self.log.info('Kubernetes job is %s', str(next_job).replace("\n", " "))
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 214, in __repr__
return self.to_str()
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 210, in to_str
return pprint.pformat(self.to_dict())
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 196, in to_dict
result[attr] = value.to_dict()
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod_spec.py", line 1070, in to_dict
result[attr] = list(map(
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod_spec.py", line 1071, in <lambda>
lambda x: x.to_dict() if hasattr(x, "to_dict") else x,
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_container.py", line 672, in to_dict
value = getattr(self, attr)
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_container.py", line 464, in startup_probe
return self._startup_probe
AttributeError: 'V1Container' object has no attribute '_startup_probe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/logging/__init__.py", line 1083, in emit
msg = self.format(record)
File "/usr/local/lib/python3.9/logging/__init__.py", line 927, in format
return fmt.format(record)
File "/usr/local/lib/python3.9/logging/__init__.py", line 663, in format
record.message = record.getMessage()
File "/usr/local/lib/python3.9/logging/__init__.py", line 367, in getMessage
msg = msg % self.args
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 214, in __repr__
return self.to_str()
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 210, in to_str
return pprint.pformat(self.to_dict())
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod.py", line 196, in to_dict
result[attr] = value.to_dict()
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod_spec.py", line 1070, in to_dict
result[attr] = list(map(
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_pod_spec.py", line 1071, in <lambda>
lambda x: x.to_dict() if hasattr(x, "to_dict") else x,
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_container.py", line 672, in to_dict
value = getattr(self, attr)
File "/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/models/v1_container.py", line 464, in startup_probe
return self._startup_probe
AttributeError: 'V1Container' object has no attribute '_startup_probe'
Call stack:
File "/home/airflow/.local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/__main__.py", line 38, in main
args.func(args)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py", line 75, in scheduler
_run_scheduler_job(args=args)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/cli/commands/scheduler_command.py", line 46, in _run_scheduler_job
job.run()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/base_job.py", line 244, in run
self._execute()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/jobs/scheduler_job.py", line 757, in _execute
self.executor.end()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py", line 809, in end
self._flush_task_queue()
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/executors/kubernetes_executor.py", line 767, in _flush_task_queue
self.log.warning('Executor shutting down, will NOT run task=%s', task)
Unable to print the message and arguments - possible formatting error.
Use the traceback above to help find the error.
kubernetes python library version was exactly as specified in constraints file: https://raw.githubusercontent.com/apache/airflow/constraints-2.3.0/constraints-3.9.txt
What you think should happen instead
Scheduler should work
How to reproduce
Not 100% sure but:
- Run Airflow 2.2.4 using official Helm Chart
- Run some dags to have some records in DB
- Migrate to 2.3.0 (replace 2.2.4 image with 2.3.0 one)
Operating System
Debian GNU/Linux 11 (bullseye)
Versions of Apache Airflow Providers
irrelevant
Deployment
Official Apache Airflow Helm Chart
Deployment details
KubernetesExecutor
PostgreSQL (RDS) as Airflow DB
Python 3.9
Docker images build from apache/airflow:2.3.0-python3.9
(some additional libraries installed)
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 41 (28 by maintainers)
Commits related to this issue
- Don't crash scheduler if exec config has old k8s objects From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new version, we c... — committed to astronomer/airflow by dstandish 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to apache/airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/airflow/i... — committed to astronomer/airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to apache/airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to a0x8o/airflow by a0x8o 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to astronomer/airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to apache/airflow by dstandish 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to apache/airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
- Don't crash scheduler if exec config has old k8s objects (#24117) From time to time k8s library objects change their attrs. If executor config is stored with old version, and unpickled with new vers... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
- Return empty dict if Pod JSON encoding fails (#24478) When UI unpickles executor_configs with outdated k8s objects it can run into the same issue as the scheduler does (see https://github.com/apache/... — committed to GoogleCloudPlatform/composer-airflow by dstandish 2 years ago
After upgrading from 2.2.4 to 2.3.2. I get same error in the webserver when trying to view any task result of runs that were produced before the upgrade. Runs happening after the upgrade are still possible to view.
Eg
Using same kubernetes python client as listed in the official constraints file (kubernetes==23.6.0) Cluster is a managed AKS, version 1.23.5
airflow dags reserialize
did not help.this fix works: https://github.com/apache/airflow/pull/24117
Can you try 2.3.4 (It will be out in 2 days or so) and if it is still there please open a new issue with detailed description of the problem and how you got there. You can refer to this issue in the new one but piggybacking on existing, closed issue is not going to make it “active” I am afraid.
@dstandish I patched with commits on #24478 and it seems work!