airflow: `airflow db upgrade` Failed to write serialized DAG
Apache Airflow version
2.4.1
What happened
Running airflow db upgrade
on an Airflow installation with 100 DAGs fails with this error:
ERROR [airflow.models.dagbag.DagBag] Failed to write serialized DAG: /usr/local/airflow/dags/REDACTED.py
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/airflow/models/dagbag.py", line 615, in _serialize_dag_capturing_errors
dag_was_updated = SerializedDagModel.write_dag(
File "/usr/local/lib/python3.9/site-packages/airflow/utils/session.py", line 72, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/airflow/models/serialized_dag.py", line 146, in write_dag
session.query(literal(True))
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2810, in first
return self.limit(1)._iter().first()
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2894, in _iter
result = self.session.execute(
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1688, in execute
conn = self._connection_for_bind(bind)
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1529, in _connection_for_bind
return self._transaction._connection_for_bind(
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind
self._assert_active()
File "/usr/local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active
raise sa_exc.PendingRollbackError(
sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "serialized_dag_pkey"
DETAIL: Key (dag_id)=(REDACTED) already exists.
[SQL: INSERT INTO serialized_dag (dag_id, fileloc, fileloc_hash, data, data_compressed, last_updated, dag_hash, ...
What you think should happen instead
airflow db upgrade
should successfully reserialize DAGs at the end of the upgrade just like the airflow dags reserialize
command.
How to reproduce
- Upgrade to
airflow 2.4.1
on an existing codebase - Run
airflow db upgrade
Operating System
Debian GNU/Linux 10 (buster)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==5.1.0
apache-airflow-providers-celery==3.0.0
apache-airflow-providers-cncf-kubernetes==4.3.0
apache-airflow-providers-common-sql==1.2.0
apache-airflow-providers-datadog==3.0.0
apache-airflow-providers-ftp==3.1.0
apache-airflow-providers-http==4.0.0
apache-airflow-providers-imap==3.0.0
apache-airflow-providers-postgres==5.2.1
apache-airflow-providers-redis==3.0.0
apache-airflow-providers-sendgrid==3.0.0
apache-airflow-providers-sftp==4.0.0
apache-airflow-providers-slack==5.1.0
apache-airflow-providers-sqlite==3.2.1
apache-airflow-providers-ssh==3.1.0
Deployment
Other Docker-based deployment
Deployment details
k8s deployment
Anything else
Fails consistently in these two scenarios:
-
Run db upgrade only:
airflow db upgrade
-
Run along with reserialize
airflow dags reserialize --clear-only airflow db upgrade
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 19 (16 by maintainers)
Ah so
airflow db check-migrations -t 0 || airflow db upgrade || true
would work then.Today we upgraded to airflow 2.4.2 - we did not notice this issue during the migration this time around.
We are using the official helm chart, so the migration occurred on deploy via the migration job. We run 900 + dags currently
When we upgraded to airflow 2.4.1 the migration took >20 minutes. After upgrading to airflow 2.4.2 the migration took < 2 minutes.
If there are other data points needed I am happy to help provide some.
(I think even -t 0 is not needed)
It’s already there
airflow db check-migrations
Thanks for sharing @troyharvey I think we are going to do the same.