airflow: upgrading from 2.2.3 or 2.2.5 to 2.3.2 fails on migration-job
Apache Airflow version
2.3.2 (latest released)
What happened
Upgrade Airflow 2.2.3 or 2.2.5 -> 2.3.2 fails on migration-job.
first time upgrade execution:
Referencing column 'task_id' and referenced column 'task_id' in foreign key constraint 'task_map_task_instance_fkey' are incompatible.")
[SQL:
CREATE TABLE task_map (
dag_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
task_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
run_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
map_index INTEGER NOT NULL,
length INTEGER NOT NULL,
`keys` JSON,
PRIMARY KEY (dag_id, task_id, run_id, map_index),
CONSTRAINT task_map_length_not_negative CHECK (length >= 0),
CONSTRAINT task_map_task_instance_fkey FOREIGN KEY(dag_id, task_id, run_id, map_index) REFERENCES task_instance (dag_id, task_id, run_id, map_index) ON DELETE CASCADE
)
]
after the first failed execution (should be due to the first failed execution):
Can't DROP 'task_reschedule_ti_fkey'; check that column/key exists")
[SQL: ALTER TABLE task_reschedule DROP FOREIGN KEY task_reschedule_ti_fkey[]
What you think should happen instead
The migration-job shouldn’t fail 😉
How to reproduce
Everytime in my environment just need to create a snapshot from last working DB-Snapshot (Airflow Version 2.2.3) and then deploy Airflow 2.3.2. I can update in between to 2.2.5 but ran into the same issue by update to 2.3.2.
Operating System
Debian GNU/Linux 10 (buster) - apache/airflow:2.3.2-python3.8 (hub.docker.com)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.4.0 apache-airflow-providers-celery==2.1.0 apache-airflow-providers-cncf-kubernetes==2.2.0 apache-airflow-providers-docker==2.3.0 apache-airflow-providers-elasticsearch==2.1.0 apache-airflow-providers-ftp==2.0.1 apache-airflow-providers-google==6.2.0 apache-airflow-providers-grpc==2.0.1 apache-airflow-providers-hashicorp==2.1.1 apache-airflow-providers-http==2.0.1 apache-airflow-providers-imap==2.0.1 apache-airflow-providers-microsoft-azure==3.4.0 apache-airflow-providers-mysql==2.1.1 apache-airflow-providers-odbc==2.0.1 apache-airflow-providers-postgres==2.4.0 apache-airflow-providers-redis==2.0.1 apache-airflow-providers-sendgrid==2.0.1 apache-airflow-providers-sftp==2.3.0 apache-airflow-providers-slack==4.1.0 apache-airflow-providers-sqlite==2.0.1 apache-airflow-providers-ssh==2.3.0 apache-airflow-providers-tableau==2.1.4
Deployment
Official Apache Airflow Helm Chart
Deployment details
- K8s Rev: v1.21.12-eks-a64ea69
- helm chart version: 1.6.0
- Database: AWS RDS MySQL 8.0.28
Anything else
Full error Log first execution:
/home/airflow/.local/lib/python3.8/site-packages/airflow/configuration.py:529: DeprecationWarning: The auth_backend option in [api[] has been renamed to auth_backends - the old setting has been used, but please update your config.
option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
/home/airflow/.local/lib/python3.8/site-packages/airflow/configuration.py:356: FutureWarning: The auth_backends setting in [api[] has had airflow.api.auth.backend.session added in the running config, which is needed by the UI. Please update your config before Apache Airflow 3.0.
warnings.warn(
DB: mysql+mysqldb://airflow:***@test-airflow2-db-blue.fsgfsdcfds76.eu-central-1.rds.amazonaws.com:3306/airflow
Performing upgrade with database mysql+mysqldb://airflow:***@test-airflow2-db-blue.fsgfsdcfds76.eu-central-1.rds.amazonaws.com:3306/airflow
[2022-06-17 12:19:59,724[] {db.py:920} WARNING - Found 33 duplicates in table task_fail. Will attempt to move them.
[2022-06-17 12:36:18,813[] {db.py:1448} INFO - Creating tables
INFO [alembic.runtime.migration[] Context impl MySQLImpl.
INFO [alembic.runtime.migration[] Will assume non-transactional DDL.
INFO [alembic.runtime.migration[] Running upgrade be2bfac3da23 -> c381b21cb7e4, Create a ``session`` table to store web session data
INFO [alembic.runtime.migration[] Running upgrade c381b21cb7e4 -> 587bdf053233, Add index for ``dag_id`` column in ``job`` table.
INFO [alembic.runtime.migration[] Running upgrade 587bdf053233 -> 5e3ec427fdd3, Increase length of email and username in ``ab_user`` and ``ab_register_user`` table to ``256`` characters
INFO [alembic.runtime.migration[] Running upgrade 5e3ec427fdd3 -> 786e3737b18f, Add ``timetable_description`` column to DagModel for UI.
INFO [alembic.runtime.migration[] Running upgrade 786e3737b18f -> f9da662e7089, Add ``LogTemplate`` table to track changes to config values ``log_filename_template``
INFO [alembic.runtime.migration[] Running upgrade f9da662e7089 -> e655c0453f75, Add ``map_index`` column to TaskInstance to identify task-mapping,
and a ``task_map`` table to track mapping values from XCom.
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (3780, "Referencing column 'task_id' and referenced column 'task_id' in foreign key constraint 'task_map_task_instance_fkey' are incompatible.")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/__main__.py", line 38, in main
args.func(args)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/db_command.py", line 82, in upgradedb
db.upgradedb(to_revision=to_revision, from_revision=from_revision, show_sql_only=args.show_sql_only)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
return func(*args, session=session, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/db.py", line 1449, in upgradedb
command.upgrade(config, revision=to_revision or 'heads')
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/command.py", line 322, in upgrade
script.run_env()
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/script/base.py", line 569, in run_env
util.load_python_file(self.dir, "env.py")
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 94, in load_python_file
module = load_module_py(module_id, path)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 110, in load_module_py
spec.loader.exec_module(module) # type: ignore
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/env.py", line 107, in <module>
run_migrations_online()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/env.py", line 101, in run_migrations_online
context.run_migrations()
File "<string>", line 8, in run_migrations
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/runtime/environment.py", line 853, in run_migrations
self.get_context().run_migrations(**kw)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/runtime/migration.py", line 623, in run_migrations
step.migration_fn(**kw)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/versions/0100_2_3_0_add_taskmap_and_map_id_on_taskinstance.py", line 75, in upgrade
op.create_table(
File "<string>", line 8, in create_table
File "<string>", line 3, in create_table
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/operations/ops.py", line 1254, in create_table
return operations.invoke(op)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/operations/base.py", line 394, in invoke
return fn(self, operation)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/operations/toimpl.py", line 114, in create_table
operations.impl.create_table(table)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/ddl/impl.py", line 354, in create_table
self._exec(schema.CreateTable(table))
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/ddl/impl.py", line 195, in _exec
return conn.execute(construct, multiparams)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1200, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/ddl.py", line 77, in _execute_on_connection
return connection._execute_ddl(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1290, in _execute_ddl
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (3780, "Referencing column 'task_id' and referenced column 'task_id' in foreign key constraint 'task_map_task_instance_fkey' are incompatible.")
[SQL:
CREATE TABLE task_map (
dag_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
task_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
run_id VARCHAR(250) COLLATE utf8mb3_bin NOT NULL,
map_index INTEGER NOT NULL,
length INTEGER NOT NULL,
`keys` JSON,
PRIMARY KEY (dag_id, task_id, run_id, map_index),
CONSTRAINT task_map_length_not_negative CHECK (length >= 0),
CONSTRAINT task_map_task_instance_fkey FOREIGN KEY(dag_id, task_id, run_id, map_index) REFERENCES task_instance (dag_id, task_id, run_id, map_index) ON DELETE CASCADE
)
]
(Background on this error at: http://sqlalche.me/e/14/e3q8)
Full error Log after first execution (should caused by first execution):
/home/airflow/.local/lib/python3.8/site-packages/airflow/configuration.py:529: DeprecationWarning: The auth_backend option in [api[] has been renamed to auth_backends - the old setting has been used, but please update your config.
option = self._get_option_from_config_file(deprecated_key, deprecated_section, key, kwargs, section)
/home/airflow/.local/lib/python3.8/site-packages/airflow/configuration.py:356: FutureWarning: The auth_backends setting in [api[] has had airflow.api.auth.backend.session added in the running config, which is needed by the UI. Please update your config before Apache Airflow 3.0.
warnings.warn(
DB: mysql+mysqldb://airflow:***@test-airflow2-db-blue.cndbtlpttl69.eu-central-1.rds.amazonaws.com:3306/airflow
Performing upgrade with database mysql+mysqldb://airflow:***@test-airflow2-db-blue.cndbtlpttl69.eu-central-1.rds.amazonaws.com:3306/airflow
[2022-06-17 12:41:53,882[] {db.py:1448} INFO - Creating tables
INFO [alembic.runtime.migration[] Context impl MySQLImpl.
INFO [alembic.runtime.migration[] Will assume non-transactional DDL.
INFO [alembic.runtime.migration[] Running upgrade f9da662e7089 -> e655c0453f75, Add ``map_index`` column to TaskInstance to identify task-mapping,
and a ``task_map`` table to track mapping values from XCom.
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
MySQLdb._exceptions.OperationalError: (1091, "Can't DROP 'task_reschedule_ti_fkey'; check that column/key exists")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/bin/airflow", line 8, in <module>
sys.exit(main())
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/__main__.py", line 38, in main
args.func(args)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/cli_parser.py", line 51, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/cli.py", line 99, in wrapper
return f(*args, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/cli/commands/db_command.py", line 82, in upgradedb
db.upgradedb(to_revision=to_revision, from_revision=from_revision, show_sql_only=args.show_sql_only)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", line 71, in wrapper
return func(*args, session=session, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/db.py", line 1449, in upgradedb
command.upgrade(config, revision=to_revision or 'heads')
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/command.py", line 322, in upgrade
script.run_env()
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/script/base.py", line 569, in run_env
util.load_python_file(self.dir, "env.py")
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 94, in load_python_file
module = load_module_py(module_id, path)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/util/pyfiles.py", line 110, in load_module_py
spec.loader.exec_module(module) # type: ignore
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/env.py", line 107, in <module>
run_migrations_online()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/env.py", line 101, in run_migrations_online
context.run_migrations()
File "<string>", line 8, in run_migrations
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/runtime/environment.py", line 853, in run_migrations
self.get_context().run_migrations(**kw)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/runtime/migration.py", line 623, in run_migrations
step.migration_fn(**kw)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/migrations/versions/0100_2_3_0_add_taskmap_and_map_id_on_taskinstance.py", line 49, in upgrade
batch_op.drop_index("idx_task_reschedule_dag_task_run")
File "/usr/local/lib/python3.8/contextlib.py", line 120, in __exit__
next(self.gen)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/operations/base.py", line 376, in batch_alter_table
impl.flush()
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/operations/batch.py", line 111, in flush
fn(*arg, **kw)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/ddl/mysql.py", line 155, in drop_constraint
super(MySQLImpl, self).drop_constraint(const)
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/ddl/impl.py", line 338, in drop_constraint
self._exec(schema.DropConstraint(const))
File "/home/airflow/.local/lib/python3.8/site-packages/alembic/ddl/impl.py", line 195, in _exec
return conn.execute(construct, multiparams)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1200, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/ddl.py", line 77, in _execute_on_connection
return connection._execute_ddl(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1290, in _execute_ddl
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1748, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1929, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1705, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 716, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) (1091, "Can't DROP 'task_reschedule_ti_fkey'; check that column/key exists")
[SQL: ALTER TABLE task_reschedule DROP FOREIGN KEY task_reschedule_ti_fkey[]
(Background on this error at: http://sqlalche.me/e/14/e3q8)
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 2
- Comments: 22 (16 by maintainers)
Commits related to this issue
- Add instructions on manually fixing MySQL Charset problems Depending on the way how and when you MySQL DB was created, it might have problems with the right CHARACTER SET and COLLATION used. This PR... — committed to potiuk/airflow by potiuk 2 years ago
- Add instructions on manually fixing MySQL Charset problems (#25938) Depending on the way how and when you MySQL DB was created, it might have problems with the right CHARACTER SET and COLLATION used... — committed to apache/airflow by potiuk 2 years ago
- Merge upstream Also a chapter was added to recommend taking a backup before the migration. Based on discussions and user input from #25866, #24526 Closes: #24526 Improve cleanup of temporary files... — committed to anja-istenic/airflow by pankajastro 2 years ago
For posterity and in case it helps somebody else we fix our schema with the queries below. Note that it assumes that the migrations from 2.2 to 2.3 have not been executed and failed. Since DDL changes are not transaction in MySQL (or at least the way Alembic is configured), the state of the database will be different after a failed migration.
Hi,
I managed to run migrations with some small manual changes to database before running
airflow db upgrade
. I was upgrading from 2.2.3 to 2.3.2. The main culprit in that situation was misaligned CHARSET/COLLATE for task_id and dag_id column between multiple tables. Here is the script I used to align those columns:Thanks, @potiuk from the documentation on
sql_engine_collation_for_ids
I kew aboutdag_id
,task_id
andkey
, but it did not mentionexternal_executor_id
⚠Also thank you for clarifying the recommended CHARSET/COLLATION: 👉
DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
(as described by the database setup docs) 👉CHARACTER SET utf8mb3 COLLATE utf8mb3_bin
fordag_id
,task_id
,key
,external_executor_id
fieldsI found the easiest way to fix this is to fix it right in my
backup.sql
file.When applying the modified
backup.sql
someCREATE TABLE
failed because dependent tables only got updated later. I believe this is becausemysqldump
doesn’tDROP TABLE
everything at the beginning, but only one by one. Manually doingDROP TABLE
for these few tables works.First things first… If there’s a way to migrate the data I’ll consider it once I recovered from todays incident…
Hello, Since I’ve better understanding on migration procees by had a deeper look into it 😉 I could fix the migration manually. @dstandish It was all about the charset and at one point I had remove duplicates from xcom. Don’t know, guess due to failed migration run. @potiuk I’ve taken your offer and put into what I think could help user on it as pull to the docs.
@miimsam have a look at my modified sql dry-run code and into my pull request 24926 you should ba able fix it manual.
You can do it - sure
@itayB I think you had a different issue, the log you provided says
'Lost connection to MySQL server during query'
, however, other issues in this thread are related to CHARSET configuration. I haven’t been able to solve this after a few hours of debugging and testing multiple CHARSET configuration options.