dbt-athena: Getting random "mismatched input 'cascade'." during dbt test
First of all: Thank you for your amazing contribution and bringing Athena engine v3 features to the DBT community!
I am currently tracking a bug that seems to occur non-deterministically. Essentially, during dbt test, I get syntax errors like these:
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:100: mismatched input 'cascade'. Expecting: '.', <EOF
It happens during the initial cleanup phase when DBT tries to delete existing test tables. To me, it looks like sometimes, the macro drop_relation from this repo is not found and DBT falls back to using default__drop_relation, which does a
drop {{ relation.type }} if exists {{ relation }} cascade
Below is the relevant log output of such an error:
-- /* {"app": "dbt", "dbt_version": "1.3.1", "profile_name": "athena", "target_name": "prelive", "node_id": "test.data_importer.not_null_bestaende_cleaned_lagerbestand.d2103837db"} */
12:21:35.254829 [debug] [Thread-7 (]: Opening a new connection, currently in state closed
Failed to execute query.
Traceback (most recent call last):
File "/opt/pysetup/.venv/lib/python3.10/site-packages/pyathena/common.py", line 494, in _execute
query_id = retry_api_call(
File "/opt/pysetup/.venv/lib/python3.10/site-packages/pyathena/util.py", line 68, in retry_api_call
return retry(func, *args, **kwargs)
File "/opt/pysetup/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 406, in __call__
do = self.iter(retry_state=retry_state)
File "/opt/pysetup/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 351, in iter
return fut.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/opt/pysetup/.venv/lib/python3.10/site-packages/tenacity/__init__.py", line 409, in __call__
result = fn(*args, **kwargs)
File "/opt/pysetup/.venv/lib/python3.10/site-packages/botocore/client.py", line 530, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/opt/pysetup/.venv/lib/python3.10/site-packages/botocore/client.py", line 960, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:102: mismatched input 'cascade'. Expecting: '.', <EOF>
12:21:35.631219 [debug] [Thread-8 (]: Athena adapter: Error running SQL: drop table if exists dbt_test__audit.dbt_expectations_expect_column_93d8b305ae3a3af2a182dc0d9423d314 cascade
-- /* {"app": "dbt", "dbt_version": "1.3.1", "profile_name": "athena", "target_name": "prelive", "node_id": "test.data_importer.dbt_expectations_expect_column_value_lengths_to_equal_rueres_wm_ek__wm_ek_IS_NOT_NULL__5.bc16ec6d3f"} */
12:21:35.631863 [debug] [Thread-8 (]: Athena adapter: Error running SQL: macro drop_relation
12:21:35.632307 [debug] [Thread-8 (]: finished collecting timing info
12:21:35.632645 [debug] [Thread-8 (]: On test.data_importer.dbt_expectations_expect_column_value_lengths_to_equal_rueres_wm_ek__wm_ek_IS_NOT_NULL__5.bc16ec6d3f: Close
12:21:35.633406 [debug] [Thread-8 (]: Runtime Error in test dbt_expectations_expect_column_value_lengths_to_equal_rueres_wm_ek__wm_ek_IS_NOT_NULL__5 (models/host/schema.yml)
I had some success on my local machine by adding a simple Jinja comment in front of this line.
Is it possible that there is a race condition when registering the macro that leads to spurious errors when the macro is not yet registered when test tables are being deleted? I am not very familiar with the underlying process of how DBT adapters work together with DBT core, just guessing here.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 16 (16 by maintainers)
Commits related to this issue
- Fix #94 by explicitly specifying project_name for drop_relation macro lookup — committed to jurgispods/dbt-athena by jurgispods 2 years ago
This seems to work for me as well, thank you @nicor88!
The nice thing about it is that it addresses another issue that I was having from time to time during development: When you switch a table type between
"hive"and"iceberg", thedrop_tablemacro would not delete the existing S3 paths on the next run, and the AthenaDROP TABLEstatement would time out. This can’t happen anymore with #119 as it is always checked with Glue what table type we are dealing with instead of relying on user config.Btw, I had other cases with
DROP TABLEstatements on Iceberg tables timing out and filed issues with AWS support. In one case, they found a bug related to trailing slashes in the S3 location property which should be fixed soon.@jurgispods After some thoughts, I don’t think that the issue will be fix in dbt-core. There are cases where we want to drop a relation and no config context is available, e.g. for some table dropping on testing.
As agreed with @Jrmyy we will try to propose a solution that ignore totally the config, and check only the table type in order to do the right drop.