datafusion: Don't ignore failed optimizer rules
Describe the bug Some logical optimizer rules like type coercion are actually essential for correct query execution, but are currently silently ignored when they failed.
To Reproduce TBD
Expected behavior
Don’t skip failed rules (this probably requires us to remove the non-failing OptimizerRule::optimize
method).
Workaround
This behavior is controlled by a config option (that defaults to true
meaning to silently skip failed errors).
You can see this in the datafusion cli:
DataFusion CLI v20.0.0
❯ show all;
+-----------------------------------------------------------+------------+
| name | setting |
+-----------------------------------------------------------+------------+
...
| datafusion.optimizer.skip_failed_rules | true |
...
+-----------------------------------------------------------+------------+
35 rows in set. Query took 0.038 seconds.
Datafusion can be configured to fail hard if an optimizer rule fails like
❯ -- default behavior
❯ set datafusion.optimizer.skip_failed_rules = false;
0 rows in set. Query took 0.023 seconds.
Additional context See this discussion.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 19 (19 by maintainers)
These are the variants for runtime behavior, not related to compile time checking I think. I’ve done some tests on Postgres, and find the expression
when FALSE then ...
will be removed before type checking, so if you have the exprwhen FALSE then 1/0
in your query, it is always valid: https://onecompiler.com/postgresql/3yskvh35xBut if the invalid expression can not be removed at compile time, a
Divide by zero
error will always return before executing the query, even if the invalid code won’t be touched at runtime: https://onecompiler.com/postgresql/3yskvqaxfI tried to remove the config
skip_failed_rules
and met several test failures tracked in #4685. Feel free to update the task-list if you find more and welcome to fix them if you have time!I would personally rather have an InternalError rather than a panic
Make sense to me. It’s a good discussion point. Some rule is essential, we shouldn’t tolerate them failed. BTW, I think we also should make optimizer rule more stable instead of ignoring error.
IMHO, I think it’s not good that tolerate error in optimizer. If a rule want to tolerate
failed
, I think it may be better handle inside itself.The “when to check errors” is highly dependent on the database. I’ve worked with MS cloud SQL and Exasol before and in some cases, type errors (e.g. due to incompatible operators or functions that only accept a certain type) were only raised if at least a single row was passed throw it (post joins and where selections). These errors happened at query execution (not planning). From a user’s perspective this was highly confusing. I recommend we take a look at the SQL standard and if it allows for it we bail out of invalid queries as early as possible, no matter if there’s any data to query.
That’s correct. Now that we have the
try_optimize
method I support failing on errors and we will soon fix cases where rules incorrectly return errors.