presto: Predicate pushdown may result in query failure

Projection IF (o.orderstatus = 'P', t.x[3], NULL) as y combined with o.orderstatus = 'P' AND y IS NOT NULL filter on top of a join causes pushdown of x[3] IS NOT NULL predicate into t. This may trigger Array subscript out of bounds error if t has records where x has fewer than 3 elements. In case these records don’t survive the join, the overall query should pass, but it fails.

To reproduce, add the following to com.facebook.presto.hive.TestHiveIntegrationSmokeTest (or execute queries manually):

    @Test
    public void test()
    {
        assertQuerySucceeds(getQueryRunner().getDefaultSession(),
                "CREATE TABLE test AS " +
                "SELECT\n" +
                "    l.linenumber,\n" +
                "    o.orderkey,\n" +
                "    o.orderstatus,\n" +
                "    IF (o.orderstatus = 'P', ARRAY[1, 2, 4], ARRAY[1]) AS x\n" +
                "FROM lineitem l, orders o\n" +
                "WHERE l.orderkey = o.orderkey");

        assertQuerySucceeds(getQueryRunner().getDefaultSession(),
                "SELECT *\n" +
                "FROM (\n" +
                "    SELECT\n" +
                "        o.orderstatus,\n" +
                "        IF (o.orderstatus = 'P', x[3], NULL) AS y\n" +
                "    FROM test t, orders o\n" +
                "    WHERE t.orderkey = o.orderkey\n" +
                ")\n" +
                "WHERE orderstatus = 'P' AND y IS NOT NULL");
        
        assertUpdate("DROP TABLE test");
    }

The stacktrace:

com.facebook.presto.spi.PrestoException: Array subscript out of bounds
	at com.facebook.presto.operator.scalar.ArraySubscriptOperator.checkIndex(ArraySubscriptOperator.java:166)
	at com.facebook.presto.operator.scalar.ArraySubscriptOperator.longSubscript(ArraySubscriptOperator.java:95)
	at com.facebook.presto.$gen.PageFilter_20190625_013659_979.filter(Unknown Source)
	at com.facebook.presto.$gen.PageFilter_20190625_013659_979.filter(Unknown Source)
	at com.facebook.presto.operator.project.DictionaryAwarePageFilter.filter(DictionaryAwarePageFilter.java:83)
	at com.facebook.presto.operator.project.PageProcessor.createWorkProcessor(PageProcessor.java:115)
	at com.facebook.presto.operator.project.PageProcessor.process(PageProcessor.java:101)
	at com.facebook.presto.operator.ScanFilterAndProjectOperator.processPageSource(ScanFilterAndProjectOperator.java:287)
	at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:231)

com.facebook.presto.sql.planner.optimizations.PredicatePushDown.Rewriter#processInnerJoin already makes sure not to push down non-deterministic predicates. It may need to block pushdown of predicates that may generate an error or wrap these in try.

CC: @rongrong @highker @wenleix @arhimondr

P.S. The following fix introduced in 0.221 seems to expose this issue (e.g. queries that used to pass are now failing). It affects queries that use varchar columns. E.g. it would happen if type of orderstatus column in the above repro was changed to varchar.

aa6e60648d9d7d8ddedec70e500e89575d177707 is the first bad commit
commit aa6e60648d9d7d8ddedec70e500e89575d177707
Author: Andrii Rosa <andriirosa@fb.com>
Date:   Wed Apr 24 14:21:27 2019 -0400

    Fix equality inference for VARCHAR predicates

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 15 (14 by maintainers)

Most upvoted comments

@rongrong @highker I don’t think this issue qualifies as a release blocker because it seems to exist for a long time. I also don’t think we should revert #12724. I suggest we remove release-blocker label and work on a fix on a timeline that makes sense.

Simpler self-contained test case:

WITH T1 AS (
    SELECT
        *
    FROM (
        VALUES
            ('P', 1),
            ('N', 2)
    ) T(x, z)
),
test AS (
    SELECT
        *,
        IF (o.x = 'P', ARRAY[1, 2, 4], ARRAY[1]) AS a
    FROM T1 o
)
SELECT
    *
FROM (
    SELECT
        o.x,
        IF (o.x = 'P', t.a[3], NULL) AS y
    FROM test t,
        T1 o
    WHERE
        t.z = o.z
)
WHERE
    x = 'P'
    AND y IS NOT NULL;

Two suggestions of similar solutions we were talking about at lunch today were

  1. push down predicates but also keep the original predicate where it was. The predicate that gets pushed down would be the original filter + any rows that get an error (so sort of like the predicate wrapped in a try) and then after the join we’d have the strict predicate that would throw an exception on errors
  2. Push down predicate, but keep in any rows that get an error along with keeping track of the errors themselves. if after the join is done the error rows are still there, then surface the exception

Both of these would require some work on the execution side. The advantage of 1 is that it’s easier to implement. The disadvantage is that it could cause a performance regression to do the filter twice if the predicate is expensive to compute and adds more overhead for the 99% of queries that won’t have any errors.

I’ll take a look.

@wenleix Wenlei, as you pointed out #12724 only exposed the existing problem. Reverting it helps with some queries, but there can still be queries that incorrectly fail.