presto: [Discussion] Lead/Lag function's behavior when the offset is NULL

// Presto TestLeadFunction.java
assertWindowQuery("lead(orderkey, null, -1) OVER (PARTITION BY orderstatus ORDER BY orderkey)",
        resultBuilder(TEST_SESSION, INTEGER, VARCHAR, INTEGER)
                .row(3, "F", null)
                .row(5, "F", null)
                .row(6, "F", null)
                .row(33, "F", null)
                .row(1, "O", null)
                .row(2, "O", null)
                .row(4, "O", null)
                .row(7, "O", null)
                .row(32, "O", null)
                .row(34, "O", null)
                .build());

In this test case, the offset is null, default value is specified as -1, but the expected value is null rather than -1, while in the doc([1]), it says that default value should be returned:

If the offset is null or larger than the window, the default_value is returned, or if it is not specified null is returned.

Lag function has similar issue.

[1]. https://prestodb.io/docs/current/functions/window.html

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Comments: 16 (16 by maintainers)

Commits related to this issue

Most upvoted comments

@tdcmeehan @aditi-pandit I want to bring up this discussion again and get a conclusion this time.

I tried several databases for the lead function + null offset scenario and summarized the result here:

DB Behavior
Trino Return Null (see [1])
Presto(Current) Return Null
PostgreSQL Return Null (see [2])
Velox Return Default Value (see [3])
MySQL Raise error(see [4])
DuckDB Return Current Row (See @aditi-pandit 's comment above)

And from the results above I’d vote for Return Null because Presto(Velox’s origin?) & PostgreSQL goes this way. What do you think?

References

  1. Trino Behavior: https://trino.io/docs/current/functions/window.html
  2. PostgreSQL
postgres=# select lead(orderkey, null, -1) over (partition by orderstatus order by orderkey ) from (values (1, 's'), (2, 's')) t(orderkey, orderstatus);
 lead
------


(2 rows)
  1. Velox Behavior: https://github.com/facebookincubator/velox/issues/6413
  2. MySQL Behavior
mysql> select lead(orderkey, null, -1) over (partition by orderstatus order by orderkey ) from test;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'null, -1) over (partition by orderstatus order by orderkey ) from test' at line 1

@aditi-pandit I was referring to this sentence in Oracle doc:

The optional default value is returned if the offset goes beyond the scope of the table.

I was assuming null offset is a offset goes beyond the scope of the table, and yes, it didn’t say it explictly.

@xumingming : ‘offset that goes beyond the scope of the table’ is a situation like lead on the last row with offset 5 goes beyond the scope of the partition. null offset is not that imo.