velox: Projection got incorrect result with LazyVector

image

I add the following test case to ExprTest.cpp and got incorrect result.

TEST_F(ExprTest, lazyLoadBug) {
    const vector_size_t size = 4;

    auto valueAt = [](auto row) { return row; };
    auto isNullAtColC0 = [](auto row) { return row; };
    // [1, 1, 1, null]
    auto inputC0 = makeFlatVector<int64_t>(
            size, [](auto row) { return 1; }, [](auto row) { return row == 3; });
    // [0, 1, 2, 3] if fully loaded
    std::vector<vector_size_t> loadedRows;
    VectorPtr inputC1 = std::make_shared<LazyVector>(
            pool_.get(),
            BIGINT(),
            size,
            std::make_unique<test::SimpleVectorLoader>([&](auto rows) {
                for (auto row : rows) {
                    loadedRows.push_back(row);
                }
                return makeFlatVector<int64_t>(
                        rows.back() + 1, valueAt);
            }));

    // 1) can pass test, manually load LazyVector with allRows
//    SelectivityVector allRows(size);
//    LazyVector::ensureLoadedRows(inputC1, allRows);

    // 2) can pass test, not using lazy vector
//    auto inputC1 = makeFlatVector<int64_t>(
//            size, valueAt, nullptr);

    // isFinalSelection_ == true
    auto result = evaluate(
            "row_constructor(c0 + c1, if (c1 >= 0, c1, 0))", makeRowVector({inputC0, inputC1}));

    // [1, 2, 3, null]
    auto outputCol0 = makeFlatVector<int64_t>(
            size, [](auto row) { return row + 1; }, [](auto row) { return row == 3; });
    // [0, 1, 2, 3]
    auto outputCol1 = makeFlatVector<int64_t>(
            size, [](auto row) { return row; }, nullptr);
    // [(1, 0), (2, 1), (3, 2), (null, 3)]
    auto expected = ExprTest::makeRowVector({outputCol0, outputCol1});

    assertEqualVectors(expected, result);
}

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 22 (22 by maintainers)

Commits related to this issue

Most upvoted comments

@barsondei Thank you for an example query. I implemented it in a unit test and I’m seeing failures evaluating this query. Will investigate. BTW, let me know if you’d like to help with investigating or fixing.


TEST_F(TableScanTest, bug) {
  auto data = makeRowVector({
      makeNullableFlatVector<int64_t>({1, 1, 1, std::nullopt}),
      makeNullableFlatVector<int64_t>({0, 1, 2, 3}),
  });

  auto filePath = TempFilePath::create();
  writeToFile(filePath->path, {data});
  createDuckDbTable({data});

  auto plan =
      PlanBuilder()
          .tableScan(asRowType(data->type()))
          .filter(
              "element_at(array_constructor(c0 + c1, if(c1 >= 0, c1, 0)), 0) > 0")
          .planNode();
  assertQuery(plan, {filePath}, "SELECT null, null");
}

The error I’m getting is:

C++ exception with description "Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: (4 vs. 3)
Retriable: False
Expression: rows.end() <= vector.size()
Context: switch(gte(c1, 0:BIGINT), c1, 0:BIGINT)
Top-Level Context: gt(element_at(array_constructor(plus(c0, c1), switch(gte(c1, 0:BIGINT), c1, 0:BIGINT)), 0:BIGINT), 0:BIGINT)
Function: makeIndices
File: /Users/mbasmanova/cpp/velox-1/velox/vector/DecodedVector.cpp
Line: 98

OK。I‘m pleasure to fix it。

element_at( row_constructor(c0 + c1, if (c1 >= 0, c1, 0)), 0) > 0

I’m not clear with the logic of LazyVector’s creation in TableScan operator。 If it only load rows specified with bitmap,the bug will occur when run SQL in end to end environment: SQL:

select * from 
  table
  where
  element_at( row_constructor(c0 + c1, if (c1 >= 0, c1, 0)), 0) > 0