TileDB-Py: error on call to consolidate

With the most recent release, I am getting an error on consolidate. Something odd seems to be happening on the second dimension (based upon the error message).

Traceback (most recent call last):
  File "./bug.py", line 38, in <module>
    main()
  File "./bug.py", line 34, in main
    tiledb.consolidate(name)
  File "tiledb/libtiledb.pyx", line 4420, in tiledb.libtiledb.consolidate
  File "tiledb/libtiledb.pyx", line 387, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 372, in tiledb.libtiledb._raise_tiledb_error
tiledb.libtiledb.TileDBError: [TileDB::Query] Error: Subarray out of bounds. subarray: [0, 27999, 0, 20699] domain: [0, 27999, 0, 20645]

To reproduce:

import tiledb
import numpy as np


def make_array(name, shape):

    filters = tiledb.FilterList([
        tiledb.ZstdFilter(),
    ])
    attrs = [
        tiledb.Attr(dtype=np.float32, filters=filters)
    ]
    domain = tiledb.Domain(tiledb.Dim(name="obs", domain=(0, shape[0] - 1), tile=min(shape[0], 200), dtype=np.uint32),
                           tiledb.Dim(name="var", domain=(0, shape[1] - 1), tile=min(shape[1], 100), dtype=np.uint32))

    schema = tiledb.ArraySchema(domain=domain, sparse=False, attrs=attrs,
                                cell_order='row-major', tile_order='row-major')
    tiledb.DenseArray.create(name, schema)


def main():

    shape = (28000, 20646)
    name = "X"
    make_array(name, shape)

    stride = int(np.power(10, np.around(np.log10(1e8 / shape[1]))))
    with tiledb.DenseArray(name, mode='w') as X:
        for row in range(0, shape[0], stride):
            lim = min(row+stride, shape[0])
            print(row, lim)
            X[row:lim, :] = np.random.rand(lim-row, shape[1])

        tiledb.consolidate(name)


if __name__ == '__main__':
    main()

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (8 by maintainers)

Commits related to this issue

Most upvoted comments

It worked great. Thanks for the resolution.

Thank you for the fast fix. I have verified that it works on my end as well.

@bkmartinjr we identified the issue, thanks for raising it. It is an internal issue with the way dense tiles must be read in their entirety to be written to the new consolidated fragment. Your domain is not divisible by the tile extent and this is causing an issue. We’ll fix this shortly.

In the meantime, could you please expand your column domain to be divisible by the column tile extent? This will fix the problem for now. After my patch, you will again be able to define arbitrary domains (even non-divisible by the tile extents).