io: tfio.IODataset.from_parquet load array element failed

test_data = pd.DataFrame({'a':[[1,2,3],[4,5,6]], 'b':['q','p']})
test_data.to_parquet('a.parquet')
another_data = tfio.experimental.IODataset.from_parquet('a.parquet').as_numpy_iterator()
another_data.next()

result is

OrderedDict([('a.list.item', 1), ('b', b'q')])

seems only load the first element of array

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Reactions: 14
  • Comments: 18

Most upvoted comments

I have the same demand for this case, when I use tfio.IODataset.from_parquet to load same parquet file with array element, it always report “tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: IOFromParquet/ParquetIODataset/strided_slice/”

Having the same issue. I’d love to be using IODataset.from_parquet instead of putting together a custom generator and creating a dataset from that. Any idea if/when this issue will be picked up?