io: tfio.IODataset.from_parquet load array element failed
test_data = pd.DataFrame({'a':[[1,2,3],[4,5,6]], 'b':['q','p']})
test_data.to_parquet('a.parquet')
another_data = tfio.experimental.IODataset.from_parquet('a.parquet').as_numpy_iterator()
another_data.next()
result is
OrderedDict([('a.list.item', 1), ('b', b'q')])
seems only load the first element of array
About this issue
- Original URL
- State: open
- Created 3 years ago
- Reactions: 14
- Comments: 18
I have the same demand for this case, when I use tfio.IODataset.from_parquet to load same parquet file with array element, it always report “tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: IOFromParquet/ParquetIODataset/strided_slice/”
Having the same issue. I’d love to be using
IODataset.from_parquetinstead of putting together a custom generator and creating a dataset from that. Any idea if/when this issue will be picked up?