tensorflow: dataset as_numpy_iterator() failing when using RaggedTensor

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 11.6 / docker tensorflow:2.5.0-gpu
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 2.5.0 and 2.6.0
  • Python version: 3.6.9
  • Bazel version (if compiling from source): N/A
  • GCC/Compiler version (if compiling from source): N/A
  • CUDA/cuDNN version: N/A
  • GPU model and memory: N/A

Describe the current behavior

Calling as_numpy_iterator() on a tf.data.Dataset having RaggedTensorSpec will fail with the error:

  File "/Users/xx/Library/Caches/pypoetry/virtualenvs/sandbox-hasb3I3q-py3.6/lib/python3.6/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 4685, in to_numpy
    numpy = x._numpy()  # pylint: disable=protected-access
AttributeError: 'RaggedTensor' object has no attribute '_numpy'

Sample code to reproduce the issue:

import tensorflow as tf

def gen():
    ragged_tensor = tf.ragged.constant([[1, 2], [3]])
    yield 42, ragged_tensor

dataset = tf.data.Dataset.from_generator(gen, output_signature=(
                tf.TensorSpec(shape=(), dtype=tf.int32),
                tf.RaggedTensorSpec(shape=(2, None), dtype=tf.int32)))

iterator = dataset.as_numpy_iterator()
print(iterator.next()) # failing with: AttributeError: 'RaggedTensor' object has no attribute '_numpy'

This regression has been brought by this commit

Describe the expected behavior

Returning the numpy iterator as for Tensor without failing. OR: providing a workaround or alternative usage to make it work.

Contributing

  • Do you want to contribute a PR? (yes/no): no
  • Briefly describe your candidate solution(if contributing):

Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.

Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (5 by maintainers)

Most upvoted comments

@kingatlas,

Instead of using iterator = dataset.as_numpy_iterator() , Can you try using this iterator = iter(dataset) to iterate through the dataset? You can take a look at the working gist here. Thanks!

I was able to reproduce the issue in tf v2.6, v2.7, v2.8 and nightly(2.10.0-dev20220407).Please find the gist of here.