tensorflow: RaggedTensor casting bug

version 2.0.0-alpha. Nested RaggedTensor are cast to int64 without apparent reason:

with regular tensors (ok):

>>> tf.constant([[1]], dtype=tf.int8)
<tf.Tensor: id=98, shape=(1, 1), dtype=int8, numpy=array([[1]], dtype=int8)>

with nested RaggedTensor (not ok):

>>> tf.ragged.constant([[1]], dtype=tf.int8)
tf.RaggedTensor(values=tf.Tensor([1], shape=(1,), dtype=int8), row_splits=tf.Tensor([0 1], shape=(2,), dtype=int64))

Also they can not be used to create generators even with dtype=int64. The following code leads to:

The expected type was int64, but the yielded element was <tf.RaggedTensor [[6]]>.


class LineGenerator(object):
  def get_next_line(self):
    while True:
      out = [[6]]
      yield tf.ragged.constant(out, dtype=tf.int64)

class Dataset(object):
  def __init__(self, generator=LineGenerator()):
    self.next_element = self.build_iterator(generator)

  def build_iterator(self, gen: LineGenerator):
    dataset = tf.data.Dataset.from_generator(gen.get_next_line,output_types = tf.int64)
    #some other code...

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 19 (11 by maintainers)

Most upvoted comments

@stekiri This is a bug with Dataset.from_generator. #37400 is a PR to fix it, though I think it’s been stalled for a little while on making sure it doesn’t break a test.

tf.data support for RaggedTensors was added by 5fe90dc.

I know. I’m not yet complete. I’m testing out various ways. Will create a PR as soon as I’m done.

On Sun, 7 Apr 2019, 7:18 pm ARozental, notifications@github.com wrote:

Thanks, the above solution avoids the conversion bug though it seems there is another issue with building a generator of RuggedTensors

tensorflow/core/framework/op_kernel.cc:1419] Unimplemented: Unsupported object type RaggedTensor

stack trace:

File “/Users/alonrozental/IdeaProjects/AGENT/create_pretraining_data.py”, line 92, in 2019-04-07 16:39:02.397857: W tensorflow/core/framework/op_kernel.cc:1419] Unimplemented: Unsupported object type RaggedTensor tf.compat.v1.app.run() File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tensorflow/python/platform/app.py”, line 40, in run 2019-04-07 16:39:02.398489: W tensorflow/core/framework/op_kernel.cc:1419] Unimplemented: Unsupported object type RaggedTensor _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/absl/app.py”, line 300, in run _run_main(main, args) File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/absl/app.py”, line 251, in _run_main sys.exit(main(argv)) File “/Users/alonrozental/IdeaProjects/AGENT/create_pretraining_data.py”, line 79, in main d = Dataset() File “/Users/alonrozental/IdeaProjects/AGENT/create_pretraining_data.py”, line 48, in init self.next_element = self.build_iterator(generator) File “/Users/alonrozental/IdeaProjects/AGENT/create_pretraining_data.py”, line 60, in build_iterator element = iter.get_next() File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py”, line 644, in get_next return self._next_internal() File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py”, line 577, in _next_internal output_shapes=self._flat_output_shapes) File “/Users/alonrozental/.pyenv/versions/3.6.5/lib/python3.6/site-packages/tensorflow/python/ops/gen_dataset_ops.py”, line 1954, in iterator_get_next_sync _six.raise_from(_core._status_to_exception(e.code, message), None) File “”, line 3, in raise_from tensorflow.python.framework.errors_impl.UnimplementedError: Unsupported object type RaggedTensor [[{{node PyFunc}}]] [Op:IteratorGetNextSync] 2019-04-07 16:39:02.437561: W tensorflow/core/kernels/data/generator_dataset_op.cc:79] Error occurred when finalizing GeneratorDataset iterator: Failed precondition: Python interpreter state is not initialized. The process may be terminated. [[{{node PyFunc}}]]

my code looks like this:

class Dataset(object): def init(self, generator=LineGenerator()): self.next_element = self.build_iterator(generator)

def build_iterator(self, gen: LineGenerator): dataset = tf.data.Dataset.from_generator(gen.get_next_line,output_types = tf.int64) #the right way iter = dataset.make_one_shot_iterator() element = iter.get_next() #this line gives the error

return element

d = Dataset()

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tensorflow/tensorflow/issues/27598#issuecomment-480591888, or mute the thread https://github.com/notifications/unsubscribe-auth/ANWI2ZN5kFeG2AkmxkouHtlfPWyZSANeks5vefc-gaJpZM4cgi-E .

On a Second thought, This issue can be solved by modifying the from_generator() function. @dynamicwebpaige @alextp can I work on it?