tensorflow: InvalidArgumentError when running map_fn on strings inside a tf.function

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
TensorFlow installed from (source or binary):
TensorFlow version (use command below):

conda install tensorflow-gpu==2.0-alpha

Python version: 3.7.1
Bazel version (if compiling from source):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: cudatoolkit-10.0.130-0
cudnn-7.3.1-cuda10.0_0
GPU model and memory: GeForce RTX 2080 Ti

Describe the current behavior Running the provided code on GPUs leads to error message tensorflow.python.framework.errors_impl.InvalidArgumentError: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string Without feeding the tensor to the convolution layer, summary.image would succeed.

Describe the expected behavior Should run smoothly.

Code to reproduce the issue

import tensorflow as tf
from tensorflow.keras import layers

H, W, C = 10, 10, 3
imgs = tf.zeros([10, H, W, C])
ds = tf.data.Dataset.from_tensor_slices(imgs)
ds = ds.batch(2)
conv = layers.Conv2D(32, (4, 4), strides=(2, 2), padding='same')


@tf.function
def run(img, i):
    conv(img)
    tf.summary.image('img', img, i)


if __name__ == "__main__":
    train_summary_writer = tf.summary.create_file_writer('/tmp/testsummary')
    with train_summary_writer.as_default():
        for i, img in enumerate(ds):
            run(img, i)

Other info / logs

TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2019-04-20 14:44:30.818841: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1700000000 Hz
2019-04-20 14:44:30.819976: I tensorflow/compiler/xla/service/service.cc:162] XLA service 0x55b6fa788f50 executing computa
tions on platform Host. Devices:
2019-04-20 14:44:30.820029: I tensorflow/compiler/xla/service/service.cc:169]   StreamExecutor device (0): <undefined>, <u
ndefined>
2019-04-20 14:44:30.825689: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic li
brary libcuda.so.1
2019-04-20 14:44:31.062487: I tensorflow/compiler/xla/service/service.cc:162] XLA service 0x55b6fc634120 executing computa
tions on platform CUDA. Devices:
2019-04-20 14:44:31.062554: I tensorflow/compiler/xla/service/service.cc:169]   StreamExecutor device (0): GeForce RTX 208
0 Ti, Compute Capability 7.5
2019-04-20 14:44:31.063894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1467] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.635
pciBusID: 0000:19:00.0
totalMemory: 10.73GiB freeMemory: 10.57GiB
2019-04-20 14:44:31.063942: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1546] Adding visible gpu devices: 0
2019-04-20 14:44:31.064034: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic li
brary libcudart.so.10.0
2019-04-20 14:44:31.067082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1015] Device interconnect StreamExecutor wi
th strength 1 edge matrix:
2019-04-20 14:44:31.067114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021]      0
2019-04-20 14:44:31.067130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1034] 0:   N
2019-04-20 14:44:31.068283: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1149] Created TensorFlow device (/job:local
host/replica:0/task:0/device:GPU:0 with 10284 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id
: 0000:19:00.0, compute capability: 7.5)
2019-04-20 14:44:33.628228: W tensorflow/core/common_runtime/base_collective_executor.cc:214] BaseCollectiveExecutor::Star
tAbort Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
         [[{{node img_1/encode_each_image/while/body/_1/TensorArrayV2Write/TensorListSetItem/_54}}]]
         [[img_1/encode_each_image/while/loop_body_control/_19/_33]]
2019-04-20 14:44:33.628374: W tensorflow/core/common_runtime/base_collective_executor.cc:214] BaseCollectiveExecutor::Star
tAbort Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
         [[{{node img_1/encode_each_image/while/body/_1/TensorArrayV2Write/TensorListSetItem/_54}}]]
2019-04-20 14:44:33.628468: E tensorflow/core/common_runtime/process_function_library_runtime.cc:764] Component function e
xecution failed: Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
         [[{{node img_1/encode_each_image/while/body/_1/TensorArrayV2Write/TensorListSetItem/_54}}]]
         [[img_1/encode_each_image/while/loop_body_control/_19/_33]]
2019-04-20 14:44:33.628456: E tensorflow/core/common_runtime/process_function_library_runtime.cc:764] Component function e
xecution failed: Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
         [[{{node img_1/encode_each_image/while/body/_1/TensorArrayV2Write/TensorListSetItem/_54}}]]
Traceback (most recent call last):
  File "test.py", line 21, in <module>
    run(img, i)
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/def_function.
py", line 438, in __call__
    return self._stateless_fn(*args, **kwds)
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/function.py",
 line 1288, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/function.py",
 line 574, in _filtered_call
    (t for t in nest.flatten((args, kwargs))
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/function.py",
 line 627, in _call_flat
    outputs = self._inference_function.call(ctx, args)
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/function.py",
 line 415, in call
    ctx=ctx)
  File "/home/swang150/.pyenv/versions/miniconda3-latest/lib/python3.7/site-packages/tensorflow/python/eager/execute.py",
line 66, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: During Variant Host->Device Copy: non-DMA-copy attempted of
tensor type: string
         [[{{node img_1/encode_each_image/while/body/_1/TensorArrayV2Write/TensorListSetItem/_54}}]]
         [[img_1/encode_each_image/while/loop_body_control/_19/_33]] [Op:__inference_run_343]

About this issue

Original URL
State: closed
Created 5 years ago
Reactions: 10
Comments: 50 (18 by maintainers)

Commits related to this issue

Add a repro for https://github.com/tensorflow/tensorflow/issues/28007 PiperOrigin-RevId: 298958458 Change-Id: I4eec5f9f582ace43710a5635028fc21efa3bdc24 — committed to tensorflow/tensorflow by saxenasaurabh 4 years ago
Force CPU placement for ops that has DT_VARIANT inputs with host-only underlying data type. Fix for #28007 PiperOrigin-RevId: 301650148 Change-Id: I47fa9c1b0b7a7d56c5a519095687f36651892644 — committed to tensorflow/tensorflow by ezhulenev 4 years ago
fix tf issue https://github.com/tensorflow/tensorflow/issues/28007 — committed to BlueFisher/Advanced-Soft-Actor-Critic by BlueFisher 4 years ago

Most upvoted comments

sorry about my poor English. I have the same problem.But I found a solution. I’m using Nvidia 2080Ti , tf-nightly-gpu-2.0-preview. python3.7.3 ,ubuntu 19.04 When I used tf.summary.image(“gen”, generated_images, max_outputs=25, step=0), I got error :During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string. If I wrote like this:

with tf.device("cpu:0"): <<-- add this line
   with log["writer"].as_default():
     tf.summary.image("gen", generated_images, max_outputs=25, step=0)

everything is fine.

+10

faruba on Jul 1, 2019

The issue should be renamed to something like: “InvalidArgumentError when running map_fn on strings inside a tf.function”.

Here is an even smaller code snippet to reproduce the error (to run on GPU):

import tensorflow as tf

@tf.function
def f():
    return tf.map_fn(tf.strings.upper, tf.constant(["a", "b", "c"]))

print(f())

guillaumekln on Oct 16, 2019

Well, it seems to be just a workaround to me. The main issue here is that the summary operation raises error when running on GPU. Forcing the operation to run on CPU doesn’t really solve the problem but just ignores the problem. I don’t know how summary operation works, probably even if running under GPU, it would still copy tensor back to CPU memory (which then would be similar to explicitly asking it to run on CPU). Even if this is the case (if not, we lose some efficiency), from an API point of view, I don’t think this issue is solved as someone might encounter the same problem and don’t know why it happen and how to solve it without bumping into this thread.

ipod825 on Aug 15, 2019

We’re looking into this now. Should have some updates soon.

jaingaurav on Mar 4, 2020

I have the same error, but error raised when I use tensorflow serving（GPU VERSION）

my model incude function below:

def preprocess_and_decode(img_str, new_shape=target_size):
        img = tf.io.decode_base64(img_str)
        img = tf.image.decode_jpeg(img, channels=3)
        img = tf.image.resize(img, new_shape, method=method)
        return img

input64 = tf.keras.layers.Input(shape=(1,), dtype="string", name=input_name)
ouput_tensor = tf.keras.layers.Lambda(
        lambda img: tf.map_fn(lambda im: preprocess_and_decode(im[0]), img, dtype="float32"))(input64)

It’s ok to delopy with serving cpu, but got error with gpu like below:

'{ "error": "2 root error(s) found.\\n (0) Invalid argument: 2 root error(s) found.\\n (0) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\\n (1) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\\n0 successful operations.\\n0 derived errors ignored.\\n\\t [[{{node model_11/lambda_16/map/TensorArrayUnstack/TensorListFromTensor}}]]\\n (1) Invalid argument: 2 root error(s) found.\\n (0) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\\n (1) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string\\n0 successful operations.\\n0 derived errors ignored.\\n\\t [[{{node model_11/lambda_16/map/TensorArrayUnstack/TensorListFromTensor}}]]\\n\\t [[Func/StatefulPartitionedCall/StatefulPartitionedCall/model_11/lambda_16/map/while/body/_887/input/_935/_935]]\\n0 successful operations.\\n0 derived errors ignored." }'

Any solutions for this?

Lannister-Xiaolin on Apr 17, 2020

I know the root cause for this issue is not in the tf.summary module, but for those who get here because of using tf.summary.image() inside @tf.function, my workaround is to return the outputs and do summaries outside:

@tf.function
def train_op(inputs):
  outputs = net(inputs)
  # handle loss and gradients...
  return outputs

def train():
  for data in dataloader:
    outputs = train_op(data)
    with summary_writer.as_default():
      tf.summary.image('image', outputs)

himajin2045 on Jan 27, 2020

@ipod825 I have the same problem (did try tf2.0 alphas and betas) and agree that assigning the summary op to /cpu:0 is only a workaround. Moreover, the fix is not working for me if I build from the r2.0 branch from source. It would be nice if this issue would be reopened, so the problem can be solved.

I took a look at github: the map_fn in line 75 is causing the issues.

loffermann on Aug 19, 2019

I am also facing this issue on GPU (no error on CPU) when using to map_fn for string tensor (with float tensor all right):

def process_string(sample):
    # here i want to write something to file, but even with identity
    # function I get "non-DMA-copy attempted..." error
    return sample

@tf.function
def f(self, y_true, y_pred):
    string_tensor = y_true["path"]  # Tensor("data_batch_19:0", shape=(2,), dtype=string)
    tf.map_fn(process_string, string_tensor)

I do not know why, but for me manually placing map_fn on cpu AND making mock return from tf.function helped:

@tf.function
def f(self, y_true, y_pred):
    string_tensor = y_true["path"]  # Tensor("data_batch_19:0", shape=(2,), dtype=string)
    with tf.device("/cpu:0"):
        tf.map_fn(process_string, string_tensor)
    return y_pred  # do not work for me without return something

I am using tf2.0.1

Hi thanks this trick works for my case also

burhr2 on Apr 27, 2020

Also having the same issue using TF 2.1. Works fine on a machine with just a CPU, but fails on a machine with a GPU, even when using with tf.device('/cpu:0'). Would appreciate an update this asap.

anishchopra on Apr 16, 2020

The issue should be renamed to something like: “InvalidArgumentError when running map_fn on strings inside a tf.function”.

Here is an even smaller code snippet to reproduce the error (to run on GPU):
import tensorflow as tf

@tf.function
def f():
    return tf.map_fn(tf.strings.upper, tf.constant(["a", "b", "c"]))

print(f())

i also have this problem, it works well in cpu ,but gpu not, how can I to solve it ?

ciel-zhang on Mar 10, 2020

Is this being fixed or adressed by someone. For me in TF2.0 not even the tf.device hint did work. Following code (tf2.0, cuda10.0, GTX 1080) did not work for me and failed with the same error message as reported above (Invalid argument: During Variant Host->Device Copy: non-DMA- …):

import tensorflow as tf

writer = tf.summary.create_file_writer("/tmp/mylogs/tf_function")


@tf.function
def my_func(image, step):
    with tf.device("/cpu:0"):
        tf.summary.image("my_image_metric", image, step=step)


image = tf.constant(
    [[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]], [[0.0, 0.0, 1.0], [1.0, 1.0, 1.0]]]
)[None]

with writer.as_default():
    for step in tf.range(100, dtype=tf.int64):
        my_func(image, step)
        writer.flush()

demmerichs on Jan 17, 2020

@jvishnuvardhan I tried the notebook you posted, and it works because the notebook’s runtime isn’t using the GPU. After changing the runtime to a GPU-accelerated one, it fails with the error @ageron posted.

rharish101 on Oct 15, 2019

Actually, I just ran the Colab gist that I shared in my earlier post, but this time with a GPU runtime. I’m now seeing the same error message that you reported. So seems to be a GPU related issue. Can you open a new bug with all of this information? Thanks!

nikitamaia on Feb 22, 2021

Closing this issue now since the bug has been fixed.

nikitamaia on Aug 7, 2020

is there a plan to fix this or a suggested workaround?

edugp on Mar 3, 2020

I also ran into this issue. Here’s a fairly minimal piece of code that reproduces it:

import tensorflow.compat.v2 as tf
tf.enable_v2_behavior()

def decode_png(data):
  return tf.image.decode_png(data)

@tf.function  # <= No exception if you comment this line out
def decode_all(images):
  return tf.map_fn(decode_png, images, dtype=tf.uint8)

img = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x08\x06\x00\x00\x00\x1f\x15\xc4\x89\x00\x00\x00\rIDATx\xdac\xfc\xcf\xf0\xbf\x1e\x00\x06\x83\x02\x7f\x94\xad\xd0\xeb\x00\x00\x00\x00IEND\xaeB`\x82'
images = tf.constant([img, img])
decode_all(images)

and here’s the full stackstrace:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-72-a59f4c54298a> in <module>()
     11 img = b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x08\x06\x00\x00\x00\x1f\x15\xc4\x89\x00\x00\x00\rIDATx\xdac\xfc\xcf\xf0\xbf\x1e\x00\x06\x83\x02\x7f\x94\xad\xd0\xeb\x00\x00\x00\x00IEND\xaeB`\x82'
     12 images = tf.constant([img, img])
---> 13 decode_all(images)

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/def_function.py in __call__(self, *args, **kwds)
    465               *args, **kwds)
    466       # If we did not create any variables the trace we have is good enough.
--> 467       return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds)  # pylint: disable=protected-access
    468 
    469     def fn_with_cond(*inner_args, **inner_kwds):

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py in _filtered_call(self, args, kwargs)
   1139          if isinstance(t, (ops.Tensor,
   1140                            resource_variable_ops.BaseResourceVariable))),
-> 1141         self.captured_inputs)
   1142 
   1143   def _call_flat(self, args, captured_inputs, cancellation_manager=None):

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
   1222     if executing_eagerly:
   1223       flat_outputs = forward_function.call(
-> 1224           ctx, args, cancellation_manager=cancellation_manager)
   1225     else:
   1226       gradient_name = self._delayed_rewrite_functions.register()

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py in call(self, ctx, args, cancellation_manager)
    509               inputs=args,
    510               attrs=("executor_type", executor_type, "config_proto", config),
--> 511               ctx=ctx)
    512         else:
    513           outputs = execute.execute_with_cancellation(

/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     65     else:
     66       message = e.message
---> 67     six.raise_from(core._status_to_exception(e.code, message), None)
     68   except TypeError as e:
     69     keras_symbolic_tensors = [

/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)

InvalidArgumentError: 2 root error(s) found.
  (0) Invalid argument:  2 root error(s) found.
  (0) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
  (1) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
0 successful operations.
0 derived errors ignored.
	 [[{{node map/TensorArrayUnstack/TensorListFromTensor/_12}}]]
  (1) Invalid argument:  2 root error(s) found.
  (0) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
  (1) Invalid argument: During Variant Host->Device Copy: non-DMA-copy attempted of tensor type: string
0 successful operations.
0 derived errors ignored.
	 [[{{node map/TensorArrayUnstack/TensorListFromTensor/_12}}]]
	 [[Func/map/while/body/_1/input/_43/_24]]
0 successful operations.
0 derived errors ignored. [Op:__inference_decode_all_20554]

Function call stack:
decode_all -> decode_all

I ran this on Colab with a GPU Runtime, using TF 1.15.0rc3. It will probably bomb as well on TF 2.0.0 but I haven’t tried.

ageron on Oct 10, 2019

@nikitamaia Yes, this is fixed on TensorFlow 2.3.0. I tested this on the colab and also on my local machine (running TF 2.3.0 on Arch Linux).

rharish101 on Aug 7, 2020

I am also facing this issue on GPU (no error on CPU) when using to map_fn for string tensor (with float tensor all right):

def process_string(sample):
    # here i want to write something to file, but even with identity
    # function I get "non-DMA-copy attempted..." error
    return sample

@tf.function
def f(self, y_true, y_pred):
    string_tensor = y_true["path"]  # Tensor("data_batch_19:0", shape=(2,), dtype=string)
    tf.map_fn(process_string, string_tensor)

I do not know why, but for me manually placing map_fn on cpu AND making mock return from tf.function helped:

@tf.function
def f(self, y_true, y_pred):
    string_tensor = y_true["path"]  # Tensor("data_batch_19:0", shape=(2,), dtype=string)
    with tf.device("/cpu:0"):
        tf.map_fn(process_string, string_tensor)
    return y_pred  # do not work for me without return something

I am using tf2.0.1

dimitree54 on Apr 17, 2020

This should be fixed for simple tf.map_fn example, however the underlying problem is still there, and might be triggered in more complex use cases. A fix commit has a repro with explanation.

ezhulenev on Mar 18, 2020

I think the issue is incorrectly assigned as it is not directly related to tf.summary. @tensorflow/dev-support Can this be reassigned to someone working on functional ops such as tf.map_fn?

guillaumekln on Jan 17, 2020

@rharish101 Thanks! Got it. This is not resolved. Thanks!

jvishnuvardhan on Oct 15, 2019