tensorflow: Debugger V2 not working. Invalid argument: DebugNumericSummaryV2Op requires tensor_id to be less than or equal to (2^53). Given tensor_id:26
System information
- I have used the test example from here
- OS: Windows 10
- Tensorflow 2.3.1 (installed with pip):
- Python 3.6
- CUDA 10.1
- nVidia GeForce GTX 1050
I cannot make the example work with Debugger V2.
By executing the example from the link above I get the following output:
D:\src\ai\visualthing\venv\Scripts\python.exe "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\pydevd.py" --multiproc --qt-support=auto --client 127.0.0.1 --port 50790 --file D:/src/ai/visualthing/debug_mnist_v2.py --dump_dir /tmp/tfdbg2_logdir --dump_tensor_debug_mode FULL_HEALTH
pydev debugger: process 8484 is connecting
Connected to pydev debugger (build 192.5728.105)
2020-09-27 20:31:08.451881: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
INFO:tensorflow:Enabled dumping callback in thread MainThread (dump root: /tmp/tfdbg2_logdir, tensor debug mode: FULL_HEALTH)
I0927 20:31:11.284601 1260 dumping_callback.py:871] Enabled dumping callback in thread MainThread (dump root: /tmp/tfdbg2_logdir, tensor debug mode: FULL_HEALTH)
2020-09-27 20:31:11.557685: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-09-27 20:31:11.584474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-09-27 20:31:11.584652: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-27 20:31:11.588047: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-27 20:31:11.591169: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-27 20:31:11.592204: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-27 20:31:11.595773: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-27 20:31:11.597733: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-27 20:31:11.605092: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-27 20:31:11.605244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-27 20:31:11.605644: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-09-27 20:31:11.614513: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1f4c545b410 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-27 20:31:11.614778: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-09-27 20:31:11.615119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 4.00GiB deviceMemoryBandwidth: 104.43GiB/s
2020-09-27 20:31:11.615425: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-09-27 20:31:11.615585: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-09-27 20:31:11.615691: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-09-27 20:31:11.615830: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-09-27 20:31:11.615921: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-09-27 20:31:11.616011: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-09-27 20:31:11.616099: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-09-27 20:31:11.616214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-27 20:31:12.188255: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-27 20:31:12.188425: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-09-27 20:31:12.188484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-09-27 20:31:12.188686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2987 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-09-27 20:31:12.191306: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1f4e366a9f0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-27 20:31:12.191431: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1050, Compute Capability 6.1
2020-09-27 20:31:13.537229: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\pydevd.py", line 2060, in <module>
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\pydevd.py", line 2054, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\pydevd.py", line 1405, in run
return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\pydevd.py", line 1412, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.2\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/src/ai/visualthing/debug_mnist_v2.py", line 238, in <module>
absl.app.run(main=main, argv=[sys.argv[0]] + unparsed)
File "D:\src\ai\visualthing\venv\lib\site-packages\absl\app.py", line 299, in run
_run_main(main, args)
File "D:\src\ai\visualthing\venv\lib\site-packages\absl\app.py", line 250, in _run_main
sys.exit(main(argv))
File "D:/src/ai/visualthing/debug_mnist_v2.py", line 223, in main
y = model(x_train)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 780, in __call__
result = self._call(*args, **kwds)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\def_function.py", line 846, in _call
return self._concrete_stateful_fn._filtered_call(canon_args, canon_kwds) # pylint: disable=protected-access
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\function.py", line 1848, in _filtered_call
cancellation_manager=cancellation_manager)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\function.py", line 1933, in _call_flat
cancellation_manager=cancellation_manager)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\function.py", line 550, in call
ctx=ctx)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\execute.py", line 138, in execute_with_callbacks
tensors = quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
File "D:\src\ai\visualthing\venv\lib\site-packages\tensorflow\python\eager\execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: DebugNumericSummaryV2Op requires tensor_id to be less than or equal to (2^53). Given tensor_id:26
[[{{node StatefulPartitionedCall/MatMul/ReadVariableOp/DebugNumericSummaryV2}}]]
[[x/_1]]
(1) Invalid argument: DebugNumericSummaryV2Op requires tensor_id to be less than or equal to (2^53). Given tensor_id:26
[[{{node StatefulPartitionedCall/MatMul/ReadVariableOp/DebugNumericSummaryV2}}]]
0 successful operations.
0 derived errors ignored. [Op:__forward_model_324]
Function call stack:
model -> model
INFO:tensorflow:Disabled dumping callback in thread MainThread (dump root: /tmp/tfdbg2_logdir)
I0927 20:31:55.200698 1260 dumping_callback.py:895] Disabled dumping callback in thread MainThread (dump root: /tmp/tfdbg2_logdir)
Process finished with exit code 1
I have also tried to build my own example with no success, same error:
DebugNumericSummaryV2Op requires tensor_id to be less than or equal to (2^53)
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 25 (4 by maintainers)
Commits related to this issue
- [DebuggerV2] Enable debug_v2_ops_test & debug_events_writer_test on Windows - A test in debug_v2_ops_test previously called `np.power(2, 53)` without specifying dtype. As a result, the output had t... — committed to tensorflow/tensorflow by caisq 4 years ago
- Change to NO_TENSOR to avoid this issue https://github.com/tensorflow/tensorflow/issues/43608 on Windows 10. — committed to Gavin-Development/GavinBackend by invalid-email-address 3 years ago
I got the same error. You have got a fix?
Same problem here. Searched all over for a solution and can’t find one. Any help would be appreciated.
Hi, I have just run into this issue with Tensorflow 2.9.1 and windows 10. A workaround was to set eager mode to true
tf.config.run_functions_eagerly(True), I have no idea if it is anything of a good workaround though, but at least it runs withtensor_debug_mode='FULL_HEALTH'.I ran into the same issue on Windows 10 with tf 2.3.0
I played around with the parameters. It seems that the debugger runs with the defaults. i.e. tf.debugging.experimental.enable_dump_debug_info( “tfdbg_logs”,tensor_debug_mode=“NO_TENSOR” ). But other options for the parameter
tensor_debug_modefail.Yes, but now you don’t have debugging information. Am I right?
The problem is that we cannot use Debugger V2 on Windows 10. The whole purpose of this ticket is to figure out how to make it work. Of course, if you disable it the problem is gone 😄