CRNN_Tensorflow: tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence [[{{node train_IteratorGetNext}} = IteratorGetNext[output_shapes=[[32,32,100,3], , [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
@MaybeShewill-CV, it’s not nvidia-smi at #295
(crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$ nvidia-smi
Thu Jul 4 11:00:00 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.130 Driver Version: 384.130 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 0000D691:00:00.0 Off | 0 |
| N/A 47C P0 57W / 149W | 0MiB / 11439MiB | 1% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
(crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$
cuda9.0 installed successfully.
crnntf) kspook@MLNC6:/usr/local/cuda/samples/bin/x86_64/linux/release$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Tesla K80"
CUDA Driver Version / Runtime Version 9.0 / 9.0
CUDA Capability Major/Minor version number: 3.7
Total amount of global memory: 11440 MBytes (11995578368 bytes)
(13) Multiprocessors, (192) CUDA Cores/MP: 2496 CUDA Cores
GPU Max Clock rate: 824 MHz (0.82 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 1572864 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Enabled
Device supports Unified Addressing (UVA): Yes
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 54929 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1
Result = PASS
cudnn 7 installed successfully
(crnntf) kspook@MLNC6:~/cudnn_samples_v7/mnistCUDNN$ ./mnistCUDNN
cudnnGetVersion() : 7601 , CUDNN_VERSION from cudnn.h : 7601 (7.6.1)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 13 Capabilities 3.7, SmClock 823.5 Mhz, MemSize (Mb) 11439, MemClock 2505.0 Mhz, Ecc=1, boardGroupID=0
Using device 0
Testing single precision
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.132224 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.133056 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.157568 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.244576 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.406240 time requiring 203008 memory
Resulting weights from Softmax:
0.0000000 0.9999399 0.0000000 0.0000000 0.0000561 0.0000000 0.0000012 0.0000017 0.0000010 0.0000000
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 0.9999288 0.0000000 0.0000711 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 0.9999820 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
Testing half precision (math in single precision)
Loading image data/one_28x28.pgm
Performing forward propagation ...
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 2
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.139008 time requiring 100 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.141408 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.176672 time requiring 28800 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.252768 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 5: 0.431808 time requiring 203008 memory
Resulting weights from Softmax:
0.0000001 1.0000000 0.0000001 0.0000000 0.0000563 0.0000001 0.0000012 0.0000017 0.0000010 0.0000001
Loading image data/three_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000000 0.0000000 1.0000000 0.0000000 0.0000714 0.0000000 0.0000000 0.0000000 0.0000000
Loading image data/five_28x28.pgm
Performing forward propagation ...
Resulting weights from Softmax:
0.0000000 0.0000008 0.0000000 0.0000002 0.0000000 1.0000000 0.0000154 0.0000000 0.0000012 0.0000006
Result of classification: 1 3 5
Test passed!
The error still occurred.
(crnntf) kspook@MLNC6:~/CRNN_Tensorflow$ python tools/train_shadownet.py --dataset_dir ./data/ --char_dict_path ./data/char_dict/char_dict.json --ord_map_dict_path ./data/char_dict/ord_map.json
I0704 11:05:45.903860 17823 train_shadownet.py:569] Use single gpu to train the model
2019-07-04 11:05:49.530389: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-04 11:05:54.737760: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: d691:00:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2019-07-04 11:05:54.737815: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-07-04 11:05:55.023881: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-04 11:05:55.023945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-07-04 11:05:55.023963: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-07-04 11:05:55.024223: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10295 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: d691:00:00.0, compute capability: 3.7)
I0704 11:05:55.310481 17823 train_shadownet.py:268] Training from scratch
Traceback (most recent call last):
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[{{node train_IteratorGetNext}} = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "tools/train_shadownet.py", line 575, in <module>
need_decode=args.decode_outputs
File "tools/train_shadownet.py", line 321, in train_shadownet
[optimizer, train_ctc_loss, merge_summary_op])
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[node train_IteratorGetNext (defined at /data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py:406) = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
Caused by op 'train_IteratorGetNext', defined at:
File "tools/train_shadownet.py", line 575, in <module>
need_decode=args.decode_outputs
File "tools/train_shadownet.py", line 153, in train_shadownet
batch_size=CFG.TRAIN.BATCH_SIZE
File "/data/home/kspook/CRNN_Tensorflow/data_provider/shadownet_data_feed_pipline.py", line 289, in inputs
num_threads=CFG.TRAIN.CPU_MULTI_PROCESS_NUMS
File "/data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py", line 406, in inputs
return iterator.get_next(name='{:s}_IteratorGetNext'.format(self._dataset_flag))
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 421, in get_next
name=name)), self._output_types,
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2069, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/kspook/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
OutOfRangeError (see above for traceback): End of sequence
[[node train_IteratorGetNext (defined at /data/home/kspook/CRNN_Tensorflow/data_provider/tf_io_pipline_fast_tools.py:406) = IteratorGetNext[output_shapes=[[32,32,100,3], <unknown>, [32]], output_types=[DT_FLOAT, DT_VARIANT, DT_STRING], _device="/job:localhost/replica:0/task:0/device:CPU:0"](OneShotIterator)]]
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 20 (9 by maintainers)
I did. the answer must be #302. lack of data. but in my case, tfrecords were not perfect, due to lexicon index errors.
can you check my file? https://drive.google.com/open?id=1k0qsklB8Y1IbMUBOurnTKTEhUUxw_pwK I don’t think it is different from syn90k.
I am also interested in how to make file in Chinese. Unlike English, Chinese was converted to numbers. How did you make Chinese words? How can you identify two characters?
according to this, https://github.com/MaybeShewill-CV/CRNN_Tensorflow/issues/285#issuecomment-505333966, a chinese word looks to have one index(number). Am I right?