benchmarks: tf_cnn_benchmarks.py does not support --data_dir with my imagenet1k tfrecords
I’m using the HEAD of both tensorflow and benchmarks.
I can run the tf_cnn_benchmarks.py with synthetic data like this:
python3 tf_cnn_benchmarks.py --num_batches=100 --display_every=1 --device=cpu --data_format=NHWC --model=trivial --batch_size=64
But when I try to specify my own local data_dir of tfrecords for imagenet1k, it hangs sometime after printing “Running warm up”:
python3 tf_cnn_benchmarks.py --num_batches=100 --display_every=1 --device=cpu --data_format=NHWC --model=trivial --batch_size=64 --data_dir=/n0/ryan/imagenet1k_tfrecord
TensorFlow: 1.8
Model: trivial
Dataset: imagenet
Mode: training
SingleSess: False
Batch size: 64 global
64.0 per device
Num batches: 100
Num epochs: 0.00
Devices: ['/cpu:0']
Data format: NHWC
Layout optimizer: False
Optimizer: sgd
Variables: parameter_server
==========
Generating model
W0530 13:48:44.750849 140466104280896 tf_logging.py:125] From /home/ryan/sandbox/rreece/onboarding-cerebras/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py:1611: Supervisor.__init__ (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.MonitoredTrainingSession
2018-05-30 13:48:44.798403: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX512F
I0530 13:48:44.929922 140466104280896 tf_logging.py:115] Running local_init_op.
I0530 13:48:50.095620 140466104280896 tf_logging.py:115] Done running local_init_op.
Running warm up
and then it hangs.
Any ideas how I can debug using my own local dataset?
I noticed these seemingly related closed issues: #150 and #176, but they do not seem to be hanging at the same place tf_cnn_benchmarks.py does for me.
Thanks!
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 25 (7 by maintainers)
Sorry it took me so long to get back to this.
I tried the head of benchmarks today with tensorflow 1.9.0, and it worked! Thanks for the feedback. Closing this issue.