tensorflow: CancelledError: Loop execution was cancelled

I was doing some regression analysis using google colabs and until 2 days back everything was working fine but suddenly I’m getting following error. I din’t know why the training is getting stopped in the middle. On stackoverflow I came to know that tensorflow was updated from 1.12 to 1.13.0rc0. The issue started the moment the tensorflow was updated.

Here is the link to my notebook on colabs.

Following is the output from training:

 SA1 


Fold:  1



WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From <ipython-input-8-001ee577e90b>:225: DatasetV1.make_initializable_iterator (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `for ... in dataset:` to iterate over a dataset. If using `tf.estimator`, return the `Dataset` object directly from your input function. As a last resort, you can use `tf.compat.v1.data.make_initializable_iterator(dataset)`.
WARNING:tensorflow:From <ipython-input-9-1bd488b4dd9a>:239: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From <ipython-input-9-1bd488b4dd9a>:348: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
WARNING:tensorflow:From <ipython-input-9-1bd488b4dd9a>:161: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
INFO:tensorflow:Restoring parameters from data/vars/SA1/feeder.cpt
Training the model with He ...
0 MAPE: 11442.98324584961
100 MAPE: 1032.3982238769531
200 MAPE: 460.38122177124023
300 MAPE: 190.61466455459595
400 MAPE: 96.63236141204834
500 MAPE: 59.168386459350586
600 MAPE: 41.031333804130554
700 MAPE: 29.591849446296692
800 MAPE: 22.207358479499817
900 MAPE: 19.93158310651779
1000 MAPE: 17.7777960896492
1100 MAPE: 16.45900011062622
1200 MAPE: 16.250891983509064
1300 MAPE: 14.98890072107315
1400 MAPE: 14.976367354393005
1500 MAPE: 14.184540510177612
1600 MAPE: 13.81748765707016
1700 MAPE: 14.370763301849365
1800 MAPE: 13.738515973091125
1900 MAPE: 13.814045488834381
2000 MAPE: 13.300719857215881
2100 MAPE: 13.28342854976654
2200 MAPE: 13.087129592895508
2300 MAPE: 13.045386970043182
2400 MAPE: 12.72062063217163
2500 MAPE: 12.522627413272858
2600 MAPE: 12.2012197971344
2700 MAPE: 12.064149230718613
2800 MAPE: 11.975772678852081
2900 MAPE: 11.860555410385132
3000 MAPE: 11.887001246213913
3100 MAPE: 11.541987955570221
3200 MAPE: 11.460774391889572
3300 MAPE: 11.266357451677322
3400 MAPE: 11.597257852554321
3500 MAPE: 11.223434656858444
3600 MAPE: 11.532413214445114
3700 MAPE: 10.872676968574524
3800 MAPE: 11.064480245113373
3900 MAPE: 11.555147916078568
4000 MAPE: 11.46700382232666
4100 MAPE: 10.984917730093002
4200 MAPE: 10.663384199142456
4300 MAPE: 11.119940131902695
4400 MAPE: 10.703090578317642
4500 MAPE: 10.289034992456436
4600 MAPE: 9.894699603319168

---------------------------------------------------------------------------

CancelledError                            Traceback (most recent call last)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1333     try:
-> 1334       return fn(*args)
   1335     except errors.OpError as e:

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1318       return self._call_tf_sessionrun(
-> 1319           options, feed_dict, fetch_list, target_list, run_metadata)
   1320 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1406         self._session, options, feed_dict, fetch_list, target_list,
-> 1407         run_metadata)
   1408 

CancelledError: Loop execution was cancelled.
	 [[{{node while/LoopCond}}]]


During handling of the above exception, another exception occurred:

CancelledError                            Traceback (most recent call last)

<ipython-input-11-86884b56c50d> in <module>()
     65                 while True:
     66                   try:
---> 67                     _, error = sess.run([train_model.train_op, train_model.mape])
     68                   except tf.errors.OutOfRangeError:
     69                     break

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    927     try:
    928       result = self._run(None, fetches, feed_dict, options_ptr,
--> 929                          run_metadata_ptr)
    930       if run_metadata:
    931         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1150     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1151       results = self._do_run(handle, final_targets, final_fetches,
-> 1152                              feed_dict_tensor, options, run_metadata)
   1153     else:
   1154       results = []

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1326     if handle is None:
   1327       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328                            run_metadata)
   1329     else:
   1330       return self._do_call(_prun_fn, handle, feeds, fetches)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1346           pass
   1347       message = error_interpolation.interpolate(message, self._graph)
-> 1348       raise type(e)(node_def, op, message)
   1349 
   1350   def _extend_graph(self):

CancelledError: Loop execution was cancelled.
	 [[node while/LoopCond (defined at <ipython-input-9-1bd488b4dd9a>:383) ]]

Caused by op 'while/LoopCond', defined at:
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 450, in _handle_events
    self._handle_recv()
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
    self._run_callback(callback, msg)
  File "/usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py", line 432, in _run_callback
    callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
    if self.run_code(code, result):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-11-86884b56c50d>", line 59, in <module>
    asgd_decay=None)
  File "<ipython-input-9-1bd488b4dd9a>", line 70, in __init__
    inp.y_feature, inp.x_feature[:, -1, 0], init)
  File "<ipython-input-9-1bd488b4dd9a>", line 383, in decoder
    _, _, _, targets_ta, outputs_ta = tf.while_loop(cond_fn, loop_fn, loop_init)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop
    return_same_structure)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 3008, in _BuildLoop
    self._pivot = loop_cond(c, name="LoopCond")
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_control_flow_ops.py", line 339, in loop_cond
    "LoopCond", input=input, name=name)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

CancelledError (see above for traceback): Loop execution was cancelled.
	 [[node while/LoopCond (defined at <ipython-input-9-1bd488b4dd9a>:383) ]]


About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Comments: 17 (13 by maintainers)

Most upvoted comments

@alxfed tf.colocate_with is going away in TF 2.0, which is why it prints the deprecation warning. It’s trying to say that you don’t need to worry about updating because it should be handled automatically, but I agree this could be more clear. I don’t believe that’s related to this issue however.

@anshkumar as an update, I was able to repro the problem in a copy of your notebook, but unfortunately have been unable to repro the problem outside of the colab environment, which is difficult to debug. I’m still working on figuring out what’s going on though.

What does this line even mean? Especially with NOTHING after the colon in “Instructions for updating:”? I’m getting it too in one of the standard notebooks that I’m trying in Colab.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.