tensorflow: KeyError: 'Failed to format this callback filepath: "checkpoint_5000/checkpoint_{epoch:02d}_{batch:04d}". Reason: \'batch\''
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): Installed using Pip
- TensorFlow version (use command below): 2.2.0
- Python version: 3.7
- Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:26_Pacific_Standard_Time_2019
Cuda compilation tools, release 10.1, V10.1.105
- GPU model and memory: NVIDIA MX110 2GB
You can collect some of this information using our environment capture script You can also obtain the TensorFlow version with:
- TF 1.0:
python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"
- TF 2.0:
python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
v2.2.0-rc4-8-g2b96f3662b 2.2.0
Describe the current behavior I am following a tutorial where we can save model weights in Tensorflow. We are saving weights every 5000 training points. Code of instructor and my code is same. But his version is 2.0, and my version is 2.2.0. There is error so I guess it is a bug in version. Describe the expected behavior It should save the model weights every 5k training points. Standalone code to reproduce the issue
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import ModelCheckpoint
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train = x_train / 255.0
x_test = x_test / 255.0
def get_new_model():
model = Sequential([
Conv2D(filters=16, input_shape=(32, 32, 3), kernel_size=(3, 3),
activation='relu', name='conv_1'),
tf.keras.layers.BatchNormalization(),
Conv2D(filters=8, kernel_size=(3, 3), activation='relu', name='conv_2'),
MaxPooling2D(pool_size=(4, 4), name='pool_1'),
tf.keras.layers.BatchNormalization(),
Conv2D(filters=8, kernel_size=(3, 3), activation='relu', name='conv_3'),
MaxPooling2D(pool_size=(4, 4), name='pool_2'),
Flatten(name='flatten'),
Dense(units=32, activation='relu', name='dense_1'),
tf.keras.layers.Dropout(0.5),
Dense(units=10, activation='softmax', name='dense_2')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
checkpoint_5000_path = 'checkpoint_5000/checkpoint_{epoch:02d}_{batch:04d}'
model = get_new_model()
checkpoint_5000 = ModelCheckpoint(filepath=checkpoint_5000_path, verbose=True, save_weights_only=True,
save_freq=5000)
model.fit(x_train, y_train, batch_size=10, validation_data=(x_test,y_test), epochs=3, verbose= True, callbacks=[checkpoint_5000])
Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Full traceback is
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in _get_file_path(self, epoch, logs)
1243 # placeholders can cause formatting to fail.
-> 1244 return self.filepath.format(epoch=epoch + 1, **logs)
1245 except KeyError as e:
KeyError: 'batch'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-11-cc68dad1ac2c> in <module>
7 checkpoint_5000 = ModelCheckpoint(filepath=checkpoint_5000_path, verbose=True, save_weights_only=True,
8 save_freq=5000)
----> 9 model.fit(x_train, y_train, batch_size=10, validation_data=(x_test,y_test), epochs=3, verbose= True, callbacks=[checkpoint_5000])
10
11
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\engine\training.py in _method_wrapper(self, *args, **kwargs)
64 def _method_wrapper(self, *args, **kwargs):
65 if not self._in_multi_worker_mode(): # pylint: disable=protected-access
---> 66 return method(self, *args, **kwargs)
67
68 # Running inside `run_distribute_coordinator` already.
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
853 context.async_wait()
854 logs = tmp_logs # No error, now safe to assign to logs.
--> 855 callbacks.on_train_batch_end(step, logs)
856 epoch_logs = copy.copy(logs)
857
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in on_train_batch_end(self, batch, logs)
388 if self._should_call_train_batch_hooks:
389 logs = self._process_logs(logs)
--> 390 self._call_batch_hook(ModeKeys.TRAIN, 'end', batch, logs=logs)
391
392 def on_test_batch_begin(self, batch, logs=None):
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in _call_batch_hook(self, mode, hook, batch, logs)
296 for callback in self.callbacks:
297 batch_hook = getattr(callback, hook_name)
--> 298 batch_hook(batch, logs)
299 self._delta_ts[hook_name].append(time.time() - t_before_callbacks)
300
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in on_train_batch_end(self, batch, logs)
613 """
614 # For backwards compatibility.
--> 615 self.on_batch_end(batch, logs=logs)
616
617 @doc_controls.for_subclass_implementers
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in on_batch_end(self, batch, logs)
1160 self._batches_seen_since_last_saving += 1
1161 if self._batches_seen_since_last_saving >= self.save_freq:
-> 1162 self._save_model(epoch=self._current_epoch, logs=logs)
1163 self._batches_seen_since_last_saving = 0
1164
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in _save_model(self, epoch, logs)
1194 int) or self.epochs_since_last_save >= self.period:
1195 self.epochs_since_last_save = 0
-> 1196 filepath = self._get_file_path(epoch, logs)
1197
1198 try:
C:\Anaconda\envs\myenv\lib\site-packages\tensorflow\python\keras\callbacks.py in _get_file_path(self, epoch, logs)
1245 except KeyError as e:
1246 raise KeyError('Failed to format this callback filepath: "{}". '
-> 1247 'Reason: {}'.format(self.filepath, e))
1248 else:
1249 # If this is multi-worker training, and this worker should not
KeyError: 'Failed to format this callback filepath: "checkpoint_5000/checkpoint_{epoch:02d}_{batch:04d}". Reason: \'batch\''
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 15 (4 by maintainers)
Here is the solution: just replace save_freq=5000 To save_freq=‘epoch’
welcome 👍