tensorflow: tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 'gs' not implemented (file: 'gs://tfds-data/datasets/mnist')
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): It is an example script (for distributed training) provided in TensorFlow
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
- TensorFlow installed from (source or binary): source
- TensorFlow version (use command below): v2.2.0-rc2-77-gaad398b5e9 2.2.0-rc3
- Python version: 3.6.9
- Bazel version (if compiling from source): 2.0.0
- GCC/Compiler version (if compiling from source): 7.5.0
- CUDA/cuDNN version: 10.2 / 7.6.5.32-1
- GPU model and memory: NVIDIA GeForce 940MX with 2 GB Dedicated VRAM
- Exact command to reproduce: python3 distributed_training.py
Describe the problem
I have built TensorFlow 2.2 from source (using r2.2 branch) with support of CUDA 10.2 and CUDNN 7.6.5 on Ubuntu 18.04 for python3. During the configuration of the build there was no question like “Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]” After I installed this built whl and tried to use it with the script that requires to access the gs://tfds-data/datasets/mnist data, I got the following error: tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme ‘gs’ not implemented (file: ‘gs://tfds-data/datasets/mnist’) Please advise. I am using tensorflow-datasets 2.1.0, not sure if this can be the cause of the problem (this version agains tensroflow 2.2 version).
Source code / logs
Source code of the script:
from __future__ import absolute_import, division, print_function, unicode_literals
import os
import tensorflow_datasets as tfds
import tensorflow as tf
tfds.disable_progress_bar()
def evaluate_and_get_model(pth):
mdl = tf.keras.models.load_model(pth, compile=False)
mdl.compile(loss='sparse_categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
evl_loss, evl_acc = mdl.evaluate(eval_dataset)
print('Eval loss: {}, Eval Accuracy: {}'.format(evl_loss, evl_acc))
return mdl
# Function for decaying the learning rate.
# You can define any decay function you need.
def decay(epoch):
if epoch < 3:
return 1e-3
elif 3 <= epoch < 7:
return 1e-4
else:
return 1e-5
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255
return image, label
# Callback for printing the LR at the end of each epoch.
class PrintLR(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
print('\nLearning rate for epoch {} is {}'.format(epoch + 1, model.optimizer.lr.numpy()))
print(tf.__version__)
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True, data_dir='gs://tfds-data/datasets')
mnist_train, mnist_test = datasets['train'], datasets['test']
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
# You can also do info.splits.total_num_examples to get the total
# number of examples in the dataset.
num_train_examples = info.splits['train'].num_examples
num_test_examples = info.splits['test'].num_examples
BUFFER_SIZE = 10000
BATCH_SIZE_PER_REPLICA = 64
BATCH_SIZE = BATCH_SIZE_PER_REPLICA * strategy.num_replicas_in_sync
train_dataset = mnist_train.map(scale).cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
eval_dataset = mnist_test.map(scale).batch(BATCH_SIZE)
with strategy.scope():
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
tf.keras.layers.MaxPooling2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(loss='sparse_categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(),
metrics=['accuracy'])
# Define the checkpoint directory to store the checkpoints
checkpoint_dir = './training_checkpoints'
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, 'ckpt_{epoch}')
callbacks = [
tf.keras.callbacks.TensorBoard(log_dir='./logs'),
tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix,
save_weights_only=True),
tf.keras.callbacks.LearningRateScheduler(decay),
PrintLR()
]
model.fit(train_dataset, epochs=12, callbacks=callbacks)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
eval_loss, eval_acc = model.evaluate(eval_dataset)
print('Eval loss: {}, Eval Accuracy: {}'.format(eval_loss, eval_acc))
path = 'saved_model/'
model.save(path, save_format='tf')
unreplicated_model = evaluate_and_get_model(path)
unreplicated_model.save(path, save_format='tf')
with strategy.scope():
evaluate_and_get_model(path)
Command line output of running the “python3 distributed_training.py” command:
2.2.0-rc3
ERROR:absl:Failed to construct dataset mnist
Traceback (most recent call last):
File "distributed_training.py", line 48, in <module>
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True, data_dir='gs://tfds-data/datasets')
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/api_utils.py", line 52, in disallow_positional_args_dec
return fn(*args, **kwargs)
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/registered.py", line 302, in load
dbuilder = builder(name, data_dir=data_dir, **builder_kwargs)
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/registered.py", line 172, in builder
return _DATASET_REGISTRY[name](**builder_kwargs)
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/api_utils.py", line 52, in disallow_positional_args_dec
return fn(*args, **kwargs)
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/dataset_builder.py", line 197, in __init__
self._data_dir = self._build_data_dir()
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/dataset_builder.py", line 661, in _build_data_dir
version_dirs = _other_versions_on_disk()
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow_datasets/core/dataset_builder.py", line 648, in _other_versions_on_disk
if not tf.io.gfile.exists(builder_data_dir):
File "/home/vyepishov/.local/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 267, in file_exists_v2
_pywrap_file_io.FileExists(compat.as_bytes(path))
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 'gs' not implemented (file: 'gs://tfds-data/datasets/mnist')
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 6
- Comments: 28 (11 by maintainers)
Commits related to this issue
- Fix master error during verify the whl with MNIST test The upstream issue is https://github.com/tensorflow/tensorflow/issues/38477 So we have only to downgrade the tensorflow_datasets version fro... — committed to theopenlab/openlab-zuul-jobs by bzhaoopenstack 4 years ago
- Fix master error during verify the whl with MNIST test (#922) The upstream issue is https://github.com/tensorflow/tensorflow/issues/38477 So we have only to downgrade the tensorflow_datasets vers... — committed to theopenlab/openlab-zuul-jobs by bzhaoopenstack 4 years ago
Any update on GCS support for windows ?
I don’t think GCS filesystem is compiled for Windows for TF 2.3.
We are planning to make filesystems modular (https://github.com/tensorflow/community/pull/101) but this will likely land in TF 2.5.
TF 2.4 should contain work done over the summer as part of Google Summer of Code that would enable GCS filesystems on Windows
Any link to a tutorial or information on doing that. Most people getting this error is from running the official tensorflow tutorial. It may be a good idea to put it there as well.
You need to install
tensorflow-iofor the other filesystems. See #51583 for similar issueCould you try with the last version of TFDS and TF ?
I downgrade the tensorflow_datasets from 3.2.1 to 3.1.0, then the issue is disappeared.