datasets: tensorflow_datasets.load('cats_vs_dogs') not working !
Code NOT WORKING
# load dataset module
import tensorflow_datasets as tfds
# make downloading progress bar dissable
tfds.disable_progress_bar()
# download data - cats vs dogs
_=tfds.load('cats_vs_dogs', # dataset name
as_supervised=False, # include labels - False
)
ERROR: -> DownloadError: Failed to get url https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_3367a.zip. HTTP code: 404.
Environment - Google Colab
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 3
- Comments: 23
Hi. I have a temporary solution below to modify the URL:
setattr(tfds.image_classification.cats_vs_dogs, ‘_URL’,“https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip”)
Why I have this error: module ‘tensorflow_datasets’ has no attribute ‘image_classification’
I didn’t attempt this with the split parameter, so I can’t comment on that.
I did not add buffer.seek(0).
Here’s the code I used to get past the issue. The line I changed is prefaced by a comment that says HACKY FIX.
import tensorflow as tf import tensorflow_datasets as tfds import io import zipfile import logging
def __generate_examples(self, archive): num_skipped = 0 for fname, fobj in archive: res = tfds.image_classification.cats_vs_dogs._NAME_RE.match(fname) if not res: # README file, … continue label = res.group(1).lower() if tf.compat.as_bytes(“JFIF”) not in fobj.peek(10): num_skipped += 1 continue
tfds.image_classification.cats_vs_dogs.CatsVsDogs._generate_examples = __generate_examples data, metadata = tfds.load(‘cats_vs_dogs’, as_supervised=True, with_info=True)
Regarding the last comment, I was getting the same issue. After some poking about, this looks like a problem with the method _generate_examples() on tensor_dataflow.image_classification.cats_vs_dogs.CatsVsDogs. In that method, the following line…
new_fobj = zipfile.ZipFile(buffer).open(fname)
…is causing the exception. The problem is with fname. Once written into the in-memory ZipFile a few lines prior, the path separator may end up being different in the in-memory ZipFile buffer than in fname variable itself, leading to the KeyError exception ‘there is no item named some\path\or\other.ext in the archive’.
I managed to hack my way past it by replacing the _generate_examples method with one I generated on-the-fly that replaced the line above with…
new_fobj = zipfile.ZipFile(buffer).open(fname.replace(‘\’, ‘/’))
…but the fix that needs to be pulled into the repository would need to be a bit hardier than that.
Thanks Worked for me…