tensorflow: OutOfRangeError Unknown Error when extracting particular zip file
Please be aware this issue was originally posted in tensorflow/datasets but I got directed here as it seems that the issue is related to the GFile implementation: https://github.com/tensorflow/datasets/issues/1337
Short description When the download of imagenet_resized finished and tfds starts extracting/writing records, the program crashes.
You can reproduce this error by downloading the particular zip file manually and extracting it with tensorflow:
http://www.image-net.org/image/downsample/Imagenet32_train_npz.zip
Environment information
- Operating System: Windows 10
- Python version: 3.7
- tensorflow-datasets version: 1.3.2
- tensorflow-gpu version: 2.0.0
Reproduction instructions Without TFDS:
import zipfile
import tensorflow.compat.v2 as tf
path = 'path/to/file.zip'
with tf.io.gfile.GFile(path, 'rb') as fobj:
z = zipfile.ZipFile(fobj)
for member in z.infolist():
extract_file = z.open(member)
print(member.filename)
With TFDS:
import tensorflow_datasets as tfds
imagenet_data, info = tfds.load(name="imagenet_resized/32x32", with_info=True, as_supervised=True)
Link to logs
Dl Size...: 100%|██████████| 3414/3414 [22:47<00:00, 2.60 MiB/s]
0 examples [00:00, ? examples/s]Traceback (most recent call last):
File "C:\Program Files\Python37\lib\contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\file_format_adapter.py", line 199, in incomplete_dir
yield tmp_dir
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\dataset_builder.py", line 333, in download_and_prepare
download_config=download_config)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1008, in _download_and_prepare
max_examples_per_split=download_config.max_examples_per_split,
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\dataset_builder.py", line 871, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1033, in _prepare_split
total=split_info.num_examples, leave=False):
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tqdm\_tqdm.py", line 1005, in __iter__
for obj in iterable:
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\image\imagenet_resized.py", line 141, in _generate_examples
for fname, fobj in archive:
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_datasets\core\download\extractor.py", line 179, in iter_zip
z = zipfile.ZipFile(fobj)
File "C:\Program Files\Python37\lib\zipfile.py", line 1225, in __init__
self._RealGetContents()
File "C:\Program Files\Python37\lib\zipfile.py", line 1288, in _RealGetContents
endrec = _EndRecData(fp)
File "C:\Program Files\Python37\lib\zipfile.py", line 259, in _EndRecData
fpin.seek(0, 2)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\lib\io\file_io.py", line 167, in seek
offset += self.size()
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\lib\io\file_io.py", line 102, in size
return stat(self.__name).length
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\lib\io\file_io.py", line 727, in stat
return stat_v2(filename)
File "C:\Users\[username]\AppData\Roaming\Python\Python37\site-packages\tensorflow_core\python\lib\io\file_io.py", line 744, in stat_v2
pywrap_tensorflow.Stat(compat.as_bytes(path), file_statistics)
tensorflow.python.framework.errors_impl.OutOfRangeError: C:\Users\[username]\tensorflow_datasets\downloads\image-net.org_image_downs_Image_train_npzlCJjN-zBsDCdn80BZxJ6qtyTFYcDX7y1OSUjXtuuxPw.zip; Unknown error
Process finished with exit code 1
Expected behavior No error
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (5 by maintainers)
Changed the file extractor.py in line 131
into
which becomes
and everything works by using
Chiming in because I am receiving this same
OutOfRangeErroron Windows 10 with:The offending file is
tensorflow_datasets\downloads\images.cocodataset.org_zips_train2017aai7WOpfj5nSSHXyFBbeLp3tMXjpA_H3YD4oO54G2Sk.zip.I can provide the full traceback if you’d like, but it’s the same as the one above and I’d rather not spam the issue.