tensorflow: tf.io.gfile.glob missing some patterns. Using tf-nightly
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): - TensorFlow version (use command below): tf-nightly
- Python version: - Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: - GPU model and memory:
Describe the current behavior
cc @Conchylicultor,
Please have a look on issue from TFDS
tensorflow/datasets#1670, tests are failing for PlantVillage
and The300wLp
datasets because in _generate_example
function of both plant_village.py
and the300w_lp.py tf.io.gfile.glob()
does not correctly matches all examples patterns. However python glob
solves issue see PR tensorflow/datasets#1684
Describe the expected behavior
tf.io.gfile.glob() must matches all patterns provided so that all required examples are generated.
Standalone code to reproduce the issue
Please have a look on this colab
notebook, it contains all tracebacks as well as problem with tf.io.gfile.glob()
and how python glob
solves this issue.
As glob fix this issue but we have to use tf.io.gfile
because we need to support GCS and other distributed files systems.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 21 (9 by maintainers)
@Eshan-Agarwal For the future, here is what a minimum reproductible example looks like:
This allow the team to easily understand what the issue is. They can just copy past the code and experiment with it. This save many hours, as all people working on the issue can get started immediately without having to go through the 10000+ lines of codes of TFDS.
@mihaimaruseac The bug is that
tf.io.gfile.glob
fails when(
are present in the path. This is a regression as it only appear in TF nightly. Not TF 2.1. This make some TFDS tests fails as some datasets rely on this global pattern to generate the dataset.I confirm this fixed our tests. Thank you very much!
TF 2.2.0-rc2 has been released and this issue should be fixed now.
@Eshan-Agarwal thank you for confirming.
@mihaimaruseac I believe this should be prioritised. This not only impact TFDS but potentially every users using
tf.io.gfile.glob
. As the issue is silent, users may not even notice there is a bug. In our case we got lucky to have good unit-tests. Note: The issue only happened externally. Internally, our tests works fine.@Conchylicultor @mihaimaruseac thanks for your quick responses, Actually I upload temp folder containing some example you can download folder from here. but it is good to use code provided by @Conchylicultor without any external uploading.
@Eshan-Agarwal the difference between the colab and the example template suggested is that we need to have the exact same setup for the colab, whereas the suggested template creates the files (with zero bytes) so it can be easily converted into a test case that now fails and after fixing will succeed.
But it’s ok, I’ll take care of this issue.
@Conchylicultor @mihaimaruseac please look on this colab notebook
Yes, TFDS tests have started failing for patterns like:
tf.io.gfile.glob('/path/to/file/[!Code]*[!_Flip]/[!_]*.jpg')
ortf.io.gfile.glob('/path/to/*.[jJ][pP][gG]')
.@Eshan-Agarwal Could you provide a small self-contained code snippet to reproduce the issue ?
Something like: