data: running FSSpecFileLister in ikernel doesn't work
🐛 Describe the bug
Hi This bug is following the conversation on discuss.pytorch.org When running the following code in a jupyter kernel - the fs.protocol is not consistent
To reproduce - there is a need to update the url_to_fs
call in /torchdata/datapipes/iter/load/fsspec.py
file
fs, path = fsspec.core.url_to_fs(self.root, token='/Path/to/creds/credentials.json')
then run the following code
from torchdata.datapipes.iter import FSSpecFileLister
image_bucket = "gs://path/to/folder"
datapipe = FSSpecFileLister(root=image_bucket, masks=['*.png'])
file_dp = datapipe.open_file_by_fsspec(mode='rb')
list(file_dp)
in the second time running this code without restarting the kernel the URI returns without the gs://
but with the full path of the environment.
Versions
Collecting environment information… PyTorch version: 1.11.0 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A
OS: macOS 12.3.1 (x86_64) GCC version: Could not collect Clang version: 13.1.6 (clang-1316.0.21.2.3) CMake version: Could not collect Libc version: N/A
Python version: 3.9.13 (main, May 24 2022, 21:28:31) [Clang 13.1.6 (clang-1316.0.21.2)] (64-bit runtime) Python platform: macOS-12.3.1-x86_64-i386-64bit Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True
Versions of relevant libraries: [pip3] facenet-pytorch==2.5.2 [pip3] mypy-extensions==0.4.3 [pip3] numpy==1.22.4 [pip3] pytorch-ignite==0.4.9 [pip3] torch==1.11.0 [pip3] torchdata==0.3.0 [pip3] torchvision==0.12.0
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 21 (9 by maintainers)
For the
token
argument, we addedkwargs
toFSSpecFileLister
. With TorchData 0.4.0 or nightly release, you should be able to add your token there: https://github.com/pytorch/data/blob/f1a128ec789f078852943e8c58377a99b42a7b45/torchdata/datapipes/iter/load/fsspec.py#L57Based on the discussion on the forum, it seems that there are two issues.
Just want to confirm that you mean the process hangs forever, right? 3. Re-iterate over your pipeline would raise
FileNotFoundError
in ipython kernel. But, there won’t be such a problem by running it as a script…