tensorflow: Unable to access files on S3 with tf.io.gfile.GFile
Please make sure that this is a bug. As per our GitHub Policy, we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:bug_template
System information
- Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS Mojave
- Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device:
- TensorFlow installed from (source or binary): - TensorFlow version (use command below): 2.0.0
- Python version: - Bazel version (if compiling from source):
- GCC/Compiler version (if compiling from source):
- CUDA/cuDNN version: - GPU model and memory:
You can collect some of this information using our environment capture
script
You can also obtain the TensorFlow version with: 1. TF 1.0: python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)" 2. TF 2.0: python -c "import tensorflow as tf; print(tf.version.GIT_VERSION, tf.version.VERSION)"
Describe the current behavior
Executing the following code:
with tf.io.gfile.GFile("s3://path/to/my/file", mode="r") as f:
data = f.read()
results in the following error message:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 124, in read
length = self.size() - self.tell()
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 102, in size
return stat(self.__name).length
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 727, in stat
return stat_v2(filename)
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 744, in stat_v2
pywrap_tensorflow.Stat(compat.as_bytes(path), file_statistics)
tensorflow.python.framework.errors_impl.NotFoundError: Object s3://[REDACTED]/train.txt does not exist
Describe the expected behavior
The contents of the file should be written to variable data
Standalone code to reproduce the issue Provide a reproducible test case that is the bare minimum necessary to generate the problem. If possible, please share a link to Colab/Jupyter/any notebook.
This points to a public s3 file, but still fails:
https://colab.research.google.com/drive/1VSlfzRPdFNSGGI8wd6RdhH9uFj7fkaFG
Other info / logs Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached.
2020-03-30 23:22:36.465054: I tensorflow/core/platform/s3/aws_logging.cc:54] Initializing config loader against fileName /Users/neelabh//.aws/config and using profilePrefix = 1
2020-03-30 23:22:36.465103: I tensorflow/core/platform/s3/aws_logging.cc:54] Initializing config loader against fileName /Users/neelabh//.aws/credentials and using profilePrefix = 0
2020-03-30 23:22:36.465129: I tensorflow/core/platform/s3/aws_logging.cc:54] Setting provider to read credentials from /Users/neelabh//.aws/credentials for credentials file and /Users/neelabh//.aws/config for the config file , for use with profile default
2020-03-30 23:22:36.465207: I tensorflow/core/platform/s3/aws_logging.cc:54] Creating AWSHttpResourceClient with max connections2 and scheme http
2020-03-30 23:22:36.465336: I tensorflow/core/platform/s3/aws_logging.cc:54] Initializing CurlHandleContainer with size 2
2020-03-30 23:22:36.465391: I tensorflow/core/platform/s3/aws_logging.cc:54] Creating Instance with default EC2MetadataClient and refresh rate 300000
2020-03-30 23:22:36.465427: I tensorflow/core/platform/s3/aws_logging.cc:54] Added EC2 metadata service credentials provider to the provider chain.
2020-03-30 23:22:36.465682: I tensorflow/core/platform/s3/aws_logging.cc:54] Successfully reloaded configuration.
2020-03-30 23:22:36.465889: I tensorflow/core/platform/s3/aws_logging.cc:54] Initializing CurlHandleContainer with size 25
2020-03-30 23:22:36.466559: I tensorflow/core/platform/s3/aws_logging.cc:54] Pool grown by 2
2020-03-30 23:22:36.466586: I tensorflow/core/platform/s3/aws_logging.cc:54] Connection has been released. Continuing.
2020-03-30 23:22:37.536377: E tensorflow/core/platform/s3/aws_logging.cc:60] HTTP response code: 301
Exception name:
Error message: No response body.
7 response headers:
content-type : application/xml
date : Mon, 30 Mar 2020 17:52:37 GMT
server : AmazonS3
transfer-encoding : chunked
x-amz-bucket-region : eu-north-1
x-amz-id-2 : 021phnKX0e6e9R+N9sMrXZHViGoHdzJrTT5rnyHyWsP8d9ErkPMZT02RbTZcjVeCVrI/3hDkWk8=
x-amz-request-id : 7DAC97A633CA370B
2020-03-30 23:22:37.536443: W tensorflow/core/platform/s3/aws_logging.cc:57] If the signature check failed. This could be because of a time skew. Attempting to adjust the signer.
2020-03-30 23:22:37.536681: I tensorflow/core/platform/s3/aws_logging.cc:54] Connection has been released. Continuing.
2020-03-30 23:22:37.818954: W tensorflow/core/platform/s3/aws_logging.cc:57] Encountered Unknown AWSError 'PermanentRedirect': The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
2020-03-30 23:22:37.819073: E tensorflow/core/platform/s3/aws_logging.cc:60] HTTP response code: 301
Exception name: PermanentRedirect
Error message: Unable to parse ExceptionName: PermanentRedirect Message: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.
7 response headers:
content-type : application/xml
date : Mon, 30 Mar 2020 17:52:37 GMT
server : AmazonS3
transfer-encoding : chunked
x-amz-bucket-region : eu-north-1
x-amz-id-2 : ob9TL15Q4Y7/idUzUTWvurB3Z4nxfVYRV2V+9ly88HrVGuHytuZA1U02rhcL0vFUpv83vUxeO9o=
x-amz-request-id : 51E3695245C81463
2020-03-30 23:22:37.819138: W tensorflow/core/platform/s3/aws_logging.cc:57] If the signature check failed. This could be because of a time skew. Attempting to adjust the signer.
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 124, in read
length = self.size() - self.tell()
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 102, in size
return stat(self.__name).length
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 727, in stat
return stat_v2(filename)
File "/Users/neelabh/opt/anaconda3/envs/tftrt/lib/python3.6/site-packages/tensorflow_core/python/lib/io/file_io.py", line 744, in stat_v2
pywrap_tensorflow.Stat(compat.as_bytes(path), file_statistics)
tensorflow.python.framework.errors_impl.NotFoundError: Object s3://[REDACTED]/train.txt does not exist
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 6
- Comments: 26 (3 by maintainers)
For my part it was an SSL problem whenever I ran TensorFlow on an ec2 instance (curl 77 errors in the logs), I temporarily fixed it by setting
S3_VERIFY_SSL=0.I fixed this issue by adding AWS_REGION and S3_ENDPOINT environment variables. For example:
Here is what I did:
I ran the following code in a
tensorflow/tensorflow:nightlycontainer. TF version was2.2.0-dev20200402This produced the following error:
Now, when I install
boto3in the same container and uses3.Object().get()I am able to download the file. Also works with wget and the HTTP link to the file.Not fixed in my case.
_tensorflow.python.framework.errors_impl.UnknownError: {{function_node wrapped__ReadFile_device/job:localhost/replica:0/task:0/device:CPU:0}} : curlCode: 77, Problem with the SSL CA cert (path? access rights?) [Op:ReadFile]