tensorflow: Unable access S3 using the S3 filesystem


System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): yes
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): docker container based on centos7
  • TensorFlow installed from (source or binary): binary, from pip
  • TensorFlow version (use command below): (‘v1.4.0-rc1-11-g130a514’, ‘1.4.0’)
  • Python version: 2.7.5
  • Bazel version (if compiling from source): n/a
  • GCC/Compiler version (if compiling from source): n/a
  • CUDA/cuDNN version: n/a
  • GPU model and memory: n/a
  • Exact command to reproduce:
from tensorflow.python.lib.io import file_io
file_io.stat('s3://datasets.elasticmapreduce/ngrams/books/20090715/eng-1M/1gram/data')

Describe the problem

I’m unable to read files using the s3 filesystem. I believe my example above should work, please correct my usage of the api if not. I tried using both an object in a public bucket and one in a private bucket that I have credentials in my ~/.aws/config for.

I’m not quite sure what the error is, or how to diagnose it further. Setting the log level to Debug had no effect. tf.logging.set_verbosity(tf.logging.DEBUG)

Source code / logs

Traceback from file_io.stat

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 540, in stat
    return file_statistics
  File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Object s3://datasets.elasticmapreduce/ngrams/books/20090715/eng-1M/1gram/data does not exist

Using the aws cli, you can verify the object exists

aws s3 ls s3://datasets.elasticmapreduce/ngrams/books/20090715/eng-1M/1gram/data

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 17 (11 by maintainers)

Most upvoted comments

Added a PR #15493 for the S3 logging. Please take a look if interested.