tfx: TFX >= 1.4.0 fails with S3 as backend due to tensorflow-io not being imported
TFX >= 1.4.0 fails with S3 as backend due to tensorflow-io not being imported. Up to tensorflow 2.5.*, the other filesystems was a part of tensorflow but from TF 2.6 this has been moved to tf-io. However, tf io isn’t imported in tfx/orchestration/kubeflow/container_entrypoint.py and hence, S3 (and several other) filesystem can’t be used.
- Have I specified the code to reproduce the issue (Yes, No): No
- Environment in which the code is executed (e.g., Local(Linux/MacOS/Windows), Interactive Notebook, Google Cloud, etc): KubeFlow, Ubuntu image
- TensorFlow version: 2.7
- TFX Version: 1.5
- Python version: 3.7
- Python dependencies (from
pip freezeoutput):
Describe the current behavior TFX >= 1.4.0 fails with S3 as backend due to tensorflow-io not being imported
Describe the expected behavior S3 filesystem should work.
Standalone code to reproduce the issue Any simple pipeline which uses s3 as storage backend.
Other info / logs
INFO:absl:Going to run a new execution 27735
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/root/pyenv/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 476, in <module>
main(sys.argv[1:])
File "/root/pyenv/lib/python3.7/site-packages/tfx/orchestration/kubeflow/container_entrypoint.py", line 468, in main
execution_info = component_launcher.launch()
File "/root/pyenv/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py", line 524, in launch
execution_preparation_result = self._prepare_execution()
File "/root/pyenv/lib/python3.7/site-packages/tfx/orchestration/portable/launcher.py", line 384, in _prepare_execution
self._output_resolver.get_executor_output_uri(execution.id)),
File "/root/pyenv/lib/python3.7/site-packages/tfx/orchestration/portable/outputs_utils.py", line 169, in get_executor_output_uri
fileio.makedirs(execution_dir)
File "/root/pyenv/lib/python3.7/site-packages/tfx/dsl/io/fileio.py", line 80, in makedirs
_get_filesystem(path).makedirs(path)
File "/root/pyenv/lib/python3.7/site-packages/tfx/dsl/io/plugins/tensorflow_gfile.py", line 71, in makedirs
tf.io.gfile.makedirs(path)
File "/root/pyenv/lib/python3.7/site-packages/tensorflow/python/lib/io/file_io.py", line 515, in recursive_create_dir_v2
_pywrap_file_io.RecursivelyCreateDir(compat.path_to_bytes(path))
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 's3' not implemented (file: 's3://pipelines/tfx/trace_model_pipeline/TimeBasedExampleGen/.system/executor_execution/27735')
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 4
- Comments: 44 (22 by maintainers)
The root cause is https://github.com/tensorflow/tensorflow/issues/51583. TF dropped s3 / HDFS support from 2.6 and I believe that all our packages are affected by this. We could support s3 by importing tensorflow_io dependency in the repo.
This fix could be potentially included in next release.
@varshaan Mostly great news: the issue in tf Transform seems to be resolved!
@jiyongjung0 Slightly worse news: similar issue is still present in Evaluator (see log below). Can someone look at this ASAP? This is the issue as in Transform before so a simple
import tensorflow_iowill probably do the trick.This is the final component so when this is fixed, TFX is officially S3 certified again.
@ConverJens - Could you try with a nightly post https://github.com/tensorflow/transform/commit/6f082654050dc6b49b8c3e2549445487c30f3c75 and let me know if that works?
@jiyongjung0 @varshaan Indeed, I used TFMA 0.37.0 and when upgrading to 0.38.0 it worked! Thank you very much for your time and effort! I consider this issue closed.
I’m still working on it. I’ll have something out by early next week.
@varshaan It did. But more over, TFX 1.6 also works if one force installs tensorflow 2.5.1 which is the last version where filesystem support was still a part of tensorflow.
@ConverJens Thank you for the explanation. I think that your insight is correct. But the use of TF API might be hard to change because TF-Transform cannot depends on TFX.
It seems like Beam calls the transform libraries in a separate worker process and
tensorflow_iois not imported in it. We might need to add an import (For example, similar to what TFX did) at TFT. (CC @varshaan )