pipelines: kfp.compiler.Compiler.compile() failed with Bazel due to kfp.CHECK_TYPE not found

What steps did you take:

I use Bazel to compile the pipeline. It has been working for kfp==1.0.0 but failed for kfp>=1.1.2.

Environment:

python==3.8.5 bazel==3.2.0

KFP SDK version: kfp==1.1.2 as well as kfp==1.3.0

Anything else you would like to add:

Traceback from kfp==1.1.2

.../pypi_kfp/kfp/compiler/compiler.py", line 917, in compile
    type_check_old_value = kfp.TYPE_CHECK
AttributeError: module 'kfp' has no attribute 'TYPE_CHECK'

I suspect that the pkgutil style namespace packaging breaks the import (at least in Bazel) https://github.com/kubeflow/pipelines/blob/1.1.2/sdk/python/kfp/__init__.py#L17

/kind bug /area sdk

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 15 (4 by maintainers)

Most upvoted comments

I confirm that bazel added pip_parsed_deps_kfp_pipeline_spec/site-packages/kfp/__init__.py and removing it as @cristifalcas suggested works as a work-around.

Nonetheless, this work-around does not seem to conform to namespace packages’ documentation, as kfp_pipeline_spec does not have __init__.py while kfp has __init__.py?

This error comes from 2 things:

Rules python will make that kfp folder from kfp-pipeline-spec a module and, after that, python will try to load ‘TYPE_CHECK’ from there and everything fails.

As a workaround you need to delete the generated __init__.py file from bazel path.

Something like this before loading the kfp module:

import sys

for module_path in sys.path:
    if module_path.endswith("nameused_kfp_pipeline_spec"):
        os.remove(f"{module_path}/kfp/__init__.py")

replace the path with whatever you have named your pip_parse/pip_install call in WORKSPACE.

Or open an issue with rules_python for a proper fix

Even without rules_python, namespace packages need to include identical init files, per the python doc I quoted above?

We were aware of the deviation from what’s suggested in the python doc when we made the namespace package change. My memory of why we possibly had to have this difference is hazy. But based on our research at that time and the experiment in the field, the difference doesn’t seems to cause any issue until later this Bazel case which hasn’t been proved to the result of the init.py difference yet. That being said, @connor-mccarthy on our team is currently actively investigating this topic for a different scenario.

Will it be possible to merge pipeline spec into kfp as suggested https://github.com/kubeflow/pipelines/issues/5087#issuecomment-1040721497?

Likely not an option. Pipeline spec was part of KFP package, and a copy was part of TFX package, the result was users cannot install KFP and TFX in the same environment otherwise the proto-generated Python code conflicts with each other. Due to that issue, we made pipeline spec a standalone package as a dependency for both KFP and TFX–TFX doesn’t need to take the entire KFP package as a dependency.

I use bazel to work with python, c++ etc, just as in tensorflow etc. I understand that google internally uses something similar to bazel. What build tool would you recommend in open source if not bazel? What does kfp use without bazel?

For KFP, the SDK part is implicitly “built” through test and packaged using setuptools. The rests (backend and frontend) are built into containers using Dockerfile. I don’t have any recommendation for build tools in open source. Maybe you can try this question in the KFP Slack channel?