pipelines: Problem running bigquery example from ai hub
What steps did you take:
Hi i am trying to execute a bigquery to google cloud storage example which can be found in the google ai hub here:
https://aihub.cloud.google.com/p/products%2F4700cd7e-2826-4ce9-a1ad-33f4a5bf7433/v/1/downloadpage
What happened:
I downloaded the example from the ai hub and imported the zip file using the kubeflow ui. When creating a run using this pipeline i get the following error message:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/local/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/ml/kfp_component/launcher/__main__.py", line 34, in <module>
main()
File "/ml/kfp_component/launcher/__main__.py", line 31, in main
launch(args.file_or_module, args.args)
File "kfp_component/launcher/launcher.py", line 45, in launch
return fire.Fire(module, command=args, name=module.__name__)
File "/usr/local/lib/python2.7/site-packages/fire/core.py", line 127, in Fire
component_trace = _Fire(component, args, context, name)
File "/usr/local/lib/python2.7/site-packages/fire/core.py", line 366, in _Fire
component, remaining_args)
File "/usr/local/lib/python2.7/site-packages/fire/core.py", line 542, in _CallCallable
result = fn(*varargs, **kwargs)
File "kfp_component/google/bigquery/_query.py", line 45, in query
client = bigquery.Client(project=project_id)
File "/usr/local/lib/python2.7/site-packages/google/cloud/bigquery/client.py", line 142, in __init__
project=project, credentials=credentials, _http=_http
File "/usr/local/lib/python2.7/site-packages/google/cloud/client.py", line 224, in __init__
Client.__init__(self, credentials=credentials, _http=_http)
File "/usr/local/lib/python2.7/site-packages/google/cloud/client.py", line 130, in __init__
credentials, _ = google.auth.default()
File "/usr/local/lib/python2.7/site-packages/google/auth/_default.py", line 305, in default
credentials, project_id = checker()
File "/usr/local/lib/python2.7/site-packages/google/auth/_default.py", line 165, in _get_explicit_environ_credentials
os.environ[environment_vars.CREDENTIALS])
File "/usr/local/lib/python2.7/site-packages/google/auth/_default.py", line 98, in _load_credentials_from_file
six.raise_from(new_exc, caught_exc)
File "/usr/local/lib/python2.7/site-packages/six.py", line 737, in raise_from
raise value
google.auth.exceptions.DefaultCredentialsError: ('File /secret/gcp-credentials/user-gcp-sa.json is not a valid json file.', ValueError('No JSON object could be decoded',))
What did you expect to happen:
The pipeline succeeds to run and generates a file on my google cloud storage bucket and that the generated secret contains valid data.
How did you deploy Kubeflow Pipelines (KFP)?
Kubeflow is deployed on GCP using the AI Platform Pipeline UI, which uses the installer from here:
Anything else you would like to add:
Looking into the kubernetes secret it seems that the definition is not correct as the Data fields seem to be empty.
kubectl describe secrets user-gcp-sa
Name: user-gcp-sa
Namespace: default
Labels: app=gcp-sa
app.kubernetes.io/name=kubeflow-pipelines
Annotations:
Type: Opaque
Data
====
application_default_credentials.json: 0 bytes
user-gcp-sa.json: 0 bytes
Any hints on this would be greatly apprechiated.
/kind bug
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (12 by maintainers)
@Bobgy thank you very much for your help. Finally i was able to run this example succesfully.
For anyone coming to this, here is what i did:
dsl-compile --py component.py --output ./output/pipeline.zipI also needed to make sure your compute engine default service account has the following roles: