pipelines: [Multi User] failed to call `kfp.Client().create_run_from_pipeline_func` in in-cluster juypter notebook
What steps did you take:
In a multi-user enabled env, I created a notebook server on user’s namespace, launch a notebook and try to call Python SDK from there. When I execute the code below:
pipeline = kfp.Client().create_run_from_pipeline_func(mnist_pipeline, arguments={}, namespace='mynamespace')
What happened:
The API call was rejected with the following errors:
~/.local/lib/python3.6/site-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
236
237 if not 200 <= r.status <= 299:
--> 238 raise ApiException(http_resp=r)
239
240 return r
ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'content-length': '19', 'content-type': 'text/plain', 'date': 'Tue, 01 Sep 2020 00:58:39 GMT', 'server': 'envoy', 'x-envoy-upstream-service-time': '8'})
HTTP response body: RBAC: access denied
What did you expect to happen:
A pipeline run should be created and executed
Environment:
How did you deploy Kubeflow Pipelines (KFP)?
I installed the KFP on IKS with multi-user support KFP version: v1.1.0 KFP SDK version: v1.0.0
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
/kind bug
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 32
- Comments: 128 (71 by maintainers)
Commits related to this issue
- downgrade to kubeflow 1.0.1 until https://github.com/kubeflow/pipelines/issues/4440 is resolved — committed to pachyderm/kfdata by lukemarsden 4 years ago
- Revert "downgrade to kubeflow 1.0.1 until https://github.com/kubeflow/pipelines/issues/4440 is resolved" This reverts commit 49d2170f7b1a6a3424ed48fb59c91a3740d53742. — committed to pachyderm/kfdata by lukemarsden 4 years ago
@Bobgy
I studied the envoy filter more and here is a better version:
It directly uses the
custom request headerfeature that http_connection_manager provides. Because the header name/value are fixed, no need to use lua filter.@Bobgy sure!
kubeflow-useridheader from notebook to ml-pipeline service. In the example below, the notebook server’s name ismynotebookand userid for namespace: mynamespace isuser@example.comThe envoy filter above only inject the
kubeflow-useridHTTP header for those traffic going to ml-pipelie service@ErikEngerd The below
AuthorizationPolicyandEnvoyFilterworks for me with k8s 1.18.19, kubeflow 1.3.0 and istio 1.9.5.Function to create and run pipeline.
@Bobgy I thought about this issue again and I think this is where istio envoy filter could be used without any application change. I added an envoy filter to add
kubeflow-useridheader for those HTTP traffics going toml-pipeline.kubeflow.svc.cluster.local:8888. Then it works. So you actually don’t need to do the authentication trick for in-cluster use case. It’s weird to me to perform authentication for in-cluster scenario. The kubeflow-userid I injected in the http header is the namespace owner’s userid. I think it totally make sense.In conclusion, I added two config objects to make it work:
kubeflow-useridheader for ml-pipeline-api-server to validate the incoming request.If these two configs could be created with the notebook server, then it will be perfect!
After lib update it works fine, exact code that works for me is the variant without anything really special:
So all in all for me it was:
many thanks to @grapefruitL
@arshashi The envoyfilter should be added to user’s namespace. You may check your notebook server pod and see if its label does have the
notebook-name: mynotebook. That will make sure the envoyfilter would apply to the notebook server in user’s namespace. Also check the user’s namespace, i.e.and make sure the owner is
brainyapps@example.com. for example:In your case, there is no
kubeflow-useridis injected. I guess thenotebook-name: mynotebookis wrong. that’s my guess.edit oh another possibility is that in your kubeflow, the identity header name is not
kubeflow-userid. You may double check your kubeflow config.Result:
Following the instructions access Kubeflow Pipelines from inside your cluster solved my access issue.
PodDefault from the docs
Our official feature to support this use-case is https://github.com/kubeflow/pipelines/issues/5138. Because the PR has been merged and released: https://github.com/kubeflow/pipelines/pull/5676. Let’s keep tracking the task of finishing documentation in #5138, and close this issue.
@yhwang You are right. I fixed above error based on https://github.com/kubeflow-kale/kale/issues/210#issuecomment-727018461 this comment. Thank you for your reply!
Well, I think now it’s working correctly:
@yhwang thank you. I think the error above is related to Kale already. So, I’ll try to fix it on Kale side.
@mr-yaky in your
ServiceRoleBindingyou should changesource.principal: cluster.local/ns/anonymous/default-editortosource.principal: cluster.local/ns/anonymous/sa/default-editorCan you try it?
thanks @Bobgy! Got it working now.
Also note the header value has to be formatted like
accounts.google.com:<email>@jonasdebeukelaer are you on GCP? The header should be ‘x-goog-authenticated-user-email’
Another scenario (I guess this is more of what you want to support) is that, we have the same users A and B and namespaces A and B and namespace A is shared to User B. User B uses a notebook server in namespace B and tries to
kfp.Client.create_run_from_pipeline_func()fornamespace=A. The notebook server in namespace B (despite User B being the owner and namespace A is shared to User B) won’t have access to pipelines in namespace A.To fix the permission issue, user A should also invite namespace B’s default-editor service account as collaborator to namespace A. So conceptually, in this security model, a service account in the cluster has its own identity different from the user, and access are managed by the identity sending the request. Arguably, Kubernetes has the same auth model, if you run a notebook in cluster, you cannot use your own account’s permissions, the notebook has its service account’s permissions.
Coming a little late, but let me explain my current thoughts.
First, we don’t want to ask users to configure istio for RBAC access, istio configs are really brittle and requires a lot of knowledge to use and debug. Therefore, my ideal setup is like:
X-Forwarded-Client-Certinjected by istio sidecar: https://stackoverflow.com/a/58099997X-Forwarded-Client-Certshould contain auth information likespiffe://cluster.local/ns/<namespace>/sa/<service account>.3.1 If a request comes from istio gateway (probably we can configure this), then KFP api server interprets it as coming from users reading the special header like
kubeflow-userid3.2 If a request comes from other istio mTLS enabled sources, we know which service account initiated the request.
3.3 If a request doesn’t have
X-Forwarded-Client-Cert, it’s not authed by istio, we may develop other ways for auth like providing service account token with pipeline audienceSubjectAccessReviewto test if the corresponding user/service account can access a certain resource representing KFP using Kubernetes RBAC.With this setup, access to all KFP resources are backed by Kubernetes RBAC. If users have notebook servers in-cluster with istio sidecar (mTLS enabled), they only need to grant K8s RBAC permissions to those servers’ service accounts.
And we should provide an option to disable all of the authz checks, so if it’s not useful for an org, they can just disable it.
What are your thoughts? @yanniszark @yhwang
The following config is (finally) working for me. Note that my use case isn’t for notebooks, but rather an in-cluster
kfpclient, so switching thelistenerTypefromGATEWAYtoSIDECAR_INBOUNDwas necessary so that you got the header added on in-cluster traffic as well.Hi all! I will try to answer some of the questions that came up in this issue:
Istio RBAC is deprecated so I’m going to talk about Istio AuthorizationPolicy. Istio AuthorizationPolicy is a useful tool, but besides the obvious disadvantages (tied to Istio, harder to use, etc.), it doesn’t have the flexibility that the Pipelines model requires right now. Consider that in the current code:
On the contrary, we use Kubernetes RBAC as an authorization database and perform whatever complex logic we want in the API-Server. And as an authorization database, Kubernetes RBAC makes much more sense. Does this answer your question @yhwang? Please tell me if something is not clear! cc @Bobgy
Please also take a look at the Kubeflow-wide guideline for authorization, which prescribes RBAC and SubjectAccessReview: https://github.com/kubeflow/community/blob/master/guidelines/auth.md
I want to make it clear that the config I see outlined here is NOT secure. The sidecar can impersonate ANY identity. The correct way to enable programmatic access is to:
@Bobgy but we don’t need to have a notion of a ServiceAccount that “represents” a namespace. All ServiceAccount identities will be able to prove themselves to the Pipelines API Server with the design outlined above.
@yhwang My notebook server name was incorrect and now it works for me with the above two changes. Thanks alot for your time and suggestion, I was stuck with this issue for long time.
Since the notebook server uses serviceaccount:
default-editorin my user’s namespace, I can fixed the RBAC issue by adding a servicerolebinding to allow the serviceaccount to accessml-pipeline-service. However, the request is still rejected by the ml-pipelie-api-server:@rustam-ashurov-mcx try update kfp to 1.8.1:
pip install kfp --upgradeTrying this out now on kubernetes 1.21.0 with kubeflow 1.3.0 and istio 1.9.0 (docker.io/istio/pilot:1.9.0).
Unfortunately, it seems that istio no longer supports the rbac.istio.io/v1alpha1 and has replaced the ServerRoleBinding object by something else ( https://istio.io/latest/blog/2019/v1beta1-authorization-policy/). The instructions for migrating from v1alpha1 to v1beta1 are a bit complex. Do you have an example of an equivalent yaml file for an authorization policy that replaces the ServerRoleBinding?
For people tracking this issue, the correct solution will come from issue: https://github.com/kubeflow/pipelines/issues/5138
@kosehy my guess is you need to specify the namespace since it complains about empty namespace
@pedrocwb I have a similar question about 1.1.6 vs 1.3.1. For me the more relevant case is being able to authenticate from outside the cluster. Currently this requires passing the Cognito cookies. I have managed to get this to work with 1.1.6, but it actually looks like this currently doesn’t work with 1.3.1.
Even though I pass the correct cookies, I still get the
Request header error: there is no user identity headererror. I am going to spend some more time on this today and will give you feedback if I know a bit more.For our clients the two most important KF components are KFP and KFServing. At the moment we can use KFP with 1.1.6, but not 1.3.1. And only a very old version of KFServing seems to be compatible with 1.1.6.
I got the same problem that @karlschriek. In a further investigation, I discovered that KF v1.1 and above is using a very outdated istio version (1.1.6) so the EnvoyFilter @yhwang provided is not compatible with this version.
I tried to port the filter to be compatible with version 1.1.6 but it still doesn’t work.
Error:
EDIT: Very important to mention I’m on AWS.
This is what I used:
EDIT:
Fixed after @DavidSpek’s comment below
I would ping the appropriate WG that owns the config. I currently don’t have the bandwidth to work on this
That’s right, and I’d rather consider that as expected behavior. The scenario is based on the assumption, user A/B didn’t authenticate as themselves when using the notebook server, therefore, they should only be able to access what the notebook server’s service account can access.
User B will still have the choice to use KFP SDK to connect to cluster public endpoint and use his user credentials for authentication. In that case, User B will have access to namespace B, but the notebook server are shared between user A and B, so user B needs to be aware that his credentials may be used by anyone else having access to namespace A (which should be avoided).
@Bobgy yes, what you describe is pretty much how I planned to use Istio mTLS for authentication (via the XFCC header). As for SubjectAccessReview, we plan on delivering it after Kubecon is over. We have also refactored the authentication code a bit in order to support multiple auth methods (we call them authenticators). We’ll be pushing that upstream as well, after SubjectAccessReview. Does this sound good?
@Bobgy
In the RBAC config I posted above, the notebook would be
cluster.local/ns/mynamespace/sa/default-editor. So ml-pipeline-apiserver could use that and only allow that serviceaccount to access specific resources, for example: pipelines created bymynamespace’s ownerThanks @yhwang! The solution sounds secure and reasonable!
I guess the only concern I have, is that other users sharing the namespace can act as namespace owner’s permissions.
So if we can make up a service account that represents this namespace and only grant permission to access the same namespace, that will be ideal, but I believe that is totally possible, at least for GCP.
/cc @yanniszark @IronPan what do you think about this approach to grant cluster workload access to KFP api?
How does this work for non-GCP clusters? I saw this issue where it was stated that auth is only possible using GCP IAP, and that someone using AWS should use the kfp client from within the cluster.