training-operator: helm permission issue on 1.8.1
When I try to install the operator on a 1.8.1 cluster (GKE) like so
helm install https://storage.googleapis.com/tf-on-k8s-dogfood-releases/latest/tf-job-operator-chart-latest.tgz -n tf-job --wait --replace --set cloud=gke
I get the error
Error: release tf-job failed: namespaces "default" is forbidden: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "default": Unknown user "system:serviceaccount:kube-system:default"
This looks like an RBAC issue. Previously I was using K8s 1.7 so I guess something changed with 1.8 which is why I’m hitting this now.
@sozercan Any idea what’s going on? Is the problem that helm needs to be granted appropriate permissions as mentioned here
helm version
Client: &version.Version{SemVer:"v2.4.2", GitCommit:"82d8e9498d96535cc6787a6a9194a76161d29b4c", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.7.0", GitCommit:"08c1144f5eb3e3b636d9775617287cc26e53dba4", GitTreeState:"clean"}
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 18 (6 by maintainers)
@foxish what’s the proper way to setup helm on a GKE cluster running 1.8? Should it just work or is it expected that I have to run commands like the following (from this post)
Just these commands, it’ll work
kubectl create serviceaccount --namespace kube-system tiller kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller kubectl patch deploy --namespace kube-system tiller-deploy -p ‘{“spec”:{“template”:{“spec”:{“serviceAccount”:“tiller”}}}}’
helm init --service-account tiller --upgrade
Tiller that is bundled with Azure includes service account and role bindings (as
cluster-admin). I am guessing this doesn’t come with GKE?Tfjob CRD sets up it’s own serviceaccount and role bindings, so that shouldn’t be an issue. Sounds like this is permissions for the tiller itself. Maybe we can update the docs to include something like this in case it doesn’t exist