ingress-nginx: GLBC: Ingress can't be properly created: Insufficient Permission
I recently upgraded to kubernetes 1.7 with RBAC on GKE, and I am seeing this problem:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
6h 6m 75 loadbalancer-controller Warning GCE :Quota googleapi: Error 403: Insufficient Permission, insufficientPermissions
I have double-checked my quotas, and they are all green.
I have also tried granting the Node service account Project > Editor
permissions, and I have added the Node service account to the cluster-admin
ClusterRole, just in case it had anything to do with that (which it should not, right?).
GKE Cluster logs (slightly redacted):
{
insertId: "x"
jsonPayload: {
apiVersion: "v1"
involvedObject: {
apiVersion: "extensions"
kind: "Ingress"
name: "ingress-testing"
namespace: "default"
resourceVersion: "425826"
uid: "x"
}
kind: "Event"
message: "googleapi: Error 403: Insufficient Permission, insufficientPermissions"
metadata: {
creationTimestamp: "2017-07-15T12:54:37Z"
name: "ingress-testing.x"
namespace: "default"
resourceVersion: "53520"
selfLink: "/api/v1/namespaces/default/events/ingress-testing.14d1822c5ed30595"
uid: "x"
}
reason: "GCE :Quota"
source: {
component: "loadbalancer-controller"
}
type: "Warning"
}
logName: "projects/x/logs/events"
receiveTimestamp: "2017-07-15T19:11:59.117152623Z"
resource: {
labels: {
cluster_name: "app-cluster"
location: ""
project_id: "x"
}
type: "gke_cluster"
}
severity: "WARNING"
timestamp: "2017-07-15T19:11:54Z"
}
I have tried figuring out what the cause might be, but have not found anything that was applicable.
What can I do to get Ingress working again in my cluster?
Thanks!
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 8
- Comments: 60 (4 by maintainers)
This might just be the result of a failed upgrade from the issue @nicksardo referenced, but here is where my cluster was breaking down.
Apparently something went awry w/ the cluster upgrade and it removed my GKE Instances from the
k8s-ig-<guid>
Instance Group thus disallowing any traffic including health checks to the cluster’sNodePorts
and bringing everything offline.Manually forcing/adding the GKE Instances via the command line seems to have resolved the issue.
NOTE: I could not select any instance via the web console interface.
Version 1.7.1 of GKE is now rolling out to Google Cloud Platform zones. It contains a fix for the problem described in this thread.
You can check the availability of the version in a certain zone with
I just checked and issue disappeared for us too.
I was able to recreate ingresses as well and they work as expected.
(Maybe some angel is watching this discussion 😄)
The planned rollout dates for version 1.7.1 are listed in the release notes: https://cloud.google.com/container-engine/release-notes#july_18_2017
@zquestz, fyi looks like google upgraded the master overnight w/ a custom build. Creating a new ingress appears to work as expected now.
I’ve faced that problem too. The only option I found is to recreate the cluster on 1.6.7. Took time but you know, it’s better than just “wait for a fix”… Awful experience.
Google should have mentioned the problem with ingresses on https://status.cloud.google.com/ or even pull off the 1.7 from the upgrade choices.
Also, I agree with @mironov - Google should have pulled the 1.7.0 upgrade option until this was fixed, or labeled it as having a known issue. Even now, you can select it for new clusters and upgrade to it and there is no warning anywhere in the UI.
@iMelnik, nothing other than posting in this issue.
yea I wasted most of my day trying to figure out why my Load Balancers weren’t being created. We just started moving over to GKE today.
Great, I will contact support!
Is there anything I can do to avoid this on the production cluster, when I upgrade it to 1.7?
@nicksardo
Just following up on this, as of 2 hours ago, my ingress controllers started behaving normally again (?)
@zquestz No, I didn’t do anything except watching this thread. I don’t have Premium Support plan either. I am using asia-east1.
@zquestz We have a support ticket in for the same issue, and our representative indicated that the GKE team was able to mitigate the problem for clusters at 1.7.0. They still haven’t actually performed the mitigation (might be getting a lot of tickets on this?), but perhaps what @icereval reported is the result of it.
@icereval Did you do anything special to reach this version? I am stuck on 1.7.0
@zquestz, the change will only get your existing Ingress back online, any new ingress cannot be deployed until there is a software release to fix this issue.
Google control’s the glbc l7 controller via the master now so it is not easy to just roll back a version either.
In theory it should be possible to manually setup or even fix the existing ingress for your last site and point it at the nodeport for the service. A good reference for all the components @ https://github.com/kubernetes/ingress/tree/master/controllers/gce#l7-load-balancing-on-kubernetes
After a lot of back-and-forth with google support, the issue is still unresolved and will not be resolved until tomorrow, when the support person comes back to work… Not very satisfied with my “gold level support” at this point to be quite honest.