ingress-nginx: GLBC: Ingress can't be properly created: Insufficient Permission

I recently upgraded to kubernetes 1.7 with RBAC on GKE, and I am seeing this problem:

  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  6h		6m		75	loadbalancer-controller			Warning		GCE :Quota	googleapi: Error 403: Insufficient Permission, insufficientPermissions

I have double-checked my quotas, and they are all green.

I have also tried granting the Node service account Project > Editor permissions, and I have added the Node service account to the cluster-admin ClusterRole, just in case it had anything to do with that (which it should not, right?).

GKE Cluster logs (slightly redacted):

{
 insertId:  "x"   
 jsonPayload: {
  apiVersion:  "v1"    
  involvedObject: {
   apiVersion:  "extensions"     
   kind:  "Ingress"     
   name:  "ingress-testing"     
   namespace:  "default"     
   resourceVersion:  "425826"     
   uid:  "x"     
  }
  kind:  "Event"    
  message:  "googleapi: Error 403: Insufficient Permission, insufficientPermissions"    
  metadata: {
   creationTimestamp:  "2017-07-15T12:54:37Z"     
   name:  "ingress-testing.x"     
   namespace:  "default"     
   resourceVersion:  "53520"     
   selfLink:  "/api/v1/namespaces/default/events/ingress-testing.14d1822c5ed30595"     
   uid:  "x"     
  }
  reason:  "GCE :Quota"    
  source: {
   component:  "loadbalancer-controller"     
  }
  type:  "Warning"    
 }
 logName:  "projects/x/logs/events"   
 receiveTimestamp:  "2017-07-15T19:11:59.117152623Z"   
 resource: {
  labels: {
   cluster_name:  "app-cluster"     
   location:  ""     
   project_id:  "x"     
  }
  type:  "gke_cluster"    
 }
 severity:  "WARNING"   
 timestamp:  "2017-07-15T19:11:54Z"   
}

I have tried figuring out what the cause might be, but have not found anything that was applicable.

What can I do to get Ingress working again in my cluster?

Thanks!

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 8
  • Comments: 60 (4 by maintainers)

Most upvoted comments

This might just be the result of a failed upgrade from the issue @nicksardo referenced, but here is where my cluster was breaking down.

Apparently something went awry w/ the cluster upgrade and it removed my GKE Instances from the k8s-ig-<guid> Instance Group thus disallowing any traffic including health checks to the cluster’s NodePorts and bringing everything offline.

Manually forcing/adding the GKE Instances via the command line seems to have resolved the issue.

NOTE: I could not select any instance via the web console interface.

$ gcloud compute instance-groups unmanaged add-instances k8s-ig--<guid> --zone us-east1-d --instances=gke-cluster-1-default-pool-abc1234-wxyz

Version 1.7.1 of GKE is now rolling out to Google Cloud Platform zones. It contains a fix for the problem described in this thread.

You can check the availability of the version in a certain zone with

gcloud container get-server-config --zone us-central1-b

I just checked and issue disappeared for us too.

I was able to recreate ingresses as well and they work as expected.

(Maybe some angel is watching this discussion 😄)

The planned rollout dates for version 1.7.1 are listed in the release notes: https://cloud.google.com/container-engine/release-notes#july_18_2017

@zquestz, fyi looks like google upgraded the master overnight w/ a custom build. Creating a new ingress appears to work as expected now.

image

I’ve faced that problem too. The only option I found is to recreate the cluster on 1.6.7. Took time but you know, it’s better than just “wait for a fix”… Awful experience.

Google should have mentioned the problem with ingresses on https://status.cloud.google.com/ or even pull off the 1.7 from the upgrade choices.

Also, I agree with @mironov - Google should have pulled the 1.7.0 upgrade option until this was fixed, or labeled it as having a known issue. Even now, you can select it for new clusters and upgrade to it and there is no warning anywhere in the UI.

@iMelnik, nothing other than posting in this issue.

yea I wasted most of my day trying to figure out why my Load Balancers weren’t being created. We just started moving over to GKE today.

Great, I will contact support!

Is there anything I can do to avoid this on the production cluster, when I upgrade it to 1.7?

@nicksardo

Just following up on this, as of 2 hours ago, my ingress controllers started behaving normally again (?)

LASTSEEN   FIRSTSEEN   COUNT     NAME                                              KIND      SUBOBJECT   TYPE      REASON       SOURCE
3m        2h        18        my-magic-ingress   Ingress             Normal    Service   loadbalancer-controller   default backend set to some-backend:32516

@zquestz No, I didn’t do anything except watching this thread. I don’t have Premium Support plan either. I am using asia-east1.

@zquestz We have a support ticket in for the same issue, and our representative indicated that the GKE team was able to mitigate the problem for clusters at 1.7.0. They still haven’t actually performed the mitigation (might be getting a lot of tickets on this?), but perhaps what @icereval reported is the result of it.

@icereval Did you do anything special to reach this version? I am stuck on 1.7.0

@zquestz, the change will only get your existing Ingress back online, any new ingress cannot be deployed until there is a software release to fix this issue.

Google control’s the glbc l7 controller via the master now so it is not easy to just roll back a version either.

In theory it should be possible to manually setup or even fix the existing ingress for your last site and point it at the nodeport for the service. A good reference for all the components @ https://github.com/kubernetes/ingress/tree/master/controllers/gce#l7-load-balancing-on-kubernetes

After a lot of back-and-forth with google support, the issue is still unresolved and will not be resolved until tomorrow, when the support person comes back to work… Not very satisfied with my “gold level support” at this point to be quite honest.