spinnaker: GCE: False Traffic guard errors while downscaling server groups

Issue Summary:

We are getting intermittent error in our pipeline execution, while performing deployments. After deploying server group when we try and disable/down scale the previous sever group it gives error.

Cloud Provider(s):

GCP We have spinnaker setup on GCE

Environment:

Ubuntu 16.04 xenial

Feature Area:

Orca, Clouddriver

Description:

Error Details

stackTrace: "java.lang.IllegalStateException: This cluster ('http' in deploy-my-project/asia-east1) has traffic guards enabled. Disabling http-v002 would leave the cluster with no instances taking traffic. at com.netflix.spinnaker.orca.clouddriver.utils.TrafficGuard.verifyOtherServerGroupsAreTakingTraffic(TrafficGuard.java:144) at com.netflix.spinnaker.orca.clouddriver.utils.TrafficGuard.verifyTrafficRemoval(TrafficGuard.java:105) at sun.reflect.GeneratedMethodAccessor548.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMethodSite.invoke(PojoMetaMethodSite.java:192) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56) at com.netflix.spinnaker.orca.clouddriver.tasks.servergroup.DisableServerGroupTask.validateClusterStatus(DisableServerGroupTask.groovy:43) at sun.reflect.GeneratedMethodAccessor573.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:384) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022) at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:69) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:174) at com.netflix.spinnaker.orca.clouddriver.tasks.servergroup.AbstractServerGroupTask$_execute_closure1.doCall(AbstractServerGroupTask.groovy:74) at com.netflix.spinnaker.orca.clouddriver.tasks.servergroup.AbstractServerGroupTask$_execute_closure1.doCall(AbstractServerGroupTask.groovy) at sun.reflect.GeneratedMethodAccessor539.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93) at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325) at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294) at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022) at groovy.lang.Closure.call(Closure.java:414) at org.codehaus.groovy.runtime.ConvertedClosure.invokeCustom(ConvertedClosure.java:54) at org.codehaus.groovy.runtime.ConversionHandler.invoke(ConversionHandler.java:124) at com.sun.proxy.$Proxy166.get(Unknown Source) at com.netflix.spinnaker.kork.core.RetrySupport.retry(RetrySupport.java:26) at sun.reflect.GeneratedMethodAccessor451.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoCachedMethodSiteNoUnwrap.invoke(PojoMetaMethodSite.java:213) at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56) at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:149) at com.netflix.spinnaker.orca.clouddriver.tasks.servergroup.AbstractServerGroupTask.execute(AbstractServerGroupTask.groovy:73) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$handle$1$1.invoke(RunTaskHandler.kt:82) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$handle$1$1.invoke(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.AuthenticationAwareKt$sam$Callable$7d29dd26.call(AuthenticationAware.kt) at com.netflix.spinnaker.security.AuthenticatedRequest.lambda$propagate$1(AuthenticatedRequest.java:79) at com.netflix.spinnaker.orca.q.handler.AuthenticationAware$DefaultImpls.withAuth(AuthenticationAware.kt:49) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.withAuth(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$handle$1.invoke(RunTaskHandler.kt:81) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$handle$1.invoke(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$withTask$1.invoke(RunTaskHandler.kt:173) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler$withTask$1.invoke(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$withTask$1.invoke(OrcaMessageHandler.kt:47) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$withTask$1.invoke(OrcaMessageHandler.kt:31) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$withStage$1.invoke(OrcaMessageHandler.kt:57) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$withStage$1.invoke(OrcaMessageHandler.kt:31) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$DefaultImpls.withExecution(OrcaMessageHandler.kt:66) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.withExecution(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$DefaultImpls.withStage(OrcaMessageHandler.kt:53) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.withStage(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$DefaultImpls.withTask(OrcaMessageHandler.kt:40) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.withTask(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.withTask(RunTaskHandler.kt:166) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.handle(RunTaskHandler.kt:63) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.handle(RunTaskHandler.kt:51) at com.netflix.spinnaker.q.MessageHandler$DefaultImpls.invoke(MessageHandler.kt:36) at com.netflix.spinnaker.orca.q.handler.OrcaMessageHandler$DefaultImpls.invoke(OrcaMessageHandler.kt) at com.netflix.spinnaker.orca.q.handler.RunTaskHandler.invoke(RunTaskHandler.kt:51) at com.netflix.spinnaker.orca.q.audit.ExecutionTrackingMessageHandlerPostProcessor$ExecutionTrackingMessageHandlerProxy.invoke(ExecutionTrackingMessageHandlerPostProcessor.kt:47) at com.netflix.spinnaker.q.QueueProcessor$pollOnce$1$1.run(QueueProcessor.kt:74) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)" 

Steps to Reproduce:

This is an intermittent issue. 2 out of 5 execution we get this issue. We have configured traffic guard on all the server groups for our pipelines. After new server group is deployed and enabled for taking traffic and scaled up, we disable the old sever group followed by down scaling it.

Not sure what is going wrong and where.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 15 (10 by maintainers)

Most upvoted comments

@ezimanyi We have been using spinnaker 1.8.4 from past few weeks we have not come across this issue as of now. I will close this issue. Will re open if we see this happening again. Thanks