spinnaker: Error fetching new jobs from Travis and Travis trigger stop working after upgrading to >= 1.16

Issue Summary:

Travis pipeline trigger stopped working for us after upgrading to 1.16.

Cloud Provider(s):

N/A

Environment:

Trigger pipeline run after Travis(travis-ci.com) builds finish.

Feature Area:

Travis service of Igor

Description:

Travis pipeline triggers stopped working for us after upgrading to >= 1.16. Tried 1.17 and 1.18, both have the same issue. We used to put Travis configs in igor-local.yaml but the issue persists after we re-apply the config with halyard. Igor logs have this error all the time:

2020-02-18 17:57:40.957  WARN 1 --- [ix-travis-ci-10] c.n.s.igor.travis.service.TravisService  : An error occurred while fetching new jobs from Travis.

retrofit.RetrofitError: 500 Internal Server Error
	at retrofit.RetrofitError.httpError(RetrofitError.java:40) ~[retrofit-1.9.0.jar:na]
	at retrofit.RestAdapter$RestHandler.invokeRequest(RestAdapter.java:388) ~[retrofit-1.9.0.jar:na]
	at retrofit.RestAdapter$RestHandler.invoke(RestAdapter.java:240) ~[retrofit-1.9.0.jar:na]
	at com.sun.proxy.$Proxy160.jobs(Unknown Source) ~[na:na]
	at com.netflix.spinnaker.igor.travis.service.TravisService.lambda$null$6(TravisService.java:317) ~[igor-web.jar:na]
	at java.base/java.util.stream.IntPipeline$1$1.accept(IntPipeline.java:180) ~[na:na]
	at java.base/java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:108) ~[na:na]
	at java.base/java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:699) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[na:na]
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:na]
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[na:na]
	at com.netflix.spinnaker.igor.travis.service.TravisService.lambda$getJobs$8(TravisService.java:327) ~[igor-web.jar:na]
	at com.netflix.spinnaker.hystrix.SimpleJava8HystrixCommand.run(SimpleJava8HystrixCommand.java:52) ~[kork-hystrix-7.5.1.jar:7.5.1]
	at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:302) ~[hystrix-core-1.5.18.jar:1.5.18]
	at com.netflix.hystrix.HystrixCommand$2.call(HystrixCommand.java:298) ~[hystrix-core-1.5.18.jar:1.5.18]
	at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:46) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:51) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeDefer.call(OnSubscribeDefer.java:35) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:41) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeDoOnEach.call(OnSubscribeDoOnEach.java:30) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:48) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OnSubscribeLift.call(OnSubscribeLift.java:30) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.Observable.unsafeSubscribe(Observable.java:10327) ~[rxjava-1.3.8.jar:1.3.8]
	at rx.internal.operators.OperatorSubscribeOn$SubscribeOnSubscriber.call(OperatorSubscribeOn.java:100) ~[rxjava-1.3.8.jar:1.3.8]
	at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:56) ~[hystrix-core-1.5.18.jar:1.5.18]
	at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction$1.call(HystrixContexSchedulerAction.java:47) ~[hystrix-core-1.5.18.jar:1.5.18]
	at com.netflix.hystrix.strategy.concurrency.HystrixContexSchedulerAction.call(HystrixContexSchedulerAction.java:69) ~[hystrix-core-1.5.18.jar:1.5.18]
	at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:55) ~[rxjava-1.3.8.jar:1.3.8]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[na:na]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[na:na]
	at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]

Steps to Reproduce:

Downgrade to 1.15 the Travis trigger would fix the issue. The issue resumes after changing to >= 1.16.

Additional Details:

About this issue

Original URL
State: closed
Created 4 years ago
Reactions: 1
Comments: 27

Commits related to this issue

docs(travis): Use travis-ci.com rather than travis-ci.org https://github.com/spinnaker/spinnaker/issues/5459 — committed to jervi/igor by jervi 4 years ago
docs(travis): Change public Travis url from org to com https://github.com/spinnaker/spinnaker/issues/5459 https://blog.travis-ci.com/2018-05-02-open-source-projects-on-travis-ci-com-with-github-apps — committed to jervi/halyard by jervi 4 years ago
docs(travis): Use travis-ci.com rather than travis-ci.org (#642) https://github.com/spinnaker/spinnaker/issues/5459 Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> — committed to spinnaker/igor by jervi 4 years ago

Most upvoted comments

I’ll have a look 👍

jervi on Feb 26, 2020

I asked our support guy from Travis in Slack, and I just got confirmation they have fixed it at their end, so I’m gonna close this issue. Please re-open if you still have issues!

jervi on Apr 22, 2020

I got the following response from Travis support:

We are working on this issue. We will let you know asap. Thanks.

Hopefully they’ll have this fixed on their end soon.

jervi on Mar 16, 2020

@jervi Hi, I got this confirmation back from Travis support: “In summary, the issue is that there are a bunch of complex database queries, which crash the entire system when left to run beyond a specific time frame thus making the entire system unusable for everyone. We have explored options to optimize this, however, there are so many interconnected parts that it will require significant engineering work to improve this as it is.” Looks like we should just remove the state param to work around this?

CB-GuangyaoXie on Feb 28, 2020

Thanks! I tried a couple of times. None of the permutation of the state params work. I also submitted a ticket to Travis asking about the error.

CB-GuangyaoXie on Feb 27, 2020