spinnaker: Waiting executions doesn't follow FIFO
Issue Summary:
Waiting executions doesn’t follow FIFO
Cloud Provider(s):
Environment:
Spinnaker 1.21.4, Spinnaker 1.25
Feature Area:
Orca queue
Description:
The sequence of queued executions, which should follow FIFO, gets messed up when two conditions are met
- The pipeline has branches. When one branch completes, the running execution is waiting for other branches to complete.
- Pipeline disables concurrent executions. So several pending executions are waiting in a queue when pipeline has running executions.
Checked redis queue, the order of 01F0WQV9NR65J9F9N2FWN8FRVK is changed from oldest to newest. And the attempts attribute get +1
127.0.0.1:6379> LRANGE orca.pipeline.queue.8b4f6fbf-df5b-4ac9-9b0e-a63b3dd38c97 0 10
1) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQVEY9XK52Y47W2DD1F91W\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":1}]}"
2) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQVCB8EPRJAJNQ55VF3QTN\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":1}]}"
3) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQV9NR65J9F9N2FWN8FRVK\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":1}]}"
127.0.0.1:6379>
127.0.0.1:6379>
127.0.0.1:6379> LRANGE orca.pipeline.queue.8b4f6fbf-df5b-4ac9-9b0e-a63b3dd38c97 0 10
1) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQV9NR65J9F9N2FWN8FRVK\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":2}]}"
2) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQVEY9XK52Y47W2DD1F91W\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":1}]}"
3) "{\"kind\":\"startExecution\",\"executionType\":\"PIPELINE\",\"executionId\":\"01F0WQVCB8EPRJAJNQ55VF3QTN\",\"application\":\"issuedebug\",\"attributes\":[{\"kind\":\"attempts\",\"attempts\":1}]}"
Checked redis message, looks like the “completeExecution” message of current running execution may be related to re-order. But we have no idea why it pops out the oldest waiting execution and pushes it back.
{"kind":"completeExecution","executionType":"PIPELINE","executionId":"01F0WQV6WFRXCS2JQ5C8X043WP","application":"issuedebug","attributes":[{"kind":"attempts","attempts":3}]}
{"kind":"runTask","executionType":"PIPELINE","executionId":"01F0WQV6WFRXCS2JQ5C8X043WP","application":"issuedebug","stageId":"01F0WQV6WTS33MK59K55FBEQ3S","taskId":"1","taskType":"com.netflix.spinnaker.orca.pipeline.tasks.WaitTask","attributes":[{"kind":"attempts","attempts":1}],"ackTimeoutMs":600000}
We met this issue since 2019 (previous issue https://github.com/spinnaker/spinnaker/issues/4587. ) Now it still exists in 1.21, even latest 1.25. We also tried mysql as queue, not working.
Steps to Reproduce:
- create one pipeline, with two wait stages as two branches. One wait stage1 set like 10 secs, the other wait stage2 set 300 secs
- disable concurrent execution of this pipeline
- run this pipeline 5 times, to make sure one running and 4 waiting. The 4 waiting executions should follow FIFO in redis queue at the very beginning (with attempt=1)
- When wait stage1 complete but waitStage2 is still running, the “kind”:“completeExecution” will be created in orca message.
- Now monitor the waiting 4 executions from both UI and redis queue, you can see the re-order.
Additional Details:
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 31
Commits related to this issue
- fix(queue): treat waiting pipelines queue as FIFO if keepWaitingPipelines (#1677) — committed to spinnaker/orca by anotherchrisberry 7 years ago
I have created a PR to fix this issue, can someone help to review this and approve. https://github.com/spinnaker/orca/pull/4356
Hi Arjun, There are the two workarounds to bypass the issue.
Both require to redesign the pipeline. We are still waiting for the fix.