ec2-fleet-plugin: Planned capacity snapshot bug
During the setup of this plugin I ran into the following issue: no instances would be spawned because it thinks it has enough capacity in the form of so called ‘planned capacity snapshots’. I’m not quite sure what this is, I can’t find any clear documentation on it. The logs give me the following statement:
currentDemand -1 availableCapacity 2 (availableExecutors 0 connectingExecutors 0 plannedCapacitySnapshot 2 additionalPlannedCapacity 0)
Current demand is -1, which is correct since one build is queued up and awaiting execution. But no instance will spawn so it never executes.
Some additional information: I’ve had 2 on-demand instances before running alongside the spotfleet to allow for builds to not break during the installation of the spotfleet. Perhaps this is related to the planned capacity snapshot capacity? I’m quite sure if this is relevant though.
Am I missing something here? Is there documentation on the different capacity types?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 26
Fixes released in 1.16.2
Will keep it open, please post feedback if possible, thx
ok, I think I fixed both problems, widget and actual provision in https://github.com/jenkinsci/ec2-fleet-plugin/pull/159 will update when it will be merged
Hi, we recently updated the plugin from version
1.11.1
to version1.16.1
.Since this update we encounter lot of problems, it works during several hours and then when no jobs are running the Fleets scale-down to 0 (all our EC2 Fleets are configure with
Min Cluster Size
equals to0
).A this time new jobs began to be queued and all our Fleets stays at 0 indefinitely.
Yesterday I restarted my Jenkins slave, after restart the jobs which were queued restarted correctly. This morning our job queue was clean and everything was good.
But, today at 12 am we started to see a big job queue again and all our Spot Fleets stays at 0. The Spot Fleets do not scale-up anymore.
I’m not 100% sure but it seems several persons reported similar problems :
On our side we tried the following without success.
No Delay Provision Strategy
, checked or unchecked the effect is the same Spot Fleet stays intactMinimum Cluster Size
from0
to1
has no effect (and nothing appears in the Spot Fleets history on AWS side)The only thing which can make the Spot Fleet scale-up again seems to be a Jenkins restart. But after few hours the problems is their again (it seems to appear after our Spot Fleet scale down to 0).
Today I finally succeed to see strange exceptions in our logs.
Here is the configuration associated to the EC2 Fleet
micro-1.1.0
.The other strange thing I see is that the EC2 Fleet Status panel cannot be opened.
Don’t know if its related but perhaps is also cause by a property which cannot be read somewhere.
Finally, just after a Jenkins restart I can also see the following exception in my logs.
This exception is related to the Github plugin but I found it mentioned on the EC2 Plugin too so i’m wondering if it could be linked : https://issues.jenkins-ci.org/browse/JENKINS-54041
Hope their is enough context to track the problem, we absolutely need a fix for this because it impacts our whole team.
Thanks for your help.
Hi, after one day it seems to work, at least the scale-up is not locked as before. The Widget bug is also solved.
This is pretty much the same issue as in #149 . There is something off with the capacity calculation. Our solution for now is to add a jenkins restart every hour in crontab