ec2-fleet-plugin: Plugin does not scale up when needed
I am currently using version 1.9.2
.
I have 2 cases where Scaling up does not go into effect.
- Jenkins Plugin does not scale up automatically when I set the minimum cluster to 2 when I am starting out with 0 nodes/instances.
- To get around this I had to set the target to
2
in AWS or run enough builds where the Jenkins Plugin scales up to 2 nodes.
- Once I have at least 2 nodes running (the minimum for the cluster), if I run enough jobs where the executors for the current nodes fill up and a queue begins to form, there is no scale up.
My current configurations is:
Minimum Cluster Size: 2 Maximum Cluster Size: 20 Number of Executors: 5 Max Idle Minutes Before Scaledown: 2 Connect Using Private IP: true Maximum Init Connection Timeout in sec: 60
Checking AWS Cloudtrail, Jenkins logs, or the Spot Request history
there is never any information about scaling nodes up.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 3
- Comments: 25
to all,
Since we have some many questions around
plugin doesn't scale up
I think it’s time to explain a little bit how does it work, and ask the big question about that.Intro
The current implementation of the plugin fully depends on Jenkins scale-up strategy. Plugin scales up capacity only by Jenkins request. The plugin can decide to don’t scale-up. if Jenkins requested but cannot scale-up without Jenkins request. The plugin doesn’t have other than
max fleet size
logic to skip scale-up. And noexcessworkload
in Jenkins log means no request to plugin from Jenkins!Jenkins scale-up
One of the main goals for default Jenkins scale-up strategy is avoiding capacity spikes. That’s why you don’t see immediate capacity increasing when you just run tons of jobs. By default Jenkins is trying to postpone capacity scale-up as much as possible, so existent nodes could deal with queue. In other words, Jenkins focus on throughput, but not execution latency. Direct result for this is a small bill.
Possible solution
In the modern world, this approach could be too conservative as usually, we expect a fast result. To do that plugin could override default Jenkins scale-up strategy and do provision almost immediately. The downside will be higher AWS bill versus default strategy.
Question
Do you prefer fast allocation time versus a small bill?
Please put 👍 for this comment if you want to have a choice in plugin configuration to use default Jenkins scale-up or custom fast allocation.
No delay provision now is available in 1.13.0 just check in configuration
No Delay Provision Strategy
It significally improve response time of Jenkins for new load in queue when existent capacity is not enough.
I started noticing this same issue on multiple jenkins that we have even with different plugin versions.
I’m having the exact same problem with the version
1.1.9
On a different Jenkins I tried to upgrade to the latest versions (both Jenkins and ec2-fleet-plugin) and still find the same issue 🙂
Same, here with an entire AWS region at my disposal and 3-4 digit quotas for some instance types, but builds via EC2 Fleet are queuing for a single instance.
hello from 2023!!! Same issue on latest jenkins and ec2fleet versions, heeeeelp!
@smastrorocco then you should vote on this comment: https://github.com/jenkinsci/ec2-fleet-plugin/issues/125#issuecomment-519368220
These are my notes at my second testing: Minimum Cluster Size: 4 Maximum Cluster Size: 10 Number of Executors: 5 Max Idle Minutes Before Scaledown: 2 Connect Using Private IP: true Maximum Init Connection Timeout in sec: 0
ec2-fleet
label. I began to seeexcessworkload = 1
in the logs.excessworkload = 5
.excessworkload
)Would my issue be related to the way Jenkins load balances jobs? Has there been reports of this plugin : https://wiki.jenkins.io/display/JENKINS/Least+Load+Plugin causing issues?