keda: Prometheus Scaler - Maintain last known state if prometheus is unavailable
I am in the process of migrating to Keda. I currently use https://github.com/DirectXMan12/k8s-prometheus-adapter and it has a very useful feature. In the event that Prometheus goes down, prom-adapter maintains the last known state of the metric. This means scaling is not triggered either up or down.
With Keda, if prometheus is not available, my deployments are scaled to zero after the cooldownPeriod has expired regardless of whether the last known value was above 0 or not.
Use-Case
We are using prom adapter to scale google pubsub subscribers and rabbitmq workers. In the unlikely event that prometheus goes down we would want the existing workload to continue processing based on the numbers it knew before prometheus stopped responding.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 5
- Comments: 24 (13 by maintainers)
“Maintain last known state” - I think this approach has its drawbacks, especially when autoscaling to zero via
minReplicaCount: 0. Imagine that you can’t wake up your system, because the Keda Operator can’t temporarily reach the source of metrics.I just hit this problem with
postgresqltrigger. After a security group change in our AWS account, the Keda Operator suddenly couldn’t reach our Postgres database and the whole system just scaled down to zero, making the service unavailable.I propose a new (optional) field
onErrorReplicaCountthat would serve as a default value when Operator can’t read current values, ie.:@bschaeffer you can use https://github.com/kedacore/keda/issues/1872 to mitigate this problem.
Not that I’m aware of. Are you interested in contributing this @lambohamp ?
Hi @zroubalik, thank you. I think I’ll step away from this one, but will be watching the progress.