keda: Possible memory leak in Keda-operator with Kafka scaler
Hi everyone, I started to use keda with the kafka scaler, defined it pretty simple according to the example and after deploying it to production, I noticed that every 3 days the pod reaches the kubernetes limits and get OOM. The memory increasing constantly and I’m not really sure why. This is the deployment description (I added few parameters such as limits, priorityClass and others):
Name: keda-operator
Namespace: keda
CreationTimestamp: Mon, 27 Apr 2020 12:56:42 +0300
Labels: app=keda-operator
app.kubernetes.io/component=operator
app.kubernetes.io/name=keda-operator
app.kubernetes.io/part-of=keda-operator
app.kubernetes.io/version=1.4.1
Annotations: deployment.kubernetes.io/revision: 2
Selector: app=keda-operator
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=keda-operator
Service Account: keda-operator
Containers:
keda-operator:
Image: docker.io/kedacore/keda:1.4.1
Port: <none>
Host Port: <none>
Command:
keda
Args:
--zap-level=info
Limits:
cpu: 100m
memory: 200Mi
Requests:
cpu: 100m
memory: 200Mi
Environment:
WATCH_NAMESPACE:
POD_NAME: (v1:metadata.name)
OPERATOR_NAME: keda-operator
Mounts: <none>
Volumes: <none>
Priority Class Name: line-of-business-service
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: keda-operator-fd678455c (1/1 replicas created)
Events: <none>
The Heap starts at around 30-40M and rinse till almost 200M, jumps to 240 and up and get OOM and restarted by the kubernetes daemon set.
Steps to Reproduce the Problem
- create a scaledObject to read from a Kafka topic.
- After hours of running the memory keeps increasing till it reached the pod limit
Specifications
- KEDA Version: 1.4.1
- Platform & Version: Kafka 2.3.0
- Kubernetes Version: 1.15
- Scaler(s): Kafka
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 2
- Comments: 15 (6 by maintainers)
Commits related to this issue
- Add Cisco as a user + Add TLSv1.3 docs (#814) — committed to SpiritZhou/keda by dgibbard-cisco 2 years ago
@jeli8-cor thanks a lot for the testing!
We should speed up development (and release) of v2, in case we are not able to find the cause and proved a fix for this issue in v1.
also experienced the same memory leak issues using keda v1.4.1 with the redis list scaler, but I upgraded to v1.5.0 and looks like that resolved it