keda: keda-operator memory leak when prometheus scaler having errors

Report

When prometheus scaler having errors while fetching metrics, memory starts to grow on keda-operator until it gets OOMKilled.

Behaviour as follows: graphana

Installation of Keda is done via plain manifest: https://github.com/kedacore/keda/releases/download/v2.11.2/keda-2.11.2.yaml

Expected Behavior

Memory is not growing when any of scalers have errors

Actual Behavior

Memory is growing, when prometheus scaler having errors (example fetch metrics from prometheus)

Steps to Reproduce the Problem

  1. Deploy service with prometheus scaler type with address that does not exists
	        - type: prometheus
            metadata:
              query: sum(rate(rabbitmq_client_messages_published_total{service_name=~'kafka-api-events-to-rabbitmq'}[2m]))
              threshold: '200'
              serverAddress: https://non-existing-prometheus-url # that returns 404
  1. keda-operator will start pushing Errors in stderr
  2. memory usage will start to grow

Logs from KEDA operator

2023-12-04T14:38:18Z	ERROR	prometheus_scaler	error executing prometheus query	{"type": "ScaledObject", "namespace": "tooling", "name": "debug-service", "error": "prometheus query api returned error. status: 404 response: "}
github.com/kedacore/keda/v2/pkg/scalers.(*prometheusScaler).GetMetricsAndActivity
	/workspace/pkg/scalers/prometheus_scaler.go:359
github.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsAndActivityForScaler
	/workspace/pkg/scaling/cache/scalers_cache.go:139
github.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics
	/workspace/pkg/scaling/scale_handler.go:508
github.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics
	/workspace/pkg/metricsservice/server.go:47
github.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler
	/workspace/pkg/metricsservice/api/metrics_grpc.pb.go:99
google.golang.org/grpc.(*Server).processUnaryRPC
	/workspace/vendor/google.golang.org/grpc/server.go:1343
google.golang.org/grpc.(*Server).handleStream
	/workspace/vendor/google.golang.org/grpc/server.go:1737
google.golang.org/grpc.(*Server).serveStreams.func1.1
	/workspace/vendor/google.golang.org/grpc/server.go:986

KEDA Version

2.12.1

Kubernetes Version

1.26

Platform

Amazon Web Services

Scaler Details

prometheus

Anything else?

No response

About this issue

  • Original URL
  • State: closed
  • Created 7 months ago
  • Comments: 29 (21 by maintainers)

Most upvoted comments

Nice! The fix is already merged so it’ll be included as part of next release 😁