keda: Keda operator facing issue in finding offset block

Report

Getting error from keda operator saying “error”:“error finding offset block for topic XXX-XXX-XXX and partition 1”

{"level":"error","ts":"2023-02-08T16:47:13Z","logger":"kafka_scaler","msg":"","type":"ScaledObject","namespace":"XXX","name":"XXX-XXX-XXX","error":"error finding offset block for topic XXX.XXX-XXX and partition 1","stacktrace":"github.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getLagForPartition\n\t/workspace/pkg/scalers/kafka_scaler.go:448\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getTotalLag\n\t/workspace/pkg/scalers/kafka_scaler.go:597\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).GetMetricsAndActivity\n\t/workspace/pkg/scalers/kafka_scaler.go:568\ngithub.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsForScaler\n\t/workspace/pkg/scaling/cache/scalers_cache.go:77\ngithub.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics\n\t/workspace/pkg/scaling/scale_handler.go:439\ngithub.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics\n\t/workspace/pkg/metricsservice/server.go:45\ngithub.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler\n\t/workspace/pkg/metricsservice/api/metrics_grpc.pb.go:79\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/workspace/vendor/google.golang.org/grpc/server.go:1340\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/workspace/vendor/google.golang.org/grpc/server.go:1713\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/workspace/vendor/google.golang.org/grpc/server.go:965"}

Expected Behavior

Keda operator should face issue to find offset block

Actual Behavior

Throws error with error finding offset block for topic

Steps to Reproduce the Problem

  1. Deploy KEDA 2.9.0
  2. Check logs of keda operator

Logs from KEDA operator

{"level":"error","ts":"2023-02-08T16:47:13Z","logger":"kafka_scaler","msg":"","type":"ScaledObject","namespace":"YYY","name":"XXX-XXX,"error":"error finding offset block for topic XX.XXX-XXX and partition 2","stacktrace":"github.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getLagForPartition\n\t/workspace/pkg/scalers/kafka_scaler.go:448\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getTotalLag\n\t/workspace/pkg/scalers/kafka_scaler.go:597\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).GetMetricsAndActivity\n\t/workspace/pkg/scalers/kafka_scaler.go:568\ngithub.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsForScaler\n\t/workspace/pkg/scaling/cache/scalers_cache.go:77\ngithub.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics\n\t/workspace/pkg/scaling/scale_handler.go:439\ngithub.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics\n\t/workspace/pkg/metricsservice/server.go:45\ngithub.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler\n\t/workspace/pkg/metricsservice/api/metrics_grpc.pb.go:79\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/workspace/vendor/google.golang.org/grpc/server.go:1340\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/workspace/vendor/google.golang.org/grpc/server.go:1713\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/workspace/vendor/google.golang.org/grpc/server.go:965"}

KEDA Version

2.9.0

Kubernetes Version

1.25

Platform

Google Cloud

Scaler Details

Kafka Scaler

Anything else?

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 31 (12 by maintainers)

Most upvoted comments

I experienced it again also with 2.10.0 - same log entries as @johnnytardin Sadly I recreated the pod without getting the logs out before doing so -.- However a pod restart of keda-operator fixed it again, and the scaling started correctly again.

@johnnytardin - Actually I haven’t seen it since. However I only noticed it once with the earlier version. I am performing some intensive testing on our Kafka workloads the coming month, so if it’s still present I should encounter it again soonish.

As it looks as solved, I’m going to close the issue. If the problem happens again, comment here and I can reopen it. Just to announce it, we have added another “flavour” of kafka scaler as experimental scaler: https://keda.sh/docs/2.12/scalers/apache-kafka-go/ It’s experimental because it doesn’t have all the functionality yet, but it uses kafka-go instead of sarama client. It’s covered by e2e test cases and even though there could be a missing test, the majority of them are the same (I mean, it’s not just random code placed there). You can give a try too if the issue persists

Any update related to that @oshmoun in new version?

Sorry for the lack of updates. Ever since updating the issue has not occurred, and keda scaling has been working smoothly. I guess in this scenario the saying “no news is good news” applies perfectly 🙂

sorry for the lack of update on this matter. I have updated keda to 2.11.1, and since then the issue has not occurred. Hopefully not just a fluke, since the issue is of random nature. But so far so good 🤞

I did not need to restart keda to fix the issue, but simply disable idle scaling by removing idleReplicaCount from the configured ScaledObjects

idleReplicaCount doesn’t have impact at scaler level, it’s something external for KEDA itself, not for kafka trigger, I think that your problem has gone because modifying the ScaledObject you have triggered a rebuild of the scaler

@dttung2905 Updated this weekend. helm.sh/chart=keda-2.10.0 Will post the output when/if it happens again.