keda: Keda operator facing issue in finding offset block
Report
Getting error from keda operator saying “error”:“error finding offset block for topic XXX-XXX-XXX and partition 1”
{"level":"error","ts":"2023-02-08T16:47:13Z","logger":"kafka_scaler","msg":"","type":"ScaledObject","namespace":"XXX","name":"XXX-XXX-XXX","error":"error finding offset block for topic XXX.XXX-XXX and partition 1","stacktrace":"github.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getLagForPartition\n\t/workspace/pkg/scalers/kafka_scaler.go:448\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getTotalLag\n\t/workspace/pkg/scalers/kafka_scaler.go:597\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).GetMetricsAndActivity\n\t/workspace/pkg/scalers/kafka_scaler.go:568\ngithub.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsForScaler\n\t/workspace/pkg/scaling/cache/scalers_cache.go:77\ngithub.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics\n\t/workspace/pkg/scaling/scale_handler.go:439\ngithub.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics\n\t/workspace/pkg/metricsservice/server.go:45\ngithub.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler\n\t/workspace/pkg/metricsservice/api/metrics_grpc.pb.go:79\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/workspace/vendor/google.golang.org/grpc/server.go:1340\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/workspace/vendor/google.golang.org/grpc/server.go:1713\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/workspace/vendor/google.golang.org/grpc/server.go:965"}
Expected Behavior
Keda operator should face issue to find offset block
Actual Behavior
Throws error with error finding offset block for topic
Steps to Reproduce the Problem
- Deploy KEDA 2.9.0
- Check logs of keda operator
Logs from KEDA operator
{"level":"error","ts":"2023-02-08T16:47:13Z","logger":"kafka_scaler","msg":"","type":"ScaledObject","namespace":"YYY","name":"XXX-XXX,"error":"error finding offset block for topic XX.XXX-XXX and partition 2","stacktrace":"github.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getLagForPartition\n\t/workspace/pkg/scalers/kafka_scaler.go:448\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).getTotalLag\n\t/workspace/pkg/scalers/kafka_scaler.go:597\ngithub.com/kedacore/keda/v2/pkg/scalers.(*kafkaScaler).GetMetricsAndActivity\n\t/workspace/pkg/scalers/kafka_scaler.go:568\ngithub.com/kedacore/keda/v2/pkg/scaling/cache.(*ScalersCache).GetMetricsForScaler\n\t/workspace/pkg/scaling/cache/scalers_cache.go:77\ngithub.com/kedacore/keda/v2/pkg/scaling.(*scaleHandler).GetScaledObjectMetrics\n\t/workspace/pkg/scaling/scale_handler.go:439\ngithub.com/kedacore/keda/v2/pkg/metricsservice.(*GrpcServer).GetMetrics\n\t/workspace/pkg/metricsservice/server.go:45\ngithub.com/kedacore/keda/v2/pkg/metricsservice/api._MetricsService_GetMetrics_Handler\n\t/workspace/pkg/metricsservice/api/metrics_grpc.pb.go:79\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/workspace/vendor/google.golang.org/grpc/server.go:1340\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/workspace/vendor/google.golang.org/grpc/server.go:1713\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/workspace/vendor/google.golang.org/grpc/server.go:965"}
KEDA Version
2.9.0
Kubernetes Version
1.25
Platform
Google Cloud
Scaler Details
Kafka Scaler
Anything else?
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 31 (12 by maintainers)
I experienced it again also with 2.10.0 - same log entries as @johnnytardin Sadly I recreated the pod without getting the logs out before doing so -.- However a pod restart of keda-operator fixed it again, and the scaling started correctly again.
@johnnytardin - Actually I haven’t seen it since. However I only noticed it once with the earlier version. I am performing some intensive testing on our Kafka workloads the coming month, so if it’s still present I should encounter it again soonish.
As it looks as solved, I’m going to close the issue. If the problem happens again, comment here and I can reopen it. Just to announce it, we have added another “flavour” of kafka scaler as experimental scaler: https://keda.sh/docs/2.12/scalers/apache-kafka-go/ It’s experimental because it doesn’t have all the functionality yet, but it uses
kafka-goinstead of sarama client. It’s covered by e2e test cases and even though there could be a missing test, the majority of them are the same (I mean, it’s not just random code placed there). You can give a try too if the issue persistsSorry for the lack of updates. Ever since updating the issue has not occurred, and keda scaling has been working smoothly. I guess in this scenario the saying “no news is good news” applies perfectly 🙂
sorry for the lack of update on this matter. I have updated keda to 2.11.1, and since then the issue has not occurred. Hopefully not just a fluke, since the issue is of random nature. But so far so good 🤞
idleReplicaCountdoesn’t have impact at scaler level, it’s something external for KEDA itself, not for kafka trigger, I think that your problem has gone because modifying the ScaledObject you have triggered a rebuild of the scaler@dttung2905 Updated this weekend. helm.sh/chart=keda-2.10.0 Will post the output when/if it happens again.