sarama: brokerProducer is nil, panic

Versions
Sarama Kafka Go
1.32.0 2.8.7 1.18
Problem Description

This is not an inevitable situation, but it happens many times This is a crash triggered by brokerProducer=nil Test scenario: A message is sent every second, and the network connection is interrupted during the sending process. During the process, the producer will asynchronously close the producer, and Panic will occur in some cases in about 3-5 minutes In github.com/Shopify/sarama/async_producer.go 560 lines

if msg.retries > pp.highWatermark {
	// a new, higher, retry level; handle it and then back off
	pp.newHighWatermark(msg.retries)
	pp.backoff(msg.retries)
}

func (pp *partitionProducer) newHighWatermark(hwm int) {
	Logger.Printf("producer/leader/%s/%d state change to [retrying-%d]\n", pp.topic, pp.partition, hwm)
	pp.highWatermark = hwm

	
	pp.retryState[pp.highWatermark].expectChaser = true
	pp.parent.inFlight.Add(1)
//  Note: This is going to cause panic, because brokerProducer=nil
	pp.brokerProducer.input <- &ProducerMessage{Topic: pp.topic, Partition: pp.partition, flags: fin, retries: pp.highWatermark - 1}


	Logger.Printf("producer/leader/%s/%d abandoning broker %d\n", pp.topic, pp.partition, pp.leader.ID())
	pp.parent.unrefBrokerProducer(pp.leader, pp.brokerProducer)
	pp.brokerProducer = nil
}

I also took a quick look at the Dispatch method, but couldn’t figure out the specific reason This problem, only in 1.32.0, I have tested other versions without this problem

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 19 (4 by maintainers)

Most upvoted comments

As this seems to be related, on our production system (sarama v1.36.0, go 1.19; kafka v2.8.0) a pod just crashed with:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x877330]

goroutine 1168 [running]:
github.com/Shopify/sarama.(*partitionProducer).newHighWatermark(0x40008e38c0, 0x2)
	/go/pkg/mod/github.com/!shopify/sarama@v1.36.0/async_producer.go:620 +0x1e0
github.com/Shopify/sarama.(*partitionProducer).dispatch(0x40008e38c0)
	/go/pkg/mod/github.com/!shopify/sarama@v1.36.0/async_producer.go:564 +0x4a4
github.com/Shopify/sarama.withRecover(0xdf5190?)
	/go/pkg/mod/github.com/!shopify/sarama@v1.36.0/utils.go:43 +0x40
created by github.com/Shopify/sarama.(*asyncProducer).newPartitionProducer
	/go/pkg/mod/github.com/!shopify/sarama@v1.36.0/async_producer.go:515 +0x208

Are there any updates on this?

@crcms You mentioned that this is a must have BUG? So is this panic expected behavior if there is a connection issue for some minutes?

We are having the same issue, any updates on this?

edit: sarama v1.34.0 (go 1.19) with kafka v2.4.1