pulsar: [Proto] java.lang.IllegalStateException: Some required fields are missing

Describe the bug image parse message metadata throw Exception To Reproduce Steps to reproduce the behavior:

  1. start up a standalone pulsar, the version is bigger than 2.10.0
  2. change broker config transactionCoordinatorEnabled=true managedLedgerMaxEntriesPerLedger=3 managedLedgerMinLedgerRolloverTimeMinutes=1
  3. config namespace ttl ./pulsar-admin namespaces set-message-ttl -ttl 10 public/default
  4.      PulsarClient pulsarClient = PulsarClient.builder().enableTransaction(true).serviceUrl("pulsar://127.0.0.1:6650").build();
         int sendTopicsCount = 200;
         String topicName = "test55";
         List<Producer<String>> producers = new ArrayList<>(sendTopicsCount);
    
         for (int i = 0; i < sendTopicsCount; i++) {
             producers.add(pulsarClient.newProducer(Schema.STRING).sendTimeout(0, TimeUnit.SECONDS).topic(i + topicName).create());
              Consumer<String> consumer = pulsarClient.newConsumer(Schema.STRING).subscriptionName("test").topic(i + topicName).subscribe();
             consumer.closeAsync();
         }
         for (int j = 0; j < 10000; j ++) {
             for (int i = 0; i < sendTopicsCount; i++) {
                 Transaction transaction = pulsarClient.newTransaction()
                         .withTransactionTimeout(10, TimeUnit.SECONDS).build().get();
                 System.out.println("send one transaction messageId" + producers.get(i)
                         .newMessage(transaction).value(i + topicName).send() + "   topic name : " + i);
                 transaction.commit();
                 Thread.sleep(10);
             }
         }```
    

Execute the above code multiple times Execute ./pulsar-admin namespaces unload public/default multiple times during execute the above code 4. See error image Expected behavior can’t throw exception

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]

Additional context Not a stable reproduction, requires multiple trials

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 38 (37 by maintainers)

Most upvoted comments

So the problem may be in netty-tc-native.version 2.0.48.Final.

Do you happen to use TLS in your setup? If not, it seems odd why the netty-tc-native version would cause the problem to reproduce.

btw. A major difference between Netty 4.1.68.Final and 4.1.74.Final is the Netty Recycler rewrite that was made in 4.1.71.Final. There’s more info in this comment: #13328 (comment)

I am not sure the problem in 2.0.48.Final. So I will test again. change netty version to 4.1.68.Final and don’t change the netty-tcnative version

@congbobo184 Thank you. Based on this experiment we cannot determine the issue, but since disabling Netty Recycler has an impact, there’s a possibility that recycling instances is causing issues in a way or another.

The next experiment could be to leave the Netty Recycler setting unchanged (enabled) and instead comment out the line recyclerHandle.recycle(this); in OpReadEntry: https://github.com/apache/pulsar/blob/9032a9afc15e029d8f205761b5e104f44f9c4be0/man[…]c/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java

That will disable recycling of OpReadEntry instances. If that makes the problem go away, it’s again more information about the possible problem.

@congbobo184 Would you be able to continue with this experiment?

@lhotari hi, I add some log. I found that, when the ByteBuf read from bookie client, the data length is not correct and the data is not this entry data which you want to read. So this problem may only the ByteBuf not clear enough, it may netty bug or bookie problem