pulsar: Log compaction fails due to timeout

Describe the bug Log compaction fails due to a timeout.

To Reproduce

  1. Start pulsar standalone
  2. Create producer
  3. Produce messages at high throughput
  4. Trigger topic compaction
  5. Read compaction status

Detailed steps to reproduce:

  1. Clear all state
# rm -r data logs # !!! careful !!! this wipes all your topics etc.
  1. Start pulsar standalone
bin/pulsar standalone
  1. Create producer
PulsarClient pulsarClient = PulsarClient.builder()
    .serviceUrl("pulsar://192.168.11.22:6650")
    .build();
Producer<byte[]> producer = pulsarClient.newProducer()
    .topic("my-topic")
    .create();
  1. Produce messages at high throughput
int msgCount = 500000;
long before = System.currentTimeMillis();
for (int i = 0; i < msgCount; i++) {
    producer.newMessage()
            .key("asdf")
            .value(new byte[] {1, 2, 3, 4, 5})
            .sendAsync();
}
System.out.println("Sent " + msgCount + " messages in " + (System.currentTimeMillis() - before) + " msec");
  1. Trigger topic compaction
$ bin/pulsar-admin topics compact persistent://public/default/my-topic
Topic compaction requested for persistent://public/default/my-topic
  1. Read compaction status
$ bin/pulsar-admin topics compaction-status persistent://public/default/my-topic
Compaction is currently running

$ # WAIT FOR 10 SECONDS

$ bin/pulsar-admin topics compaction-status persistent://public/default/my-topic
Error in compaction
null

Reason: Error compacting: java.util.concurrent.TimeoutException: Timeout

Expected behavior Compaction and no error

Desktop (please complete the following information):

  • Pulsar version: 2.5.0 (tar.gz from website)
  • OS: Linux 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 unknown unknown GNU/Linux

Logs The complete logs from the start of compaction until the error was read via compaction status

09:47:54.409 [pulsar-web-57-8] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Starting Pulsar consumer status recorder with config: {
  "topicNames" : [ "persistent://public/default/my-topic" ],
  "topicsPattern" : null,
  "subscriptionName" : "__compaction",
  "subscriptionType" : "Exclusive",
  "receiverQueueSize" : 1000,
  "acknowledgementsGroupTimeMicros" : 100000,
  "negativeAckRedeliveryDelayMicros" : 60000000,
  "maxTotalReceiverQueueSizeAcrossPartitions" : 50000,
  "consumerName" : null,
  "ackTimeoutMillis" : 0,
  "tickDurationMillis" : 1000,
  "priorityLevel" : 0,
  "cryptoFailureAction" : "FAIL",
  "properties" : { },
  "readCompacted" : true,
  "subscriptionInitialPosition" : "Latest",
  "patternAutoDiscoveryPeriod" : 1,
  "regexSubscriptionMode" : "PersistentOnly",
  "deadLetterPolicy" : null,
  "autoUpdatePartitions" : true,
  "replicateSubscriptionState" : false,
  "resetIncludeHead" : false,
  "keySharedPolicy" : null
}
09:47:54.410 [pulsar-web-57-8] INFO  org.apache.pulsar.client.impl.ConsumerStatsRecorderImpl - Pulsar client config: {
  "serviceUrl" : "pulsar://localhost:6650",
  "authPluginClassName" : null,
  "authParams" : null,
  "operationTimeoutMs" : 30000,
  "statsIntervalSeconds" : 60,
  "numIoThreads" : 1,
  "numListenerThreads" : 1,
  "connectionsPerBroker" : 1,
  "useTcpNoDelay" : true,
  "useTls" : false,
  "tlsTrustCertsFilePath" : null,
  "tlsAllowInsecureConnection" : false,
  "tlsHostnameVerificationEnable" : false,
  "concurrentLookupRequest" : 5000,
  "maxLookupRequest" : 50000,
  "maxNumberOfRejectedRequestPerConnection" : 50,
  "keepAliveIntervalSeconds" : 30,
  "connectionTimeoutMs" : 10000,
  "requestTimeoutMs" : 60000,
  "initialBackoffIntervalNanos" : 100000000,
  "maxBackoffIntervalNanos" : 60000000000
}
09:47:54.411 [pulsar-web-57-8] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:47:54 +0900] "PUT /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 204 0 "-" "Pulsar-Java-v2.5.0" 6
09:47:54.411 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribing to topic on cnx [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650]
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Subscribing on topic persistent://public/default/my-topic / __compaction
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-__compaction] Rewind from 1544:100 to 1544:0
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][__compaction] Created new subscription for 1
09:47:54.412 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Created subscription on topic persistent://public/default/my-topic / __compaction
09:47:54.413 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 1
09:47:54.413 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Seek subscription to message id -1:-1:-1
09:47:54.415 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.Consumer - Disconnecting consumer: Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:47:54.416 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.AbstractDispatcherSingleActiveConsumer - Removing consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:47:54.416 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ClientCnx - [localhost/127.0.0.1:6650] Broker notification of Closed consumer: 1
09:47:54.416 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://public/default/my-topic] [__compaction] Closed connection [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650] -- Will try again in 0.1 s
09:47:54.416 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentSubscription - [persistent://public/default/my-topic][__compaction] Successfully disconnected consumers from subscription, proceeding with cursor reset
09:47:54.416 [bookkeeper-ml-workers-OrderedExecutor-2-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic] Initiate reset position to 1544:-1 on cursor __compaction
09:47:54.424 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic] reset position to 1544:-1 before current read position 1544:0 on cursor __compaction
09:47:54.425 [BookKeeperClientWorker-OrderedExecutor-1-0] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] [persistent://public/default/my-topic][__compaction] Reset subscription to message id -1:-1
09:47:54.426 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Successfully reset subscription to message id -1:-1:-1
09:47:54.516 [pulsar-timer-87-1] INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://public/default/my-topic] [__compaction] Reconnecting after timeout
09:47:54.517 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribing to topic on cnx [id: 0xeb973738, L:/127.0.0.1:60180 - R:localhost/127.0.0.1:6650]
09:47:54.517 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Subscribing on topic persistent://public/default/my-topic / __compaction
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [public/default/persistent/my-topic-__compaction] Rewind from 1544:-1 to 1544:0
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic] There are no replicated subscriptions on the topic
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://public/default/my-topic][__compaction] Created new subscription for 1
09:47:54.518 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Created subscription on topic persistent://public/default/my-topic / __compaction
09:47:54.518 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Subscribed to topic on localhost/127.0.0.1:6650 -- consumer: 1
09:47:54.526 [pulsar-external-listener-86-1] WARN  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic] [__compaction] Could not get connection while getLastMessageId -- Will try again in 100 ms
09:47:54.527 [pulsar-external-listener-86-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Get topic last message Id
09:47:54.527 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic][__compaction] Successfully getLastMessageId 1544:37344
09:47:54.527 [pulsar-client-io-85-1] INFO  org.apache.pulsar.compaction.TwoPhaseCompactor - Commencing phase one of compaction for persistent://public/default/my-topic, reading to 1544:37344:-1
09:47:57.416 [pulsar-web-57-12] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:47:57 +0900] "GET /admin/v2/persistent/public/functions/coordinate/stats HTTP/1.1" 200 887 "-" "Pulsar-Java-v2.5.0" 5
09:48:02.838 [pulsar-web-57-1] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:02 +0900] "GET /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 200 35 "-" "Pulsar-Java-v2.5.0" 2
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Closing consumer: 1
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.AbstractDispatcherSingleActiveConsumer - Removing consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:48:04.646 [pulsar-io-50-7] INFO  org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:60180] Closed consumer Consumer{subscription=CompactorSubscription{topic=persistent://public/default/my-topic, name=__compaction}, consumerId=1, consumerName=b411f, address=/127.0.0.1:60180}
09:48:04.646 [pulsar-client-io-85-1] INFO  org.apache.pulsar.client.impl.ConsumerImpl - [persistent://public/default/my-topic] [__compaction] Closed consumer
09:48:05.941 [pulsar-web-57-3] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:05 +0900] "GET /admin/v2/persistent/public/default/my-topic/compaction HTTP/1.1" 200 79 "-" "Pulsar-Java-v2.5.0" 2
09:48:27.417 [pulsar-web-57-6] INFO  org.eclipse.jetty.server.RequestLog - 127.0.0.1 - - [31/1/2020:09:48:27 +0900] "GET /admin/v2/persistent/public/functions/coordinate/stats HTTP/1.1" 200 887 "-" "Pulsar-Java-v2.5.0" 5

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 15 (12 by maintainers)

Commits related to this issue

Most upvoted comments

@sijie You are right, I will write an issue.