disruptor: Dead lock observed in BlockingWaitStrategy in Log 4J

We have seen this behavior in during high load. Where Logging Got Stropped and Application Went to not responsive state. log4J Version : 2.2 Disruptor Version : 3.3.2 Ring Buffer Size : 128 Producer(Multiples Threads) and Consumer Threads(Single Thread As per Log 4J Configuration) Started Waiting on each other.

Here is the one of the Trace from Thread Dump:

Producer : “[ACTIVE] ExecuteThread: ‘7’ for queue: ‘weblogic.kernel.Default (self-tuning)’” TIMED_WAITING sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:349) com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136) com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105) com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:444) com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:256) org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:285) org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:722) org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:693) org.apache.logging.log4j.jcl.Log4jLog.debug(Log4jLog.java:81)

Consumer Thread :

“AsyncLogger-1” waiting for lock java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@5d972983 WAITING sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:45) com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:55) com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:123) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:744)

Is this is known issue which got already fixed in recent build ?

Before raising this issue in Log4J thought of checking here since this dead lock related to LMAX implementation.

Can you please take a look into it ?, it seems to be critical for our application. Let me know if any other details required ?. Your Quick Support is much appreciated.

Code Snippet for BlockingWaitStrategy :

public final class BlockingWaitStrategy implements WaitStrategy
{
    private final Lock lock = new ReentrantLock();
    private final Condition processorNotifyCondition = lock.newCondition();

    @Override
    public long waitFor(long sequence, Sequence cursorSequence, Sequence dependentSequence, SequenceBarrier barrier)
        throws AlertException, InterruptedException
    {
        long availableSequence;
        if ((availableSequence = cursorSequence.get()) < sequence)
        {
           lock.lock();
            try
            {
                while ((availableSequence = cursorSequence.get()) < sequence)
                {
                    barrier.checkAlert();
                    processorNotifyCondition.await();
                }
            }
            finally
            {
                lock.unlock();
            }
        }

        while ((availableSequence = dependentSequence.get()) < sequence)
        {
            barrier.checkAlert();
        }

        return availableSequence;
    }

    @Override
    public void signalAllWhenBlocking()
    {
        lock.lock();
        try
        {
            processorNotifyCondition.signalAll();
        }
        finally
        {
            lock.unlock();
        }
    }
}

Regards, Sakumar

About this issue

  • Original URL
  • State: closed
  • Created 9 years ago
  • Comments: 47 (22 by maintainers)

Most upvoted comments

And we had this deadlock observed in Linux environment…