pgjdbc: Loom compatible - replace synchronized block with for example j.u.c.ReentrantLock

Loom project: https://openjdk.java.net/projects/loom/

With Loom there is a current known limitation that means code should ideally avoid performing IO inside a synchronized block.

The suggestion to make java code “loom friendly” is to replace the use of synchronized blocks with a java.util.concurrent.Lock (like ReentrantLock).

An example of the simple case is to replace code like:

void foo() {
  synchronized(this) {
     ... // performs IO or something blocking
  }
}

With code like:

// loom friendly ...

private final Lock lock = new ReentrantLock();

void foo() {
  lock.lock();
  try {
     ... // performs IO or something blocking
  } finally {
     lock.unlock();
  }
}

PgStatement simple case example

An example use of synchronized(this) in PgStatement is: https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgStatement.java#L244 https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgStatement.java#L264

More difficult changes

e.g. PgStatement synchronized(connection)

There are more difficult changes in the case where for example PgStatement has synchronized on connection. In these cases there are more design options to consider. https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/PgStatement.java#L906


  • Are we happy to prepare and submit PR’s for the simple case replacement of sychronized(this) ?
  • Are we happy to find and replace code using sychronized(this) with code that uses ReentrantLock specifically or is there some other pattern/lock that people would prefer?

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Reactions: 18
  • Comments: 43 (26 by maintainers)

Commits related to this issue

Most upvoted comments

(I’m working on Project Loom).

If you testing with Project Loom EA builds then it might be useful to run with -Djdk.tracePinnedThread=full . This will print a stack trace when a thread is pinned due to attempting a blocking operation while holding a monitor. It might help you here to identify the monitors that are problematic.

I’d love to see real world data from folks using this

The meta-reason you’re seeing different results is that the implementation of the OpenJDK VM (HotSpot) changes frequently. Some micro-operation can be faster than another in one release, slower by the same amount in the next, and then faster again in the next. While we treat performance regressions as bugs, 1. changes in the relative speed between two operations is not in itself a regression, and 2. we only consider something a regression if we consider the benchmark “realistic”. So, as a general rule, it is not a good idea to optimise for specific VM behaviour, but if you must, you should repeat all considerations with every release, as things change. I.e. it is incorrect to assume that if A is faster than B in JDK 17 it will also be faster than B in JDK 20.

In this case, not only are you examining a microbenchmark, but one that measures precisely the situation where virtual threads do not benefit from ReentrantLock, i.e. in-memory operations. Because j.u.c locks are needed only when guarding frequent and long-running I/O, I doubt you’ll see some noticeable difference one way or another.

As JDK 19 is GA today, can you please re-evaluate this PR?

@rbygrave and @vlsi from me, only a big thank you.

@cescoffier, please review the stacktrace. You need to watch for <== monitors lines.

In this case the line is

    org.hibernate.id.enhanced.PooledLoOptimizer.generate(PooledLoOptimizer.java:61) <== monitors:1

The corresponding code is https://github.com/hibernate/hibernate-orm/blob/2e73795e4b4fc00bd29686843af346629aa8e7f4/hibernate-core/src/main/java/org/hibernate/id/enhanced/PooledLoOptimizer.java#L56

I would suggest asking hibernate team rather than pgjdbc team so they could clarify if synchronized there could be replaced with java.util.concurrent.locks.Lock

Yes. The microbenchmark guards a very short-lived in-memory operation, and so irrelevant for the places where ReentrantLock is actually helpful for virtual threads. Virtual threads work well with either ReentrantLock or synchronized in these cases. It’s only the cases where synchronized guards a long-running and frequent operation that might adversely affect virtual threads’ scalability.

FWIW, Project Loom is scheduled as a preview feature for JDK19 with still the limitations in place where synchronized keyword usage leads to kernel thread pinning. This limitation seems to stay around for the foreseeable future.

That would be awesome if you provided PR’s

@Aleksandr-Filichkin I believe we are waiting for me to fix the tests for the PR #2635 . Unfortunately I’ve been very busy in recent weeks and not got a chance to looks at those failing tests yet (+ other things like license text & format if I recall correctly).

FWIW I ran the hibernate-orm test suite on Loom with the pgsql driver. For that workload, the key problem areas for thread pinning appear to be

QueryExecutorImpl execute method PgStatement execute/executeQuery/executeInternal methods PgPreparedStatement executeUpdate/executeWithFlags methods

so if you favour a more targeted approach over replacing all synchronized use, then maybe start with those.

Hibernate - Loom (virtual threads) support tracking issue.

It’s only the cases where synchronized guards a long-running and frequent operation that might adversely affect virtual threads’ scalability

I do not think we would be able to use two types of locks for different cases.

For instance:

  • PgPreparedStatement#setString() is in-memory op, so in theory it could be covered via syncrhonized
  • PgPreparedStatement#execute does IO, so it should probably use ReentrantLock

I’m not sure we can use both there.

However, currently we do not synchronize methods like PgPreparedStatements#setString, PgResultSet#getString, so “degrading ResultSet#getString” is not a question currently.

We do have synchronized on ResultSet#update*, however, I believe “updateable resultset” is not really performance-critical (I believe the feature is rarely used, and it does not support “batch updates”, so “updateable resultset” is an inherently slow API anyway.

So I’m +1 on doing this.

I see it is on main, after the latest release, so can test from source build on main.

We push -SNAPSHOT builds to Maven Central on each commit, so you could try 42.6.0-SNAPSHOT: https://oss.sonatype.org/content/repositories/snapshots/org/postgresql/postgresql/42.6.0-SNAPSHOT/

Q: what benefits do you hope to see

I’ll have a crack at that. TLDR less resources (CPU/Memory) and that translates to saving $money (reducing the cloud hosting cost).

As I see it there are 2 ways that we can look at it which are:

  • Reduced CPU/Memory resources required to process a given load
  • Higher throughput in ops/s with the same amount of CPU/Memory resources.

… I’d suggest these are 2 ways of saying the same thing.

The amount of CPU we will be looking to save will be a function of how much IO wait we are actually incurring. The more IO wait that is actually going on then the bigger the benefit we are going to actually see in practice.

We can see these benefits today when we simulate IO wait (using say LockSupport or Thread.sleep). The goal is to try and observe a difference in scenarios with actual JDBC IO that look like our production loads.

How are we going to measure this or detect this in this JDBC case? I’d love to hear other peoples plans (please share). My plan is that I believe it will be easier to try and observe higher throughput as ops/s. Restrict the CPU via Docker/K8s resource limits (e.g. restrict to 8 Cores / 8000 Millicores) and observe max ops/s with/without Loom on some workload that is a good approximation to production. Lots of variables around the nature of the queries, jdbc buffering, latency of first response etc so there is a lot of detail to look at if we get into that.


In term of the ReentrantLock vs Synchronized JMH Benchmark, I have JDK 1.8.0_291 Linux x86_64 which gives me:

1 Thread

Benchmark                    Mode  Cnt          Score         Error  Units
LockTest.testReentrantLock  thrpt    6   53457313.215 ± 1000715.641  ops/s
LockTest.testSynchronized   thrpt    6   51451417.213 ± 1494911.154  ops/s
LockTest.testWithoutLock    thrpt    6  400951047.669 ± 1206444.298  ops/s

10 Threads

Benchmark                    Mode  Cnt          Score         Error  Units
LockTest.testReentrantLock  thrpt    6   28008868.789 ±  843736.962  ops/s
LockTest.testSynchronized   thrpt    6   22142469.915 ± 2239755.751  ops/s
LockTest.testWithoutLock    thrpt    6  171936509.097 ± 1767850.946  ops/s

# JMH version: 1.33
# VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
# VM invoker: /usr/lib/jvm/jdk1.8.0_291/jre/bin/java
# VM options: <none>
# Blackhole mode: full + dont-inline hint (default, use -Djmh.blackhole.autoDetect=true to auto-detect)
# Warmup: 5 iterations, 10 s each
# Measurement: 3 iterations, 10 s each
...

Code at: https://github.com/rbygrave/reentrant-vs-synchronized

@microhardsmith I’m curious; what benefits do you hope to see

When it comes to blocking, the most common case is running a jdbc call, and I am personally changing my threading model to virtual threads, and I believe a lot more developers will do so in the future. It’s important for jdbc library to fit loom thus we can build our applications in new threading model. It’s not benefit we are talking about, it’s about availability 😃

ReentrantLock performance comparing to Synchronized is really not a problem here, as the jmh says, it doesn’t differ that much, plus ReentrantLock can provide more precisely control over the code flow, and more readability, I think switching from Synchronized to ReentrantLock would be a good idea.

Yet I don’t think maintain Synchronized and JDK developers will fix the problem, Synchronized in JVM is implemented with C++, currently they fixed pure java level problems, but jni and jvm are still big challenges, waiting for that coming would be unpromising(We need you as soon as possible haha)

@rbygrave I /think/ it might be important that the lock objects are final. (It is certainly important for Atomic*FIeldUpdaters, I think some of the jvm optimizations apply here also, but am not certain). Another option would be to create a local scoped:

  public long testReentrantLock(){
    final ReentrantLock localLock = reentrantLock;
    localLock.lock();
    try {
      return doSomething();
    } finally {
      localLock.unlock();
    }
  }

Also, I think the testWithoutLock should return the result of doSomething

The benefit is using pgjdbc on virtual threads without performance degradation due to carrier thread pinning 😃 I think replacing synchronized blocks and await/notify with java.util.concurrent structures is safe, it was even done years ago in the JDK itself in the Socket API to prepare for virtual thread support:

https://openjdk.org/jeps/353

It uses java.util.concurrent locks rather than synchronized methods so that it can play well with fibers in the future. In JDK 11, the NIO SocketChannel and the other SelectableChannel implementations were mostly re-implemented with the same goal in mind.

For example, here is a Aug 2022 benchmark / blog showing ReentrantLock faster than synchronized using JMH https://www.mo4tech.com/comparison-of-synchronized-and-reentrantlock-performance-in-java.html

private void doSomething() {
    cnt += 1;
    if (cnt = (Long.MAX_VALUE  1)) {
        cnt = 0;
    }
}

The benchmark probably suffers from dead code elimination: https://github.com/melix/jmh-gradle-example/blob/a8f204f0f4c48817ed1a575295409f2c181bab4d/src/jmh/java/org/openjdk/jmh/samples/JMHSample_08_DeadCode.java#L63-L64

Notes: (A): We could perhaps ask for an indication from the likes of Alan or Ron but its likely a question that can’t be answered today or perhaps anytime soon.

Right, impossible to provide a good answer now. Exploration continues, it is a significant effort.

FWIW Just in case it helps, I think there are 3 paths to choose from going forward:

  1. Do nothing and wait for Loom to support synchronized noting that at this point we have no idea how long that will take (A)
  2. Look at making synchronized to ReentrantLock changes like these PRs into the main branch noting that there is a detail here (B)
  3. Look at maintaining a semi-official loom fork? Get a “loom branch” and look to merge the synchronized to ReentrantLock changes there. Likely maintained by the community / interested parties? ©

Notes: (A): We could perhaps ask for an indication from the likes of Alan or Ron but its likely a question that can’t be answered today or perhaps anytime soon. (B): Although semantically they are the same synchronized is implemented differently as a special flag on the object header. I suspect this is the detail that stopped these PRs in the first place? ©: I’d suggest this could be quite a lot of (mostly rebasing?) work and we could be looking at maintaining this branch for quite a long time. I’d suggest we can’t really ask the maintainers to do this work on top of what they already do. Would the community step up here to maintain this loom branch?

There might be another path I can’t see, otherwise I think we are looking to choose one of those options.

How about this? Can you squash all of these into a new PR just for testing ? I’d like to see it pass all of the tests with all of the PR’s that you have ?

So they all seem to pass the tests individually. I’m wondering how to exercise these changes more rigorously?