pgjdbc: PgJDBC can experience client/server deadlocks during batch execution

PgJDBC can encounter client/server deadlocks during batch execution, where the server is waiting for the client and the client is waiting for the server. Neither can progress and one must be terminated.

The client cannot continue until the server consumes some input from the server’s receive buffer (the client’s send buffer).

The server cannot continue until the client consumes some input from the client’s receive buffer, (the server’s send buffer).

Each is blocked trying to send to the other. Neither can receive until the other sends.

PgJDBC tries to prevent this case from arising with some heuristics in its batch facilities where it attempts to limit the number of queries that may be queued; see org.postgresql.core.v3.QueryExecutorImpl, and the comments around MAX_BUFFERED_QUERIES. The coarse heuristic of assuming 250 bytes per server reply and a 64kb server send buffer can be defeated by large numbers of asynchronous messages like NOTIFYs or non-fatal RAISE messages. It was introduced in commit c1a939f6, with a followup commit 985c0472 restricting batching to queries that don’t return generated keys.

The main reason that deadlocks are rare is that the 64k buffer size is now unrealistically small; on my Linux system, default buffers are 200kb for both send and receive, giving us 400kb of buffer space to work with.

I’ve produced a very artificial test case showing that a deadlock can still occur; see TestDeadlock.java in https://github.com/ringerc/pgjdbc-batchtest . While that test is very heavy handed in producing the deadlock, real world cases can and do arise.

The client’s stack looks like:

SocketOutputStream.socketWrite(byte[], int, int) line: 113  
SocketOutputStream.write(byte[], int, int) line: 159    
BufferedOutputStream.write(byte[], int, int) line: 122  
BufferedOutputStream(FilterOutputStream).write(byte[]) line: 97 
PGStream.Send(byte[]) line: 224 
SimpleParameterList.writeV3Value(int, PGStream) line: 344   
QueryExecutorImpl.sendBind(SimpleQuery, SimpleParameterList, Portal, boolean) line: 1387    
QueryExecutorImpl.sendOneQuery(SimpleQuery, SimpleParameterList, int, int, int) line: 1622  
QueryExecutorImpl.sendQuery(V3Query, V3ParameterList, int, int, int, QueryExecutorImpl$ErrorTrackingResultHandler) line: 1137   
QueryExecutorImpl.execute(Query[], ParameterList[], ResultHandler, int, int, int) line: 396 
Jdbc4PreparedStatement(AbstractJdbc2Statement).executeBatch() line: 2897

The server’s stack looks something like:

#0  0x0000003bc68f5dad in __libc_send (fd=10, buf=0x20eee90, n=8192, flags=-1, flags@entry=0) at ../sysdeps/unix/sysv/linux/x86_64/send.c:27
#1  0x00000000005a6c47 in secure_write (port=<optimized out>, ptr=ptr@entry=0x20eee90, len=len@entry=8192) at be-secure.c:396
#2  0x00000000005b008b in internal_flush () at pqcomm.c:1252
#3  0x00000000005b01dd in internal_putbytes (
    s=0x2b452a1 "abcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefghabcdefgh"..., s@entry=0x2ae52c0 "SWARNING", len=655444) at pqcomm.c:1198
#4  0x00000000005b11f0 in pq_putmessage (msgtype=78 'N', s=0x2ae52c0 "SWARNING", len=<optimized out>) at pqcomm.c:1395
#5  0x00000000005b1e64 in pq_endmessage (buf=buf@entry=0x7fff12cf9300) at pqformat.c:346
#6  0x0000000000721ff7 in send_message_to_frontend (edata=<optimized out>) at elog.c:2887
#7  EmitErrorReport () at elog.c:1429
#8  0x000000000072190d in errfinish (dummy=<optimized out>) at elog.c:488
#9  0x00007fc526f1f47b in exec_stmt_raise (estate=estate@entry=0x7fff12cf9780, stmt=stmt@entry=0x21eecb0) at pl_exec.c:3027
#10 0x00007fc526f20e55 in exec_stmt (stmt=0x21eecb0, estate=0x7fff12cf9780) at pl_exec.c:1442
#11 exec_stmts (estate=0x7fff12cf9780, stmts=<optimized out>) at pl_exec.c:1345
#12 0x00007fc526f22d1b in exec_stmt_block (estate=estate@entry=0x7fff12cf9780, block=0x21ef758) at pl_exec.c:1283
#13 0x00007fc526f237c3 in plpgsql_exec_trigger (func=func@entry=0x21e9310, trigdata=0x7fff12cf9f30) at pl_exec.c:689
#14 0x00007fc526f18b3a in plpgsql_call_handler (fcinfo=0x7fff12cf9ac0) at pl_handler.c:120
#15 0x00000000005672c7 in ExecCallTriggerFunc (trigdata=trigdata@entry=0x7fff12cf9f30, tgindx=tgindx@entry=0, finfo=finfo@entry=0x21eb240, instr=instr@entry=0x0, 
    per_tuple_context=per_tuple_context@entry=0x21e90f0) at trigger.c:1871
#16 0x000000000056857a in AfterTriggerExecute (trigdesc=0x21eb320, trigdesc=0x21eb320, per_tuple_context=0x21e90f0, instr=0x0, finfo=0x21eb240, rel=0x7fc527133858, 
    event=0x21e7100) at trigger.c:3454
#17 afterTriggerInvokeEvents (events=events@entry=0x21c1420, firing_id=4, estate=estate@entry=0x21eb100, delete_ok=delete_ok@entry=1 '\001') at trigger.c:3633
#18 0x000000000056c8be in AfterTriggerEndQuery (estate=estate@entry=0x21eb100) at trigger.c:3823
#19 0x000000000057f3dc in standard_ExecutorFinish (queryDesc=0x21d0998) at execMain.c:379
#20 0x0000000000658fe8 in ProcessQuery (plan=<optimized out>, sourceText=0x21d08a0 "INSERT INTO deadlock_demo1(id, largetext) VALUES ($1,$2)", params=0x21d0908, 
    dest=0xb1a160 <donothingDR>, completionTag=0x7fff12cfa420 "INSERT 0 1") at pquery.c:225
#21 0x000000000065918c in PortalRunMulti (portal=portal@entry=0x21b1060, isTopLevel=isTopLevel@entry=1 '\001', dest=0xb1a160 <donothingDR>, dest@entry=0x217f1a0, 
    altdest=0xb1a160 <donothingDR>, altdest@entry=0x217f1a0, completionTag=completionTag@entry=0x7fff12cfa420 "INSERT 0 1") at pquery.c:1279
#22 0x0000000000659c7c in PortalRun (portal=0x21b1060, count=1, isTopLevel=<optimized out>, dest=0x217f1a0, altdest=0x217f1a0, completionTag=0x7fff12cfa420 "INSERT 0 1")
    at pquery.c:816
#23 0x00000000006573bc in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:1931
#24 0x000000000045d9ad in BackendRun (port=0x20ed950) at postmaster.c:4011
#25 BackendStartup (port=0x20ed950) at postmaster.c:3685
#26 ServerLoop () at postmaster.c:1586
#27 0x00000000006161f3 in PostmasterMain (argc=argc@entry=5, argv=argv@entry=0x20c72f0) at postmaster.c:1253
#28 0x000000000045e558 in main (argc=5, argv=0x20c72f0) at main.c:226

About this issue

Original URL
State: open
Created 10 years ago
Reactions: 5
Comments: 31 (17 by maintainers)

Links to this issue

Commits related to this issue

Change MAX_BUFFERED_QUERIES to a byte measure and count buffer bytes Per GitHub issue #194 and the comments above MAX_BUFFERED_QUERIES, we're using a pretty rough heuristic for receive buffer managem... — committed to ringerc/pgjdbc by ringerc 10 years ago
Merge pull request #204 from 2ndQuadrant/batch-returning-support Add limited support for returning generated columns in batches See individual commits for notes on the limitations. This isn't a pe... — committed to pgjdbc/pgjdbc by ringerc 10 years ago

Most upvoted comments

Changing the size of the output buffer has shown to help, but currently changing timeouts won’t help.

Can you please explain how to change size of the output buffer? 😃

turbanoff on Apr 1, 2020

We’ve found that reducing the batch size avoids the behaviour. In addition, we made sure the operating system TCP configuration have larger allocated buffers. e.g.

net.core.rmem_max = 134217728 net.core.wmem_max = 134217728 net.core.rmem_default = 134217728 net.core.wmem_default = 134217728 net.ipv4.tcp_mem = 18569037 24758719 37138074 net.ipv4.tcp_rmem= 10240 87380 134217728 net.ipv4.tcp_wmem= 10240 87380 134217728

This is really just fiddling around the edges and trying to second guess the behaviour. It isn’t a solution.

oobles on Jul 21, 2020