neo4j-java-driver: Intermittent SSL socket connection failure

Neo4j Java driver version: 1.0.4 Neo4j server version: 3.0.3

We are using a 3-node HA cluster setup in AWS with 2 ELBs; one for read operations pointing to the slave nodes, and one for write operations pointing to the master node. The ELBs are configured to use the Neo4j management end-points for health checks and to fail over when one of the nodes goes down and the ‘master’ moves. The ELBs are also configured to pass SSL traffic to the back-end servers, so SSL termination is done on the Neo4j server instances.  Our application code has Neo4j Driver object instances for read and write operations that connect to the corresponding ELB instance using the BOLT protocol and requiring encryption.

The problem we are having is periodic failure by the Neo4j Driver to establish an SSL connection.  It seems that after some period of inactivity, a request to read something from the graph results in a failure to establish an SSL connection.  Issuing the same request again succeeds.

Here is the relevant stack trace:

org.neo4j.driver.v1.exceptions.ClientException: Failed to establish SSL socket connection. at org.neo4j.driver.internal.connector.socket.TLSSocketChannel.unwrap(TLSSocketChannel.java:179) at org.neo4j.driver.internal.connector.socket.TLSSocketChannel.read(TLSSocketChannel.java:374) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readNextPacket(BufferingChunkedInput.java:408) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readChunkSize(BufferingChunkedInput.java:344) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.read(BufferingChunkedInput.java:246) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.fillScratchBuffer(BufferingChunkedInput.java:215) at org.neo4j.driver.internal.connector.socket.BufferingChunkedInput.readByte(BufferingChunkedInput.java:109) at org.neo4j.driver.internal.packstream.PackStream$Unpacker.unpackStructHeader(PackStream.java:441) at org.neo4j.driver.internal.messaging.PackStreamMessageFormatV1$Reader.read(PackStreamMessageFormatV1.java:397) at org.neo4j.driver.internal.connector.socket.SocketClient.receiveOne(SocketClient.java:130) at org.neo4j.driver.internal.connector.socket.SocketClient.receiveAll(SocketClient.java:124) at org.neo4j.driver.internal.connector.socket.SocketConnection.receiveAll(SocketConnection.java:121) at org.neo4j.driver.internal.connector.socket.SocketConnection.sync(SocketConnection.java:100) at org.neo4j.driver.internal.connector.ConcurrencyGuardingConnection.sync(ConcurrencyGuardingConnection.java:122) at org.neo4j.driver.internal.pool.PooledConnection.sync(PooledConnection.java:144) at org.neo4j.driver.internal.InternalSession.close(InternalSession.java:130)

Here are relevant code snippets:

Driver neo4jReadDriver = GraphDatabase.driver(serverURI,
            AuthTokens.basic(username, password),
            Config.build()
                    .withEncryptionLevel(Config.EncryptionLevel.REQUIRED)
                    .toConfig());

private StatementResult run(Driver neo4jDriver, String statementTemplate, Map<String, Object> statementParameters) {
    try (Session neo4jSession = neo4jDriver.session()) {
        return neo4jSession.run(statementTemplate, statementParameters);
    }
}

String cypherStatement = "<cypher>";
HashMap<String, Object> params = new HashMap<>();
StatementResult result = run(neo4jReadDriver, cypherStatement, params);

The SSL connection failure happens at the end of the ‘try’ block when the session is closed. An immediate re-try of the same call succeeds.

Are there any recommended configuration settings for using the Neo4j driver with AWS ELBs? Have the Neo4j drivers been tested in HA configurations using AWS and ELBs? Are there any recommended configuration settings when deploying into AWS and using ELBs?

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 15 (5 by maintainers)

Most upvoted comments

Hello,

I just want to update this old ticket with references to couple new APIs which might be helpful here:

  1. Connection liveness check timeout configuration setting: https://github.com/neo4j/neo4j-java-driver/blob/1.2.0/driver/src/main/java/org/neo4j/driver/v1/Config.java#L291-L317. This should force driver re-acquire connection when set to values less than load balancer idle connection timeout.
  2. Transaction function APIs that allow retries with exponential backoff. More info can be found in docs: https://neo4j.com/docs/developer-manual/current/drivers/sessions-transactions/#driver-transactions-transaction-functions.

Hope this helps.

@tavolate I posted my retry logic above. In my case, I encapsulated this code in a base class used by all of my Neo4j data access classes so that all Cypher queries are executed with retry capability.