postgres-nio: Connections to Azure Postgres dbs raise `NIOSSLError.uncleanShutdown` on shutdown

Description

This is a follow-up to a discussion on Discord where I first asked about the error and was advised to raise an issue.

We’ve recently switched the Swift Package Index’ db from a non-SSL docker Postgres instance to Azure hosted Postgres, which uses SSL. Seemingly every command (Vapor.Command.run) that we’re running in batch jobs is reporting

[ ERROR ] Channel error caught. Closing connection. [psql_connection_id: E70F3BE3-BBE1-4CD7-AE49-83DFF7877C13, psql_error: PSQLError(base: PostgresNIO.PSQLError.Base.connectionError(underlying: NIOSSL.NIOSSLError.uncleanShutdown))]

on shutdown.

This doesn’t seem to affect operation except that it’s really hard to diagnose issues with all the error noise.

It’s not happening in the request loops of the webserver, so it seems to be an issue with Vapor.Command.run exiting uncleanly.

It only seems to be happening with Azure dbs. Running against a local dockerized PG db with SSL (same version, 11.6) does not exhibit the problem.

I’ve raised an issue with Azure to see if it’s a config/server issue.

To Reproduce

One example command that triggers the error is here but there’s actually a stand-alone recipe to observe the error that I’m attaching as a playground.

It can also be re-created as follows:

Run Arena to create a new playground with the required dependencies:

arena https://github.com/vapor/vapor https://github.com/vapor/fluent https://github.com/vapor/fluent-postgres-driver -o ssl-error-repro

and add:

import Vapor
import Fluent
import FluentPostgresDriver

final class Test: Model, Content {
    static let schema = "test"

    @ID(key: .id)
    var id: UUID?
}

func run() throws {
    let app = Application(.testing)
    defer { app.shutdown() }

    let tlsConfig: TLSConfiguration = .forClient()
    app.databases.use(.postgres(hostname: "pgnio-debug.postgres.database.azure.com",
                                port: 5432,
                                username: "test@pgnio-debug",
                                password: "<ask me for the password>",
                                database: "test",
                                tlsConfiguration: tlsConfig), as: .psql)

    let db = app.db

    let t = try Test.find(UUID("cafecafe-cafe-cafe-cafe-cafecafecafe"), on: db).wait()
    print("t: \(String(describing:  t))")
}

run()

When running, the console should print

t: Optional(Test(output: ["test_id": CAFECAFE-CAFE-CAFE-CAFE-CAFECAFECAFE]))
2021-04-19T12:05:25+0200 error codes.vapor.postgres : psql_connection_id=A8F4884F-DCB4-47CD-AB80-A48A3D320F00 psql_error=uncleanShutdown Channel error caught.
2021-04-19T12:05:25+0200 error codes.vapor.postgres : psql_connection_id=A8F4884F-DCB4-47CD-AB80-A48A3D320F00 psql_error=PSQLError(base: PostgresNIO.PSQLError.Base.connectionError(underlying: NIOSSL.NIOSSLError.uncleanShutdown)) Channel error caught. Closing connection.

And the expectation is that it shouldn’t 😆

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 2
  • Comments: 15 (11 by maintainers)

Most upvoted comments

(It’d probably be good if trace masked these though)

Whoops, forgot to set the trace - hold on.

FYI, I’ve received a reply from Azure support:

I checked the telemetry of your database everything seems to be fine. The type of error you are getting is most likely related to the Vapor layer or how it sets up the connection. Postgres connections using TLS sometimes get closed after extended periods of inactivity. The next request handler that fetches such a connection from the connection pool will then fail with an NIOOpenSSL.OpenSSLError.uncleanShutdown error.

Highly unlikely, as the repro case runs within a second or less.

From the Database side, you can also try to configure parameters pstatement_timeout and idle_in_transaction_session_timeout. Please, check whether these are equal in your docker implementation. As a reference, check this article. Tracking and managing your Postgres connections - Craig Kerstiens.

From the Vapor/Swift side, please try to use ‘ignoreUncleanShutdown = true’ and see with the error is bypassed by the interface.

I’m not sure what ignoreUncleanShutdown they’re referring to - it’s not a Postgres setting as far as I can see. I’ll follow up to ask. Maybe they’re just speculating there might be a flag like that in PostgresNIO.

More on the connection management topic for Application Layer: Troubleshoot PostgreSQL: ‘An existing connection was forcibly closed by the remote host’ - Microsoft Tech Community