postgres-nio: Error detail regression

I’m not sure if this is a fluent-postgres-driver issue or one in postgres-kit - but FWIW, I can avoid it by pinning to .package(url: "https://github.com/vapor/fluent-postgres-driver.git", exact: "2.6.2").

We have a test that asserts on the error description when violating a not-null constraint:

        do {  // commit unset
            let v = Version()
            v.commitDate = .distantPast
            v.reference = .branch("main")
            try await v.save(on: app.db)
            XCTFail("save must fail")
        } catch {
            // validation
            XCTAssertEqual(error.localizedDescription,
                           #"server: null value in column "commit" of relation "versions" violates not-null constraint (ExecConstraints)"#)
        }

With the latest version of fluent-postgres-driver this starts failing because the error message is now The operation couldn’t be completed. (PostgresNIO.PSQLError error 1.),

Of course the exact error message isn’t important but the point of the test is to ensure that we have relevant information about an error in the logs.

We have another test that checks unique key violations:

            XCTAssert($0.localizedDescription.contains(
                #"duplicate key value violates unique constraint "idx_repositories_owner_name""#),
                      "was: \($0.localizedDescription)"
            )

That test is also failing due to that same new error message, The operation couldn’t be completed. (PostgresNIO.PSQLError error 1.).

Ideally, both violations would still be reported as such (with some detail as to which kind of constraint was violated) and also include the name of the constraint that was violated (if available).

Is there a way to recover this error information with the latest version of fluent-postgres-driver/postgres-kit?

About this issue

Original URL
State: open
Created a year ago
Comments: 40 (39 by maintainers)

Most upvoted comments

@fabianfett asked me to write up a summary of the “ideal” error interface from our point of view. Here’s what I’ve come up with 🙂

The original interface worked just fine for us, so I think in terms of an API that was pretty much what we needed.

One thing that stood out (but I’m not sure how fixable that is), was that we were reaching beyond Fluent into a PG specific error. That’s certainly fine and we’re so tied to PG that this isn’t really a problem. It’s just slightly odd that for specific error handling you then need to import PG modules. A Fluent wrapper (like @gwynne suggested above) would be closer what I’d have wished for.

I haven’t thought through how you’d pass along specific error codes but I think preserving any explicit codes as they are thrown by the underlying driver would be helpful. A “self-documenting” enum (or struct with static) that gives them labels would be great but getting at the underlying raw value also feels quite important for web searches etc when something goes wrong. Old PostgresError had all this and was pretty much what we needed.

My understanding is that new PSQLError’s main raison d’être is to prevent leaking of sensitive data, which of course is desirable. But I wonder if there isn’t a narrower way of addressing this than masking all errors?

I’m sure you’ve considered all this but just for the sake of thinking out loud it seems the options are:

Mask all errors (new PSQLError)
Mask no errors (old PostgresError)
Audit sensitive errors and mask only those
Make all of the above configurable via env variable and/or tied to build settings (release/debug)

I’m not sure if 3. is feasible but I think it’d be the ideal version of this.

Earlier, we discussed how an env variable could configure this and the suggestion was to tie this to debug mode. However, I can see that making trouble-shooting of live issues quite hard. I don’t think that’s a setting we’d want, having thought about it some more.

I can see how bigger orgs / setups than ours cannot afford to treat their logs as secret and would have a need for this, maybe even as the default. But turning every error into The operation couldn’t be completed. (PostgresNIO.PSQLError error 1.) is really problematic operationally, where clear messages help tremendously to diagnose and fix things quickly.

I’m wondering: How would I actually get an informative error message in case something goes wrong in prod? Re-deploy with a new error handler somewhere that does the String(describing:) logging? How would I even know where to insert that in a hurry? The info to guide me where to put it is exactly what’s hidden 😧

And if it’s tied to an env variable with a debug build, that would mean a ~30min turnaround and running a live system with -c debug. That doesn’t feel like a great solution.

Also, there mere act of redeploying might make errors go away and now you might be constantly fiddling with the logging until you catch the error with the right setup.

So I think if at all possible, only masking the actually sensitive errors is the best option (i.e. 3.).

Does that description make sense and help?

I hope it does, and thanks a lot for your time looking into this and for your help.

finestructure on Aug 9, 2023

@rausnitz So there are a lot of factors in play here:

Fluent consists of lots of moving parts; it and SQLKit both try to be abstraction layers over multiple database backends (with mediocre at best success in many ways, it must be admitted). Most of those backends have been relatively stable for some time, so when change does happen, it almost can’t help but be disruptive.
PostgresNIO has, to date, easily seen more changes and updates than any of the others, in no smart part because it’s the most popular of Fluent’s drivers. (Although the MySQL backend is about to undergo a similar upheaval.) During all the work done in its overhaul, PostgresNIO tried desperately to not break adopters. Fabian would be the first to admit that this wasn’t always successful and that sometimes small patch releases were necessary towards this end, but he’d also point out that the intent was always there. I’d also add that there are very few people actively contributing code to these packages right now; we can only do so much at a time (hence how long it’s taken me to respond to this properly!).
PostgresNIO introduced a number of new, enormously improved APIs in the last year, and started to deprecate the existing APIs. The new APIs, which FluentPostgresDriver and PostgresKit have been rewritten to take as much advantage of as is practical, return the new PSQLError because it is vastly better than the old PostgresError, containing considerably more information; Fabian and I both made the call that it was better to provide the new error type than not, despite the lack of any higher-level abstraction.
In an ideal world, Fluent would have never forwarded errors from the lower level database drivers verbatim at all; it would have wrapped them in a higher-level Fluent error type that gave access to the underlying error at need. Whether this would have allowed better abstraction for this change is unclear, but it probably would have at the very least resulted in fewer people feeling the side effects. Unfortunately, this is not an ideal world, and present-day Fluent 4 is still stuck with several design choices Tanner made several years ago.

Also, to specifically address this:

I wonder why this change (throwing PSQLError instead of PostgresError) was not considered a semver major change, since it was a breaking change. Is there another way I can find out when breaking changes are introduced to packages I’m using? Should I generally not be relying on Error types remaining unchanged?

Swift does not consider the specific errors thrown by a given API part of that API’s contract (notwithstanding APIs which use Result); as such, it is not a semver-major break - although it’s obviously more than a little inconvenient in cases like this.

That being said, it’s easily arguable that the overhaul of PostgresNIO should have been a new major version. We went back and forth on this several times during the development, and at the end of the day the fact of the matter is that I probably made the wrong call when I talked Fabian into doing it the way we did; it was a decision made based on the lack of any consensus in Vapor as a whole (at least at the time) on how to handle versioning, and concern about how to leverage the new APIs in higher layers (PostgresKit and FluentPostgresDriver). Since that time there has been considerably more discussion and benefit of experience, and if we were doing it now, I think we’d go the other way. I probably will do exactly that with the MySQLNIO updates - but even if I don’t, I can promise right now there will not be a repeat of this problem 😰.

As for solutions to what we have with PostgresNIO now, there’s considerable reluctance to provide a way to change PostgresNIO’s default behavior, but the current plan is actually better than that - we’re going to make PSQLError.description show at least some information. Exactly what information hasn’t quite been nailed down yet, but I’m going to transfer this issue over to postgres-nio, and we’ll link it to the PR as soon as there is one.

gwynne on Aug 4, 2023

@finestructure We’re still trying to come up with a good answer to this that’s both safe and helpful. Our current thinking is to add an environment variable that switches on verbose-always errors (something like POSTGRES_NIO_VERBOSE_ERRORS; bikeshedding on the name is welcome) and only works in Debug builds - would this be workable for your case?

gwynne on Jul 13, 2023