rails: PostgreSQL prepared statement failure inside transaction results in a permanently cached prepared statement that's never cleaned up.

ActiveRecord 4.0. Verified with PostgreSQL 9.2 and 9.3 and pg gem 0.16 and 0.17

Assume you have some basic table, User for this example’s purposes.

In a Rails console:

User.find(1)
=> .....

At this point, the PostgreSQL adapter has created a prepared statement for selecting a User by ID and cached it in the connection. To verify:

ActiveRecord::Base.connection.instance_variable_get(:@statements).to_a
=> [
  [0] [
    [0] "\"$user\",public-SELECT  \"users\".* FROM \"users\"  WHERE \"users\".\"id\" = $1 LIMIT 1",
    [1] "a1"
  ]
]

Now from a separate session, let’s modify the schema for the User table. The use case here is adding a column to a live system without causing downtime to users:

ALTER TABLE users ADD new_metric_column integer;

Back in the previous Rails console, we then run this:

User.transaction { User.find(1) }
=> [2013-09-23 17:53:03] DEBUG ActiveRecord::Base        :    (0.2ms)  BEGIN
[2013-09-23 17:53:03] DEBUG ActiveRecord::Base        :   User Load (0.6ms)  SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT 1  [["id", 1]]
[2013-09-23 17:53:03] ERROR ActiveRecord::Base        : PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
[2013-09-23 17:53:03] DEBUG ActiveRecord::Base        :    (0.5ms)  ROLLBACK
ActiveRecord::StatementInvalid: PG::InFailedSqlTransaction: ERROR:  current transaction is aborted, commands ignored until end of transaction block
: SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
from /Users/nixme/Projects/ML/flabongo/vendor/bundle/ruby/2.0.0/gems/activerecord-4.0.0/lib/active_record/connection_adapters/postgresql_adapter.rb:512:in `exec'

What’s happening is that since the shape of users has changed, the a1 prepared statement is no longer valid. The AR PostgreSQL adapter correctly sees this and tries to run DEALLOCATE a1. But the transaction’s already invalid at that point, and so even the DEALLOCATE fails, leaving that connection’s @statements pool with a permanently bad prepared statement.

For reference, on the PostgreSQL side, the logs show:

ERROR:  cached plan must not change result type
STATEMENT:  SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:  DEALLOCATE a1

Now that might seem like an arbitrary transaction, but all AR saves wrap in a transactions, so a validation or callback might be selecting a User. Imagine in a production system, all old app servers are now permanently broken without a restart. This makes live migrations without downtime basically impossible, unless I’m missing something.

I think ideally, the transaction is aborted, then DEALLOCATE is run outside a transaction and thus successfully removes that prepared statement for the connection. Then the original transaction is retried automatically and the necessary prepared statements are re-created automatically.

Another option is just to blow up this one transaction, but at the very least get that DEALLOCATE to succeed. So that any follow-up transactions will work and fail the least number of requests.

For now, we’ve disabled prepared_statements after monkey-patching some fixes from 4-0-STABLE (see #12023)

/cc @tenderlove since I think you’ve done some of the PG prepared statement support?

About this issue

  • Original URL
  • State: closed
  • Created 11 years ago
  • Reactions: 3
  • Comments: 78 (26 by maintainers)

Commits related to this issue

Most upvoted comments

This issue appears to be getting a little stale again, but I’m seeing this as well on Rails 4.2.5 with pg gem 0.18.4 and PostgreSQL server 9.4.4.

Thanks for the backport, @HoyaBoya! Took me a while to figure out how to plug it in.

I ended up sticking this in config/initializers/rails_recoverable_transactions.rb:

# https://github.com/rails/rails/issues/12330#issuecomment-215203613

module TransactionRecoverable
  module ClassMethods
    raise "We may no longer need the monkeypatch #{__FILE__}!" if Rails::VERSION::MAJOR > 4

    def transaction(*args)
      super(*args) do
        yield
      end
    rescue PG::InFailedSqlTransaction => e
      connection.rollback_db_transaction
      connection.clear_cache!

      super(*args) do
        yield
      end
    end
  end
end

class << ActiveRecord::Base
  prepend TransactionRecoverable::ClassMethods
end

If you want to verify that it’s plugged in correctly, do this on your local machine:

  • Add puts "hello" on the line before the first super(*args) do.

  • Run something like this in a console (replacing Item with some model in your app):

    rails runner 'Item.transaction { puts Item.count }'

    Make sure it outputs “hello” before the count.

  • Remove your puts "hello" line again.

I’m having a similar issue in my Rails 6 app. Had to disable prepared_statements to get pg out of gridlock.

Yo, this should be fixed in master… my PR was merged in march.

See: https://github.com/rails/rails/commit/50c53340824de2a8000fd2d5551cbce2603dc34a

I reckon this issue can be closed now.

Unlocking since it became stale. Please do not comment on this issue with +1. We know people are having the problem, and it will not cause it to be solved any faster.

+1 as a workaround, we are currently selecting all columns explicitly.

Restarting is great in theory, but when you’re doing gradual deploys it gets more tricky. We’ve seen a migration run, but requests being handled by a server that hasn’t been rebooted yet start throwing errors. One strategy I think I’ll give a try is to temporarily disable prepared statements while deploying and then re-enable them again after the migrations have run. That’ll involve restarting 3 times, though…

I’m facing the same issue with Rails 4.1.9 and Postgres 9.4.

We’re facing similar issues (but when dropping columns).

Before dropping columns, we’re ensuring that AR is not writing to any columns by overriding AR::Base.columns method.

But dropping columns as part of a migration is causing issues because of cached prepared statements. See https://gist.github.com/ketan/577ca47774364d8e4add for an reproducible bug (uses PG)

I’m seeing the following error in the postgres log when this happens.

ERROR:  cached plan must not change result type
STATEMENT:  SELECT  "users".* FROM "users"  WHERE "users"."id" = $1 LIMIT 1
ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:  DEALLOCATE a2