rails: `ThreadError: can't create Thread: Resource temporarily unavailable` after multiple calls to `Model.establish_connection`

Steps to reproduce

I appreciate that this issue may be filed under “yes, but is this really a problem for anyone” so I have added an ‘Explanation’ section below.

  • Use the mysql2 database connection on development and test
  • Create a model called DummyTable
  • Create and execute then the following test:
require 'test_helper'

class DummyTableTest < ActiveSupport::TestCase
  test "lots of connects and disconnect" do
    2000.times do |i|
      puts i
      Rails.cache.clear
      DummyTable.establish_connection :development
      Rails.cache.clear
      DummyTable.establish_connection :test
    end
  end
end

Expected behavior

The test runs and switches the database connection back and forth 2000 times. This works in Rails 5.1.

Actual behavior

...
1020
1021
E

Error:
DummyTableTest#test_lots_of_connects_and_disconnect:
ThreadError: can't create Thread: Resource temporarily unavailable
    test/models/dummy_table_test.rb:11:in `block (2 levels) in <class:DummyTableTest>'
    test/models/dummy_table_test.rb:5:in `times'
    test/models/dummy_table_test.rb:5:in `block in <class:DummyTableTest>'

System configuration

Rails version: 5.2.1

Ruby version: 2.5.1

Explanation

In our app we have two databases; the main Rails application database and a read-only database that we are given from other parts of the business. For our tests, for historical (or ‘technical debt’) reasons we need to connect to the read-only database and then, to be able to have reliable test data, we have a test helper that temporarily switch between this database and the test database, where all the tables are mirrored. This helper looks like:

  def connect_to db
    MODELS_FROM_WEBDB.each { |model|
      model.establish_connection db
    }
  end

  def clean_database
    MODELS_FROM_WEBDB.each(&:delete_all)
  end

  def with_clean_test_db
    Rails.cache.clear
    connect_to :test
    clean_database
    yield
  ensure
    clean_database
    connect_to :webdb
  end

so we can write tests like:

  with_clean_test_db do
    # Set up test data
    # Run tests
  end

When we run our tests after upgrading to Rails 5.2.1 (or 5.2.0) we start getting errors consistently after 2034 establish_connection calls. The errors are actually ActiveRecord::ConnectionNotEstablished: No connection pool with 'Branch' found., which is different from what I have reported above, but I have not yet managed to reproduce the ‘no connection pool’ error. I am guessing that there is a resource that does not get cleared down when a database connection is dropped.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 17 (5 by maintainers)

Commits related to this issue

Most upvoted comments

If anyone is still facing this issue, I back-ported the fix with this money patch, since setting reaping_frequency: 0 didn’t help.

It worked for me as a temporary solution before updating application to Rails 6.

config/initializers/rails6_backports.rb

raise "Remove no-longer-needed #{__FILE__}!" if Rails::VERSION::MAJOR >= 6

require "weakref"

module ActiveRecord
  # Backport https://github.com/rails/rails/pull/36998 and https://github.com/rails/rails/pull/36999
  # to avoid `ThreadError: can't create Thread: Resource temporarily unavailable` issues
  module ConnectionAdapters
    class ConnectionPool
      class Reaper
        @mutex = Mutex.new
        @pools = {}
        @threads = {}

        class << self
          def register_pool(pool, frequency) # :nodoc:
            @mutex.synchronize do
              unless @threads[frequency]&.alive?
                @threads[frequency] = spawn_thread(frequency)
              end
              @pools[frequency] ||= []
              @pools[frequency] << WeakRef.new(pool)
            end
          end

          private
            def spawn_thread(frequency)
              Thread.new(frequency) do |t|
                running = true
                while running
                  sleep t
                  @mutex.synchronize do
                    @pools[frequency].select!(&:weakref_alive?)
                    @pools[frequency].each do |p|
                      p.reap
                      p.flush
                    rescue WeakRef::RefError
                    end

                    if @pools[frequency].empty?
                      @pools.delete(frequency)
                      @threads.delete(frequency)
                      running = false
                    end
                  end
                end
              end
            end
        end

        def run
          return unless frequency && frequency > 0
          self.class.register_pool(pool, frequency)
        end
      end

      def reap
        stale_connections = synchronize do
          return unless @connections
          @connections.select do |conn|
            conn.in_use? && !conn.owner.alive?
          end.each do |conn|
            conn.steal!
          end
        end

        stale_connections.each do |conn|
          if conn.active?
            conn.reset!
            checkin conn
          else
            remove conn
          end
        end
      end

      def flush(minimum_idle = @idle_timeout)
        return if minimum_idle.nil?

        idle_connections = synchronize do
          return unless @connections
          @connections.select do |conn|
            !conn.in_use? && conn.seconds_idle >= minimum_idle
          end.each do |conn|
            conn.lease

            @available.delete conn
            @connections.delete conn
          end
        end

        idle_connections.each do |conn|
          conn.disconnect!
        end
      end
    end
  end
end

This issue has been automatically marked as stale because it has not been commented on for at least three months. The resources of the Rails team are limited, and so we are asking for your help. If you can still reproduce this error on the 6-0-stable branch or on master, please reply with all of the information you have about it in order to keep the issue open. Thank you for all your contributions.

It is failing on the 1021th iteration for revision e559d6dcecb871c174870fdeeed1560a0353cc49 in 5-2-stable. Thank you very much @jrmhaig for the repository.

I can still reproduce this with 5-2-stable and master, failing my example test on the 2045th iteration in both cases. The top lines of the Gemfile.lock files in each case are:

# 5-2-stable
GIT
  remote: https://github.com/rails/rails.git
  revision: b667758483f1c6480ddf9c7385d41c2f1d78c054
  branch: 5-2-stable
# master
GIT
  remote: https://github.com/rails/rails.git
  revision: a04a757e5da646c6a3b81d879f5c11b1329b67d2

I have put my test example here, mainly for my own reference for the next time that @rails-bot marks this as stale.