mongoose: No retries are made after "failed to connect on first connect"

By default, mongoose throws an Error if the first connect fails, which crashes node.

So to reproduce this bug, you will need the following code in your app, to catch the error and prevent the crash:

db.on('error', console.error.bind(console, 'connection error:'));

Now we can reproduce this bug as follows:

  1. Shut down your MongoDB
  2. Start up your node app that uses mongoose
  3. Your app will log: [MongoError: failed to connect to server [localhost:27017] on first connect [MongoError: connect ECONNREFUSED 127.0.0.1:27017]]
  4. Start up your MongoDB again
  5. Observe that mongoose does not now connect to the working MongoDB. The only way to get reconnected is to restart your app, or to use a manual workaround.

Expected behaviour: Since autoreconnect defaults to true, I would expect mongoose to establish a connection soon after the MongoDB is accessible again.

Note: If the first connect succeeds, but the connection to MongoDB is lost during runtime, then autoreconnect works fine, as expected. The problem is the inconsistency if MongoDB is not available when the app starts up.

(If this is the desired behaviour, and developers are recommended to handle this situation by not catching the error, and letting node crash, then I can accept that, but it is worth making it clear.)

node v4.4.1, mongoose@4.9.4, mongodb@2.2.19, mongodb-core@2.1.4

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 10
  • Comments: 40 (10 by maintainers)

Commits related to this issue

Most upvoted comments

For anyone wanting auto-reconnection when first connect fails, this is how I handle it:

function createConnection (dbURL, options) {
    var db = mongoose.createConnection(dbURL, options);

    db.on('error', function (err) {
        // If first connect fails because mongod is down, try again later.
        // This is only needed for first connect, not for runtime reconnects.
        // See: https://github.com/Automattic/mongoose/issues/5169
        if (err.message && err.message.match(/failed to connect to server .* on first connect/)) {
            console.log(new Date(), String(err));

            // Wait for a bit, then try to connect again
            setTimeout(function () {
                console.log("Retrying first connect...");
                db.openUri(dbURL).catch(() => {});
                // Why the empty catch?
                // Well, errors thrown by db.open() will also be passed to .on('error'),
                // so we can handle them there, no need to log anything in the catch here.
                // But we still need this empty catch to avoid unhandled rejections.
            }, 20 * 1000);
        } else {
            // Some other error occurred.  Log it.
            console.error(new Date(), String(err));
        }
    });

    db.once('open', function () {
        console.log("Connection to db established.");
    });

    return db;
}

// Use it like
var db = createConnection('mongodb://...', options);
var User = db.model('User', userSchema);

For mongoose < 4.11 use db.open() instead of db.openUri() For mongoose 4.11.7 this technique does not work. For mongoose 4.13.4 it is working again!


Edit 2019/09/02: There is also a shorter solution using promiseRetry here.

This recent post from the Strongloop Loopback mongodb connector may be relevant. Their lazyConnect flag defers first connection until the endpoint is hit. If first connection fails in that case, the default connection loss settings will take effect (it will retry).

My interest is container orchestration, where “container startup order” can often be set and expected but “order of service availability” cannot. An orchestration tool might confirm that the mongo container is “up” even though the mongo service isn’t available yet.

So, if my mongo container takes 1s to start but 5s for the service to become available, and my app container takes 1s to start and 1s for the service to be available, the app service will outrun the mongo service, causing a first connection failure as originally described.

The Docker Compose documentation has this to say:

Compose will not wait until a container is “ready” (whatever that means for your particular application) - only until it’s running. There’s a good reason for this.

The problem of waiting for a database (for example) to be ready is really just a subset of a much larger problem of distributed systems. In production, your database could become unavailable or move hosts at any time. Your application needs to be resilient to these types of failures.

To handle this, your application should attempt to re-establish a connection to the database after a failure. If the application retries the connection, it should eventually be able to connect to the database.

The best solution is to perform this check in your application code, both at startup and whenever a connection is lost for any reason.

So there’s a definite gap here in the context of container orchestration, but both of these stances appear to be valid:

  1. Mongoose could support an option to retry on first connect (perhaps defaulted to false with some cautionary documentation), or
  2. Mongoose could place the responsibility on the developer to write code to retry if first connect fails.

Sure, there’s a gap, but then the responsibility falls to you to decide whether to retry if initial connection fails. All mongoose tells you is that it failed. If you make the questionable decision to use docker compose in production (or in any context for that matter), it’s up to you to handle retrying initial connection failures.

i think this is a decent idea actually, i’ll label this a feature request

Thanks for the repro. I have looked into mongodb-core. It is the intended behaviour of the driver:

The driver will fail on first connect if it cannot connect to the host. This is by design to ensure quick failure on unreachable hosts. Reconnect behavior only kicks in once the driver has performed the initial connect.

It’s up to the application to decide what to do. This is by design to ensure the driver fails fast and does not sit there making you think it’s actually working.

So I suspect we won’t get any different behaviour from the driver.

I actually think that behaviour is reasonable for a low-level driver. It will help developers who accidentally try to connect to the wrong host or the wrong port.

But if we want to do something more developer-friendly in mongoose, we could consider:

  • When auto-reconnect options are enabled, keep trying to reconnect until the Mongo server can be contacted (by building in something like the workaround linked above).
  • Log when mongoose is doing this, so in the case of a connection that never establishes, the developer will at least know where the problem is. (The log could be postponed, e.g. 30 seconds. Actually instead of direct logging, I guess we should emit an advisory error event, but still try the reconnect automatically.)

If I remember correctly, when I used the workaround and finally connected after a few failed attempts, queries already queued by the app did get executed as desired. (But this is worth testing again.)

No, failing fast on initial connection is a pretty consistent behavior across MongoDB drivers and there isn’t much benefit to mongoose supporting it.

i’m guessing this is a mongodb-core issue. It should attempt to reconnect even if the first try fails I think, since I’m not sure why that would be different from subsequent attempts.

Can you also report this issue there?

Here’s a full repro script:

const mongoose = require('mongoose');
const co = require('co');
mongoose.Promise = global.Promise;
const GITHUB_ISSUE = `gh-5169`;


exec()
  .catch(error => {
    console.error(`Error: ${ error }\n${ error.stack }`);
  });


function exec() {
  return co(function*() {
    const db = mongoose.createConnection(`mongodb://localhost:27017/${ GITHUB_ISSUE }`);
    db.on('error', (error) => {
      console.error(`in an error ${ error }\n${ error.stack }`);
    })
  });
}