mongoose: No retries are made after "failed to connect on first connect"
By default, mongoose throws an Error if the first connect fails, which crashes node.
So to reproduce this bug, you will need the following code in your app, to catch the error and prevent the crash:
db.on('error', console.error.bind(console, 'connection error:'));
Now we can reproduce this bug as follows:
- Shut down your MongoDB
- Start up your node app that uses mongoose
- Your app will log: [MongoError: failed to connect to server [localhost:27017] on first connect [MongoError: connect ECONNREFUSED 127.0.0.1:27017]]
- Start up your MongoDB again
- Observe that mongoose does not now connect to the working MongoDB. The only way to get reconnected is to restart your app, or to use a manual workaround.
Expected behaviour: Since autoreconnect defaults to true, I would expect mongoose to establish a connection soon after the MongoDB is accessible again.
Note: If the first connect succeeds, but the connection to MongoDB is lost during runtime, then autoreconnect works fine, as expected. The problem is the inconsistency if MongoDB is not available when the app starts up.
(If this is the desired behaviour, and developers are recommended to handle this situation by not catching the error, and letting node crash, then I can accept that, but it is worth making it clear.)
node v4.4.1, mongoose@4.9.4, mongodb@2.2.19, mongodb-core@2.1.4
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 10
- Comments: 40 (10 by maintainers)
For anyone wanting auto-reconnection when first connect fails, this is how I handle it:
For mongoose < 4.11 use
db.open()instead ofdb.openUri()For mongoose 4.11.7 this technique does not work. For mongoose 4.13.4 it is working again!Edit 2019/09/02: There is also a shorter solution using
promiseRetryhere.This recent post from the Strongloop Loopback mongodb connector may be relevant. Their
lazyConnectflag defers first connection until the endpoint is hit. If first connection fails in that case, the default connection loss settings will take effect (it will retry).My interest is container orchestration, where “container startup order” can often be set and expected but “order of service availability” cannot. An orchestration tool might confirm that the mongo container is “up” even though the mongo service isn’t available yet.
So, if my mongo container takes 1s to start but 5s for the service to become available, and my app container takes 1s to start and 1s for the service to be available, the app service will outrun the mongo service, causing a first connection failure as originally described.
The Docker Compose documentation has this to say:
So there’s a definite gap here in the context of container orchestration, but both of these stances appear to be valid:
Sure, there’s a gap, but then the responsibility falls to you to decide whether to retry if initial connection fails. All mongoose tells you is that it failed. If you make the questionable decision to use docker compose in production (or in any context for that matter), it’s up to you to handle retrying initial connection failures.
i think this is a decent idea actually, i’ll label this a feature request
Thanks for the repro. I have looked into mongodb-core. It is the intended behaviour of the driver:
So I suspect we won’t get any different behaviour from the driver.
I actually think that behaviour is reasonable for a low-level driver. It will help developers who accidentally try to connect to the wrong host or the wrong port.
But if we want to do something more developer-friendly in mongoose, we could consider:
errorevent, but still try the reconnect automatically.)If I remember correctly, when I used the workaround and finally connected after a few failed attempts, queries already queued by the app did get executed as desired. (But this is worth testing again.)
No, failing fast on initial connection is a pretty consistent behavior across MongoDB drivers and there isn’t much benefit to mongoose supporting it.
i’m guessing this is a
mongodb-coreissue. It should attempt to reconnect even if the first try fails I think, since I’m not sure why that would be different from subsequent attempts.Can you also report this issue there?
Here’s a full repro script: