cluster: Workers Keep Crashing Randomly

Randomly all of the workers will kill them selves with an error similar to these:

[104.236.27.101] Error: write after end
    at writeAfterEnd (_stream_writable.js:133:12)[104.236.27.101] 
    at Socket.Writable.write (_stream_writable.js:181:5)
    at Socket.write (net.js:616:40)
    at Socket.Writable.end (_stream_writable.js:341:10)
    at Socket.end (net.js:397:31)
    at App.exports.GenericApp.GenericApp.handle_error (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:148:13)
    at execute_request (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:33:30)
    at Object.req.next_filter (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:105:18)
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:107:13)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
Cluster: Exiting worker 7 with exitCode=7 signalCode=null
[104.236.27.101] TypeError: Cannot read property 'writeHead' of undefined
    at Listener.webjs_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/webjs.js:78:21)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:147:12)
    at Listener.handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:6:61)
    at Server.<anonymous> (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/sockjs.js:154:24)
    at Server.new_handler (/opt/meteor/app/programs/server/npm/ddp/node_modules/sockjs/lib/utils.js:86:19)
    at packages/ddp/stream_server.js:133:1
    at Array.forEach (native)
    at Function._.each._.forEach (packages/underscore/underscore.js:105:1)
    at Server.newListener (packages/ddp/stream_server.js:132:1)
    at packages/meteorhacks:cluster/lib/server/utils.js:11:1

This error may be unrelated

[104.236.27.101]     at Object.Meteor._nodeCodeMustBeInFiber (packages/meteor/dynamics_nodejs.js:9:1)
    at [object Object]._.extend.get (packages/meteor/dynamics_nodejs.js:21:1)
    at [object Object].RouteController.lookupOption (packages/iron:router/lib/route_controller.js:66:1)
    at new Controller.extend.constructor (packages/iron:router/lib/route_controller.js:26:1)
    at [object Object].ctor (packages/iron:core/lib/iron_core.js:88:1)
    at Function.Router.createController (packages/iron:router/lib/router.js:201:1)
    at Function.Router.dispatch (packages/iron:router/lib/router_server.js:39:1)
    at Object.router (packages/iron:router/lib/router.js:15:1)
    at next (/opt/meteor/app/programs/server/npm/webapp/node_modules/connect/lib/proto.js:190:15)
    at Object.Package [as handle] (packages/cfs:http-methods/http.methods.server.api.js:420:1)

It is causing instability on our website is there a known fix for this?

About this issue

  • Original URL
  • State: open
  • Created 9 years ago
  • Comments: 47 (11 by maintainers)

Most upvoted comments

I have the same issue. Anyone has an idea?

I would recommend you PM2+NGINX. You need to start every instance in fork mode (not PM2 cluster because it doesn’t support sticky sessions) and then do the load balancing via NGINX upstream.

@btoueg reduce the number of clusters from auto to something you think the server can handle. I counted the number running at crash time and cut it in half

Remind us why you would want to set the workers to a value greater than the amount of cores?