engine.io: CORS pre-flight breaks socket.io behind load balancer
I ran into an issue on our servers. We are running socket.io v1.0.6 on multiple server instances behind a load balancer. For polling, the requests go through the ELB with sticky sessions turned on. Our real-time service is on a subdomain, and thanks to CORS pre-flight requests, socket.io fucks up. Here is what happens on the client when the polling transport is used:
- A socket.io handshake POST request occurs. The response comes back valid with an
sid, and the headers include the AWS ELB cookie. - Next, a pre-flight
OPTIONSrequest is made by the browser. The ELB cookie is not included by the browser here. As a result, theOPTIONSrequest is routed to a potentially different server which will not recognize thesidin the query string. - When the request is routed to the wrong server, socket.io responds with a 400 HTTP status code and an
Session ID unknownerror. - Since the pre-flight request fails, the browser also fails the actual GET polling request, and tries to re-do the handshake from the beginning
- Possibly due to the headers being sent, the browser sends the
OPTIONSpre-flight request fairly regularly as opposed to doing it only once, so this cycle repeats over and over.
The fix on our end currently is to respond to all OPTIONS requests with a 200 and all the usual Access-Control-Allow-… headers the browser knows and loves. We do this before they even get to socket.io in our nginx config.
Now, engine.io appears to already handle this case here: https://github.com/Automattic/engine.io/blob/master/lib/transports/polling-xhr.js#L40
However, that check is only reached if the sid is valid here: https://github.com/Automattic/engine.io/blob/master/lib/server.js#L180
which it isn’t, of course. I can submit a PR but I’d like to know how you guys think it’d be best to handle this. AFAIK, if a request method is OPTIONS, we can make the assumption that we are polling. But, since we don’t have a valid sid to look up a client by, this might mean moving fairly transport-specific logic into server.js which sounds less than ideal.
Thoughts?
About this issue
- Original URL
- State: closed
- Created 10 years ago
- Reactions: 6
- Comments: 43 (25 by maintainers)
I’m trying to reproduce the issue, but so far the connection seems actually stable.

I’m using https://github.com/socketio/engine.io/tree/a63c7b787c54b3a47da7f355826bf2770139c62b.
For anyone else with this issue, the following code will fix it:
Applied after socket.io:
e.g:
This is one of the best issue descriptions I’ve read in a while. Thanks for taking the time.
Can you send me the complete headers of the request that yields
OPTIONS? Ideally we wouldn’t need the pre-flight. I’m suspecting you’re sending binary data which is resulting in aContent-Typeswitch?Requests that do not need pre-flight according to MDN are:
Before proceeding with the OPTIONS fix, I want to make sure it’s happening for a good reason.
The flow I currently see is:
/socket.io/?EIO=3&transport=polling&t=.../socket.io/?EIO=3&transport=polling&t=...&sid=.../socket.io/?EIO=3&transport=polling&t=...&sid=.../socket.io/?EIO=3&transport=polling&t=...&sid=...In 1., the session is created. socket.io assigns a sid and ALB create a new cookie. In 2., we send a new request with the sid provided by socket.io and the cookie sent by ALB. In 3., we send an OPTIONS request before doing the POST request used in the ping process. But it seems that OPTIONS requests are not sending the cookies. (https://fetch.spec.whatwg.org/#cors-protocol-and-credentials
Note that even so, a CORS-preflight request never includes credentials.). Because of that, ALB send a new cookie, which might associate the client to the same server or a new server. In 4., if the cookie links the user to a new server, the ping request will arrive to a server without the session, and we have to re-establish the connection from 1.All of that is happening because:
The contents of these cookies are encrypted using a rotating key., meaning a different cookie is sent with every request).The only way to make the polling transport in the current state of ALB would be to avoir the preflight requests.
However, it should work fine with any loadbalancer using a consistent hashing algorithm, or without setting a new cookie on preflight requests.
Basically, the issue didn’t change much since the original post. #484 is just making it fail at the POST request instead of the OPTIONS request.
Hello,
I’m trying to set up socket.io servers behind AWS ALB. The stickiness is using cookies as stated in http://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-target-groups.html#sticky-sessions
Because the web application and the ALB are not on the same domain, OPTIONS requests are sent during the handshake phase for long-polling. However the OPTIONS requests don’t have cookies attached to them, and are sent following the ALB distribution method instead of being sticky. The OPTIONS requests being handled late inside the
polling-xhrtransport, there is a verification step being done early which fails because the session is not valid on all the servers. All of that leads to the handshake failure.While I agree that the aforementioned points should be addressed to avoid doing OPTIONS requests, OPTIONS requests should not trigger all the described logic.
The rfc2616 says:
I’ve come up with a way to by-pass the verification steps and delegate earlier the processing of the OPTIONS requests to the transport with https://github.com/MiLk/engine.io/commit/c8012eb24b20dda7b9a679cde8629361cadf2f12 I’m not convinced that’s the best way to handle it. Maybe the handling the OPTIONS requests directly in the listener is a better idea as proposed by other people in this thread.
I’m willing to spend more time on this issue. What are your views on that?
I’ll explore the other solution which is trying to disable CORS by changing the content-type and see where I go with that.
Edit: I’ve added an option to let the application handle OPTIONS requests instead of engine.io. https://github.com/socketio/engine.io/pull/484
We should fix this. OPTIONS preflights can happen when doing CORS stuff and this behavior is really annoying to an end user to deal with. It basically renders this module unusable behind the amazon ELB even when sticky sessions are on. I do not think this has anything to do with careless coding and is happening under the simplest uses of this module.
Using
2.0.3on the server and the following on the client, we successfully serve WebSockets and long polling with a HA setup in production behind ALB (about 2k conn / server).We didn’t do any change on the server since last August (except Node version upgrade).
The setup is what I described earlier in https://github.com/socketio/engine.io/issues/279#issuecomment-309203357
What I meant is adding
Access-Control-Allow-Methods: GET, POSTwould remove the OPTIONS request in 3/, right?/socket.io/?EIO=3&transport=polling&t=...(allow GET and POST, not only GET)/socket.io/?EIO=3&transport=polling&t=.../socket.io/?EIO=3&transport=polling&t=...&sid=.../socket.io/?EIO=3&transport=polling&t=...&sid=...+1. Big time priority.
On Mon Dec 01 2014 at 3:17:05 AM Mark Mokryn notifications@github.com wrote: