dotcom-rendering: Fragmented packets causing SSL Handshake to hang on IPv6
As reported to userhelp, a user unabled to downgrade to IPv4 is unable to complete an SSL handshake with Fastly.
Unable to reproduce locally, but the command that hangs is: openssl s_client -6 -host '2a04:4e42::367' -port 443
This adress comes from our dualstack:
> host www.theguardian.com
www.theguardian.com is an alias for dualstack.guardian.map.fastly.net.
dualstack.guardian.map.fastly.net has address 151.101.65.111
dualstack.guardian.map.fastly.net has address 151.101.129.111
dualstack.guardian.map.fastly.net has address 151.101.193.111
dualstack.guardian.map.fastly.net has address 151.101.1.111
dualstack.guardian.map.fastly.net has IPv6 address 2a04:4e42:200::367
dualstack.guardian.map.fastly.net has IPv6 address 2a04:4e42:400::367
dualstack.guardian.map.fastly.net has IPv6 address 2a04:4e42:600::367
dualstack.guardian.map.fastly.net has IPv6 address 2a04:4e42::367
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 29 (11 by maintainers)
I’m the original reporter of the problem to userhelp. Thanks for tracking this. Here’s the packet capture (gzip’ed pcapng file) showing what happened. theguardian.pcapng.gz
@nbriggs Excellent investigation!
I think I’m able to replicate this now to a degree by using
pingand sending large payloads.Your idea of packet fragmentation seems like it could quite likely be the cause! Thanks!
I’ll raise the issue with Fastly, I’m not sure if I can make the ticket public but if not I’ll relay any useful information here!
Yes… as I mentioned, my ISP provides IPv6 access through a 6RD deployment (https://en.wikipedia.org/wiki/IPv6_rapid_deployment), which is based on a 6to4 tunnel.
@AshCorr – I wonder if Fastly is engaging in a similar hack to what Cloudflare used to. https://blog.cloudflare.com/increasing-ipv6-mtu/
An MTU <= 1472 on my ethernet interface seems to be OK, higher than that and it gets slow to no response:
I’m running some more experiments, and I’m beginning to wonder if the problem is somewhere in my ISP’s (sonic.com) IPv6 network.
I just tried the same client via another ISP (comcast.net) and haven’t been able to reproduce the failure (100/100 success at 2s interval), yet on sonic’s network it took only 5 attempts to get a failure. Attached is the packet capture of the 5 attempts. Even the successful connections look pretty ugly, and the approx 4s delay that I experience when it does work is obvious in the trace.
I’m not sure what the next step in debugging this is – perhaps see if I can move the termination of the 6RD gateway off the ISP’s router onto something else in case it’s the router’s crap 6RD implementation.
sonic5-theguardian.pcapng.gz