kubernetes: Websocket doesn't seem working in GKE

Is this a BUG REPORT or FEATURE REQUEST?: /kind bug

What happened: Running a socket.io app with GKE. Configured with ingress/service/deployment/…etc. When only have one pod in the cluster all works well. But when increasing pod to more than one, browser console starts to print errors like below:

[Error] Failed to load resource: the server responded with a status of 400 (HTTP/2.0 400) (socket.io, line 0) [Error] WebSocket connection to ‘wss://MY_HOST/socket.io/?EIO=3&transport=websocket&sid=jyeHSgEkSz1TzadNAABW’ failed: Unexpected response code: 501

Sometimes with refreshing my website for several times, the complaint gone and everything behave normally.

What you expected to happen: Websocket works without error in a cluster with multiple pod.

How to reproduce it (as minimally and precisely as possible): Create a simple socket.io app running on a single pod, then create a service with sessionAffinity: ClientIP configured. Then ingress as load balancer, GKE will provision a public IP for you. Open a web browser point to the IP and you should see your app work correctly. Then scale your app to multiple pods, you should be able to see some error from browser console.

Anything else we need to know?:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    run: web
  name: web
spec:
  replicas: 2
  selector:
    matchLabels:
      run: web
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        run: web
    spec:
      containers:
      - image: registry.gitlab.com/myimg:latest
        name: web
        env:
        - name: PORT
          value: "80"
        ports:
        - containerPort: 80
          protocol: TCP
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: web
  name: web
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: web
  sessionAffinity: ClientIP
  type: NodePort
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: staging-ingress
  annotations:
    kubernetes.io/tls-acme: "true"
    kubernetes.io/ingress.class: "gce"
spec:
  tls:
  - hosts:
    - example.com
    secretName: dev-tls
  rules:
  - host: example.com
    http:
      paths:
      - path: /*
        backend:
          serviceName: web
          servicePort: 80

Environment:

  • Kubernetes version (use kubectl version):

Client Version: version.Info{Major:“1”, Minor:“7”, GitVersion:“v1.7.6”, GitCommit:“4bc5e7f9a6c25dc4c03d4d656f2cefd21540e28c”, GitTreeState:“clean”, BuildDate:“2017-09-14T06:55:55Z”, GoVersion:“go1.8.3”, Compiler:“gc”, Platform:“darwin/amd64”} Server Version: version.Info{Major:“1”, Minor:“7+”, GitVersion:“v1.7.6-gke.1”, GitCommit:“407dbfe965f3de06b332cc22d2eb1ca07fb4d3fb”, GitTreeState:“clean”, BuildDate:“2017-09-27T21:21:34Z”, GoVersion:“go1.8.3”, Compiler:“gc”, Platform:“linux/amd64”}

  • Cloud provider or hardware configuration**: GKE
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Comments: 22 (9 by maintainers)

Most upvoted comments

@bschwartz757 Your link appears to be dead now. Is this the page you are referring to?

https://github.com/kubernetes/ingress-gce/tree/master/examples/websocket

(thanks btw 🙂 )

With that solution, are you able to have more than one pod per node?

@Kenblair1226 Keeping the session affinity on the backend service, try using externalTrafficPolicy=Local on the service instead of setting sessionAffinity. Again, you’ll be constrained to a max of one pod per node.

I’m guessing that even though sessionAffinity for the backend service is routing your browsers traffic to the same machine, the remote address of the connections may be different. The K8s sessionAffinity doesn’t look at the x-forward-for address - only the inconsistent proxy address.

If that hypothesis is correct, then this isn’t a websocket-specific issue. If you ran a normal socket.io web page that occasionally polled the service (assuming connections are killed between reqs), then you would hit different pods. We’re working on a project which may make GCE ingress sessionAffinity a non-issue, but it won’t be ready anytime soon.