istio: Istio proxy does not complete TCP connections until data is sent

Describe the bug

When using a TCP server and client, connections don’t complete to the server until data is sent by client with istio-proxy sidecars in place. Istio is deployed as a sidecar in all pods in this scenario.

Expected behavior

Expected TCP server to see connection immediately when client establishes a connection.

Steps to reproduce the bug

kubectl apply the following yaml to a namespace with istio-proxy sidecars enabled. This will create a create natcat server listening on port 4444 and k8s service for it.

apiVersion: v1
kind: Service
metadata:
  name: tcp-server
  labels:
    app: "tcp-server"
spec:
  type: ClusterIP
  ports:
    - port: 4444
      name: tcp-server
  selector:
    app: "tcp-server"
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: tcp-server
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tcp-server
        version: v1
    spec:
      containers:
      - name: tcp-server
        image: alpine:3.8
        imagePullPolicy: IfNotPresent
        command:
        - "nc"
        args:
        - -v
        - -lk
        - -p
        - "4444"
        ports:
        - containerPort: 4444

Use the following to watch the logs of netcat:

> kubectl logs -f $(kubectl get pods -l "app=tcp-server" -o name) tcp-server
listening on [::]:4444 ...

Next run another alpine linux container in the same namespace that will use netcat as a client to the server running:

> kubectl run tcp-client -it --rm=true --image alpine:3.8 ash
> nc -w5 -v tcp-server 4444
tcp-server (10.109.191.242:4444) open

The client now has an open connection, but if you look at the logs of the server it is unaware of this connection until you send the first message.

Start the client again but this time send the following message and watch the logs of server

> nc -w5 -v tcp-server 4444
hello

The server should now see the connection and log the message that was sent:

connect to [::ffff:127.0.0.1]:4444 from localhost:35376 ([::ffff:127.0.0.1]:35376)
hello

If you attempt the exact same test in a namespace with the istio sidecar disabled the netcat server will see the connection when the client establishes a connection right away.

This means applications (like NATS) that rely on a tcp client to connect to a server, but have the server send the first message, will not work.

Version

istio version: 1.0.5

k8s version: 1.10.3

Installation

git clone https://github.com/istio/istio
git checkout 1.0.5
helm install install/kubernetes/helm/istio --name istio --namespace istio-system

Environment

Attempted with Kubernetes on Docker-for-Mac and Kubernetes Engine on GCP

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 13
  • Comments: 15 (7 by maintainers)

Most upvoted comments

The PERMISSIVE mode (TLS/no-TLS detection) is implemented using TLS Inspector, which peeks at the first few bytes transmitted from the client, and everything else (including connection to upstream) is delayed until those first few bytes arrive and we can detect whether the connection is using TLS or not. If nothing arrives, then the connection is terminated after 15s.

However, this should only affect legacy clients (without sidecars) connecting to Istio services that are using protocols where server sends the first packet (e.g. NATS, MySQL), and not communication between Istio services, because sidecars are supposed to communicate with each other using mTLS in both: PERMISSIVE and STRICT modes (but this isn’t currently happening, and it’s the root cause for your problems, see: #11494).

Note: To work around the issue with TLS Inspector, we could introduce a “short” timeout, after which it would give up waiting and assume no-TLS, but this introduces a race condition for slow clients using TLS and/or adds a significant delay to legacy connections that are using protocols where server sends the first packet (though, that’s better than not working at all, IMHO)… but like I said, you shouldn’t be hitting this path when communicating between Istio services in the first place.

cc @duderino

This issue has been automatically closed because it has not had activity in the last month and a half. If this issue is still valid, please ping a maintainer and ask them to label it as “help wanted”. Thank you for your contributions.

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged “help wanted” or other activity occurs. Thank you for your contributions.

@Will2817 just installed istio-1.3.5 on osx/docker-desktop with kubernetes 1.14.7 and was able to successfully see a tcp-server connect message upon tcp-client connection prior to message send.

If you feel this issue or pull request deserves attention[…]

I still think it deserves attention… We never did successfully deploy Istio at all because of this issue. In a way that’s not a bad thing though, we realized we probably never did need a service mesh to begin with 😉