istio: Service without selector does not work with mTLS when switching the endpoint
Bug description
- When using services without selectors with mTLS, we get
upstream connect error or disconnect/reset before headers. reset reason: connection failure
if switching the endpoint. - This is an trick of Knative Serving porject. It changes the svc’s endpoint for the routing. But due to this bug, Knative Serving does not work with mTLS STRICT mode.
Affected product area (please put an X in all that apply)
[ ] Configuration Infrastructure [ ] Docs [ ] Installation [x] Networking [ ] Performance and Scalability [ ] Policies and Telemetry [ ] Security [ ] Test and Release [ ] User Experience [ ] Developer Infrastructure
Expected behavior
- When switching the endpoint of services without selectors, mTLS should also work fine.
Steps to reproduce the bug
- Create a testing ns
bug
$ kubectl create ns bug
2. Add mTLS policy & destinationrule
cat <<EOF | kubectl apply -f -
apiVersion: "authentication.istio.io/v1alpha1"
kind: "Policy"
metadata:
name: "default"
namespace: "bug"
spec:
peers:
- mtls:
mode: STRICT
---
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "mtls-services"
namespace: "bug"
spec:
host: "*.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
EOF
3. Create sleep deployment for the test client
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: sleep
namespace: bug
---
apiVersion: v1
kind: Service
metadata:
name: sleep
namespace: bug
labels:
app: sleep
spec:
ports:
- port: 80
name: http
selector:
app: sleep
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep
namespace: bug
spec:
replicas: 1
selector:
matchLabels:
app: sleep
template:
metadata:
labels:
app: sleep
annotations:
sidecar.istio.io/inject: "true"
spec:
serviceAccountName: sleep
containers:
- name: sleep
image: governmentpaas/curl-ssl
command: ["/bin/sleep", "3650d"]
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /etc/sleep/tls
name: secret-volume
volumes:
- name: secret-volume
secret:
secretName: sleep-secret
optional: true
EOF
4. Create httpbin1 & httpbin2 deployments for test server
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Service
metadata:
name: httpbin
namespace: bug
labels:
app: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin1
namespace: bug
spec:
replicas: 1
selector:
matchLabels:
app: httpbin1
version: v1
template:
metadata:
labels:
app: httpbin1
version: v1
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin2
namespace: bug
spec:
replicas: 1
selector:
matchLabels:
app: httpbin2
version: v1
template:
metadata:
labels:
app: httpbin2
version: v1
annotations:
sidecar.istio.io/inject: "true"
spec:
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
EOF
- Checking the pod IP and pod name
$ kubectl get pod -n bug -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
httpbin1-6fd7cccfd7-2tmmc 2/2 Running 0 4m15s 172.20.89.153 ip-172-20-88-161.ap-southeast-1.compute.internal <none> <none>
httpbin2-56f9bd7876-7q467 2/2 Running 0 4m15s 172.20.54.190 ip-172-20-49-92.ap-southeast-1.compute.internal <none> <none>
sleep-67769569f9-knlsq 2/2 Running 0 7m1s 172.20.68.201 ip-172-20-64-95.ap-southeast-1.compute.internal <none> <none>
6. Create endpoint for httpbin1 (You need to replace the pod IP&name with yours)
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Endpoints
metadata:
name: httpbin
namespace: bug
subsets:
- addresses:
- ip: 172.20.89.153 ### Replace your httpbin1 pod's IP
targetRef:
kind: Pod
name: httpbin1-6fd7cccfd7-2tmmc ### Replace your httpbin1 pod's name
namespace: bug
ports:
- name: http
port: 80
protocol: TCP
EOF
- Testing the serving access (It works fine.)
$ kubectl -n bug exec -it sleep-67769569f9-knlsq -- curl httpbin.bug.svc:8000
- Switching the endpoint to httpbin2
$ kubectl edit ep -n bug httpbin
...
subsets:
- addresses:
- ip: 172.20.54.190 ### Replace your httpbin2 pod's IP
targetRef:
kind: Pod
name: httpbin2-56f9bd7876-7q467 ### Replace your httpbin2 pod's name
namespace: bug
- Now, you get
upstream connect error or disconnect/reset before headers. reset reason: connection failure
error.
$ kubectl -n bug exec -it sleep-67769569f9-knlsq -- curl httpbin.bug.svc:8000
upstream connect error or disconnect/reset before headers. reset reason: connection failure
Version (include the output of istioctl version --remote
and kubectl version
)
Istio 1.3.2 / k8s 1.14
command output
$ istioctl version --remote
client version: 1.2.4
cluster-local-gateway version:
cluster-local-gateway version:
citadel version: 1.3.2
galley version: 1.3.2
ingressgateway version: 1.3.2
ingressgateway version: 1.3.2
pilot version: 1.3.2
pilot version: 1.3.2
policy version: 1.3.2
sidecar-injector version: 1.3.2
telemetry version: 1.3.2
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.0-alpha.0.1451+896c901684e774", GitCommit:"896c901684e774169dbd477aecd880df1be6bdd0", GitTreeState:"clean", BuildDate:"2019-06-25T04:30:19Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.6", GitCommit:"96fac5cd13a5dc064f7d9f4f23030a6aeface6cc", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:16Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
How was Istio installed?
Here is the template https://github.com/knative/serving/blob/75f0f8775f99357d6a3d23bf478cc87c4d577987/third_party/istio-1.3.2/istio.yaml
Environment where bug was observed (cloud vendor, OS, etc)
- Kubernetes on AWS, but I’m sure it does not matter.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 67 (38 by maintainers)
@nak3 yep. I figured that was the case, but I thought I would ask since that would be much easier to support 🙂
Originally it was due to lack of status on the Istio resources, i.e. we were reprogramming virtual services during scale to 0 to point at Activator (buffering proxy) and we never knew when it would actually be done so we had random timeouts throughout the code. But afterwards it turned out to be an extremely useful tool for other things like backend overload protection, ideal loadbalancing (e.g. when you want single request in flight to each pod), etc.