istio: Pod to Pod access is blocked with STRICT mtls

Bug Description

There is a user case that we need pod to pod direct access inside mesh. Before we enable the STRICT PeerAuthentication, it’s working as expected however after enable the STRICT mtls, the access is broken. Here is the flow; Screen Shot 2022-02-18 at 6 24 22 PM

And here is the error msg got from pod1 to curl pod2 IP:

curl  http://10.0.0.2:8082 -v
*   Trying 10.0.0.2:8082...
* TCP_NODELAY set
* Connected to 10.0.0.2 (10.0.0.2) port 8082 (#0)
> GET / HTTP/1.1
> Host: 10.0.0.2:8082
> User-Agent: curl/7.68.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host 10.0.0.2 left intact
curl: (52) Empty reply from server

Checked the sidecar log in client side(pod 1), here is the error:

2022-02-18T09:53:52.338146Z debug   envoy pool  [C1649] connecting
2022-02-18T09:53:52.338153Z debug   envoy connection    [C1649] connecting to 10.0.0.2:8082
2022-02-18T09:53:52.338194Z debug   envoy connection    [C1649] connection in progress
2022-02-18T09:53:52.338210Z debug   envoy pool  queueing request due to no available connections
2022-02-18T09:53:52.338215Z debug   envoy conn_handler  [C1648] new connection
2022-02-18T09:53:52.339039Z debug   envoy connection    [C1649] connected
2022-02-18T09:53:52.339056Z debug   envoy pool  [C1649] assigning connection
2022-02-18T09:53:52.339070Z debug   envoy filter    [C1648] TCP:onUpstreamEvent(), requestedServerName:
2022-02-18T09:53:52.339988Z debug   envoy misc  Unknown error code 104 details Connection reset by peer
2022-02-18T09:53:52.340001Z debug   envoy connection    [C1649] remote close
2022-02-18T09:53:52.340004Z debug   envoy connection    [C1649] closing socket: 0
2022-02-18T09:53:52.340017Z debug   envoy pool  [C1649] client disconnected

Also checked the sidecar log in server side(pod 2), fond some log like:

2022-02-18T10:18:07.114417Z	debug	envoy connection	[C49805] closing data_to_write=143 type=2
2022-02-18T10:18:07.114429Z	debug	envoy connection	[C49805] setting delayed close timer with timeout 1000 ms
2022-02-18T10:18:07.114438Z	debug	envoy pool	[C6] response complete
2022-02-18T10:18:07.114443Z	debug	envoy pool	[C6] destroying stream: 0 remaining
2022-02-18T10:18:07.114488Z	debug	envoy connection	[C49805] write flush complete
2022-02-18T10:18:07.114633Z	debug	envoy connection	[C49805] remote early close
2022-02-18T10:18:07.114643Z	debug	envoy connection	[C49805] closing socket: 0
2022-02-18T10:18:07.114669Z	debug	envoy conn_handler	[C49805] adding to cleanup list
2022-02-18T10:18:07.520497Z	debug	envoy main	flushing stats
2022-02-18T10:18:08.051608Z	debug	envoy filter	original_dst: New connection accepted
2022-02-18T10:18:08.051676Z	debug	envoy filter	tls inspector: new connection accepted
2022-02-18T10:18:08.051720Z	debug	envoy conn_handler	closing connection: no matching filter chain found
2022-02-18T10:18:08.315775Z	debug	envoy connection	[C49800] remote close
2022-02-18T10:18:08.315816Z	debug	envoy connection	[C49800] closing socket: 0
2022-02-18T10:18:08.315885Z	debug	envoy conn_handler	[C49800] adding to cleanup list
2022-02-18T10:18:09.113576Z	debug	envoy conn_handler	[C49806] new connection
2022-02-18T10:18:09.113872Z	debug	envoy http	[C49806] new stream
2022-02-18T10:18:09.113989Z	debug	envoy http	[C49806][S9254331503083015686] request headers complete (end_stream=true):
':authorit

Expected behavior: Suppose the pod to pod access should always work no matter if it’s PERMISSIVE or STRICT. For STRICT mode, the client sidecar should intercept the request and do tls origination and the server sidecar should do tls termination and mtls between both sidecars.

Version

istio version: 1.10
kubectl version: 1.18.8

Additional Information

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 15 (9 by maintainers)

Most upvoted comments

We are still interested but its very very hard to implement due to backwards compatibility; https://istio.io/latest/blog/2022/introducing-ambient-mesh/ will support direct to pod

@Champ-Goblem if its just a few Services I would make them headless Services where this does work, even today. That of course doesn’t work for something like prometheus where it needs to hit every pod, but if its just a couple like Zookeeper or whatever, should be fine. Best case that doc is implemented in 1.14 so released in ~2mo, but realistically it won’t land until 1.15+. In order to implement it we will likely need substantial changes both to Istio and Envoy

Following, as this is making it difficult to run Apache Flink clusters with istio enabled.