istio: Istio does not route traffic correctly if it's deployed in a namespace other than istio-system

Bug description

When I deploy Istio to a namespace which is not istio-system, It is unable to route traffic from the ingress-gateway to the application pods situated in another namespace. It’s only able to route traffic when the application pods are in the same namespace as that of the istio control plane

cc @GregHanson who has confirmed that it is reproducible

Expected behaviour It should route traffic to the bookinfo pods in the default namespace as suggested by the tutorial here.

Please note that if I deploy Istio in istio-system, it works as expected. It’s able to route traffic from Ingress-gateway to the bookinfo pods in the default namespace.

Steps to reproduce the bug

Deploy Istio to a custom namespace with the followingistioctl command

istioctl manifest apply --set components.cni.enabled=true --set values.global.istioNamespace=<custom_ns> --set values.global.configNamespace=<custom_ns> --set values.global.prometheusNamespace=<custom_ns>

Next, I deploy the Bookinfo application, as directed by the steps here.

When I try to hit the productpage endpoint via the ingress-gateway LB, the request never goes through. On enabling the envoyAccessLogs, I found instances of 400s

GET /productpage HTTP/1.1" 400 - "-" "-" 0 0 24 4 "10.2.8.52" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:75.0) Gecko/20100101 Firefox/75.0" "d1f05cd7-2bac-48bd-abef-f06a97dc2014" "aff84ac935eb447afa9b1f649a782b77-1672155039.us-east-1.elb.amazonaws.com" "10.244.65.85:9080" outbound|9080||productpage.default.svc.cluster.local 10.244.65.105:60994 10.244.65.105:80 10.2.8.52:5176 - -
[2020-04-28T23:02:50.983Z] "GET /productpage HTTP/1.1" 400 - "-" "-" 0 0 0 0 "10.2.8.52" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:75.0) Gecko/20100101 Firefox/75.0" "218d05a1-7627-4d63-a0e6-a46e87a9bf68" "aff84ac935eb447afa9b1f649a782b77-1672155039.us-east-1.elb.amazonaws.com" "10.244.65.85:9080" outbound|9080||productpage.default.svc.cluster.local 10.244.65.105:32784 10.244.65.105:80 10.2.8.52:5176 - -
[2020-04-28T23:02:52.515Z] "GET /productpage HTTP/1.1" 400 - "-" "-" 0 0 0 0 "10.2.8.52" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:75.0) Gecko/20100101 Firefox/75.0" "29d24187-43fa-4c61-a3fc-9a4b7e07a20f" "aff84ac935eb447afa9b1f649a782b77-1672155039.us-east-1.elb.amazonaws.com" "10.244.65.85:9080" outbound|9080||productpage.default.svc.cluster.local 10.244.65.105:32802 10.244.65.105:80 10.2.8.52:5176 - -

Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)

v1.5.1

How was Istio installed? Istioctl manifest apply

Environment where bug was observed (cloud vendor, OS, etc) AWS

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 22 (22 by maintainers)

Most upvoted comments

Appears to be a legitimate bug requiring some resolution. I marked as a P0, although this feels like a blocker.

Cheers -steve

The problem is that the envoyfilters only apply to the gateway, not the default ns. Thats why it works when they are in the same namespace as well, since its scoped to that namespace. The reasoning is that Istio has a concept of “root” namespace where (some) configs there are treated as global objects. Its not the best documented feature there is… Basically its istio-system by default, so they envoyfilter apply to all namespaces. However, in this case the envoyfilters are now deployed to <my-ns>, while the rootnamespace remains istio-system. The fix would be to set --set meshConfig.rootNamespace=my-ns.

What are the expected istioctl flags/settings for installing istio in a different namespace with demo or default profile?

--set values.global.istioNamespace=<custom_ns> 
--set values.global.configNamespace=<custom_ns> 
--set values.meshConfig.rootNamespace=<custom_ns>

Are there any that we should add deprecation messages for?

btw to reproduce this I had to fix the webhook config - did you guys both do something?

Oh yes, I did do that. A kubectl edit deploy to point to the right MutatingWebhookConfiguration.

Looks like this fix hasn’t gone in yet: https://github.com/istio/istio/pull/22828

envoy logs:

[Envoy (Epoch 0)] [2020-04-29 23:58:37.915][31][debug][connection] [external/envoy/source/extensions/transport_sockets/tls/ssl_socket.cc:176] [C72] handshake complete
[Envoy (Epoch 0)] [2020-04-29 23:58:37.915][31][debug][http] [external/envoy/source/common/http/conn_manager_impl.cc:278] [C72] dispatch error: http/1.1 protocol error: HPE_INVALID_METHOD
[Envoy (Epoch 0)] [2020-04-29 23:58:37.915][31][debug][connection] [external/envoy/source/common/network/connection_impl.cc:101] [C72] closing data_to_write=66 type=2

btw to reproduce this I had to fix the webhook config - did you guys both do something?