istio: Multicluster Mixer Error with Single Control Plane
Describe the bug After following instructions on https://istio.io/docs/setup/kubernetes/multicluster-install/ using 1.1.0-snapshot4 (tested with 1.0.5 also), I’ve deployed bookinfo application and moved reviews-v3 to the remote cluster. Traffic goes fine, application works fine, I can see traces also on Jaeger UI, however, Mixer is not able to identify the pod, and I’m getting some “unknown” on Kiali and ServiceGraph.
Expected behavior According to PR https://github.com/istio/istio/pull/8536 in version 1.1.0-snapshot4 kubernetesenv adapter on Mixer should be able to connect to the remote cluster to get pod information.
Steps to reproduce the bug Install Istio Multicluster, with routes on both clusters so that PODs from each cluster can ping each other. Deploy bookinfo application, remove reviews-v3 from the control-plane cluster and install it on the remote cluster.
Version istio-1.0.5 istio-1.1.0-snapshot4.
Installation
On master:
helm template install/kubernetes/helm/istio --name istio --namespace istio-system --set tracing.enabled=true --set ingress.enabled=true --set gateways.enabled=true --set gateways.istio-ingressgateway.enabled=true --set gateways.istio-egressgateway.enabled=true --set sidecarInjectorWebhook.enabled=true --set galley.enabled=true --set mixer.enabled=true --set mixer.istio-policy.autoscaleEnabled=true --set mixer.istio-telemetry.autoscaleEnabled=true --set pilot.enabled=true --set telemetry-gateway.grafanaEnabled=true --set telemetry-gateway.prometheusEnabled=true --set grafana.enabled=true --set prometheus.enabled=true --set servicegraph.enabled=true --set tracing.ingress.enabled=true --set kiali.enabled=true > /root/istio-control-plane.yaml
On Remote:
helm template install/kubernetes/helm/istio-remote --namespace istio-system \
--name istio-remote \
--set global.remotePilotAddress=${PILOT_POD_IP} \
--set global.remotePolicyAddress=${POLICY_POD_IP} \
--set global.remoteTelemetryAddress=${TELEMETRY_POD_IP} \
--set global.remoteZipkinAddress=${ZIPKIN_POD_IP} > /root/istio-remote.yaml
Environment Its a private cloud, using kubeadm with 1 master and 1 node on each cluster. Both running CentOS 7.
Cluster state Cluster A:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube01-01 Ready master 6h32m v1.13.1 172.17.10.51 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://1.13.1
kube01-02 Ready <none> 6h31m v1.13.1 172.17.10.52 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://1.13.1
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
grafana-68886b9575-pw544 1/1 Running 0 128m 10.51.0.5 kube01-01 <none> <none>
istio-citadel-5687dd56-v2p5k 1/1 Running 0 128m 10.51.1.9 kube01-02 <none> <none>
istio-cleanup-secrets-master-latest-daily-hpmv9 0/1 Completed 0 128m 10.51.1.5 kube01-02 <none> <none>
istio-egressgateway-84757ff994-ccklk 1/1 Running 0 128m 10.51.0.4 kube01-01 <none> <none>
istio-galley-7b8d6bcdc6-xtbpq 1/1 Running 0 128m 10.51.1.13 kube01-02 <none> <none>
istio-grafana-post-install-master-latest-daily-d9999 0/1 Completed 0 128m 10.51.0.2 kube01-01 <none> <none>
istio-ingress-56b9b96fdb-p5v6g 1/1 Running 0 128m 10.51.0.6 kube01-01 <none> <none>
istio-ingressgateway-55b7dbcc65-nk72n 1/1 Running 0 128m 10.51.1.6 kube01-02 <none> <none>
istio-pilot-f56c6774b-xh85f 2/2 Running 1 128m 10.51.0.9 kube01-01 <none> <none>
istio-policy-6ccddfd88c-xqq8c 2/2 Running 5 128m 10.51.1.7 kube01-02 <none> <none>
istio-security-post-install-master-latest-daily-cw4pq 0/1 Completed 0 128m 10.51.0.3 kube01-01 <none> <none>
istio-sidecar-injector-94cdb784-lvdtm 1/1 Running 0 128m 10.51.1.14 kube01-02 <none> <none>
istio-telemetry-8bf65698d-9z857 2/2 Running 0 19m 10.51.0.81 kube01-01 <none> <none>
istio-tracing-657b76bb5f-lhm2v 1/1 Running 0 128m 10.51.1.11 kube01-02 <none> <none>
kiali-6f5785c57b-h2894 1/1 Running 0 89m 10.51.0.77 kube01-01 <none> <none>
prometheus-797f6564bb-bg7mm 1/1 Running 0 128m 10.51.1.8 kube01-02 <none> <none>
servicegraph-648456876f-ttm77 1/1 Running 0 128m 10.51.1.10 kube01-02 <none> <none>
Cluster B:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kube02-01 Ready master 6h32m v1.13.1 172.17.10.61 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://1.13.1
kube02-02 Ready <none> 6h32m v1.13.1 172.17.10.62 <none> CentOS Linux 7 (Core) 3.10.0-957.1.3.el7.x86_64 docker://1.13.1
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
istio-citadel-58d5cb57-snfjk 1/1 Running 0 48m 10.61.1.4 kube02-02 <none> <none>
istio-cleanup-secrets-master-latest-daily-q8tsd 0/1 Completed 0 48m 10.61.1.3 kube02-02 <none> <none>
istio-sidecar-injector-bcf8b9dc5-49fns 1/1 Running 0 48m 10.61.1.6 kube02-02 <none> <none>
OTHER INFORMATION
Here are some pieces of information I figure might help:
Without strack trace:
2019-01-12T19:28:06.007672Z debug api Dispatching Preprocess
2019-01-12T19:28:06.007748Z debug api Dispatching Preprocess
2019-01-12T19:28:06.007901Z debug begin dispatch: destination='kubernetes:kubernetesenv.istio-system(kubernetesenv)'
2019-01-12T19:28:06.007996Z debug begin dispatch: destination='kubernetes:kubernetesenv.istio-system(kubernetesenv)'
2019-01-12T19:28:06.007987Z debug adapters could not find pod for (uid: kubernetes://reviews-v3-5b994cb49d-6xhxz.default, key: default/reviews-v3-5b994cb49d-6xhxz) {"adapter": "kubernetesenv.istio-system"}
2019-01-12T19:28:06.008305Z debug complete dispatch: destination='kubernetes:kubernetesenv.istio-system(kubernetesenv)' {err:<nil>}
2019-01-12T19:28:06.008391Z debug complete dispatch: destination='kubernetes:kubernetesenv.istio-system(kubernetesenv)' {err:<nil>}
With stack trace turned on:
2019-01-12T19:38:20.858882Z debug adapters could not find pod for (uid: kubernetes://reviews-v3-5b994cb49d-6xhxz.default, key: default/reviews-v3-5b994cb49d-6xhxz) {"adapter": "kubernetesenv.istio-system"}
istio.io/istio/pkg/log.(*Scope).emit
/workspace/go/src/istio.io/istio/pkg/log/scope.go:281
istio.io/istio/pkg/log.(*Scope).Debug
/workspace/go/src/istio.io/istio/pkg/log/scope.go:229
istio.io/istio/mixer/pkg/runtime/handler.logger.Debugf
/workspace/go/src/istio.io/istio/mixer/pkg/runtime/handler/logger.go:66
istio.io/istio/mixer/adapter/kubernetesenv.(*handler).findPod
/workspace/go/src/istio.io/istio/mixer/adapter/kubernetesenv/kubernetesenv.go:242
istio.io/istio/mixer/adapter/kubernetesenv.(*handler).GenerateKubernetesAttributes
/workspace/go/src/istio.io/istio/mixer/adapter/kubernetesenv/kubernetesenv.go:211
istio.io/istio/mixer/template.glob..func4
/workspace/go/src/istio.io/istio/mixer/template/template.gen.go:333
istio.io/istio/mixer/pkg/runtime/dispatcher.(*dispatchState).invokeHandler
/workspace/go/src/istio.io/istio/mixer/pkg/runtime/dispatcher/dispatchstate.go:143
istio.io/istio/mixer/pkg/runtime/dispatcher.(*dispatchState).(istio.io/istio/mixer/pkg/runtime/dispatcher.invokeHandler)-fm
/workspace/go/src/istio.io/istio/mixer/pkg/runtime/dispatcher/session.go:271
istio.io/istio/mixer/pkg/pool.(*GoroutinePool).AddWorkers.func1
/workspace/go/src/istio.io/istio/mixer/pkg/pool/goroutine.go:82
Telemetry POD:
Name: istio-telemetry-8bf65698d-9z857
Namespace: istio-system
Priority: 0
PriorityClassName: <none>
Node: kube01-01/172.17.10.51
Start Time: Sat, 12 Jan 2019 17:37:27 -0200
Labels: app=telemetry
chart=mixer
heritage=Tiller
istio=mixer
istio-mixer-type=telemetry
pod-template-hash=8bf65698d
release=istio
version=1.1.0
Annotations: scheduler.alpha.kubernetes.io/critical-pod:
sidecar.istio.io/inject: false
Status: Running
IP: 10.51.0.81
Controlled By: ReplicaSet/istio-telemetry-8bf65698d
Containers:
mixer:
Container ID: docker://23f034d425c27334155391160fee105e154e85ec42bba71f1a7346674bfc1641
Image: gcr.io/istio-release/mixer:master-latest-daily
Image ID: docker-pullable://gcr.io/istio-release/mixer@sha256:860d4a96f67104ba35bb6c9843b24aed6d9165d2965d011bff4c0afb0642d1d5
Ports: 9093/TCP, 42422/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--monitoringPort=9093
--address
unix:///sock/mixer.socket
--configStoreURL=mcp://istio-galley.istio-system.svc:9901
--configDefaultNamespace=istio-system
--trace_zipkin_url=http://zipkin:9411/api/v1/spans
State: Running
Started: Sat, 12 Jan 2019 17:37:30 -0200
Ready: True
Restart Count: 0
Requests:
cpu: 10m
Liveness: http-get http://:9093/version delay=5s timeout=1s period=5s #success=1 #failure=3
Environment:
GODEBUG: gctrace=2
Mounts:
/etc/certs from istio-certs (ro)
/sock from uds-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from istio-mixer-service-account-token-tzvct (ro)
istio-proxy:
Container ID: docker://afad1e72fddbb660176bbe1331e7ec5f674afaef42c68c41b323045e9933ac3b
Image: gcr.io/istio-release/proxyv2:master-latest-daily
Image ID: docker-pullable://gcr.io/istio-release/proxyv2@sha256:718563628a051e5f690c4939eb9f8af66c26f25106042f09065cb207385d176c
Ports: 9091/TCP, 15004/TCP, 15090/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
proxy
--serviceCluster
istio-telemetry
--templateFile
/etc/istio/proxy/envoy_telemetry.yaml.tmpl
--controlPlaneAuthPolicy
NONE
State: Running
Started: Sat, 12 Jan 2019 17:37:30 -0200
Ready: True
Restart Count: 0
Requests:
cpu: 10m
Environment:
POD_NAME: istio-telemetry-8bf65698d-9z857 (v1:metadata.name)
POD_NAMESPACE: istio-system (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
Mounts:
/etc/certs from istio-certs (ro)
/sock from uds-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from istio-mixer-service-account-token-tzvct (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
istio-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.istio-mixer-service-account
Optional: true
uds-socket:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
istio-mixer-service-account-token-tzvct:
Type: Secret (a volume populated by a Secret)
SecretName: istio-mixer-service-account-token-tzvct
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 17m default-scheduler Successfully assigned istio-system/istio-telemetry-8bf65698d-9z857 to kube01-01
Normal Pulled 17m kubelet, kube01-01 Container image "gcr.io/istio-release/mixer:master-latest-daily" already present on machine
Normal Created 17m kubelet, kube01-01 Created container
Normal Started 17m kubelet, kube01-01 Started container
Normal Pulled 17m kubelet, kube01-01 Container image "gcr.io/istio-release/proxyv2:master-latest-daily" already present on machine
Normal Created 17m kubelet, kube01-01 Created container
Normal Started 17m kubelet, kube01-01 Started container
Secrets on Control Plane:
NAME TYPE DATA AGE
clusterb Opaque 1 47m
default-token-k4xv6 kubernetes.io/service-account-token 3 131m
istio-ca-secret istio.io/ca-root 5 129m
istio-citadel-service-account-token-8zww5 kubernetes.io/service-account-token 3 131m
istio-cleanup-secrets-service-account-token-9hllg kubernetes.io/service-account-token 3 131m
istio-egressgateway-service-account-token-2b987 kubernetes.io/service-account-token 3 131m
istio-galley-service-account-token-wn8jv kubernetes.io/service-account-token 3 131m
istio-grafana-post-install-account-token-zw497 kubernetes.io/service-account-token 3 131m
istio-ingress-service-account-token-7jzhd kubernetes.io/service-account-token 3 131m
istio-ingressgateway-service-account-token-2jjmb kubernetes.io/service-account-token 3 131m
istio-mixer-service-account-token-tzvct kubernetes.io/service-account-token 3 131m
istio-pilot-service-account-token-mf9g2 kubernetes.io/service-account-token 3 131m
istio-security-post-install-account-token-v5s62 kubernetes.io/service-account-token 3 131m
istio-sidecar-injector-service-account-token-mjj86 kubernetes.io/service-account-token 3 131m
istio.default istio.io/key-and-cert 3 129m
istio.istio-citadel-service-account istio.io/key-and-cert 3 129m
istio.istio-cleanup-secrets-service-account istio.io/key-and-cert 3 129m
istio.istio-egressgateway-service-account istio.io/key-and-cert 3 129m
istio.istio-galley-service-account istio.io/key-and-cert 3 129m
istio.istio-grafana-post-install-account istio.io/key-and-cert 3 129m
istio.istio-ingress-service-account istio.io/key-and-cert 3 129m
istio.istio-ingressgateway-service-account istio.io/key-and-cert 3 129m
istio.istio-mixer-service-account istio.io/key-and-cert 3 129m
istio.istio-pilot-service-account istio.io/key-and-cert 3 129m
istio.istio-security-post-install-account istio.io/key-and-cert 3 129m
istio.istio-sidecar-injector-service-account istio.io/key-and-cert 3 129m
istio.kiali-service-account istio.io/key-and-cert 3 129m
istio.prometheus istio.io/key-and-cert 3 129m
kiali Opaque 2 113m
kiali-service-account-token-kdq6q kubernetes.io/service-account-token 3 131m
prometheus-token-mg8tw kubernetes.io/service-account-token 3 131m
Secrets on Remote:
NAME TYPE DATA AGE
default-token-qxbb4 kubernetes.io/service-account-token 3 51m
istio-ca-secret istio.io/ca-root 5 50m
istio-citadel-service-account-token-bg9xm kubernetes.io/service-account-token 3 51m
istio-cleanup-secrets-service-account-token-5g542 kubernetes.io/service-account-token 3 51m
istio-multi-token-jpdbr kubernetes.io/service-account-token 3 51m
istio-sidecar-injector-service-account-token-h8blp kubernetes.io/service-account-token 3 51m
istio.default istio.io/key-and-cert 3 50m
istio.istio-citadel-service-account istio.io/key-and-cert 3 50m
istio.istio-cleanup-secrets-service-account istio.io/key-and-cert 3 50m
istio.istio-multi istio.io/key-and-cert 3 50m
istio.istio-sidecar-injector-service-account istio.io/key-and-cert 3 50m
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 15 (8 by maintainers)
I spent quite a bit of time since my last comment investigating and here’s what I came up with. Note that I’m not an istio expert nor a kubernetes go client api expert, so I may be making some incorrect assumptions.
All metrics reported by mixer seem to report the replicaset as the workload ~10-20% of the time across all workloads. There does not seem to be any time correlation, i.e. it’s not like we see all of the deployment name for a period of time followed by all of the replicaset name for a period of time. Here’s an example for istio_requests_total:
Looking at the access logs in istio-telemetry, we can see examples that occur within milliseconds of eachother reporting different workloads:
The entries above:
During the above time periods I have not updated the deployment or changed the kubernetes configuration in any way.
I have exactly one istio-telemetry pod running. I verified that there are only 2 cache controllers running by observing this log line appears exactly twice. I also verified that I see this line exactly once, corresponding to my remote cluster. I take this to mean that it’s configured correctly, i.e. I don’t have two kubernetesenv attribute instances configured. I am currently running
release-1.1-20190130-09-16
.Looking at the code, it appears this might happen for a couple of reasons:
err != nil
condition. I’m not sure under what conditions we might get an error, but this is probably where I’d log to see if something wonky was happening.If you could provide or point me to instructions on how to build the image, I would be happy to debug further.
@gbaufake
@gbaufake , basically I’ve created 2 clusters with kubeadm:
Cluster1:
Cluster2:
After that I’ve added the routes to POD and Service Network for each side:
Cluster1:
Cluster2:
On Cluster1:
On Cluster2:
On Cluster1:
After that, I’ve eddited istio-reader ClusterRole on Cluster2 and changed to apiGroups *, and added replicasets and replicationcontrollers to the resources. Also I’ve created the Kiali secret on the master. Discussion on those 2 steps here: https://discuss.istio.io/t/istio-multicluster/210/
Also, an important note, is that remotePilotAddress, remotePolicyAddress, remoteTelemetryAddress and remoteZipkinAddress that are used on istio-remote, are pointing to the ClusterIP address instead of the POD, that way I can have some level of high availability. Changes to the istio-remote.yaml were done before it was applied to reflect the ports and correct addresses if I am not wrong, to work with the ClusterIP Address instead of the POD IP.
If interested, I can provide access to the environment for further troubleshooting.