istio: ambient does not work on minikube

Is this the right place to submit this?

  • This is not a security vulnerability or a crashing bug
  • This is not a question about how to use Istio

Bug Description

Follow this: https://istio.io/latest/docs/ops/ambient/getting-started/

I did not install Gateway APIs. I followed the “Istio APIs” instructions to install:

istioctl install --set profile=ambient --set components.ingressGateways[0].enabled=true --set components.ingressGateways[0].name=istio-ingressgateway --skip-confirmation

Cluster is minikube. I’m using a Istio 1.19-dev build (see “Version” field for details).

Things look installed properly:

$ kubectl get pods -n bookinfo 
NAME                              READY   STATUS    RESTARTS   AGE
details-v1-7745b6fcf4-79m8s       1/1     Running   0          68s
productpage-v1-6f89b6c557-ccpth   1/1     Running   0          68s
ratings-v1-77bdbf89bb-ndv8b       1/1     Running   0          68s
reviews-v1-667b5cc65d-dzslp       1/1     Running   0          68s
reviews-v2-6f76498fc8-2mdfv       1/1     Running   0          68s
reviews-v3-5d8667cc66-j6m8x       1/1     Running   0          68s
$ kubectl get pods,ds -n istio-system
NAME                                       READY   STATUS    RESTARTS   AGE
pod/istio-cni-node-9nd56                   1/1     Running   0          9m55s
pod/istio-ingressgateway-7d67669df-9dgpg   1/1     Running   0          9m55s
pod/istiod-7c6f4d8478-26gc8                1/1     Running   0          10m
pod/ztunnel-tcpg8                          1/1     Running   0          10m

NAME                            DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/istio-cni-node   1         1         1       1            1           kubernetes.io/os=linux   9m55s
daemonset.apps/ztunnel          1         1         1       1            1           kubernetes.io/os=linux   10m

But there are errors in the CNI daemonset… see cni-errors.log which is from kubectl logs -n istio-system daemonset/istio-cni-node > cni-errors.log

First error in the logs is:

2023-07-25T12:41:04.140628Z	warn	ambient	unable to list IPSet: failed to list ipset ztunnel-pods-ips: no such file or directory

with a bunch of

2023-07-25T12:41:16.745442Z	warn	ambient	Error running command iptables-legacy: iptables: No chain/target/match by that name.

and then

2023-07-25T12:41:16.773229Z	error	controllers	error handling istio-system/ztunnel-tcpg8, retrying (retry count: 1): failed to get veth device: no routes found for 10.244.0.10	controller=ambient

Version

$ istioctl version
client version: 1.19-alpha.c641d08aa437381c3678805e17c0479f247e714a
control plane version: 1.19-alpha.c641d08aa437381c3678805e17c0479f247e714a
data plane version: 1.19-alpha.c641d08aa437381c3678805e17c0479f247e714a (2 proxies)
$ kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.26.1
Kustomize Version: v4.5.7
Server Version: v1.26.3

Minikube (relevant for this issue; this error doesn’t happen with KinD):

$ minikube version
minikube version: v1.30.1
commit: 08896fd1dc362c097c925146c4a0d0dac715ace0

Operating System/Hardware:

$ uname -a
Linux jmazzite-thinkpadp1gen3.ttn.csb 6.3.12-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jul  6 04:05:18 UTC 2023 x86_64 GNU/Linux

$ cat /etc/redhat-release 
Fedora release 38 (Thirty Eight)


### Additional Information

[bug-report.tar.gz](https://github.com/istio/istio/files/12161334/bug-report.tar.gz)


### Affected product area

- [X] Ambient
- [ ] Docs
- [ ] Installation
- [ ] Networking
- [ ] Performance and Scalability
- [ ] Extensions and Telemetry
- [ ] Security
- [ ] Test and Release
- [ ] User Experience
- [ ] Developer Infrastructure
- [ ] Upgrade
- [ ] Multi Cluster
- [ ] Virtual Machine
- [ ] Control Plane Revisions

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 3
  • Comments: 23 (23 by maintainers)

Commits related to this issue

Most upvoted comments

The initial warns are from node cleanup, as the logs mention, they’re not relevant and are only logged as WARN:

ambient Node-level network rule cleanup started
2023-07-25T11:49:32.823752Z info    ambient If rules do not exist in the first place, warnings will be triggered - these can be safely ignored
...(warn)
...(warn)
2023-07-25T11:49:32.899316Z info    ambient Node-level cleanup done

This is an actual error, however:

2023-07-25T11:49:32.899398Z error   controllers error handling istio-system/ztunnel-7vggw, retrying (retry count: 1): failed to get veth device: no routes found for 10.244.0.9 controller=ambient

CNI can’t seem to find a valid route on the node for the ztunnel pod IP k8s gives it, and so the node initialization fails before it gets to creating the ipset.

This is effectively a catastrophic failure (tho it doesn’t put the CNI agent into an unready state - it probably should, but that’s a bit tricky given the CNI agent does double duty for sidecar and ambient).

Check istio-system - I bet your ztunnel pods are unhealthy. There should/must be a route to the ztunnel pod IP if the ztunnel pod is actually running in a correctly configured k8s cluster.