emissary: Ambassador not working on fresh installed kubernetes

Describe the bug The Ambassador container within ambassador pod does not spawn up properly on a local cluster with kubernetes 1.10.2

To Reproduce Steps to reproduce the behavior: A fresh install of kubernetes 1.10.2 on a local cluster. Follow the guidance to install ambassador-no-rbac.

Expected behavior expect ambassador to spawn up. but it give error code 137. Both liveness probe and Readiness probe failed with getsockopt: connection refused.

ambassador-6c7dd7799b-2mjzx   1/2       CrashLoopBackOff   13         37m

Versions (please complete the following information):

  • Ambassador: 0.32.1
  • Kubernetes environment: local cluster installed using kubeadm. Configured with calico.
  • Version 1.10.2
  • Ubuntu 16.04

Additional context Someone @richarddli suggested this is due to a dns problem, but local dns (/etc/resolve.conf) is configured using 8.8.8.8 and 8.8.4.4, I am sure at least I can access google website. (By deploying a busybox into my cluster, I checked my busybox container can ping google.com, but not sure for others.) Here are some extra log for debugging

kubernetes@local-cluster-0:~/my-kubeflow$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns
I0503 10:46:23.833551       1 dns.go:48] version: 1.14.8
I0503 10:46:23.862220       1 server.go:71] Using configuration read from directory: /kube-dns-config with period 10s
I0503 10:46:23.862374       1 server.go:119] FLAG: --alsologtostderr="false"
I0503 10:46:23.862433       1 server.go:119] FLAG: --config-dir="/kube-dns-config"
I0503 10:46:23.862466       1 server.go:119] FLAG: --config-map=""
I0503 10:46:23.862485       1 server.go:119] FLAG: --config-map-namespace="kube-system"
I0503 10:46:23.862504       1 server.go:119] FLAG: --config-period="10s"
I0503 10:46:23.862532       1 server.go:119] FLAG: --dns-bind-address="0.0.0.0"
I0503 10:46:23.862551       1 server.go:119] FLAG: --dns-port="10053"
I0503 10:46:23.862592       1 server.go:119] FLAG: --domain="cluster.local."
I0503 10:46:23.862621       1 server.go:119] FLAG: --federations=""
I0503 10:46:23.862648       1 server.go:119] FLAG: --healthz-port="8081"
I0503 10:46:23.862668       1 server.go:119] FLAG: --initial-sync-timeout="1m0s"
I0503 10:46:23.862688       1 server.go:119] FLAG: --kube-master-url=""
I0503 10:46:23.862742       1 server.go:119] FLAG: --kubecfg-file=""
I0503 10:46:23.862761       1 server.go:119] FLAG: --log-backtrace-at=":0"
I0503 10:46:23.862794       1 server.go:119] FLAG: --log-dir=""
I0503 10:46:23.862815       1 server.go:119] FLAG: --log-flush-frequency="5s"
I0503 10:46:23.862835       1 server.go:119] FLAG: --logtostderr="true"
I0503 10:46:23.862854       1 server.go:119] FLAG: --nameservers=""
I0503 10:46:23.862872       1 server.go:119] FLAG: --stderrthreshold="2"
I0503 10:46:23.862891       1 server.go:119] FLAG: --v="2"
I0503 10:46:23.862911       1 server.go:119] FLAG: --version="false"
I0503 10:46:23.862953       1 server.go:119] FLAG: --vmodule=""
I0503 10:46:23.863269       1 server.go:201] Starting SkyDNS server (0.0.0.0:10053)
I0503 10:46:23.864231       1 server.go:220] Skydns metrics enabled (/metrics:10055)
I0503 10:46:23.864282       1 dns.go:146] Starting endpointsController
I0503 10:46:23.864363       1 dns.go:149] Starting serviceController
I0503 10:46:23.901849       1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0503 10:46:23.901936       1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0503 10:46:24.364803       1 dns.go:170] Initialized services and endpoints from apiserver
I0503 10:46:24.364871       1 server.go:135] Setting up Healthz Handler (/readiness)
I0503 10:46:24.364935       1 server.go:140] Setting up cache handler (/cache)
I0503 10:46:24.364962       1 server.go:126] Status HTTP port 8081
I0504 04:33:43.169511       1 dns.go:555] Could not find endpoints for service "tf-hub-0" in namespace "kubeflow". DNS records will be created once endpoints show up.
I0515 01:02:23.572540       1 dns.go:555] Could not find endpoints for service "tf-hub-0" in namespace "kubeflow". DNS records will be created once endpoints show up.
kubernetes@local-cluster-0:~/my-kubeflow$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
I0503 10:46:24.860136       1 main.go:76] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0503 10:46:24.870496       1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0503 10:46:25.334307       1 nanny.go:116] dnsmasq[12]: started, version 2.78 cachesize 1000
I0503 10:46:25.334618       1 nanny.go:116] dnsmasq[12]: compile time options: IPv6 GNU-getopt no-DBus no-i18n no-IDN DHCP DHCPv6 no-Lua TFTP no-conntrack ipset auth no-DNSSEC loop-detect inotify
I0503 10:46:25.334632       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0503 10:46:25.334638       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0503 10:46:25.334643       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0503 10:46:25.334651       1 nanny.go:116] dnsmasq[12]: reading /etc/resolv.conf
I0503 10:46:25.334657       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain ip6.arpa
I0503 10:46:25.334662       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain in-addr.arpa
I0503 10:46:25.334668       1 nanny.go:116] dnsmasq[12]: using nameserver 127.0.0.1#10053 for domain cluster.local
I0503 10:46:25.334672       1 nanny.go:116] dnsmasq[12]: using nameserver 8.8.8.8#53
I0503 10:46:25.334678       1 nanny.go:116] dnsmasq[12]: using nameserver 8.8.4.4#53
I0503 10:46:25.334716       1 nanny.go:116] dnsmasq[12]: read /etc/hosts - 7 addresses
I0503 10:46:25.334851       1 nanny.go:119]
W0503 10:46:25.334861       1 nanny.go:120] Got EOF from stdout
kubernetes@local-cluster-0:~/my-kubeflow$ kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c sidecar
I0503 10:46:25.240827       1 main.go:51] Version v1.14.8
I0503 10:46:25.240868       1 server.go:45] Starting server (options {DnsMasqPort:53 DnsMasqAddr:127.0.0.1 DnsMasqPollIntervalMs:5000 Probes:[{Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33} {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}] PrometheusAddr:0.0.0.0 PrometheusPort:10054 PrometheusPath:/metrics PrometheusNamespace:kubedns})
I0503 10:46:25.240906       1 dnsprobe.go:75] Starting dnsProbe {Label:kubedns Server:127.0.0.1:10053 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
I0503 10:46:25.240992       1 dnsprobe.go:75] Starting dnsProbe {Label:dnsmasq Server:127.0.0.1:53 Name:kubernetes.default.svc.cluster.local. Interval:5s Type:33}
W0503 10:46:25.241374       1 server.go:64] Error getting metrics from dnsmasq: read udp 127.0.0.1:58432->127.0.0.1:53: read: connection refused

Can anyone help me have a look? Many thanks in advance!

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 41 (21 by maintainers)

Most upvoted comments

@victortrac can you please provide more detail how you fixed this?

OK, so here’s a better Ambassador deployment YAML for Seldon. I’ll figure out how to create a PR for this for the Seldon folks.

---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
  name: ambassador
rules:
- apiGroups: [""]
  resources:
  - services
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["create", "update", "patch", "get", "list", "watch"]
- apiGroups: [""]
  resources:
  - secrets
  verbs: ["get", "list", "watch"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: ambassador
  namespace: seldon
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: ambassador
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ambassador
subjects:
- kind: ServiceAccount
  name: ambassador
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: ambassador
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: ambassador
subjects:
- kind: ServiceAccount
  name: ambassador
  namespace: seldon
---
apiVersion: v1
kind: Service
metadata:
  name: ambassador
  namespace: seldon
spec:
  selector:
    service: ambassador
  ports:
    - name: http
      protocol: TCP
      port: 80
      targetPort: 80
  type: NodePort
---
apiVersion: v1
kind: Service
metadata:
  labels:
    service: ambassador-admin
  name: ambassador-admin
  namespace: seldon
spec:
  ports:
  - name: ambassador-admin
    port: 8877
    targetPort: 8877
  selector:
    service: ambassador
  type: NodePort
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: ambassador
  namespace: seldon
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        sidecar.istio.io/inject: 'false'
      labels:
        service: ambassador
    spec:
      containers:
      - image: quay.io/datawire/ambassador:0.34.1
        name: ambassador
        env:
        - name: AMBASSADOR_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        resources:
          limits:
            cpu: 1
            memory: 400Mi
          requests:
            cpu: 200m
            memory: 100Mi
      - image: quay.io/datawire/statsd:0.34.1
        name: statsd
      restartPolicy: Always
      serviceAccountName: ambassador

Augh. So. Seldon deploys into the seldon namespace, indeed, and Ambassador will need to be tweaked for that.

@AdrianLsk Are you on our Slack channel? www.getambassador.io has instructions, if not – I’d like to give you a different ambassador.yaml to try, but it’ll likely be easier to interact there.