digitalocean-cloud-controller-manager: Digital Ocean implementation of K8's does not allow metrics-server to function

Hello,

I raised an issue last night for this, but since I’ve had more time to sleep, wanted to raise it again and provide some more information.

I currently have a Digital Ocean Managed Kubernetes Cluster. I have some applications deployed and running on it.

I have configured Horizontal Pod Autoscaling for one of my deployments, but when running the kubectl get hpa command, I noticed the following in my output (<unknown> in the targets column):

NAME                              REFERENCE                                    TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
benjamin-maynard-io-fe            Deployment/benjamin-maynard-io-fe            <unknown>/80%   1         20        3          10h

I identified this was because I did not have either heapster or metrics-server running on my cluster. So went to install it as per the instructions on https://github.com/kubernetes-incubator/metrics-server

metrics-server successfully installs, and is running in the kube-system namespace:

NAME                                READY   STATUS    RESTARTS   AGE
csi-do-controller-0                 3/3     Running   0          42h
csi-do-node-dbvg5                   2/2     Running   0          42h
csi-do-node-lq97x                   2/2     Running   1          42h
csi-do-node-mvnrw                   2/2     Running   0          42h
kube-dns-55cf9576c4-4r466           3/3     Running   0          42h
kube-proxy-upbeat-lichterman-3mz4   1/1     Running   0          42h
kube-proxy-upbeat-lichterman-3mzh   1/1     Running   0          42h
kube-proxy-upbeat-lichterman-3mzi   1/1     Running   0          42h
metrics-server-7fbd9b8589-64x86     1/1     Running   0          9m48s

However, I am still getting no metrics.

Running kubectl get apiservice v1beta1.metrics.k8s.io -o yaml reveals:

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  creationTimestamp: 2018-11-27T08:32:26Z
  name: v1beta1.metrics.k8s.io
  resourceVersion: "396557"
  selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  uid: f88f3576-f21e-11e8-8aed-fab39051e242
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100
status:
  conditions:
  - lastTransitionTime: 2018-11-27T08:32:26Z
    message: 'no response from https://10.245.219.253:443: Get https://10.245.219.253:443:
      net/http: request canceled while waiting for connection (Client.Timeout exceeded
      while awaiting headers)'
    reason: FailedDiscoveryCheck
    status: "False"
    type: Available

With the latter part of the output being of interest message: 'no response from https://10.245.219.253:443: Get https://10.245.219.253:443: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)'

I believe the above error message means that the kube-apiserver cannot speak to the metrics-server service. I believe this is due to the specifics of how the Digital Ocean Kubernetes master works.

I’ve performed some other general validation:

Service is configured:

Benjamins-MacBook-Pro:metrics-server benmaynard$ kubectl get service --namespace=kube-system
NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)         AGE
kube-dns         ClusterIP   10.245.0.10      <none>        53/UDP,53/TCP   2d19h
metrics-server   ClusterIP   10.245.219.253   <none>        443/TCP         14m

metrics-server is up and running:

Benjamins-MacBook-Pro:metrics-server benmaynard$ kubectl logs metrics-server-7fbd9b8589-64x86 --namespace=kube-system
I1127 08:32:30.665197       1 serving.go:273] Generated self-signed cert (apiserver.local.config/certificates/apiserver.crt, apiserver.local.config/certificates/apiserver.key)
[restful] 2018/11/27 08:32:32 log.go:33: [restful/swagger] listing is available at https://:443/swaggerapi
[restful] 2018/11/27 08:32:32 log.go:33: [restful/swagger] https://:443/swaggerui/ is mapped to folder /swagger-ui/
I1127 08:32:32.981732       1 serve.go:96] Serving securely on [::]:443

Another customer has reported similar things: https://www.digitalocean.com/community/questions/cannot-get-kubernetes-horizonal-pod-autoscaler-or-metrics-server-working

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Reactions: 5
  • Comments: 49 (4 by maintainers)

Most upvoted comments

The following works for me now: helm install --name metrics stable/metrics-server --namespace kube-system -f values.yaml

values.yaml:

args:
  - --logtostderr
  - --kubelet-preferred-address-types=InternalIP
  - --kubelet-insecure-tls

I’m had the same problem. Here as I solved by using metrics-server chart from bitname repository.

helm repo add bitnami https://charts.bitnami.com/bitnami
helm template metrics-server bitnami/metrics-server --values metrics-server.yaml -n kube-system
#metrics-server.yaml
apiService:
  create: true # this solves the permission problem

extraArgs:
  kubelet-preferred-address-types: InternalIP

@maxwedwards totally agree that basic metrics collection in a secure way is a must-have. Apologies if this came across us “we don’t care about this”, we genuinely do. What I was trying to express is that CCM is likely not the place where the fix should (maybe even can) happen: the project’s primary purpose is to implement the cloud provider interface that is defined by upstream Kubernetes. I checked again but don’t see a way to hook into host name resolutions. We could presumably hack it into the project, but it might not be the right place to do so.

Regardless, I have filed an internal ticket to track progress on the matter and did some initial investigations myself that confirm your findings. WiIl keep you posted in this issue since it has become a place to go to for most users that ran into the problem.

By the way, we also have something else in the pipeline at DO to improve on the observability front which we intend to release not too far in the future. It only touches partially on the subject discussed here though, proper metrics-server integration still is a given to support features that built on top of it (like autoscaling).

Sorry again in case I have sent the wrong message. Appreciate the feedback!

Adding these vars on deployment file of metrics-server works.

      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.1
        imagePullPolicy: Always
        command:
        - /metrics-server
        - --logtostderr
        - --kubelet-preferred-address-types=InternalIP
        - --kubelet-insecure-tls

        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp

1.19.3-do.2 was just released and should fix the problem. Please report back if that’s not the case.

Sorry for the inconveniences!

I just ran tests with all versions of DOKS currently supported (1.11.5-do.2, 1.12.3-do.2, and 1.13.1-do.2 as of this writing). In each case, I was able to read metrics properly.

Here’s what I did (mostly summarizing what was mentioned before in this issue):

  1. Apply the 1.8+ manifests of metrics-server, adding the extra commands suggested by @dwdraju above
  2. Run an HPA-managed deployment of Nginx like this: kubectl run --image=nginx nginx --requests=cpu=200m && kubectl autoscale deploy nginx --min=1 --max=10 --cpu-percent=80 (please take note that you need to specify CPU requests for HPA to take action)
  3. Wait for metrics to be collected and finally see the target numbers in kubectl get hpa nginx populated (or kubectl top node for a more basic check unrelated to HPA)

The waiting part in the last step is important: It takes 1-2 minutes for metrics to show up.

FWIW, I created my Kubernetes clusters in the FRA1 region.

Is anyone not able to reproduce a successful setup with my steps outlined above?

@timoreimann you’re welcome, checked my clusters status page a few minutes before you posted and started the upgrade. Was kinda hoping to beat you to it, but more glad that this is fix 🎉

@WyriHaximus thanks for confirming! 💙

@timoreimann just started the upgrade and can confirm that this is now fixed

@Simwar and others: we just opened up a new repository to track more general feature requests and bug reports related to DOKS (but not specific to any other of our repos, like this one). I created digitalocean/DOKS#2 to address the issue around metrics-server not supported with TLS on DOKS.

Please continue discussions on the new issue. Thanks!

Sorry folks, we’re working through this, it’s been a bit busy with KubeCON coming up. Hoping to have more details soon (by next week maybe?) 😃