kubernetes: GKE 1.6 kubelet /metrics endpoint unauthorized over https
BUG REPORT
After upgrading a GKE cluster from 1.5.6 to 1.6.0 Prometheus stopped to scrape the node /metrics endpoint due to a 401 unauthorized error.
This is likely due to RBAC being enabled. In order to give Prometheus access to the node metrics I added the following ClusterRole and ClusterRoleBinding and created a dedicated service account that is used by the pod.
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: prometheus
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- nodes
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: prometheus
namespace: monitoring
Although the mounted token is now the one for the prometheus service account - verified at https://jwt.io/ - it can’t get access to the node metrics (they’re served by the kubelet, right?).
If I execute the following command it returns the 401 Unauthorized
KUBE_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://<node ip>:10250/metrics
Any tips how to get to the bottom and figure out what’s needed to get this to work? I already looked at the issue with Prometheus contributors via ticket https://github.com/prometheus/prometheus/issues/2606 but as the curl doesn’t work either it’s probably not a Prometheus issue.
Kubernetes version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:36:33Z", GoVersion:"go1.7.5
", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5
", Compiler:"gc", Platform:"linux/amd64"}
Environment:
clusterIpv4Cidr: 10.248.0.0/14
createTime: '2016-11-14T19:26:49+00:00'
currentMasterVersion: 1.6.0
currentNodeCount: 14
currentNodeVersion: 1.6.0
endpoint: **REDACTED**
initialClusterVersion: 1.4.5
instanceGroupUrls:
- **REDACTED**
locations:
- europe-west1-c
loggingService: logging.googleapis.com
masterAuth:
clientCertificate: **REDACTED**
clientKey: **REDACTED**
clusterCaCertificate: **REDACTED**
password: **REDACTED**
username: **REDACTED**
monitoringService: monitoring.googleapis.com
name: development-europe-west1-c
network: development
nodeConfig:
diskSizeGb: 250
imageType: COS
machineType: n1-highmem-8
oauthScopes:
- https://www.googleapis.com/auth/compute
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/service.management
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
serviceAccount: default
nodeIpv4CidrSize: 24
What happened:
With a ClusterRole configured I would expect to be able to scrape the /metrics endpoint on each node, but this fails with 401 Unauthorized.
What you expected to happen:
The service account token with appropriate ClusterRole to be able to give access to the /metrics endpoint.
How to reproduce it (as minimally and precisely as possible):
- create namespace, serviceaccount, clusterrole, clusterrolebinding and deployment with linked serviceaccount
- get the ip for one of the nodes
- run
KUBE_TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)andcurl -sSk -H "Authorization: Bearer $KUBE_TOKEN" https://<node ip>:10250/metricsfrom the container in your deployment
Anything else we need to know:
This failed with the default service account as well. Whereas I thought initially GKE would still be very liberal with it’s access control settings.
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Reactions: 8
- Comments: 16 (9 by maintainers)
Links to this issue
Commits related to this issue
- very important line letting prom read kubelet /metrics https://github.com/kubernetes/kubernetes/issues/44330#issuecomment-293729622 — committed to drewp/prometheus by drewp 3 years ago
Querying the same endpoint over http to port 10255 actually works. Any idea why there’s a difference?
Could the cause be similar to https://github.com/coreos/coreos-kubernetes/issues/714 ?
Ahhhh, ya that’s not going to work. We don’t plan on enabling token review API in GKE. You can either configure prometheus to pull metrics by hitting the apiserver proxy directly or you can create a client certificate using the certificates API for prometheus to use when contacting kubelets.
GKE doesn’t enable service account token authentication to the kubelet
cc @mikedanese @cjcullen
Take a look to the following parameter in kubelet exporter: https://github.com/coreos/prometheus-operator/blob/master/helm/exporter-kubelets/values.yaml#L2 Hope it helps
If GKE is using the GCE cluster up scripts, it isn’t enabling service account token authentication:
https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/gci/configure-helper.sh#L699
to authenticate to the kubelet with API tokens, these steps would be needed (from https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authentication):
I’m fairly certain we do…
In your ClusterRole I think
should be
Like this https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/rbac/kubelet-api-admin-role.yaml#L16
Your
nonResourceURLsdoesn’t make sense.