kubernetes: HPA "invalid label value" for an external metric matchLabels value
/kind bug /sig scalability
What happened: I am on GKE and I use the GCE ingress to expose my application with the HTTPS Global Load Balancer. To autoscale the app, I am using an HPA capable of reacting on a Stackdriver external metric (loadbalancing.googleapis.com|https|request_count) exposed by the k8s-stackdriver controller.
My problem occurs when I want to use a metricSelector based on the matched_url_path_rule field which can contain “/” characters. kubectl apply succeeds but the hpa wont detect the metric. When I try to edit the hpa, I see the following error:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-08-27T00:40:37Z","reason":"SucceededGetScale","message":"the
HPA controller was able to get the target''s current scale"},{"type":"ScalingActive","status":"False","lastTransitionTime":"2019-08-27T00:40:38Z","reason":"FailedGetExternalMetric","message":"the
HPA was unable to compute the replica count: invalid label value: \"/api/\":
a valid label must be an empty string or consist of alphanumeric characters,
''-'', ''_'' or ''.'', and must start and end with an alphanumeric character
(e.g. ''MyValue'', or ''my_value'', or ''12345'', regex used for validation
is ''(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?'')"}]'
[...]
And yet, the field contains ‘/api/’ when checking it with this command:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/loadbalancing.googleapis.com|https|request_count" | jq | grep resource.labels.matched_url_path_rule
The problem comes from the matchLabels validator. It checks metrics label as a normal label but in external metrics I think we should have access to more characters. If not, then I will open a bug to the k8s-stackdriver controller to workaround this.
What you expected to happen: The HPA to scale my app based on the RPS metric filtered by url path.
How to reproduce it (as minimally and precisely as possible): You will need to:
- Create a GKE cluster with the GCE ingress activated
- Install the stackdriver controller
kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/prom-to-sd-v0.6.0/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml
- Create a deployment, a service and an ingress.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
spec:
replicas: 1
strategy:
selector:
matchLabels:
app.kubernetes.io/name: test
template:
metadata:
labels:
app.kubernetes.io/name: test
spec:
containers:
- name: test
image: nginx
imagePullPolicy: Always
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: test
name: test
spec:
ports:
- name: http
port: 80
selector:
app.kubernetes.io/name: test
type: NodePort
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: gce
name: test
spec:
backend:
serviceName: test
servicePort: 80
rules:
- http:
paths:
- backend:
serviceName: test
servicePort: 80
path: /api/*
- Create the HPA (replace the url map name with the one created by the ingress)
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: 'test'
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 20
- external:
metricName: loadbalancing.googleapis.com|https|request_count
metricSelector:
matchLabels:
resource.labels.url_map_name: <url map name>
resource.labels.matched_url_path_rule: "/api/"
targetAverageValue: "40"
type: External
-
Send a bit of traffic to your app url to make the stackdriver metrics appear.
-
You should get the error after a few minutes when editing the hpa.
kubectl edit hpa test
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version):
Client Version: version.Info{Major:“1”, Minor:“14”, GitVersion:“v1.14.2”, GitCommit:“66049e3b21efe110454d67df4fa62b08ea79a19b”, GitTreeState:“clean”, BuildDate:“2019-05-16T16:23:09Z”, GoVersion:“go1.12.5”, Compiler:“gc”, Platform:“linux/amd64”} Server Version: version.Info{Major:“1”, Minor:“13+”, GitVersion:“v1.13.7-gke.8”, GitCommit:“7d3d6f113e933ed1b44b78dff4baf649258415e5”, GitTreeState:“clean”, BuildDate:“2019-06-19T16:37:16Z”,GoVersion:“go1.11.5b4”, Compiler:“gc”, Platform:“linux/amd64”}
- Cloud provider or hardware configuration: GCP (GKE)
- OS (e.g:
cat /etc/os-release): - Kernel (e.g.
uname -a): - Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 8
- Comments: 30 (8 by maintainers)
Would’ve been awesome if this issue would get fixed… since currently I had to create automated function that basically creates a custom monitoring metric for each of ‘matched_url_path_rule’ and copies data from loadbalancer metric, then I use this custom metric as a target value… this is like really crazy but it works although with very minimal delay and workarounds for duplicate data due to fast updates… I love stuff that works out of the box though.
Edit: Here is the function… https://pduchnovsky.com/2021/05/kube-lb-rpm-scaling/