application-gateway-kubernetes-ingress: Failed to refresh the Token
Describe the bug The ingress controller is running for almost 40 days without restarts and issues. Then, a few days ago, without apparent reason fails to refresh the token. Other resources around it, such as mic and nmi did not change, managed identity did not change in the meantime… Note that managed identity has a Reader role on a resource group, where the gateway is located.
I haven’t tried to recreate a pod, because I would like to find the root cause first. I have same setup on the production cluster, and I am afraid it can happen there and break my applications.
Does anyone have an idea what might went wrong, and where to look?
To Reproduce Not sure how to reproduce
Ingress Controller details
- Output of
kubectl describe pod <ingress controller
>
Name: fantastic-waterbuffalo-ingress-azure-66b968bbbc-wds6z
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: aks-agentpool-15375443-6/10.0.0.159
Start Time: Fri, 25 Oct 2019 14:02:06 +0200
Labels: aadpodidbinding=fantastic-waterbuffalo-ingress-azure
app=ingress-azure
pod-template-hash=66b968bbbc
release=fantastic-waterbuffalo
Annotations: <none>
Status: Running
IP: 10.0.0.172
Controlled By: ReplicaSet/fantastic-waterbuffalo-ingress-azure-66b968bbbc
Containers:
ingress-azure:
Container ID: docker://553e5262ac6537629f4a90aaf26b38648a8ba287938df454b044c31b84f7d820
Image: mcr.microsoft.com/azure-application-gateway/kubernetes-ingress:0.10.0-rc4
Image ID: docker-pullable://mcr.microsoft.com/azure-application-gateway/kubernetes-ingress@sha256:4579e970084e58ce84f85e783c2e57e2e38fbf22b4204076bc80f7e464475917
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 25 Oct 2019 14:02:30 +0200
Ready: True
Restart Count: 0
Liveness: http-get http://:8123/health/alive delay=15s timeout=1s period=20s #success=1 #failure=3
Readiness: http-get http://:8123/health/ready delay=5s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
fantastic-waterbuffalo-cm-ingress-azure ConfigMap Optional: false
Environment:
AZURE_CONTEXT_LOCATION: /etc/appgw/azure.json
AGIC_POD_NAME: fantastic-waterbuffalo-ingress-azure-66b968bbbc-wds6z (v1:metadata.name)
AGIC_POD_NAMESPACE: default (v1:metadata.namespace)
KUBERNETES_PORT_443_TCP_ADDR: xxxxxxx
KUBERNETES_PORT: xxxxxx
KUBERNETES_PORT_443_TCP: xxxxx
KUBERNETES_SERVICE_HOST: xxxx
Mounts:
/etc/appgw/azure.json from azure (rw)
/var/run/secrets/kubernetes.io/serviceaccount from fantastic-waterbuffalo-sa-ingress-azure-token-5gwd7 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
azure:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/azure.json
HostPathType: File
fantastic-waterbuffalo-sa-ingress-azure-token-5gwd7:
Type: Secret (a volume populated by a Secret)
SecretName: fantastic-waterbuffalo-sa-ingress-azure-token-5gwd7
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
- Output of `kubectl logs <ingress controller>.
]
E1210 12:27:27.393881 1 worker.go:49] Error mutating AKS from k8s event. unable to get specified AppGateway (CTRL001)
E1210 12:27:27.529038 1 mutate_app_gateway.go:34] unable to get specified AppGateway [xxx-xx-xx], check AppGateway identifier, error=[azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/xxx-xx-xx/resourceGroups/xxx-xx-xx/providers/Microsoft.Network/applicationGateways/xxx-xx-xx?api-version=2019-06-01: StatusCode=403 -- Original Error: adal: Refresh request failed. Status Code = '403'. Response body: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
]
E1210 12:27:27.529097 1 worker.go:53] Error mutating App Gateway config from k8s event. unable to get specified AppGateway (CTRL001)
E1210 12:27:32.687335 1 mutate_app_gateway.go:34] unable to get specified AppGateway [xxx-xx-xx], check AppGateway identifier, error=[azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/xxx-xx-xx/resourceGroups/xxx-xx-xx/providers/Microsoft.Network/applicationGateways/xxx-xx-xx?api-version=2019-06-01: StatusCode=403 -- Original Error: adal: Refresh request failed. Status Code = '403'. Response body: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
]
E1210 12:27:32.687756 1 worker.go:49] Error mutating AKS from k8s event. unable to get specified AppGateway (CTRL001)
E1210 12:27:32.947537 1 mutate_app_gateway.go:34] unable to get specified AppGateway [xxx-xx-xx], check AppGateway identifier, error=[azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscriptions/xxx-xx-xx/resourceGroups/xxx-xx-xx/providers/Microsoft.Network/applicationGateways/xxx-xx-xx?api-version=2019-06-01: StatusCode=403 -- Original Error: adal: Refresh request failed. Status Code = '403'. Response body: adal: Refresh request failed. Status Code = '400'. Response body: {"error":"invalid_request","error_description":"Identity not found"}
]
E1210 12:27:32.947736 1 worker.go:53] Error mutating App Gateway config from k8s event. unable to get specified AppGateway (CTRL001)```
- Any Azure support tickets associated with this issue. maybe related https://github.com/Azure/application-gateway-kubernetes-ingress/issues/117
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16 (12 by maintainers)
Not very satisfying resolution of the ticket… i still dont know what to do…
Hello, I’m having the same problem after re-create my storage accounts, how can I fix caching issues?
@aleksmark this looks like an issue in either AAD Pod identity or IMDS (instance metadata service) that is responsible for responding to token requests. I am investigating this further with the identity team. Will soon provide an update.