kubernetes: PVC usage metrics incorrect in /stats/summary kubelet endpoint for Windows Azure Files

What happened?

Curling https://<Node-IP>:10250/stats/summary for a Windows node returns the wrong usage and capacity bytes for a azurefile-csi PVC used by a Windows pod on that node:

 "pods": [
  {
   "podRef": {
    "name": "win-webserver-5d5d4966f5-zrcf6",
    "namespace": "default",
    "uid": "4b2a7fec-26d4-482c-adc8-2c25baa1d054"
   },
  ...
   "volume": [
    {
     "time": "2022-05-26T23:37:24Z",
     "availableBytes": 91617021952,
     "capacityBytes": 136912564224,
     "usedBytes": 45295542272,
     "inodesFree": 0,
     "inodes": 0,
     "inodesUsed": 0,
     "name": "volume",
     "pvcRef": {
      "name": "azure-file-win",
      "namespace": "default"
     }
    }
  },

This looks like it is the usage of the disk on the node or something else. azurefile-csi PVCs used by Linux pods report the correct usage.

What did you expect to happen?

The PVC is empty so the usedBytes should be 0 and the capacity should be 2Gi.

How can we reproduce it (as minimally and precisely as possible)?

Apply the following to create an azurefile PVC and a windows pod that uses it:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: azure-file-win
spec:
  accessModes:
  - ReadWriteMany
  storageClassName: azurefile
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: Service
metadata:
  name: win-webserver
  labels:
    app: win-webserver
spec:
  ports:
    # the port that this service should serve on
  - port: 80
    targetPort: 80
  selector:
    app: win-webserver
  type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: win-webserver
  name: win-webserver
spec:
  selector:
    matchLabels:
      app: win-webserver
  replicas: 1
  template:
    metadata:
      labels:
        app: win-webserver
      name: win-webserver
    spec:
      containers:
      - name: windowswebserver
        image: mcr.microsoft.com/windows/servercore:ltsc2019
        imagePullPolicy: IfNotPresent
        command:
        - powershell.exe
        - -command
        - $listener = New-Object System.Net.HttpListener; $listener.Prefixes.Add('http://*:80/'); $listener.Start();Write-Host('Listening at http://*:80/'); while ($listener.IsListening) { $context = $listener.GetContext(); $response = $context.Response; $content='<html><body><H1>Windows Container Web Server</H1></body></html>'; $buffer = [System.Text.Encoding]::UTF8.GetBytes($content); $response.ContentLength64 = $buffer.Length; $response.OutputStream.Write($buffer, 0, $buffer.Length); $response.Close(); };
        volumeMounts:
        - mountPath: "C:\\Data"
          name: volume
      volumes:
      - name: volume
        persistentVolumeClaim:
          claimName: azure-file-win
      nodeSelector:
        beta.kubernetes.io/os: windows

Then run curl -s -k -H "Authorization: Bearer $(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"https://<NODE_IP>:10250/stats/summary where <NODE_IP> is the IP of the Windows node the pod is running on and view the output for the pod.

Anything else we need to know?

No response

Kubernetes version

1.21.7

Cloud provider

Azure

OS version

Caption: Microsoft Windows Server 2019 Datacenter Version: 10.0.17763 BuildNumber:17763 OSArchitecture: 64-bit

Install tools

No response

Container runtime (CRI) and version (if applicable)

Docker

Related plugins (CNI, CSI, …) and versions (if applicable)

AzureFile CSI Driver version: 1.17.0

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 2
  • Comments: 28 (20 by maintainers)

Most upvoted comments

@andyzhangx thank you so very much for the information provided here, I want to know if this is already in the roadmap and see if there is any ETA for this fix?

Thank you so much.

@lualvare it mainly depends on when csi-proxy removal would be completed: https://github.com/kubernetes-csi/csi-proxy/issues/217, since we don’t publish any new version of csi-proxy now, after that work is done, such kind of change is easier. There is no clear ETA yet.