kubernetes: Creation of large aws-ebs volume with xfs file system fails sporadically

What happened: Creating larger aws-ebs volumes (~2.6TB) with xfs seems to fail sporadically. As a consequence, mounting the volume in to the pod fails:

Events:
  Type     Reason       Age                    From                                                    Message
  ----     ------       ----                   ----                                                    -------
  Warning  FailedMount  11m (x249 over 7h35m)  kubelet, ip-10-42-66-162.eu-central-1.compute.internal  (combined from similar events): MountVolume.MountDevice failed for volume "pvc-14ceff2e-401b-48eb-9d0d-ea3a2da8f9a5" : failed to mount the volume as "xfs", it already contains unknown data, probably partitions. Mount error: mount failed: exit status 32
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-central-1a/vol-01769a536e60bc232 --scope -- mount -t xfs -o defaults /dev/xvdbj /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-central-1a/vol-01769a536e60bc232
Output: Running scope as unit: run-r2c28bced23c548059e73186ac213f30a.scope
mount: /var/lib/kubelet/plugins/kubernetes.io/aws-ebs/mounts/aws/eu-central-1a/vol-01769a536e60bc232: wrong fs type, bad option, bad superblock on /dev/xvdbj, missing codepage or helper program, or other error.
  Warning  FailedMount  2m9s (x169 over 7h35m)  kubelet, ip-10-42-66-162.eu-central-1.compute.internal  Unable to mount volumes for pod "41c4e262-72f2-426d-a4c7-48eb8295bf56-574b5bdfbb-pcmsd_default(729ff4c5-932e-498e-9a12-4b9c9054f469)": timeout expired waiting for volumes to attach or mount for pod "default"/"41c4e262-72f2-426d-a4c7-48eb8295bf56-574b5bdfbb-pcmsd". list of unmounted volumes=[storage-volume]. list of unattached volumes=[config-volume storage-volume backint-volume hana-ssl-secret]

Output of parted for volume with successful file system creation:

(parted) select /dev/xvdbl                                                
 Using /dev/xvdbl
(parted) p                                                                
 Model: Xen Virtual Block Device (xvd)
 Disk /dev/xvdbl: 2792GB
 Sector size (logical/physical): 512B/512B
 Partition Table: loop
 Disk Flags:
Number  Start  End     Size    File system  Flags
 1      0.00B  2792GB  2792GB  xfs

Output of parted for volume with invalid file system:

(parted) select /dev/xvdbt                                                
 Using /dev/xvdbt
 (parted) p                                                                
 Error: /dev/xvdbt: unrecognised disk label
 Model: Xen Virtual Block Device (xvd)                                     
 Disk /dev/xvdbt: 2792GB
 Sector size (logical/physical): 512B/512B
 Partition Table: unknown
 Disk Flags:

What you expected to happen: Creation of volume should result in a volume with a file system which can be mounted. Volume creation should fail if file system creation fails.

How to reproduce it (as minimally and precisely as possible):

Create storage class

kind: StorageClass
metadata:
  labels:
    name: sample
  name: sample
  selfLink: /apis/storage.k8s.io/v1/storageclasses/sample
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
parameters:
  encrypted: "true"
  fsType: xfs
  type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

Create pvc

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
    volume.kubernetes.io/selected-node: ip-10-XXX-XXX-XXX.eu-central-1.compute.internal
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    app.kubernetes.io/component: Sample
  name: sample
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 2600Gi
  storageClassName: sample
  volumeMode: Filesystem
  volumeName: pvc-5558dce7-9070-4b43-b2d9-3b59d4680a22

Create pod

spec:
    name: sample
    volumeMounts:
    - mountPath: /sample/mounts
      name: storage-volume
  volumes:
  - name: storage-volume
    persistentVolumeClaim:
      claimName: sample

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.6", GitCommit:"7015f71e75f670eb9e7ebd4b5749639d42e20079", GitTreeState:"clean", BuildDate:"2019-11-13T11:11:50Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Cloud provider or hardware configuration: self-managed cluster on AWS
OS (e.g: cat /etc/os-release):

NAME="SLES"
VERSION="15-SP1"
VERSION_ID="15.1"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp1"

Kernel (e.g. uname -a): 4.12.14-197.18-default #1 SMP Tue Sep 17 14:26:49 UTC 2019 (d75059b) x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

About this issue

Original URL
State: closed
Created 5 years ago
Comments: 31 (9 by maintainers)

Most upvoted comments

I’m leaving this info here for future users. I’ve hit this problem multiple times within us-east-1 specifically. This occurs on encrypted EBS volumes only.

From AWS support EBS team:

This behavior is observed when an application reads unwritten blocks on an encrypted EBS volume. These unwritten blocks return random data. If the state of these blocks is important to the application, we would recommend writing to those unwritten blocks. Additionally, we are also in the process of evaluating whether we should change our behavior in the light of your feedback.”

We see this issue randomly when provisioning >1TB GP2 volumes. Our current workaround has been to delete a PVC and let kubernetes recreate the PV, this has worked for our use case today. I think operating system choice also plays are part in this issue. The Amazon Linux OS doesn’t seem to have this issue while Flatcar and Ubuntu both report data being present on the volume.

BondAnthony on Jan 5, 2021

EBS had deployed a fix for the latest generation Nitro instances, so that unwritten blocks on an encrypted EBS volume will no longer return random data. We are actively working to deploy the fix on Xen instances later this year.

Please let us know if you are still running into this issue on Nitro instances.

ksliu58 on Aug 10, 2021

The fix on Nitro was deployed on July 6th.

ksliu58 on Aug 16, 2021