kubernetes: StatefulSet Pod stuck in pending waiting for volumes to attach/mount (AWS)
BUG REPORT
Kubernetes version (use kubectl version): v1.5.1 (hyperkube v1.5.1_coreos.0)
Environment:
- Cloud provider or hardware configuration: AWS (private subnets across 3 AZs)
- OS (e.g. from /etc/os-release): CoreOS stable
- Kernel (e.g.
uname -a): 4.7.3-coreos-r2 - Install tools: Cloudinit + Hyperkube
- Others:
What happened: After creating a storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
and a StatefulSet:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul
namespace: core
spec:
serviceName: consul
replicas: 3
template:
metadata:
labels:
app: consul
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
terminationGracePeriodSeconds: 0
containers:
- name: consul
image: consul:0.7.2
ports:
- containerPort: 8500
- containerPort: 8600
volumeMounts:
- name: consul-data
mountPath: /var/lib/consul
volumeClaimTemplates:
- metadata:
name: consul-data
annotations:
volume.alpha.kubernetes.io/storage-class: ssd
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
All 3 requested volumes (ELBs) are created and the first pod is scheduled, the volume is attached to the instance but then nothing else happens. Eventually the mount times out and the pod is stuck pending:
λ kn describe pod consul-0
Name: consul-0
Namespace: core
Node: ip-172-20-12-10.eu-west-1.compute.internal/172.20.12.10
Start Time: Tue, 20 Dec 2016 17:12:00 +0000
Labels: app=consul
Status: Pending
IP:
Controllers: StatefulSet/consul
Containers:
consul:
Container ID:
Image: consul:0.7.2
Image ID:
Ports: 8500/TCP, 8600/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Volume Mounts:
/var/lib/consul from consul-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-vtb0f (ro)
Environment Variables: <none>
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
consul-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: consul-data-consul-0
ReadOnly: false
default-token-vtb0f:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-vtb0f
QoS Class: BestEffort
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
2m 2m 1 {default-scheduler } Normal Scheduled Successfully assigned consul-0 to ip-172-20-12-10.eu-west-1.compute.internal
48s 48s 1 {kubelet ip-172-20-12-10.eu-west-1.compute.internal} Warning FailedMount Unable to mount volumes for pod "consul-0_core(6ba1ec70-c6d7-11e6-a1f4-0622fbf400d9)": timeout expired waiting for volumes to attach/mount for pod "consul-0"/"core". list of unattached/unmounted volumes=[consul-data]
48s 48s 1 {kubelet ip-172-20-12-10.eu-west-1.compute.internal} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "consul-0"/"core". list of unattached/unmounted volumes=[consul-data]
What you expected to happen: The mount should have completed and the pod started.
How to reproduce it (as minimally and precisely as possible): Using the above manifests
Anything else do we need to know:
➜ kn get pv
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM REASON AGE
pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO Delete Bound core/consul-data-consul-0 1h
pvc-240d7caa-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO Delete Bound core/consul-data-consul-1 1h
pvc-240e4cd0-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO Delete Bound core/consul-data-consul-2 1h
➜ kn get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
consul-data-consul-0 Bound pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO 1h
consul-data-consul-1 Bound pvc-240d7caa-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO 1h
consul-data-consul-2 Bound pvc-240e4cd0-c6cf-11e6-a1f4-0622fbf400d9 1Gi RWO 1h
➜ kn describe pv pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9
Name: pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9
Labels: failure-domain.beta.kubernetes.io/region=eu-west-1
failure-domain.beta.kubernetes.io/zone=eu-west-1c
StorageClass:
Status: Bound
Claim: core/consul-data-consul-0
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 1Gi
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://eu-west-1c/vol-06a263cc5dbe35d74
FSType: ext4
Partition: 0
ReadOnly: false
No events.
➜ aws ec2 describe-volumes --volume-ids vol-06a263cc5dbe35d74
{
"Volumes": [
{
"VolumeId": "vol-06a263cc5dbe35d74",
"Size": 1,
"CreateTime": "2016-12-20T16:12:45.206Z",
"State": "in-use",
"Iops": 100,
"Encrypted": false,
"VolumeType": "gp2",
"Tags": [
{
"Key": "KubernetesCluster",
"Value": "eu-west-1.kube.usw.co"
},
{
"Key": "Name",
"Value": "kubernetes-dynamic-pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9"
},
{
"Key": "kubernetes.io/created-for/pv/name",
"Value": "pvc-240c5146-c6cf-11e6-a1f4-0622fbf400d9"
},
{
"Key": "kubernetes.io/created-for/pvc/name",
"Value": "consul-data-consul-0"
},
{
"Key": "kubernetes.io/created-for/pvc/namespace",
"Value": "core"
}
],
"Attachments": [
{
"VolumeId": "vol-06a263cc5dbe35d74",
"State": "attached",
"Device": "/dev/xvdba",
"InstanceId": "i-0baea1a1746a216d7",
"DeleteOnTermination": false,
"AttachTime": "2016-12-20T16:12:48.000Z"
}
],
"SnapshotId": "",
"AvailabilityZone": "eu-west-1c"
}
]
}
About this issue
- Original URL
- State: closed
- Created 8 years ago
- Comments: 22 (9 by maintainers)
Still having same issue on k8s v 1.9