rook: OSD upgrade failure from 1.0 to 1.1 with "managedFields" on
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior: During upgrade from 1.0.2 to 1.1.7 upgrade, the mon/mgr deploy were upgraded successfully but osd deployment failed. Seems that rook upgrade doesn’t handle “managed field” included in the metadata.
Expected behavior: OSD upgrade should succeed.
How to reproduce it (minimal and precise):
1). Deploy rook 1.0.2 cluster 2). Upgrade operator image from 1.0.2 to 1.1.7 3). The osd deploy failed to be upgraded with below failure shown in the operator log. Note: The mon/mgr deploy succeeded with the upgrade.
2019-11-22 15:54:24.340219 I | op-osd: starting 1 osd daemons on node olive-1906-e-s-01
2019-11-22 15:54:24.369473 W | op-osd: failed to create osd deployment for node olive-1906-e-s-01, osd {1 /var/lib/rook/osd1 /var/lib/rook/osd1/rook-ceph.config ceph /var/lib/rook/osd1/keyring 15dcf396-b263-4145-90d7-4a1fe7ea188c false false true false}: failed to update object (Create for apps/v1, Kind=Deployment) managed fields: failed to create typed new object: .spec.template.spec.containers[name="osd"].env: duplicate entries for key [name="NODE_NAME"]
2019-11-22 15:54:24.377129 I | op-osd: osd orchestration status for node olive-1906-e-s-02 is completed
2019-11-22 15:54:24.377148 I | op-osd: starting 1 osd daemons on node olive-1906-e-s-02
2019-11-22 15:54:24.381468 W | op-osd: failed to create osd deployment for node olive-1906-e-s-02, osd {0 /var/lib/rook/osd0 /var/lib/rook/osd0/rook-ceph.config ceph /var/lib/rook/osd0/keyring bd8cdeb6-4419-422f-a5ec-2a49a35d1db4 false false true false}: failed to update object (Create for apps/v1, Kind=Deployment) managed fields: failed to create typed new object: .spec.template.spec.containers[name="osd"].env: duplicate entries for key [name="NODE_NAME"]
# kubectl -n rook-ceph get deployments -l rook_cluster=rook-ceph -o jsonpath='{range .items[*]}{.metadata.name}{"\trook-version="}{.metadata.labels.rook-version}{"\n"}{end}'
rook-ceph-mgr-a rook-version=v1.1.7
rook-ceph-mon-a rook-version=v1.1.7
rook-ceph-mon-b rook-version=v1.1.7
rook-ceph-mon-c rook-version=v1.1.7
rook-ceph-osd-0 rook-version=v1.0.2
rook-ceph-osd-1 rook-version=v1.0.2
rook-ceph-osd-2 rook-version=v1.0.2
rook-ceph-osd-3 rook-version=v1.0.2
rook-ceph-osd-4 rook-version=v1.0.2
rook-ceph-rgw-rook-ceph-store rook-version=v1.0.2
4). The failed osd has “managed field” included in the metadata. In the different k8s cluster, the osds without “managed field” would succeed.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
creationTimestamp: "2019-11-22T07:15:38Z"
generation: 1
labels:
app: rook-ceph-osd
ceph-osd-id: "0"
ceph-version: 14.2.1
rook-version: v1.0.2
rook_cluster: rook-ceph
managedFields:
- apiVersion: apps/v1
fields:
f:metadata:
f:annotations:
.: null
f:deployment.kubernetes.io/revision: null
5). The first “NODE_NAME” is in “managedFields”. And the 2nd is regular one.
#]# knc get pod rook-ceph-osd-0-57b64b6587-4g9lr -o json | grep -n "\"NODE_NAME"
66: "k:{\"name\":\"NODE_NAME\"}": {
755: "name": "NODE_NAME",
"managedFields": [
{
"apiVersion": "v1",
"fields": {
"f:metadata": {
"f:annotations": {
".": null,
"f:kubernetes.io/psp": null
},
"f:generateName": null,
"f:labels": {
".": null,
"f:app": null,
"f:ceph-osd-id": null,
"f:pod-template-hash": null,
"f:rook_cluster": null
},
"f:ownerReferences": {
".": null,
"k:{\"uid\":\"e2673ba6-0cf7-11ea-b07c-fa163e064919\"}": {
".": null,
"f:apiVersion": null,
"f:blockOwnerDeletion": null,
"f:controller": null,
"f:kind": null,
"f:name": null,
"f:uid": null
}
}
},
"f:spec": {
"f:affinity": {
".": null,
"f:nodeAffinity": {
".": null,
"f:requiredDuringSchedulingIgnoredDuringExecution": null
}
},
"f:containers": {
"k:{\"name\":\"osd\"}": {
".": null,
"f:args": null,
"f:command": null,
"f:env": {
".": null,
"k:{\"name\":\"CONTAINER_IMAGE\"}": {
".": null,
"f:name": null,
"f:value": null
},
"k:{\"name\":\"NODE_NAME\"}": {
".": null,
"f:name": null,
"f:valueFrom": {
File(s) to submit:
- Cluster CR (custom resource), typically called
cluster.yaml, if necessary - Operator’s logs, if necessary
- Crashing pod(s) logs, if necessary
To get logs, use kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
Read Github documentation if you need help.
Environment:
- OS (e.g. from /etc/os-release): centos 7.6
- Kernel (e.g.
uname -a): - Cloud provider or hardware configuration: openstack
- Rook version (use
rook versioninside of a Rook Pod): rook1.1.7 - Storage backend version (e.g. for ceph do
ceph -v): 14.2.1 - Kubernetes version (use
kubectl version): 14.3 - Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift): Tectonic
- Storage backend status (e.g. for Ceph use
ceph healthin the Rook Ceph toolbox):
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 17 (9 by maintainers)
In 14.3, with “AllAlpha=true” , the “managed fields” will show. And it’s disabled by “AllAlpha=true, ServerSideApply=false”.