kubectl: kubectl rollout undo should warn about undefined behaviour with kubectl apply
What would you like to be added:
Make kubectl rollout undo and kubectl apply work together. kubectl rollout undo doesn’t revert the kubectl.kubernetes.io/last-applied-configuration annotation, which leads to problems if named objects in arrays are intended to be removed.
Why is this needed: I ran into the issue that I had an old container still being present after applying a rename of a sidecar. Symptoms look like these issues: https://github.com/kubernetes/kubernetes/issues/63982 and https://github.com/kubernetes/kubernetes/issues/25390.
What happened to me, is as follows:
- deployment has containers
a+b - I wanted to rename
btocand changec’s implementation - I ran
kubectl applywith the changed yaml with a version ofcwhich contains an error - rollout status doesn’t finish, because the pod goes into a CrashLoopBackOff
- I see the failure occur and decide to revert to the last version:
kubectl rollout undo - the deployment is reverted and the new replica set is scaled down
- however the
kubectl.kubernetes.io/last-applied-configurationannotation on the deployment’s metadata is not rolled back - the next time I deploy the
a+cyaml for the deployment,bis not in the last applied configuration so kubernetes config merge algorithm doesn’t removeb - the pods are running with containers
a+b+c
The rollout undo command doesn’t rollback kubectl.kubernetes.io/last-applied-configuration which doesn’t make sense imho.
I reported this to Google Support, with a reproduction, their answer was this:
We do not typically recommend using “rollout” to be honest, in general we would recommend re-applying the old config.
There are a lot of cases where the old config no long is available to the user, especially since rollout undo is only done when there is something going wrong. At these times you want kubectl to be your friend who has your back…
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 2
- Comments: 42 (31 by maintainers)
kubectl rollout undois an imperative command, similar tokubectl edit, and not intended to be used withkubectl applyin its original design. Forkubectl applyusers, modifying configs (e.g. revert changes) and then apply changes throughkubectl applyis the declarative way to do a rollback./retitle kubectl rollout undo should warn about undefined behaviour with kubectl apply
(I think it is mislead to accept this as previously titled; alternatively we can revert the title and the triage and create a separate issue for the warning)
We discussed this at last week’s SIG meeting and will follow up again next week after taking some time to reflect. You can listen to the conversation starting at https://youtu.be/Wh6eU1AfHBY?t=294. Notes follow.
This is happening because the implementation of
kubectl rollout undoessentially copies the spec of the targeted ReplicaSet into the Deployment, without touching the last-applied annotation. That means last-applied continues to reflect the spec from the apply operation that created the newer ReplicaSet we moved off of, not the older one whose spec is now back in the Deployment. Because of this, the strategic merge calculation performed by the nextkubectl applywill be making an illogical and potentially invalid comparison. In the case where the set of fields under management differ among the three points in time in question,applymay draw conclusions about field ownership that are perfectly correct given the inputs, but incorrect given the real history / actual user intent.In @johngmyers example, when the rollback happens, the Deployment ends up in a state where the spec has container “foo”, and the last-applied annotation has container “bar”. When
applysubsequently runs with container “bar” added again, it sees only “bar” in the annotation. It therefore assumes that a different actor added “foo” and it should be left alone.This is impossible to fix with code alone: we’d need to start keeping more historical data than we do today. Specifically, we’d need to preserve the last-applied information by copying it to ReplicaSet. It is deliberately not being copied over today, because doing so would cause
applyto think it manages the ReplicaSet directly (it doesn’t). However, we could copy the data over into a different annotation, and then perform the reverse translation when copying back during a rollback. Something similar would need to be done for the newermetadata.managedFieldsused by server-side apply as well. Overall, it would be a significant increase in the size of all ReplicaSet objects created from Deployments managed byapply, for all users. We need to carefully consider whether this bug fix is worth that non-trivial cost.SIG-CLI often faces challenges related to the interaction between the imperative and declarative commands kubectl provides. For example, making changes with
kubectl editbetweenapplyoperations is capable of causing a very similar problem. My personal opinion is thatrollout undofalls in a grey area between the two types of command. The user’s intent in this case isn’t to make a manual change; it’s to roll back the entire Deployment object to its previous state. Where theapplydata is stored is an implementation detail they may have no awareness of, and there’s nothing that signals to them that the rollback will be incomplete in an important way. Also unlike the crud imperative commands,rolloutdoes not strike me as something we can call intended for experimental rather than production use. On the contrary, it sounds like something you should be able to reach for in an emergency.Points to reflect on and come back to at next week’s meeting:
cc @seans3 @eddiezane @ant31
Reproduction procedure:
Expected result: Deployment containing one container named
bar. Actual result: Deployment containing two containers:fooandbar.