rook: init-copy-binaries can leave container in a bad state
Is this a bug report or feature request?
- Bug Report
Deviation from expected behavior: If the program crashes (due to OOM, segfault, etc), the restarting the container should not leave things in a bad state
Expected behavior: If the copy fails, the copy is re-attempted on next init container run
How to reproduce it (minimal and precise):
- Set a limit range on the namespace:
apiVersion: v1
kind: LimitRange
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
limits:
- type: Container
defaultRequest:
memory: 8Mi
default:
memory: 64Mi
- Restart rook-ceph-operator so it triggers the
rook-ceph-csi-detect-versionjob - Watch the job get OOMKilled then restarted
Environment:
Rook Version 1.3.1 Ceph CSI Version 2.0.1
Details: Looking at the code, it looks like you don’t do an atomic copy and skip if the file already exists. So when the program crashes, the incomplete file with the wrong permissions is left there so it manifests as a “Permission Denied” when trying to run the executable without the executable bits set.
Aside from the issue above, why does copying files take over 64MiB of memory!? Can’t you just make this a cp -a instead of implementing a simple function like this in Go?
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 32 (17 by maintainers)
Hey @parth-gr I will take it up. please assign it to me. I have already looked at the code earlier once. Will send a PR for this in this week.