podman: kube play: cannot unmarshal ... PodRmReport.RmReport.Err

[+1526s] not ok 392 podman kube play - hostport
...
$ podman-remote kube down /tmp/podman_bats.rYmm67/testpod.yaml
Error: json: cannot unmarshal string into Go struct field PodRmReport.RmReport.Err of type error
[ rc=125 (** EXPECTED 0 **) ]

Three instances, first in August. Root/rootless. Always remote.

About this issue

  • Original URL
  • State: open
  • Created 2 years ago
  • Comments: 18 (3 by maintainers)

Most upvoted comments

It definitely does.


The underlying cause, in RmReport:

1 error occurred:
	* removing container 919c374a24cadedcba2d95849dcf5c4aa17d0e757965436b5530d14fa28ab695 from pod 23925178e788af05632a6613f214378c0d2a3c64ca078339fc0bd906b3ae4820: removing container 919c374a24cadedcba2d95849dcf5c4aa17d0e757965436b5530d14fa28ab695 root filesystem: 1 error occurred:
	* unlinkat /var/lib/containers/storage/overlay-containers/919c374a24cadedcba2d95849dcf5c4aa17d0e757965436b5530d14fa28ab695/userdata/shm: device or resource busy

That looks rather like some other flakes, e.g. https://github.com/containers/podman/issues/11594#issuecomment-1276409450


And then there seems to be a design issue in error handling of the Kube commands: With the disclaimer that I’m very new to this code, and I am just reading it, not testing it in practice:

Summary: The error interface can’t be directly transferred over JSON.

E.g. consider how https://github.com/containers/podman/blob/90b18d2d9c6cbd5f2281176452ee74db3b771d70/pkg/domain/infra/tunnel/pods.go#L171 works (outside of Kube): The PodRmReport.Err field is manually set by the remote’s client caller, i.e. it is not transferred directly as JSON. (Well, actually it is, https://github.com/containers/podman/blob/90b18d2d9c6cbd5f2281176452ee74db3b771d70/pkg/bindings/pods/pods.go#L212 does end up unmarshaling the Err field, but as long as there is no error, the value is nil and that is passed through JSON just fine.)

If the server fails, a different HTTP status triggers transfer not of PodRmReport, but of a specialized https://github.com/containers/podman/blob/90b18d2d9c6cbd5f2281176452ee74db3b771d70/pkg/errorhandling/errorhandling.go#L93 JSON format; and the callee detects that and manually populates PodRmReport.Err.

That’s the way this seems to work for the standalone PodRmReport. But that mechanism only works at the top level of the request; meanwhile play kube just embeds the full PodRmReport (and other structs) into the total result, as a single unit, with no special-case handling of error values; because PodRmReport can’t actually transfer Err through raw JSON, that then breaks.

Fixing this seems to mean changing the structure of entities.PlayKubeTeardown so that it does not naively embed the top-level “Report” structures, but some variant that does not contain a raw error field but maybe instances of errorhandling.ErrorModel. (That would, in theory, be an API break, but the existing API can’t be correctly consumed by Podman itself in error situations, so…)

And as much as I originally looked into this bug because of a stupid oversight (https://github.com/containers/podman/issues/16154#issuecomment-1355306783 mixes up type and field names, there is actually no mystery in that), I’d like to leave this play kube reorganization to someone familiar with the code.