podman: Cannot determine if a container was unable to start

/kind feature

Description

Say I execute a pod with the following pod spec:

        restartPolicy: OnFailure
        containers:
        - image: docker.io/library/alpine:edge
          name: hello
          args:
          - efcho
          - Hello world
          tty: true

The resulting container, of course, cannot start:

podman start container-id
Error: unable to start container "container-id": crun: executable file `efcho` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found

… due to command not found.

However, this error does not appear anywhere in the “podman inspect” output:

[
     {
          "State": {
               "OciVersion": "1.0.2-dev",
               "Status": "created",
               "Running": false,
               "Paused": false,
               "Restarting": false,
               "OOMKilled": false,
               "Dead": false,
               "Pid": 0,
               "ExitCode": 0,
               "Error": "",
               "StartedAt": "0001-01-01T00:00:00Z",
               "FinishedAt": "0001-01-01T00:00:00Z",
               "Health": {
                    "Status": "",
                    "FailingStreak": 0,
                    "Log": null
               },
               "CheckpointedAt": "0001-01-01T00:00:00Z",
               "RestoredAt": "0001-01-01T00:00:00Z"
          }
     }
]

… the container just appears as “created” and there’s no way to distinguish why it failed.

Describe the results you received:

What is the correct way to check if a container is in this kind of failed state via the API, and how to get the error message?

Is the best way to just check if the container is in “created” state, try to start it via api, and check for an error?

Describe the results you expected:

This seems like a bit of a hack/workaround, it’d be best if the error was in the container status somewhere.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 21 (16 by maintainers)

Commits related to this issue

Most upvoted comments

I currently have a partial fix for this, but it isn’t capturing the entire scope of the issue. In https://github.com/containers/podman/blob/639efd86e3d9ea9d5c580c2f7f3bbcb8b9a549fe/libpod/container_api.go#L85 I currently have something along these lines

saveErrorState := func(e error) error {
    c.state.Error = e.Error()
    if err := c.save(); err != nil {
        return err
    }
    return nil
}

which is called when an error occurs in the function. The issue is this doesn’t allow for the entire issue to be solved. If an error occurs in the container engine’s ContainerRun function, I don’t have the availability to modify the container’s state and save it (to the best of my knowledge) without modifying the API in some way. Is there a way to do this without making any changes to the API?

For reference, this is the result of the above fix in regards to the issue above

         "State": {
              "OciVersion": "1.0.2-dev",
              "Status": "created",
              "Running": false,
              "Paused": false,
              "Restarting": false,
              "OOMKilled": false,
              "Dead": false,
              "Pid": 0,
              "ExitCode": 0,
              "Error": "crun: executable file `efcho` not found in $PATH: No such file or directory: OCI runtime attempted to invoke a command that was not found",
              "StartedAt": "0001-01-01T00:00:00Z",
              "FinishedAt": "0001-01-01T00:00:00Z",
              "Health": {
                   "Status": "",
                   "FailingStreak": 0,
                   "Log": null
              },
              "CheckpointedAt": "0001-01-01T00:00:00Z",
              "RestoredAt": "0001-01-01T00:00:00Z"
         },