argo-workflows: executor fails to process resource with a manifest including multiple resources, which works in v3.0.7

Summary

Currently, workflow executor throw “Kind and Name are required one of them is missed within the manifest” when process multile resources within the manifests.

What happened/what you expected to happen? Expect all resources within the manifest are able to get created.

What version of Argo Workflows are you running? v3.2.7

Diagnostics

Either a workflow that reproduces the bug, or paste you whole workflow YAML, including status, something like:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    workflows.argoproj.io/pod-name-format: v1
  creationTimestamp: "2022-02-01T17:32:53Z"
  generation: 4
  labels:
    workflows.argoproj.io/completed: "true"
    workflows.argoproj.io/controller-instanceid: addon-manager-workflow-controller
    workflows.argoproj.io/phase: Failed
  name: event-router-prereqs-f349a582-wf
  namespace: addon-manager-system
  ownerReferences:
  - apiVersion: addonmgr.keikoproj.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Addon
    name: event-router
    uid: 0a27f1ce-843b-476b-8a28-c0739f73a4e3
  resourceVersion: "47574"
  uid: 356b5e48-2a9a-4aa7-89c3-57e5d925ba6a
spec:
  activeDeadlineSeconds: 300
  arguments:
    parameters:
    - name: namespace
      value: addon-event-router-ns
    - name: pkgChannel
      value: ""
    - name: pkgName
      value: event-router
    - name: pkgVersion
      value: v0.2
    - name: pkgType
      value: composite
    - name: pkgDescription
      value: Event router
    - name: clusterName
      value: knock-knock
    - name: clusterRegion
      value: us-west-2
  entrypoint: entry
  serviceAccountName: addon-manager-workflow-installer-sa
  templates:
  - inputs: {}
    metadata: {}
    name: entry
    outputs: {}
    steps:
    - - arguments: {}
        name: prereq-resources
        template: submit
  - inputs: {}
    metadata: {}
    name: submit
    outputs: {}
    resource:
      action: apply
      manifest: |-
        {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"labels":{"app.kubernetes.io/managed-by":"addonmgr.keikoproj.io","app.kubernetes.io/name":"event-router","app.kubernetes.io/part-of":"event-router","app.kubernetes.io/version":"v0.2"},"name":"addon-event-router-ns"}}
        {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"app.kubernetes.io/managed-by":"addonmgr.keikoproj.io","app.kubernetes.io/name":"event-router","app.kubernetes.io/part-of":"event-router","app.kubernetes.io/version":"v0.2"},"name":"event-router-sa","namespace":"addon-event-router-ns"}}
  ttlStrategy:
    secondsAfterCompletion: 259200
status:
  artifactRepositoryRef:
    artifactRepository: {}
    default: true
  conditions:
  - status: "False"
    type: PodRunning
  - status: "True"
    type: Completed
  finishedAt: "2022-02-01T17:33:13Z"
  message: child 'event-router-prereqs-f349a582-wf-3187746742' failed
  nodes:
    event-router-prereqs-f349a582-wf:
      children:
      - event-router-prereqs-f349a582-wf-2041065729
      displayName: event-router-prereqs-f349a582-wf
      finishedAt: "2022-02-01T17:33:13Z"
      id: event-router-prereqs-f349a582-wf
      message: child 'event-router-prereqs-f349a582-wf-3187746742' failed
      name: event-router-prereqs-f349a582-wf
      outboundNodes:
      - event-router-prereqs-f349a582-wf-3187746742
      phase: Failed
      progress: 1/1
      resourcesDuration:
        cpu: 1
        memory: 1
      startedAt: "2022-02-01T17:32:53Z"
      templateName: entry
      templateScope: local/event-router-prereqs-f349a582-wf
      type: Steps
    event-router-prereqs-f349a582-wf-2041065729:
      boundaryID: event-router-prereqs-f349a582-wf
      children:
      - event-router-prereqs-f349a582-wf-3187746742
      displayName: '[0]'
      finishedAt: "2022-02-01T17:33:13Z"
      id: event-router-prereqs-f349a582-wf-2041065729
      message: child 'event-router-prereqs-f349a582-wf-3187746742' failed
      name: event-router-prereqs-f349a582-wf[0]
      phase: Failed
      progress: 1/1
      resourcesDuration:
        cpu: 1
        memory: 1
      startedAt: "2022-02-01T17:32:53Z"
      templateScope: local/event-router-prereqs-f349a582-wf
      type: StepGroup
    event-router-prereqs-f349a582-wf-3187746742:
      boundaryID: event-router-prereqs-f349a582-wf
      displayName: prereq-resources
      finishedAt: "2022-02-01T17:33:12Z"
      hostNodeName: knock-knock-worker2
      id: event-router-prereqs-f349a582-wf-3187746742
      message: 'Error (exit code 1): Kind and name are both required but at least
        one of them is missing from the manifest'
      name: event-router-prereqs-f349a582-wf[0].prereq-resources
      outputs:
        exitCode: "1"
      phase: Failed
      progress: 1/1
      resourcesDuration:
        cpu: 1
        memory: 1
      startedAt: "2022-02-01T17:32:53Z"
      templateName: submit
      templateScope: local/event-router-prereqs-f349a582-wf
      type: Pod
  phase: Failed
  progress: 1/1
  resourcesDuration:
    cpu: 1
    memory: 1
  startedAt: "2022-02-01T17:32:53Z"

What Kubernetes provider are you using? v1.20

What executor are you running? Docker/K8SAPI/Kubelet/PNS/Emissary

Docker quay.io/argoproj/argoexec

# Logs from the workflow controller:
time="2022-02-01T17:28:27.954Z" level=warning msg="Non-transient error: configmaps \"artifact-repositories\" not found"
time="2022-02-01T17:28:27.955Z" level=info msg="resolved artifact repository" artifactRepositoryRef=default-artifact-repository
time="2022-02-01T17:28:27.955Z" level=info msg="Updated phase  -> Running" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.955Z" level=info msg="Steps node event-router-prereqs-f349a582-wf initialized Running" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.955Z" level=info msg="StepGroup node event-router-prereqs-f349a582-wf-2041065729 initialized Running" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.955Z" level=info msg="Pod node event-router-prereqs-f349a582-wf-3187746742 initialized Pending" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.970Z" level=info msg="Create events 201"
time="2022-02-01T17:28:27.978Z" level=info msg="Create pods 201"
time="2022-02-01T17:28:27.980Z" level=info msg="Created pod: event-router-prereqs-f349a582-wf[0].prereq-resources (event-router-prereqs-f349a582-wf-3187746742)" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.980Z" level=info msg="Workflow step group node event-router-prereqs-f349a582-wf-2041065729 not yet completed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.980Z" level=info msg="TaskSet Reconciliation" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.980Z" level=info msg=reconcileAgentPod namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:27.992Z" level=info msg="Update workflows 200"
time="2022-02-01T17:28:27.993Z" level=info msg="Workflow update successful" namespace=addon-manager-system phase=Running resourceVersion=46874 workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:28.002Z" level=info msg="Create events 201"
time="2022-02-01T17:28:28.010Z" level=info msg="Create events 201"
time="2022-02-01T17:28:32.938Z" level=info msg="Get leases 200"
time="2022-02-01T17:28:32.946Z" level=info msg="Update leases 200"
time="2022-02-01T17:28:33.772Z" level=info msg="Alloc=8522 TotalAlloc=183531 Sys=25425 NumGC=127 Goroutines=196"
time="2022-02-01T17:28:37.951Z" level=info msg="Get leases 200"
time="2022-02-01T17:28:37.957Z" level=info msg="Update leases 200"
time="2022-02-01T17:28:37.978Z" level=info msg="Processing workflow" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.978Z" level=info msg="Pod failed: Error (exit code 1): Kind and name are both required but at least one of them is missing from the manifest" displayName=prereq-resources namespace=addon-manager-system pod=event-router-prereqs-f349a582-wf-3187746742 templateName=submit workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.978Z" level=info msg="Updating node event-router-prereqs-f349a582-wf-3187746742 exit code 1" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.978Z" level=info msg="Updating node event-router-prereqs-f349a582-wf-3187746742 status Pending -> Failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.978Z" level=info msg="Updating node event-router-prereqs-f349a582-wf-3187746742 message: Error (exit code 1): Kind and name are both required but at least one of them is missing from the manifest" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.978Z" level=info msg="pod progress" namespace=addon-manager-system progress=1/1 workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Step group node event-router-prereqs-f349a582-wf-2041065729 deemed failed: child 'event-router-prereqs-f349a582-wf-3187746742' failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf-2041065729 phase Running -> Failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf-2041065729 message: child 'event-router-prereqs-f349a582-wf-3187746742' failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf-2041065729 finished: 2022-02-01 17:28:37.9794411 +0000 UTC" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="step group event-router-prereqs-f349a582-wf-2041065729 was unsuccessful: child 'event-router-prereqs-f349a582-wf-3187746742' failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Outbound nodes of event-router-prereqs-f349a582-wf-3187746742 is [event-router-prereqs-f349a582-wf-3187746742]" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Outbound nodes of event-router-prereqs-f349a582-wf is [event-router-prereqs-f349a582-wf-3187746742]" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf phase Running -> Failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf message: child 'event-router-prereqs-f349a582-wf-3187746742' failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="node event-router-prereqs-f349a582-wf finished: 2022-02-01 17:28:37.9795951 +0000 UTC" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Checking daemoned children of event-router-prereqs-f349a582-wf" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="TaskSet Reconciliation" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg=reconcileAgentPod namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Updated phase Running -> Failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Updated message  -> child 'event-router-prereqs-f349a582-wf-3187746742' failed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Marking workflow completed" namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf
time="2022-02-01T17:28:37.979Z" level=info msg="Checking daemoned children of " namespace=addon-manager-system workflow=event-router-prereqs-f349a582-wf

# If the workflow's pods have not been created, you can skip the rest of the diagnostics.

# The workflow's pods that are problematic:
kubectl get pod -o yaml -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

# Logs from in your workflow's wait container, something like:
kubectl logs -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 31 (13 by maintainers)

Most upvoted comments

maybe the original author intends to support multiple resources within a manifest

The original author already responded https://github.com/argoproj/argo-workflows/issues/7721#issuecomment-1028250815.

And we just need expand the check of GRV/kind/name to a group of resources rather than one.

It’s not just the check. There’s also success and failure conditions that we need to specify to watch statuses for each of the resources.

We probably want to fail fast at kubectl command. Would you like to submit a PR to improve that?

No, you could template it using parameters

@ccfishk Your PR https://github.com/argoproj/argo-workflows/pull/7793 seems to change the original behavior and design, which is not what I suggested so I am going to close it for now.

The current workaround is to use multiple resource templates or use a script template that executes a kubectl apply.

Also note that we will be switching to replace kubectl with pure Go. See https://github.com/argoproj/argo-workflows/issues/7797.

I don’t think we should change this behaviour. Instead, I think we should discus alternative options and workarounds.

@jessesuen do you know if a resource template was meant to support multiple resources in the manifest?

It was not. Resource templates were intended to be against individual resources and is needed to wait on resource status. So the fact that this was working before, was actually a bug.

Understand this is current limitation. 👍

But, one manifest with multiple resources works in v3.0.7, as well as previous versions. Not sure which changes introduces regression, but would be very nice to have the full-function back.

I am looking at v3.0.7 source code, there always requires a single resource name unless you are running a “delete” action. @ccfishk Were you running “delete” action by any chance with v3.0.7?

I’ll also defer to @jessesuen to comment on the original design of resource template.

So @terrytangyuan said, this is intentional. We cannot have more than one resource.

This is intentional as we need to monitor the status/conditions (success/failure) for the created resource.