Kubernetes Pod Stuck in ContainerCreating State
A Kubernetes pod remains indefinitely in the 'ContainerCreating' state and never transitions to 'Running', making kubectl logs unavailable for diagnosis. Root causes span image pull failures, insufficient cluster resources, PersistentVolumeClaim binding issues, CNI misconfiguration, and container runtime errors on the node. The primary diagnostic tool is 'kubectl describe pods', which exposes the Events section detailing the specific failure preventing container creation.
Indicators
- Pod status shows 'ContainerCreating' indefinitely after creation
- kubectl logs returns an error stating the container is in a pending or waiting state
- Pod does not transition to 'Running' state within the expected timeframe
- No container output or application logs are available
- kubectl get pods shows the pod age increasing with no status change
Likely causes
- Docker/OCI image pull failure due to incorrect image name, wrong tag, or unavailable registry
- Missing or misconfigured imagePullSecret preventing authentication to a private registry
- Insufficient cluster node resources (CPU or memory) causing scheduling or runtime failure
- PersistentVolumeClaim not bound or volume mount failure preventing pod initialisation
- Network plugin (CNI) misconfiguration or failure preventing network interface assignment
- Referenced Secret or ConfigMap missing or inaccessible to the pod spec
- Container runtime errors (Docker or containerd) at the node level
- Node-level issues such as disk pressure, memory pressure, or PID exhaustion
Diagnostic steps
-
Run 'kubectl describe pods <pod-name> -n <namespace>' to retrieve detailed pod events and status messages, including image pull status and container start attempts.
-
Review the 'Events' section at the bottom of the describe output for specific error messages such as ErrImagePull, ImagePullBackOff, FailedMount, or FailedScheduling.
-
Run 'kubectl get events --namespace <namespace> --sort-by=.lastTimestamp' to list all recent cluster events sorted by time, identifying any events correlated with the stuck pod.
-
Check node status with 'kubectl get nodes' to confirm all nodes are Ready. Run 'kubectl describe node <node-name>' on the scheduled node to check for resource pressure conditions (DiskPressure, MemoryPressure, PIDPressure).
-
If a volume mount issue is indicated in events, run 'kubectl get pvc -n <namespace>' and 'kubectl describe pvc <pvc-name> -n <namespace>' to verify PersistentVolumeClaim binding status and associated StorageClass.
-
If an image pull error is indicated, verify the image name and tag in the pod spec. Confirm imagePullSecrets are correctly referenced and that the secret exists with 'kubectl get secret <secret-name> -n <namespace>'.
-
SSH into the affected node and inspect container runtime logs with 'journalctl -u containerd --since "10 minutes ago"' (or 'journalctl -u docker') to identify lower-level container creation errors not surfaced by kubectl.
-
If CNI failure is suspected, check that the CNI plugin pods (e.g., Calico, Flannel, Weave) are running on the affected node with 'kubectl get pods -n kube-system -o wide' and review CNI plugin logs.
Resolution path
- Run 'kubectl describe pods <pod-name> -n <namespace>' and locate the Events section to identify the specific error preventing container creation.
- If ErrImagePull or ImagePullBackOff is reported: correct the image name/tag in the pod spec, create or fix the imagePullSecret, and confirm registry reachability from the cluster.
- If FailedScheduling is reported: free cluster resources by scaling down other workloads, adding nodes, or adjusting the pod's resource requests and limits.
- If FailedMount is reported: verify the PVC is bound ('kubectl get pvc'), confirm the StorageClass provisioner is functional, and check for underlying storage errors.
- If a Secret or ConfigMap reference error is reported: create the missing resource with 'kubectl create secret' or 'kubectl create configmap' as required by the pod spec.
- If node or container runtime errors are indicated: SSH to the node, restart the container runtime service if appropriate ('systemctl restart containerd'), and review journalctl output for actionable errors.
- Delete the stuck pod with 'kubectl delete pod <pod-name> -n <namespace>' after the root cause is resolved and allow the controller (Deployment, StatefulSet, etc.) to recreate it, or apply an updated manifest.
- Monitor pod startup with 'kubectl get pods -n <namespace> -w' to confirm the pod transitions to 'Running' and passes readiness checks.
Prevention
- Validate container image names, tags, and registry accessibility as part of CI/CD pipeline checks before deployment.
- Pre-configure imagePullSecrets in the relevant namespace service account so all pods inherit them without explicit pod-level configuration.
- Define appropriate resource requests and limits on all pod specs to ensure the scheduler can make accurate placement decisions and prevent node resource exhaustion.
- Test PersistentVolumeClaim provisioning and binding in a non-production environment before deploying stateful workloads to production.
- Monitor cluster node health, resource utilisation, and pressure conditions proactively using tools such as Prometheus, Grafana, or cloud-native monitoring (e.g., Azure Monitor, AWS CloudWatch).
- Ensure CNI plugin DaemonSets are included in cluster upgrade and maintenance runbooks and are validated after any node or cluster version change.
- Use liveness and readiness probes on all production pods to enable early detection of containers that start but fail to become healthy.
Tools
- kubectl describe pods
- kubectl get events
- kubectl get nodes
- kubectl describe node
- kubectl get pvc
- kubectl describe pvc
- kubectl get secret
- kubectl get pods (with -o wide and -w flags)
- journalctl (containerd or docker service logs on the node)