T The Triage ManualTechnical Guides for IT Emergencies
P2 · Network Infrastructure

HTTP 503 Service Unavailable — Web Server Process Down or Upstream Backend Unreachable (IIS / Nginx / Apache / Load Balancer)

HTTP 503 Service Unavailable is returned when the web server or reverse proxy cannot reach a healthy backend to fulfil a request. The failure originates at the web server process, application pool, or load balancer layer — not the client. Common causes include a crashed or stopped application pool (IIS), a failed Linux service, resource exhaustion (CPU, memory, thread pool, file descriptors), a bad deployment causing startup failure, or an unavailable downstream dependency (database, cache, message queue). Resolution follows a layered approach: confirm the serving process state, review logs for crash or startup errors, test backend dependency reachability, address resource exhaustion, then restart or roll back as appropriate.

Indicators

Likely causes

Diagnostic steps

  1. Check the status of the web server process and application pool. On IIS (PowerShell): `Get-WebConfiguration system.applicationHost/applicationPools/add | Select-Object name, state` or open IIS Manager → Application Pools and look for Stopped state. On Linux: `systemctl status <service-name>` and `ps aux | grep <process-name>`. For containers: `docker ps -a` or `kubectl get pods -n <namespace>`.
    Establishes immediately whether the serving process is running at all or has crashed/stopped — the most common cause of 503.
  2. Review web server error logs for 503 entries and upstream/backend error messages from around the time of onset. IIS: `%SystemDrive%\inetpub\logs\LogFiles\W3SVC<SiteID>\` — filter for sc-status=503. Nginx: `/var/log/nginx/error.log` — filter for 'connect() failed' or 'upstream'. Apache: `/var/log/apache2/error.log` or `/var/log/httpd/error_log`. Use: `Get-Content <logpath> | Select-String '503'` (Windows) or `grep ' 503 ' /var/log/nginx/access.log` (Linux).
    Determines whether 503 is generated locally (process down) or proxied from a failing upstream backend, and provides exact timestamps and error context for root cause narrowing.
  3. Check application-level and system event logs for startup exceptions, unhandled errors, or crash events coinciding with 503 onset. On Windows: open Event Viewer → Windows Logs → Application, filter for sources 'ASP.NET', 'W3SVC', '.NET Runtime', or 'Application Error'. Look specifically for Event ID 5002 (application pool disabled). On Linux: `journalctl -u <service-name> --since '30 minutes ago'` and application-specific log files.
    Pinpoints whether an application crash, bad deployment artifact, missing dependency, or configuration error is preventing successful process startup.
  4. Test direct connectivity to backend dependencies from the application server. For SQL Server: `Test-NetConnection -ComputerName <db-host> -Port 1433`. For Redis/Memcached: `Test-NetConnection -ComputerName <cache-host> -Port 6379`. For HTTP upstream: `curl -o /dev/null -s -w "%{http_code}" http://<upstream-host>:<port>/health`. Confirm response time is within acceptable bounds.
    Isolates whether the 503 root cause is on this host or in a downstream dependency — determines whether the fix is here or elsewhere in the stack.
  5. Check system resource utilisation on the application host. Windows: `Get-Process | Sort-Object CPU -Descending | Select-Object -First 10 Name, CPU, WorkingSet` and Task Manager Performance tab. Linux: `top` or `htop`, `free -m`, `ulimit -n` (open file descriptors), `ss -s` (socket summary). For IIS specifically, check the request queue length in Performance Monitor: counter 'Web Service\Current Connections' and 'ASP.NET Applications\Requests In Application Queue'.
    Determines whether CPU saturation, memory exhaustion, file descriptor limits, or request queue overflow is causing the service to refuse new connections even if the process is running.

Resolution path

Prevention

Tools

References

http503service-unavailableiisapplication-poolnginxapachereverse-proxyload-balancerweb-serveravailabilityincident-responseevent-id-5002