T The Triage ManualTechnical Guides for IT Emergencies
P1 · Active Directory

Failed Domain Controller — recovery without making it worse

A DC has failed (hardware, OS, NTDS corruption, or network isolation). The danger isn't the failure — it's the recovery shortcut that breaks the rest of the forest.

Indicators

Likely causes

Diagnostic steps

  1. Decide: repair this DC, or seize roles and demote it metadata-only? Driver: time available, criticality of FSMO roles held, age of the DC's last successful replication
  2. Run dcdiag /v /c /e and repadmin /showrepl /errorsonly from a healthy DC — establish the actual fault before action
  3. Check NTDS.dit free space and event log on the failed DC — Directory Service event channel, especially event IDs 1644, 2087, 2088, 1311
  4. If repair viable: stop NTDS service, esentutl /g for integrity, /p for repair (last resort — backup the .dit first)
  5. If demote viable: use ntdsutil metadata cleanup from a healthy DC; remove all DNS/SRV records, computer object, NTDS settings object
  6. Never restore a DC from a VM snapshot/checkpoint into the production network — USN rollback poisons replication. Always demote and rebuild

Resolution path

Prevention

Tools

References

active-directorydomain-controllerntdsutilreplicationfsmo