T The Triage ManualTechnical Guides for IT Emergencies
P2 · Cloud & Hybrid Infrastructure

Azure Site Recovery Replication Broken — RPO Breach or Health Critical

Azure Site Recovery replication health degrades to Critical or Warning, causing RPO to exceed the configured threshold and leaving the protected workload without a valid recovery point. Typically caused by Mobility Service agent version mismatch, process server health issues, or sustained high disk churn.

Indicators

Likely causes

Diagnostic steps

  1. Recovery Services Vault > Replicated items > select VM — review replication health, error details and last recovery point time
  2. Check Mobility Service version: compare agent version on source VM against current vault-recommended version; update via Vault > Replicated item > Update Mobility Service
  3. On process server: check C:\ProgramData\ASR\home\svsystems\var\log\outbound.log and inbound.log for connection errors
  4. Vault > Site Recovery Infrastructure > Process Servers — verify heartbeat status, CPU, memory and cache disk free space (should be >30%)
  5. If churn is the issue: run ASR Deployment Planner against workload to calculate required bandwidth and determine if process server upgrade is needed
  6. Restart 'Microsoft Azure Recovery Services Agent' and 'Microsoft Azure Site Recovery Process Server' services on process server if replication is stuck

Resolution path

Prevention

Tools

azureasrsite-recoverydisaster-recoveryreplicationrpomobility-agent