Back to overview
Downtime

ECR Image Failure

Dec 04 at 12:44am UTC
Affected services
API health

Resolved
Dec 04 at 01:19am UTC

Incident resolved

Created
Dec 04 at 12:44am UTC

Incident Summary

We experienced a service disruption lasting approximately 35 minutes. The issue was identified and resolved the same evening.

Impact

Some users experienced temporary service unavailability during the incident window.

Resolution

The service was restored by deploying a corrected build, which resolved the underlying issue and returned the system to normal operation.

Root Cause (High-level)

The disruption was caused by an infrastructure configuration issue related to how application artifacts were retained. Under specific conditions, this led to the service being unable to restart correctly.

Mitigation & Prevention

  • Configuration has been updated to prevent recurrence.
  • Ongoing work continues to strengthen monitoring, alerting, and recovery procedures.