Customer impact: Public APIs and investigation tools experienced elevated error rates between 06:18 and 07:15 UTC. Enterprise customers running ECSD on-prem were unaffected by design.
What happened: From 06:18 to 07:15 UTC, an internal certificate that authenticates service-to-service traffic inside our cluster expired. New mesh-internal connections briefly failed authentication while existing connections kept working. We rotated the certificate, refreshed the affected workloads, and confirmed full recovery.
What we're doing: Migrating the affected certificate to automated rotation, adding cluster-wide expiry monitoring with multi-tier alerts, and publishing dashboards on our public ingress so we have a definitive customer-impact signal on day one of any future incident.