Please be advised, the incident Sunday morning UTC where labs were not accessible from either ATL or CDC locations has been successfully resolved on 2024-02-04 11:30am UTC.
What went wrong
An automated maintenance task overran, causing locks on the production database. Subsequently the database failed over and due to large trans log growth, we were unable to recover in a timely fashion.
Who was impacted
All Users were unable to login to the Practice Labs platform. It’s possible that activity from users already logged into the platform may not be complete.
Why it went wrong
Automated task was introduced to production, executing a trans log clean-up and index defragmentation task. The size of the database caused the index to rebuild and trans log reduction jobs to overrun causing locks in the production database.
How did we fix it
We began a database replication from last known working backup and built a physical node with twice the resources whilst new primary is building.
This incident is now closed. We appreciate your patience and understanding during this process.