Issues accessing the Practice Labs platform

Updates

Resolved
September 27, 2023 at 11:10 AM
Resolved
September 27, 2023 at 11:10 AM
Symptoms

Users attempting to access the Practice Labs platform were unable to login.

What went wrong

Our database cluster had a spike in memory usage which caused a failover of the primary node to one of the secondary nodes. During this period the cluster become unresponsive.

Who was impacted

All users attempting to login to the Practice Labs platform.

Why it went wrong

Memory exhaustion in our database cluster.

How did we fix it

We have upgraded the memory in all 4 nodes in our database cluster. One node is running with less RAM than we have allocated due to a minor hardware fault which is being addressed by our maintenance provider.

Our database cluster is now operating with reduced processing times, in some cases up to 60% faster with the additional RAM. We have monitored this closely for 7 days and are now comfortable that we can come out of monitoring.
Update
September 20, 2023 at 2:03 PM
Update
September 20, 2023 at 2:03 PM
We have completed our emergency maintenance to upgrade our CDC database servers as part of the remediation plan from yesterdays outage. We will continue to closely monitor the platform.
Update
September 20, 2023 at 9:04 AM
Update
September 20, 2023 at 9:04 AM
We are performing emergency maintenance at 9am UTC to upgrade our CDC database servers as part of the remediation plan from yesterdays outage, this should not impact users but are monitoring closely.
Monitoring
September 19, 2023 at 1:44 PM
Monitoring
September 19, 2023 at 1:44 PM
Our database clusters primary node automatically failed over to one of its secondary nodes which restored user access to the platform.

We are investigating further corrective actions and will continue to monitor. We appreciate your understanding and patience during this incident.
Identified
September 19, 2023 at 1:15 PM
Identified
September 19, 2023 at 1:15 PM
We are currently investigating an issue where users are unable to access the platform either by logging in or launching a lab from another platform.

We apologies for any inconveniences this has caused.
Investigating
September 19, 2023 at 12:15 PM
Investigating
September 19, 2023 at 12:15 PM
We are investigating an issue where users are unable to access the platform either by logging in or launching a lab from another platform.

ACI Learning - Issues accessing the Practice Labs platform – Incident details

myACI Learn experiencing partial outage

Symptoms

What went wrong

Who was impacted

Why it went wrong

How did we fix it