Practice Labs platform unavailable

Resolved

Operational

Started almost 3 years agoLasted 8 days

Affected

Practice Labs

www.practice-labs.com

Lab Services

Intranet.practice-labs.com

Lab RDP

Azure Labs

Updates

Resolved
September 30, 2021 at 8:27 AM
Resolved
September 30, 2021 at 8:27 AM
This incident will now be marked resolved. We have observed continued stability in the platform since Saturday 25th. We will continue to work with our vendors on root cause and their recommendations, therefore any future such changes will be communicated through our Maintenance Schedule notifications on this status page.
Monitoring
September 28, 2021 at 9:09 AM
Update
September 28, 2021 at 9:09 AM
Daily update:

Platform remains stable and our hardware vendors are providing feedback to problem resolutions. All solutions received so far are being evaluated and tested where possible offline in secondary environments without affecting the production platform.

Next update due Sept 29.
Monitoring
September 27, 2021 at 10:33 AM
Update
September 27, 2021 at 10:33 AM
Please note that we implemented stability improvements and have been monitoring the platform over the weekend without issue. We are continuing to work with our vendors to isolate root causes for the platform instability. This incident will remain open in a monitoring state with at least daily updates. All services are available and operational.
Monitoring
September 24, 2021 at 8:17 PM
Monitoring
September 24, 2021 at 8:17 PM
We are continuing to work on a fix for this incident. The system has now recovered and we are monitoring our systems closely.
Investigating
September 24, 2021 at 7:45 PM
Investigating
September 24, 2021 at 7:45 PM
We are currently investigating this incident, we are currently collecting the data on this issue and will send another update in 30 mintues.

Next updated expected 20:45 BST 20:15 UTC
Monitoring
September 24, 2021 at 4:03 PM
Update
September 24, 2021 at 4:03 PM
All services are now operational in the platform including screenshots from the DRT01 data centre. This incident remains under the monitoring status whilst we pursue hardware failures with our vendors. The system will remain under close monitoring and appropriate teams remain are on standby and monitoring.
Monitoring
September 24, 2021 at 2:39 PM
Monitoring
September 24, 2021 at 2:39 PM
We implemented a fix for the accessibility issues and currently monitoring the result.
Identified
September 24, 2021 at 2:07 PM
Update
September 24, 2021 at 2:07 PM
We are aware of a repeat of a previous incident in regards to core platform access. The engineering team are in the process of attempting to restore accessibility to the platform.
Identified
September 24, 2021 at 12:11 PM
Identified
September 24, 2021 at 12:11 PM
We are aware of issues logging into the platform right now. We apologise for this disruption. Our engineering team are working on this situation.
Monitoring
September 24, 2021 at 9:57 AM
Update
September 24, 2021 at 9:57 AM
This incident remains in a monitoring state, there are no further updates at this time. Technical teams continue to work with vendors to establish root causes. We do still have workarounds in place to which seperate emergency maintenance notifications will be distributed as and when changes are expected to be made. There is currently no eta on any restorative changes to the environment.

Next update expected 1400 UTC
Monitoring
September 24, 2021 at 12:48 AM
Monitoring
September 24, 2021 at 12:48 AM
Services have been restored and the platform is accessible. This incident will be escalated to our vendor for investigation and root cause analysis.

Screenshots from data centre DRT01 lab devices still requires switching to the HTML client to complete via the Settings menu (disable Connect toggle) as per previous incident notes.
Identified
September 24, 2021 at 12:28 AM
Update
September 24, 2021 at 12:28 AM
We are continuing to work on a fix for this incident. We are in the process of restoring services at this time. We are hoping to complete this process within 30 minutes.

Next update expected at 2:00 AM UK time (02:00 UTC)
Identified
September 24, 2021 at 12:16 AM
Update
September 24, 2021 at 12:16 AM
We are continuing to work on a fix for this incident. Our Engineer is currently performing recovery actions.

Next update is expected at 1:45 UK time (01:45 UTC)
Identified
September 23, 2021 at 11:42 PM
Update
September 23, 2021 at 11:42 PM
The engineer has arrived on-site, we are awaiting further update from them.

Next update expected at 1:15 AM UK time (01:15 UTC)
Identified
September 23, 2021 at 11:17 PM
Update
September 23, 2021 at 11:17 PM
We are continuing to work on a fix for this incident. Our Engineer should be arriving on-site within the hour to investigate the issue further.

Next update is expected at 12:45 AM UK time (00:45 UTC)
Identified
September 23, 2021 at 10:42 PM
Update
September 23, 2021 at 10:42 PM
The next update is expected at 12:15 AM UK time (00:15 UTC)
Identified
September 23, 2021 at 10:16 PM
Identified
September 23, 2021 at 10:16 PM
We are continuing to work on a fix for this incident. We have sent a Technical engineer on-site to investigate the issue further and collect more data on the incident.

Next update is expected at 11:40PM UK time (23:40 UTC)
Investigating
September 23, 2021 at 9:39 PM
Investigating
September 23, 2021 at 9:39 PM
We are aware of a major incident impacting access to the platform. We are convening teams to investigate this issue.
Monitoring
September 23, 2021 at 4:41 PM
Update
September 23, 2021 at 4:41 PM
Please note that our secondary data centre is now operational again. Please however note that we are aware of issues capturing screenshots using the Connect RDP client. Please use the workaround mentioned in our previous incident update to switch to the HTML5 client to complete screenshots in the labs if you are allocated the DRT01 data centre as shown in the top level device menu bar of the lab device.

This issue remains under a monitoring state as we continue to work with our vendors to resolve our Data centre connectivity in full.
Monitoring
September 23, 2021 at 2:03 PM
Update
September 23, 2021 at 2:03 PM
We continue to investigate the platform stability issues. We are aware of connection issues affecting RDP sessions and we can advise those users who are facing issues to disable "Connect" and switch to HTML5 from the Settings menu as documented here https://help.practice-labs.com/practice-lab/settings-tab as a workaround.
Monitoring
September 23, 2021 at 10:52 AM
Update
September 23, 2021 at 10:52 AM
This incident remains under investigation. We continue to work with our vendors and development teams to identify root causes of open issues affecting the platform.

Next update due 1400 UTC
Monitoring
September 22, 2021 at 10:54 PM
Update
September 22, 2021 at 10:54 PM
The platform is now in a monitoring state. We are aware of an issue affecting screenshot capability with lab devices serviced from the DRT01 data centre as indicated in the device menu bar. This issue remains under investigation.

Next update expected Thursday, 0900 UTC.
Monitoring
September 22, 2021 at 9:25 PM
Monitoring
September 22, 2021 at 9:25 PM
We implemented a workaround and Cisco labs are now available again. We are currently investigating an issue regarding the taking of screenshots from DRT01 sourced lab devices. This is the final health check item in regards to full service restoration.
Identified
September 22, 2021 at 8:04 PM
Update
September 22, 2021 at 8:04 PM
We are continuing to work on a fix for this incident. We continue to investigate access issues with Cisco based labs.
Identified
September 22, 2021 at 6:14 PM
Update
September 22, 2021 at 6:14 PM
We continue to work on restoring the secondary data centre lab capacity. No further progress or eta to advise at this time. This still continues to directly impact Cisco lab availability.
Identified
September 22, 2021 at 4:15 PM
Update
September 22, 2021 at 4:15 PM
Cisco labs remain unavailable however we are continuing to identify and implement a workaround to restore this access.
Identified
September 22, 2021 at 1:44 PM
Update
September 22, 2021 at 1:44 PM
We have resumed lab services from our primary data centre. We are in the process of monitoring and validation overall platform health and are aware that Cisco Labs are currently unavailable. We are continuing to investigate this.
Identified
September 22, 2021 at 1:30 PM
Update
September 22, 2021 at 1:30 PM
We have reverted services back to our primary data centre and are in the process of recommissioning lab servers. Access to Labs should be available within 30 minutes from this notification.
Identified
September 22, 2021 at 12:57 PM
Update
September 22, 2021 at 12:57 PM
Unfortunately, the situation is the same as in the last update.

Next update expected at 2:30 UK time (1:30 UTC)

Once again we sincerely apologise for the disruption this is causing.
Identified
September 22, 2021 at 12:17 PM
Update
September 22, 2021 at 12:17 PM
The platform is still currently unable to establish access to Lab Devices and the team are continuing to work on this.

Our core data centre is also in the process of restoration so teams are working across both streams to restore services as soon as possible.

Unfortunately, there is not a notable update to be provided at this time.

Next update expected at 1:45 UK time (12:45 UTC)

Once again we sincerely apologise for the disruption this is causing.
Identified
September 22, 2021 at 11:40 AM
Update
September 22, 2021 at 11:40 AM
The platform is still currently unable to establish access to Lab Devices and the team are continuing to work on this.

Our core data centre is also in the process of restoration so teams are working across both streams to restore services as soon as possible.

Next update expected at 1:15 UK time (12:15 UTC)

Once again we sincerely apologise for the disruption this is causing.
Identified
September 22, 2021 at 10:59 AM
Update
September 22, 2021 at 10:59 AM
We are continuing to configure the disaster recovery platform at this time. Unfortunately, we do not have a notable update to provide. Next update expected in 30 minutes on overall status. We sincerely apologize for the disruption caused.
Identified
September 22, 2021 at 10:32 AM
Update
September 22, 2021 at 10:32 AM
We are continuing to configure the disaster recovery platform at this time. Next update expected in 30 minutes on overall status.
Identified
September 22, 2021 at 10:01 AM
Update
September 22, 2021 at 10:01 AM
Practice Labs Disaster recovery site is now accessible and the team are provisioning lab services and RDP servers. Next update expected in 30 minutes. Please note additional Services identified as impacted within this incident.
Identified
September 22, 2021 at 9:29 AM
Identified
September 22, 2021 at 9:29 AM
Please note that the Practice Labs platform is currently unavailable. This relates to a previous emergency maintenance window listed on this status page. At current we are in process of activating our disaster recovery site which will have a reduced lab title capacity. Please click this incident to see affected service details such as Persistent Labs.

We will update this incident once the disaster recovery site is operational. Current eta to platform access is 11:00 PM UK time.

We are working to restore the Primary Data Centre as quickly as possible to recover full services.

ACI Learning - Practice Labs platform unavailable – Incident details

All systems operational

Practice Labs platform unavailable