ACI Learning - Integration issues with LTI 1.0, 1.1 and 1.2 – Incident details

Integration issues with LTI 1.0, 1.1 and 1.2

Resolved
Degraded performance
Started 10 months agoLasted about 2 months

Affected

Practice Labs

Degraded performance from 2:30 AM to 12:54 PM

api.practice-labs.com

Degraded performance from 2:30 AM to 12:54 PM

Updates
  • Resolved
    Resolved

    Symptoms
    Users integrating via LTI 1.0, 1.1 or 1.2 into api.practice-labs.com may have noticed an increase in loading times, eventually resulting in a HTTP 503 'Service Unavailable' status code when accessing Practice Labs or Assessments.

    What Went Wrong
    A package called "RestSharp" was identified as not correctly disposing of used TCP connections and potentially leading to a socket exhaustion scenario which eventually meant the process was no longer able to accept incoming connections.

    Who Was Impacted
    Any customer or users using LTI 1.0, 1.1 or 1.2 integration methods from their LMS.

    Why It Went Wrong
    Our monitoring platform reported a large increase in established TCP connections on 2 separate occasions leading to the investigations into the cause. However, the issue had already occurred and caused the process to crash.

    How We Fixed It
    We have removed any references to the "RestSharp" package and replaced with a more robust HTTP client and retry logic to further prevent this issue occurring again.

  • Identified
    Identified

    A potential root cause has been identified within the LTI gateway, a package used in this application called "RestSharp" has been known to excessively consume server resources leading to instability.

    This package has since been removed and we are completing further testing in our Staging environment prior to an expected Production release on January 17th 2024.

  • Investigating
    Update

    An automated workaround has been implemented to help remediate potential recurrences of this incident. Further investigations into the root cause are continuing and updates will be provided as more information becomes available.

  • Investigating
    Investigating

    What is wrong

    LTI v1.0, 1.1 and 1.2 integrations via api.practice-labs.com may notice an increase in loading times or in certain instances see a HTTP 503 'Service Unavailable' error when accessing Practice Labs or Assessments.

    Who is impacted

    Any customer or users using LTI v1.0, 1.1 or 1.2 integration methods from their LMS.

    Why is it wrong

    Our monitoring platform has confirmed that we have had two occurrences of this on consecutive days overnight during approximately the following time windows:

    Monday 20th November, 02:30am UTC - 07:30am UTC
    Tuesday 21st November, 02:00am UTC - 05:00am UTC

    How will we fix it

    We are currently continuing investigations into this issue and will provide updates as more information becomes available.