Playback Link Issues for VOD
Incident Report for Livepeer Studio
Postmortem

Overview

On Friday, November 8, 2024, our storage provider experienced a disruption in service across the North American and European regions. This affected our asset links, causing failures for video-on-demand (VOD), clipping, thumbnails, and livestream recordings.

Incident Details

The outage stemmed from an issue with our storage provider’s services. After their investigation, they reported that links for assets were inaccessible due to a broader regional service outage. Specific impacts included:

  • Link Accessibility

    • Links generated for VODs, clipping, thumbnails, and livestream recordings were inaccessible, returning an “Access Denied” error.
  • Asset Generation

    • No new assets (including clips, recordings, and thumbnails) were generated during the incident window.

Resolution

Once the storage provider resolved the regional outage, our services began to be restored. However, we discovered that a high volume of requests repeatedly hit our livestream thumbnail link path. This request rate exceeded the provider’s limits, resulting in blocking our account. We have implemented a solution to reduce the number of requests, restore livestream thumbnail functionality, and unblock our account..

Mitigation Steps

To minimize service impact and restore functionality, we:

  • Blocked access to the shared link path specifically for livestream thumbnails, allowing VOD, clipping, and recording paths to function as usual.
  • Implement a rate limit on the livestream thumbnail path once the request volume normalizes.

Root Cause

  • Primary Cause: Storage provider outage in North America and Europe.
  • Secondary Cause: Excessive requests to the livestream thumbnail path following service restoration, triggering a rate-limit response from the provider.

Impact Assessment

  • Users Affected: Users attempting to access VOD, clipping, thumbnails, and livestream recordings received an “Access Denied” error.
  • Service Downtime: Approximately 12+ hours, with the issue partially resolved upon blocking the livestream thumbnail path.

Follow-up Actions

  • Implement Rate Limiting

    • Deploy CDNs in front of the targeted links to manage and distribute incoming requests efficiently.
  • Monitoring and Alerts

    • Set up monitoring for high request volumes on specific asset paths to detect and address potential bottlenecks before they impact service.
Posted Nov 14, 2024 - 17:07 UTC

Resolved
This incident has been resolved.
Posted Nov 14, 2024 - 17:00 UTC
Update
The links to clips and Vods are up again and running. Currently, the path to thumbnails are still being worked on. We will continuing to resolve this as soon as we can.
Posted Nov 11, 2024 - 14:34 UTC
Identified
One of our service provider is currently addressing issues that is affecting links to view playback for accessing assets. They are aware of this impact on our service and is working to resolve this as soon as possible.
At the moment, links being used to access assets is currently not working. We are in close communication with them and will update when we hear something.
Posted Nov 08, 2024 - 17:02 UTC