We confirm that the rollback resolved the problem. We are now looking into a permanent fix and will resubmit the failed jobs soon. As explained before, some jobs will have to be re-submitted manually.
We are very sorry for the disruption.
Apr 16, 19:02 UTC
All sites have been rolled back. We are observing no transcoding failure since then. We will continue to monitor the situation.
We will trigger a re-transcoding for jobs that failed since Friday. However, in some cases, we won't be able to re-trigger the jobs. These jobs will have to be resubmitted manually.
Apr 16, 18:39 UTC
We identified a regression in the transcoding process between versions 22.214.171.124 and version 126.96.36.199. Mitigation measures we tried to put in place didn't gave the expected results.
We are working on a fix, but in the meanwhile, all sites will be rolled-back to 188.8.131.52 in the next minutes. There will be no downtime during the rollback.
Apr 16, 18:03 UTC
Our initially deployed workaround did not mitigate the issue completely so we are now investigating other avenues for a fix. Thank you for your patience.
Apr 16, 15:55 UTC
We have identified with high confidence a specific use case where clients uploading media using Shotgun API versions prior to v3.0.33 may experience intermittent transcoding failures. Please update to the latest version of the Shotgun API to avoid this issue: https://github.com/shotgunsoftware/python-api/releases
We are working to deploy a fix for clients using older versions of the Shotgun API as soon as possible.
Apr 16, 14:19 UTC
Our investigation of recent transcoding issues affecting some clients is ongoing and we are confident that we are close to identifying the root cause. Thank you for your patience.
Apr 16, 13:40 UTC
We are currently investigating an issue where transcoding jobs fail intermittently.
Apr 16, 11:34 UTC
We are currently investigating this issue.
Apr 16, 11:34 UTC