On Wednesday July 22nd 2020, from 6h25 PDT to 7h10 PDT, Shotgun suffered a partial outage that resulted in intermittent site availability for some clients within this timeframe.
The incident happened during a routine release of Shotgun. The root cause of the incident was a database utilisation spike which caused some sites not to initialize as expected.
Affected clients experienced up to 45 minutes of interruption to the availability of their Shotgun site.
We have upscaled our database components to make them more resilient to unexpected connection spikes.
We are also exploring ways to improve our database load distribution and deployment methods to further mitigate the risk of this type of incident in the future.
Our monitoring is also being revised in light of this incident to help us identify this type of issue earlier.