Some Confluence sites are failing to load
Incident Report for Confluence
Postmortem

SUMMARY

On June 22, 2022, from 01:08 AM UTC to 03:42 AM UTC, some customers using Bitbucket Pipelines, Confluence Cloud, Forge, and Jira Cloud family of products (Jira Software, Jira Service Management, Jira Work Management). While for Bitbucket Pipelines there was an increase in build failures, Jira, Confluence, and Forge experienced performance and functionality degradation. The event was triggered by our internal Artifact Repository Manager becoming unavailable during a scheduled multi-availability zone disaster recovery test. Customers across all regions were affected. The incident was detected within two minutes by monitoring and mitigated by restarting the Artifact Repository service, which recovered the affected products. The total time to resolution was about three hours.

IMPACT

The overall impact was between June 22, 2022, 01:08 AM UTC, and June 22, 2022, 05:58 AM UTC on Bitbucket Pipelines, Confluence Cloud, Forge, and Jira Cloud family of products (Jira Software, Jira Service Management, Jira Work Management). The outage of the internal Artifact Repository Manager caused scalability problems in the aforementioned products and an inability to build or deploy new versions of our services. That meant the degradation of performance and functionality for most of these products.

ROOT CAUSE

The issue was caused by an outage of the internal Artifact Repository Manager during the planned multi-availability zone disaster recovery test. As a result, the products listed above could not access docker images and other necessary artifacts to scale up, which caused partial degradation of services or complete unavailability of services for some customers. The restart of the internal Artifact Repository Manager caused downtime to the service but led to successful recovery.

REMEDIAL ACTIONS PLAN & NEXT STEPS

We know that outages may impact your productivity. After the immediate impact of this outage was resolved, the incident response team completed a technical analysis of the root cause and contributing factors. The team has conducted a post-incident review to determine how we can avoid the impact of this kind of outage in the future.

We are prioritizing the following improvement actions to avoid repeating this type of incident:

  • We raised a critical issue with the vendor who provides us with software for Artifact Management to optimise the resilience of the application caused by availability zone failures.
  • We are working on improving our disaster recovery plan to be able to mitigate such incidents faster.
  • We are reviewing our test strategies to be able to catch similar issues in the early stages.

To minimize the impact of such incidents on our customers, we will implement additional preventative measures such as:

  • Development of a redundant caching mechanism for our platform system to improve the scalability and reliability of our products.

We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s performance and availability.

Thanks,

Atlassian Customer Support

Posted Jul 07, 2022 - 23:54 UTC

Resolved
Between 2022-06-22 01:30 UTC to 2022-06-22 04:30 UTC, we experienced some Confluence Cloud sites not loading. The issue has been resolved and the service is operating normally.
Posted Jun 22, 2022 - 05:40 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jun 22, 2022 - 04:35 UTC
Identified
The cause of this issue has been identified and our team is working to implement a fix. We will provide more details within the next hour.
Posted Jun 22, 2022 - 04:06 UTC
Investigating
We are investigating an issue with Confluence sites not loading that is impacting some Confluence Cloud customers. We will provide more details within the next hour.
Posted Jun 22, 2022 - 03:08 UTC
This incident affected: View Content, Create and Edit, Comments, Authentication and User Management, Search, Administration, Notifications, Marketplace Apps, Purchasing & Licensing, Signup and Mobile (iOS App, Android App).