Jira, Confluence and JSM Automation rules triggered or scheduled to run between 10am and 5pm UTC on March 17, 2025, and between 1pm UTC on March 18 and 12:30am UTC on March 19 were delayed on average by 1.5 hours and up to 12 hours maximum. The incident was triggered by the deployment of a monitoring library upgrade, which slowed the execution of all rules. This reduced the throughput of rule processing which resulted in rules being backed up and delayed. The change manifested in poor rule performance only during periods of high traffic. This incident occurred over two time windows. For the first incident window, backed-up rule executions began to reduce 4 hours 15 minutes after the first alert, and all rules had caught up 7 hours after the first alert. For the second incident window, backed-up rule executions began to reduce 1 hour, 50 minutes after the first alert, and all rules had caught up 10 hours after the first alert.
The root cause of both incidents was identified and a change to address it was deployed by 10am UTC on March 19, 2025.
The customer’s rules were delayed on average by 1.5 hours and up to 12 hours during both incident windows. A very small number of rules encountered the following error: “The rule actor doesn't have permission to view the event that triggered this rule.” This error occurred because of rate limiting implemented by an internal Atlassian service due to increased throughput resulting from our mitigation efforts. These rules failed to complete successfully. All other rules eventually ran successfully.
The issue was caused by a change introduced to an Atlassian monitoring library, which significantly degraded the Automation rule engine's performance. The performance degradation prevented Automation's system from keeping pace with processing throughput, causing a back-up of executions and subsequent customer rule delays.
We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue didn’t manifest itself until our systems were at peak load.
The change to the Atlassian monitoring library that was the root cause of the incident has been fixed.
We are prioritizing the following improvement actions that are designed to avoid a repeat of this type of incident:
We apologize to customers whose automation rules were impacted during this incident; we are taking immediate steps designed to improve the platform’s performance and availability.
Thanks,
Atlassian Customer Support