On April 25, 2025, between 18:18 and 18:33 UTC, Atlassian customers using Trello may have experienced service interruptions. The event was triggered by temporarily reduced capacity following a rollback deployment, with insufficient nodes to handle the load. Automated monitoring systems detected the incident within one minute and mitigated it by scaling up the deployment, which put Atlassian systems into a known-good state. The total time to resolution was about 15 minutes.
The overall impact was on April 25, 2025, between 18:18 and 18:33 UTC, on Trello. The incident caused service disruption to Trello users, resulting in reduced functionality, slower response times, and errors when performing key actions such as loading boards and cards.
The root cause of the incident was a failure to scale our nodes to optimal capacity caused by a release rollback.
If an issue is found during a deployment, we can roll back to a previous release. In this case, a rollback was executed to a previous release that had already undergone a scaling-down process. As that rollback happened, more compute nodes needed to be available to handle the high traffic.
We know that outages impact your productivity and strive to avoid incidents like these.
We are prioritizing the following efforts as next steps:
We apologize to customers whose services were impacted during this incident; we are taking steps designed to improve the platform’s performance and availability.
Thanks,
Atlassian Customer Support