Multiple Atlassian services experiencing degraded performance

Incident Report for Trello

Postmortem

Summary

On November 21, 2025, between 13:44 and 15:16 UTC, Trello customers were intermittently unable to view and update data on their boards. Customers also may have experienced issues authenticating with Atlassian products, and creating new GitHub and Slack integrations.

The event was triggered by a bug encountered in the software running our edge proxy fleet, which proxies customer traffic to Atlassian cloud services. The changes included the migration of our edge proxy fleet to hosts running an ARM CPU architecture, rather than the AMD64 CPU architecture they had previously been running, which impacted US East customers.

The incident was detected within 1 minute by our automated monitoring systems, and mitigated by a scale up of of fleet size, which put Atlassian systems into a known good state. This was followed by a global migration of edge proxy fleet hosts back to AMD64 CPU architecture the following day.

IMPACT

During the impact window, US East customers intermittently could not view or update data in Trello.

The same underlying issue also impacted our Identity services and integrations with GitHub and Slack, meaning some customers had trouble signing in to Atlassian products or creating new integrations. At the incident’s peak, the incident impacted up to:

52% of new Trello network connections.
9% of new GitHub and Slack integrations.
8% of new Identity network connections.

ROOT CAUSE

The issue was caused by a change to CPU architecture from AMD64 to ARM on our edge proxy fleet. This led to a bug that caused these instances to stall under high load, and refuse up to 52% of new connections. As a result, some customers of the products above could not make new connections to Atlassian services, and customers received CloudFront 504 gateway timeout error responses.

REMEDIAL ACTIONS PLAN & NEXT STEPS

We know that outages impact your productivity. While we deploy our changes progressively by cloud region to avoid broad impact, on this occasion, our pre-change load testing had not accurately reflected production loads.

As part of our response to this incident, and to help prevent recurrence, we rolled back all edge proxy fleets from ARM to AMD64 CPU architecture globally.

To minimise the impact of breaking changes to our environments, we plan to implement additional preventative measures such as:

Adding improved load tests into edge proxy fleet deployment pipelines to catch load-related bugs before deployment to production.
Adding alerts to our edge proxy fleet to catch rises in TCP connect times before customer impact.

We apologize to customers whose services were impacted during this incident; we are taking steps to help improve the platform’s performance and availability.

Thanks,

Atlassian Customer Support

Posted Dec 08, 2025 - 00:13 EST

Resolved

The Trello performance degradation has been resolved. Also, some intermittent errors on some of our other products, has also now been resolved.

The issue has now been resolved, and the service is operating normally for all affected customers.

Posted Nov 21, 2025 - 11:49 EST

Monitoring

The Trello performance degradation affecting some customers has been resolved.

A subset of customers may have experienced intermittent errors on some of our other products, but these should also now be resolved too.

We'll continue to monitor closely to confirm stability.

Posted Nov 21, 2025 - 11:14 EST

Update

Multiple Atlassian services are experiencing degraded performance. We are investigating and will provide an update within the hour.

Posted Nov 21, 2025 - 10:30 EST

Investigating

Multiple Atlassian services are seeing outages and we are investigating the same. We shall keep you posted on the progress in 60 minutes, if not sooner

Posted Nov 21, 2025 - 09:55 EST

This incident affected: Trello.com, API, Atlassian Support - Support Portal, Atlassian Support Ticketing, and Atlassian Support Knowledge Base.