Trello is slow or unavailable
Incident Report for Trello
Postmortem

On 09-28-2020 between 14:55–16:13, UTC Atlassian customers using Trello may have experienced slowness or unavailability in both the web and mobile apps.

The addition of new routes to our load balancing tier caused some of our load balancer CPU cores to become saturated at 100% utilization. This resulted in errors returned for nearly half of all requests to Trello. Due to a configuration in our monitoring, we were not alerted of early indicators to this problem, however, Amazon CloudWatch monitoring did alert us within 9 minutes that several of our load balancers were unhealthy. We mitigated the issue by reconfiguring our load balancers to use additional CPU cores. Additionally, the process by which new routes were being added to the configuration has now been substantially optimized, resulting in an overall drop in CPU usage.

We know that outages are impactful to your productivity. We deploy our changes progressively to avoid broad impact but in this case, our load balancers did not perform as expected. Moving forward, along with the fixes described above, to minimize the blast radius of breaking changes to our environments, we have implemented improved monitoring, alerting, and oversight for load balancer metrics.

We apologize for any inconvenience this may have caused. Please let us know if there are additional details we can provide.

Posted Oct 06, 2020 - 16:09 UTC

Resolved
Trello's Engineering team has resolved the problem—as always, please reach out via https://trello.com/contact if you're still having any trouble!
Posted Sep 28, 2020 - 16:57 UTC
Monitoring
Engineering has a fix in place, and a full browser refresh should get things back up and running. If you're still having any trouble, please don't hesitate to reach out via https://trello.com/contact!
Posted Sep 28, 2020 - 16:29 UTC
Update
Engineering is actively investigating the problem, and is making progress. We'll post updates here as we have them and will get back up running ASAP!
Posted Sep 28, 2020 - 16:08 UTC
Investigating
Trello is currently slow or unavailable.

Our engineering team is actively investigating this incident and working to bring Trello back up as quickly as possible.

Users affected by this incident may notice that Trello is slow or completely unavailable in both the web and mobile apps.

We will update this page as we have additional information.
Posted Sep 28, 2020 - 15:16 UTC
This incident affected: Trello.com.