Trello is slow or unavailable
Incident Report for Trello
Postmortem

SUMMARY

On Thursday, March 24 between 15:56 and 16:12 UTC, Trello customers were unable to use the product. The event was triggered by a deployment misconfiguration. As the faulty deploy propagated across our server fleet, each server failed and the entire fleet became unresponsive. The incident was detected within 1 minute by our automated monitoring and resolved by correcting the misconfiguration and redeploying the fleet, thus bringing Trello up for all users. The total time to resolution was about 16 minutes.

IMPACT

The overall impact was between Mar 24, 2022, 15:56 UTC and Mar 24, 2022, 16:12 UTC. Trello was completely unavailable to all customers and other dependent needs for 16 minutes.

ROOT CAUSE

The issue was caused by an error that occurred when setting manual configuration information for Trello's deployments. As a result, a misconfiguration occurred and Trello's servers were not able to fully deploy a new version of the code, causing them to fail to start.

REMEDIAL ACTIONS PLAN & NEXT STEPS

We know that outages impact your productivity. While we have a number of testing and preventative processes in place, this specific issue wasn’t identified because the error occurred during an optional, manual step in our deployment process.

We are prioritizing the following improvement actions to avoid repeating this type of incident:

  • Remove the manual configuration step in our deployment process.
  • Ensure that the Trello server startup process does not fail when encountering this class of misconfiguration.

We apologize to Trello customers who were impacted during this incident; we are taking immediate steps to improve Trello's availability as a result of this incident.

Thanks,

Atlassian Customer Support

Posted Mar 31, 2022 - 18:31 EDT

Resolved
This incident has been resolved. Thanks for your patience!
Posted Mar 24, 2022 - 12:35 EDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Mar 24, 2022 - 12:24 EDT
Investigating
We've noticed that Trello slow or unavailable. This will be present in both the web and mobile apps.

Our engineering team is actively investigating this incident and working to bring Trello back up to speed as quickly as possible.

We'll keep you posted with further updates on this page.
Posted Mar 24, 2022 - 12:03 EDT
This incident affected: Trello.com.