Today Basecamp suffered the worst outage / slowdown that it has had in a very long time. We're very sorry that this disrupted your day. We know that you depend on Basecamp and any downtime, no matter how rare, is a serious problem. That's why we want to share all the details with you, so we can be as transparent as possible and you can trust that we're doing everything we can to prevent this from happening again.
If you don't care about the technical details, the summary is that a server went down and our redundant backup server wasn't operational as fast as we expected it to be. No data was lost. We're making sure that both issues are taken care of and will not be repeated. If you do care about the technical details, read on.
It all started when we ran what should have been an innocent diagnostics tool to check the configuration of some hardware on the main database server. This tool wasn't so innocent after all and somehow managed to crash the server. (We're still investigating the root cause to find out exactly what happened).
This shouldn't have been a problem. We run a big, beefy slave database that's supposed to be ready at a moments notice. It has all the latest data, so even if the main machine is nuked out of orbit, we don't lose any customer data and can go straight back to trucking. Or so we thought.
The problem with this is that it's been way too long since we last had to failover to the slave server and in the meantime we've grown a ton. So while the slave database server has all the latest data and is just as beefy as the main machine, the read cache was cold because it's not being used to feed live requests. That means it has to build up the appropriate cache when we switched over to the backup server and that took way longer than we anticipated (last time there was much less data) and made everything painfully slow while it happened.
The solution is to make sure that the slave server sits with a pre-heated cache at all times, so it's ready to take over from the master immediately. There are a number of options available for this and we're looking at all of them to pick the best one.
Once again, we're really sorry for this. It was very painful to watch Basecamp be unavailable or slow for that long when we knew what we had to do to prevent this in the future. Please accept our apologies. Thanks for being a Basecamp customer.




