MySql Bug Caused A Brief Transmission Interruption

Hey Blippers,

 

This is just a quick notice to let you know what caused the downtime today. We encountered a couple little problems in quick succession that led to a site outage of somewhere between 30min to 1hr, depending on how often you hit refresh. Everything is up and running again, sorry to take away your music and force you to focus on work on a Friday. It was all Yan’s fault though, he walked in this morning with a jolly excitement and exclaimed, “Happy Friday!” completely jinxing the day. Thanks, Yan, really smooth.

 

So here’s what went down. First we had unusually heavy traffic on one of our web servers, causing it to run out of memory and lock itself in an infinite death spiral. Luckily it was caught by our awesome support staff at Contegix (best supported hosting evar) before it became a real problem and they quickly reconfigured the server. Threat averted, right? Not quite.

 

All of a sudden the main database started locking up and wouldn’t allow any new connections, which means no blipping, no giving props, no logging in if you hadn’t been already. In order to troubleshoot that problem we swapped the master database to the understudy, the database in the wings, and under normal circumstances this would go unnoticed by anyone except the people back stage. But this time the same thing started happening to the backup database. Crap.

 

Fast forward through checking logs for DoS (no), checking the data layer logic (fine), find another site causing the same problem on a different database (it’s not Blip), scratch head, Google, MySql support forum, repeat, find bug, find fix, reboot. Breathe…

 

So there you go, that was our exciting Friday morning. There’s still a little residual database stuff to take care of but everything should be just about back to normal. In the meantime some of you finally got to see our special little error page. Yes it’s just for fun, Blip.fm wasn’t really rebooting right on your screen, but thanks for the emails!

 

Have a great weekend everyone, I think it’s almost beer o’ clock around here. If you see me out tonight, I’ll buy you a cold man soda (or woman soda) for depriving you of Blip.fm for what I’m sure felt like an eternity.

 

Update: There’s still some problems with the data on one of the databases. We’re going to seed it from the main database now. There might be some slowness throughout today (Saturday) but should all be cleared up in a couple of hours. Also, there’s been reports of props-to-give being lower than expected. I will recalculate the props-to-give once the above is completed, so don’t worry, you’ll get them all back!

 

Update: Things have been stable for the past few hours. I just recalculated the props-to-give for all blippers and gave back any that were missing. If you still see any residual strangeness please email support@blip.fm and we’ll help you get it figured out.