our first hacker
the journal of Michael Werneburg
twenty-seven years and one million words
Our website was taken offline by a hacker for about three hours this morning.
I'd just released some important new functionality to the site and tested everything out to my satisfaction. We now offer sample rings for people to borrow—this allows our customers to try before they buy.
But then a couple of hours later the site was behaving slugglishly and occasionally refusing requests. I didn't know if it was just a lousy 'net connection, but I could log into the host and everything seemed to check out more or less.
But then the site became completely inaccessible, and an investigation showed me that the application server had chewed up all of the system resources. In fact, it had done so because someone was sending a lot of bogus requests, and they were clearly intrusion attempts. There was no security incident, but the machine had become unusable. Our tormentor had launched a denial of service attack, effectively.
Rebooting our virtual server to clear up the memory issue, I set to improving the server configuration. And it wasn't really difficult to sort out what I'd done that allowed this to happen. Rather than set up the webserver so that it ignored all requests it didn't know how to deal with, it was passing those on to the application server. And the application server was doing a lot of work to try to handle each of those requests as if they were legitimate application functions.
The real problem there is that the application server chews up A LOT more resources than the webserver, and with every bogus requests I was asking the application server to outlay the same resources that it would for instance if someone was buying a product.
So how did I come to this crappy design? By following every example I found on the 'net. This is one of the areas where adopting "leading edge" technologies really bites you in the ass. Too many people are advocating too many poorly-thought-out things.
I think I'll write up what I've learned and post the results to this website. I realize that it will be just another voice in the wild but honestly at this point that's all that the rails/ruby world seems to have.
Spending time with a production server is always an opportunity to learn, and as it happens I sorted out two other technical issues that I'd been meaning to look into. We're now handling outbound email in a much more sane fashion that gives us better performance, systems resource allocation, and (again) security. Also, I was able to get the new sample ring order under SSL (a silly oversight, but one that only lasted a few hours).
Three good things that happened today:
1. ended the day with a much more secure, better performing website
2. got to do some hands-on techie stuff which is always fun
3. fixed a long-standing secondary technical oddity regarding how we handle email that's really been bothering me.