David Cramer's Blog

Scaling Your Clouds

My post yesterday seems to have gotten all the cloud fanboy’s panties into a twist, so I figured I’d give them something else to rage about.

There were lots of claims that without the cloud you can’t scale, or you dont have redundancy, or you can’t come up with the result of 2 + 2. I can’t even explain the level of ignorance I’ve seen come out of the woodwork.

So let’s clarify some things.

“The Cloud”

There are many definitions that float around for “the cloud”, and what it means, and more specifically what it’s supposed to do for you. When I talk about it, I’m not talking about you setting up hundreds of your own servers and virtualizing them. We do that too. I’m talking about the notion that there’s some mythical provider that is going to cater to your needs and you’re never going to have to worry about operational concerns.

There is nothing wrong with using Heroku, AWS, Dotcloud, or any of the hundreds of other cloud providers out there. They all provide you with some level of relaxed operational requirements. That said, you’re still restricted to whatever completely fucking shit hardware they decide is right for virtualization. Now I’m not talking AWS so much, as they do allow reasonable size instances, but you’re still restricted to what they’re willing to offer. You never have the option to order custom hardware.

Scale

A bunch of the internet hipsters on Hacker News and elsewhere seem to think that if you use the cloud, your application is going to magically scale by adding more servers to it. That may be true if you’re using MongoDB, but we dont live in a fairy tale here and it will not ever work. There are very few systems that I’m aware of that can scale from one machine to tens to hundreds to thousands without a massive rearchitecture of how you use the system.

One of the first things I pointed out in my article was the fact that I had to spin up large amounts of instances to handle temporary workload. Too bad the database was bottlenecking on concurrent writes to the same row. You can’t ignore one important factor: I cant just “spin up more database”. There are many amazing systems out there that are built on the notion of distributed data with the goal of some level of horizontal scalability (Riak, Cassandra). Even they also do not allow you to spin up more servers and gain more capacity immediately.

Operations Complexity

Another argument that was brought up was the fact that I now personally have to deal with redundancy, monitoring, security fixes, OS upgrades, bringing up more servers, etc.. Sure, that’s true. Except that that will cost me far less time than I would have spent trying to create a SQL database that can horizontal scale to infinity.

  • Redundancy is easy, especially at small scale. Cloud hosting is not going to solve your database redundancy for you.
  • Just because I’m hosting my own machines doesnt mean I cant use New Relic, or in my case Scout.
  • I dont need to frequently bring up additional servers to handle the load because my actual hardware performs 2000 times better than my old virtualized hardware
  • Security updates? OS reloads? Its not like I’m compiling shit by hand, and through the convenience of configuration management this is unbelievably easy.

If you ignore the entirety of operations, you will never have any idea what’s going on when there’s a problem.

The Time/Cost Tradeoff

In my original post I stated it took me about three days to get everything into Chef, and have the new hardware ordered and online. Even if this was three full days of my time, I had just spent four days a previous week trying to get the infinitely scalable cloud solution to perform well enough. Simple math right, four is more than three. Not worth it.

I built getsentry.com specifically with the goal of optimizing cost vs profit margins. Ths is the first month that it’s been profitable, and unless every single customer jumps ship at once, it’s unlikely that I will ever have to put my own money (excluding my time) into the project again.

tl;dr

Virtualized computing has many great uses, but you do not need it, especially if you’re just starting a business. If you want to try out a provider, don’t let me stop you. Make your own decisions. That said, you can be anything at any random company and tell me you use the cloud successfully, and I’ll give you a pat on the back. I’ll then tell you that we rent servers successfully, and by we, I mean DISQUS.

Comments

Comments