The Cloud: don’t let a spell of bad weather get you down

Easter of 2011 will probably be remembered as the time the Cloud went down. Bad as it was for Amazon’s EC2, the sky didn’t actually fall on anybody.
Bad for some - but not that many. how Amazon's EC2 Easter failure actually makes the Cloud safer...

Maybe the great bank holiday weather took many writers away for the weekend. But the number of “its all over for Cloud” rants were mercifully few.

So, what should be taken away from Amazon’s failure. What have we learnt? Well, its shown intelligent system design is as vital for Cloud as anywhere else. Along with how many “experts” can still talk through their back-ends…

Blowing away some cloudy thinking

The problem with any great idea is the hype and sheer hyperbole it often carries. Cloud’s a great example. All things to all men, or the devil’s work. Black or white.

Cloud isn’t a radical new technology. Its just a way of hosting infrastructures. What sits in the Cloud is no different from what sits in a data centre.

All the failures, outages, grey-outs, black-outs and poor performance need to be mitigated exactly the same as you’d have to for that server rack in your basement. Anyone who thinks otherwise doesn’t understand system architecture design.

Take for example this tweet from a guy who leads a main bank’s innovation team.
I guess we may have to...

This isn’t from some out-of touch bank guy – he works on cutting edge bank stuff. I respect his views a lot, maybe he’s just saying what his bosses want to hear.

What guiled me was that he’d forgotten that his own bank – one of the biggest – had suffered two major system outages of it own in the last years.

Later, he pointed out that system failures do happen, negating his own tweet.

Maybe its just as well banks aren’t into Cloud just yet, with such thinking!

Planning for failure in the Cloud

Amazon’s EC2 is a great Cloud option. Its cheap, flexible, but has one drawback. Amazon EC2 is single instance. It exists in one place only.

Single instance means all your eggs in one basket. So much to fail. And fail it will. That’s no good for a high-profile infrastructure. That’s not software as a service. That’s Software-as-a-Disaster-Waiting-to-Happen.

I can point to many Cloud services that use multi-homed, resilient infrastructures. SalesForce is perhaps the one everyone thinks of.

Clearly, high-resilience is vital for any mission-critical corporate IT infrastructure, that goes for Cloud or a conventional on-premise design.

To look at one failure and write off an entire generation of development is crass. But if it cuts through the hype and leads to real-world discussion about Cloud, then maybe every cloud has a silver lining after all.