
The Inevitable Cloud Outage: 5 Key Essentials to Safe Guard Your Application
A while back, I was starting up an EC2 instance on the AWS cloud when it entered an endless restart loop. All the application deployment efforts we’d made (installation and service configuration) over two weeks just went down the drain. So we called support. The support rep redirected us to his team leader who simply […]

Disaster Recovery for Amazon EC2 in a Single Click
In my journey through the cloud I often come across great new initiatives. The interesting fact is that although the cloud is a pure revolution terms such as SLA, TCO and ROI remain valid, new methodologies and techniques are presented to support them in the cloud. I recently met Uri Wolloch, the founder of N2W […]

CloudBees Adds HA To Jenkins Enterprise Edition
CloudBees (previous CloudAve coverage), the PaaS company behind the Open Source Jenkins project, today announced that they are offering a high availability plugin to their Jenkins Enterprise product. They made this announcement at the Jenkins User Conference at New York City. This plugin will help in better uptime and improved governance/oversight along with increased productivity. […]

Eucalyptus Bets On High Availability With Their Exclusive Enterprise Focus
Eucalyptus Systems (previous CloudAve coverage), the cloud platform player with a core enterprise focus, today announced the forthcoming release of Eucalyptus 3.0. With this new release, Eucalyptus is taking the “high availability” mantra to lure enterprises when the buzz around the cloud world is the Amazon outage. More than the buzz surrounding the outages, Eucalyptus […]

Can We Take Availability Off Cloud Concerns List?
One of the concerns cited by people who believe in traditional ways of computing is the issue of service availability in the public clouds. For reasons known only to psychiatrists, they associate availability to the presence of the software inside their organizational boundaries. If we talk to enterprise users who use email system hosted on-premise, […]

The Need For Speed
Image via Wikipedia Yesterday was a busy day for me. It started at 4:30 AM when I had to do an interview with a reporter from Bloomberg who covers the European Stock Exchanges. There was then coverage of the goings on with some of my clients in the Wall Street Journal, Financial Times, and many […]

How To Build an SOA Based, High Performance, Scalable and Reliable Twitter on Steroids
Over the past few days I have been having some issues with my Twitter account. Beyond the well known pauses in the service, outages, etc there are some less known but more annoying problems with twitter search. It turns out that many accounts don’t show up in search at all. Therefore, if you are one […]

The Evolution Of Reliability and High Availability
Over the last few decades, the technologies we used and the approaches we took to make our systems reliable have undergone a steady evolution. In some cases the technology has just gotten more reliable through quality control at the hardware level (consider an Intel Blade today compared to my 1986 Zenith 8088 that I wrote […]
High Availability Series: Series Outline
With all of the talk about reliability, or lack thereof, of SaaS and Cloud based applications, I thought I would write a series on designing applications to be Resilient and Highly Available. The series sort of started with this post “It’s Inadequate Design That Lets Systems Fail, Not Whether They Are SaaS or Deployed in […]

It’s Inadequate Design That Lets Systems Fail, Not Whether They Are SaaS or Deployed in The Cloud
There have been many high profile outages lately which have caught peoples attention. These failures are being used as an argument for why critical systems should remain internal and not be deployed as SaaS or in the Cloud. Some of these outages included Google App Engine’s performance issues in early July , Rackspace’s loss of […]