With all of the talk about reliability, or lack thereof, of SaaS and Cloud based applications, I thought I would write a series on designing applications to be Resilient and Highly Available. The series sort of started with this post “It’s Inadequate Design That Lets Systems Fail, Not Whether They Are SaaS or Deployed in The Cloud“.
As any of you who have read my Bio are aware, I have spent most of my career designing very large, high volume and high performance applications for the World’s largest financial institutions. In these systems High Availability and Reliability is Key, as systems I have been involved in designing carry Trillions of dollars of transactions on them each day. Also in the Financial Markets world, and down time can cost millions of dollars per minute. We have also been center stage in the evolution of technology and design best practice when it comes to performance and reliability. We have gone from just using a robust mainframe and assuming it stays up with hot swap hardware to high performance distributed applications handling millions of transactions per second in statefull applications (much harder to make HA than stateless Web apps), where time from failure to detection and takeover by a hot standby can be as little at 7 milliseconds.
The articles which will follow in this series will represent my personal opinion on how this is done. It is by no means the only way to do it and I am sure others will clearly have other opinions.
Topic’s will tentatively the following:
- The Evolution Of Reliability and High Availability
- Guaranteeing No Loss Of Data
- Designing For Disaster Recovery
- Designing For Maximum Uptime In A Distributed World
- High Availability in a High Volume Transactional Environment
Other topics will be considered based on feedback, user requests or if something just pops into my head. So if you have a particular question or topic you would like answered just ask and if it is something I feel I can write about, I will.
We will start in the next article in the series with a brief discussion of The Evolution Of Reliability and High Availability.
(Guest post by Paul Michaud, Global Executive IT Architect for the Financial Markets sector at IBM. Paul blogs @ Technology Musings)