Yes the data blowout at Microsoft’s Danger data center
should have everyone taking a short sharp look at the way that they do
data recovery and disaster preparedness. The problem is not so much the
outage but the data loss, data loss caused because the backups didn’t work. This is not a cloud computing issue even if it was a data center; this is a disaster recovery issue.
The press is once again talking about how bad the cloud is for
computing, and once again I find myself pointing out that Cloud
Computing is a platform, just like the platform in a company’s data
center. The only difference between the cloud and the company is that
one is hosted locally and one is hosted somewhere else outside the
Just because the computer and operating system reside outside the
company does not mean that a company can skip on backups, skip on
security, and skip on the million and one things that have to happen
regardless of where user data or company data resides. The problem is
that the backups failed, and this is not the first time that backups
have failed since computer backups were invented. This really is a
matter of poor data management going mainstream, not a problem with cloud computing.
Cloud computing does mean a lower cost of operations, but it does not mean that standards and practices such as good data backup,
good disaster recovery, and good system administration including
testing of the backups should not be happening on a regular basis.
There is no plausible way that those backups could not have been tested
by spinning up a number of new VM’s and testing the backups. Backup
testing is an essential part of good information security and good
disaster planning. The fact that this was not done is a procedural
issue, not a cloud computing issue.
While it is bad that this happened, what this points to is that we
have hit a crisis point in how companies approach IT, and it might be
directly driven by cost cutting measures in place because the economy
is so poor. Cutting costs on IT is something that can be done, but we
pay a price eventually because things do not get done when we are over
working our IT staffs. The recent parade of outages, crashes, and other
problems with computing in general means we might have cut too far, and
it is time to hire people so that we do not burn them out, and that we
can have that extra margin of safety when it comes to running our
With this in mind, it is time to start using the standards and
practices that companies have developed over the last few years and
ensure that companies are completely recoverable in the event of a
major outage. The recent spate of companies going down, from Mag.no.lia
to the Fisher Data Center fire, and now to Microsoft’s Danger system,
maybe we have gone too far, and it is time to hire some new IT Staff.
Related articles by
Microsoft Lost All the Sidekick Backups
Owners Get Bad News – Phone Data Is Gone Forever
- Microsoft Mobile’s Worst Week Ever
- What Caused the Sidekick Fail?
(Cross-posted @ IT Toolbox)