On a note left on the Smugmug Blog, a critical piece of infrastructure failed on Smugmug this morning leaving everything that was connected to the system pretty much so down. At time of writing logins are not working and the site is in read only.
A critical part of our infrastructure failed this morning and its backup is having issues as well. Our engineers are looking into it and will restore service as quickly as possible. Sorry for the inconvenience.
07:26 PDT – we’ve identified the problem and were able to fix it. now to bring everything else back online, sorry for the delay
07:45 – we’re in read-only, verifying everything is OK before we put everything back
07:50 – sorry, spoke too soon. a few components still need attention before even read-only. we’re working as fast as we can, sorry for the continued delay
Source: Smugmug Blog
No this is not a problem with cloud computing, this could have happened anywhere, but it is interesting to see in any circumstance the problems when a critical chunk of infrastructure fails and the backups are not working correctly. I am a rabid fan of smugmug, and even pay for the service so an outage with them basically brings me to a standstill as I use them as a picture hosting site for many of the things I do. For example this morning, I am building web pages and wanted to use smugmug to be the picture host for the web pages I am building.
While Smugmug continues to repair the service, what this shows is one of the more interesting problems for anyone using any kind of remote hosting. What happens when the remote host fails? Normally we do not have alternative sites for pictures if they are not loading, nor do we have alternative paths for any of the media we are using. If we are using YouTube the same thing applies, there is an ugly hole in the web page if the YouTube video gets taken down. Same for just about any content that relies on remote hosting and not local hosting, where nothing in the IMG tag specification states that there has to be a fail over link for the image, let alone how to do a fail over for an image with an image or other remotely hosted media file.
Smugmug will be back online soon, but when people rely on infrastructure, we are more beholden to those remote sites than we might want to be, or we intended to be.
(Editor’s notes: Smugmug is up again now. The image used in this post is from Luis Gray, who wrote: Simultaneous Downtime for SmugMug and Twitter. Ironically, that’s a post from November 2008, but he could have written it today…)
Related articles:
- Twitter Down, Again, Just Like the Old Days
- Twitter’s Non-Failwhale Fail
- Simultaneous Downtime for SmugMug and Twitter
(Cross-posted @ TechWag)