When several, seemingly unrelated threads on TechMeme discuss the same things, you know they are onto something big. Today it’s data. Where you store it, how accessible / safe it is – and no, today’s s safety discussion isn’t about protecting it from intruders: it’s about whether you, the rightful owner can assess it safely.
Solid-Sate, Hard-disks and the Cloud
Computing Without a Whirring Drive is an interesting article in The New York Times about how hard-disk industry executives are worrying about their future with the advent of of netbooks, smartphones and other devices. Some data may be on SSD’s in these gadgets, the rest in the Cloud – and today’s teenagers increasingly don’t care where it is, they trust it will always be available.
Which operating system is best for solid-state drives? –asks ComputerWorld, ironically the same day a new forecast predicts Netbook SSD usage may fall in 2009. Whichever way the trend goes, hard-disk execs don’t really have to lose a lot of sleep: data has to be stored somewhere, even in the cloud. In fact if you buy into to IDC’s Storage Paradox ( I don’t), we’ll soon create more data than we can store.
But whether you store it locally or in the Cloud, how certain are you you can always access your data? Not only computer-generated data, but your music, photos, videos – even paper. Paperless office was a popular phrase but remained largely a dream for decades, but it’s now happening for real. Hardly any new information is generated on paper, so finally we are now willing to digitize old stuff and shred the paper originals. Soon all our information will exist only electronically. But there’s a problem as discussed over @ Technium:
Storage Media Decay
The storage medium itself can decay. Turns out that paper is much more stable over the long term than most digital media. Magnetic surfaces flake, peel, shatter. And the supposed durable CDs and DVDs aren’t very stable either.
The proposed solution: move your data regularly, probably at least once every 5 years. Now, for most of us, with computer-generated data it’s never been a problem, as we keep on replacing old computers with new ones, copying data over and over again. But do you have digital archives of really old data (rarely accessed documents, music, photos..etc) typically stored on CDs and DVDs? They may not be perfectly readable after a few years.
We don’t know what the natural movage respiration cycle is for digital media yet since it is still very new, but I suspect the cycle is much shorter than we think. I would guess it is 5 years. No matter what digital format you have your precious stored on, you should expect to move it onto new media in five years — and five years after that forever!
Move it, move it, move it.
For some of us the solution will be keeping everything on large hard-disks, syncing it between multiple machines, backing it up locally or online. But there is some naivete regarding online backups, as illustrated by this comment to the Technium post (emphasis mine):
i keep all my data on the built-in hard drive of my laptop. no dvds, no external drives. this way i can always be sure the physical media is still working. whenever i run out of disk space i buy a larger hard disc (and one for backup of course) and copy everything over.
in addition i push the really important stuff to amazon s3 regularly – in the cloud there are no physical media. oh wonderful world.
“In the Cloud there are no physical media” – really? Information still has to be stored somewhere, Cloud or local. When we move it to the Cloud, we’re simply pushing the responsibility onto others, trusting they do a better job then we would. But let’s face it, Cloud services have not been around long enough to face the issue of storage decay, and until they start talking about it, we don’t really know how they safeguard rarely accessed old data.
Format Problems Where You Least Expect Them
Then there’s the issue of formats and continuity. I’m not even talking about media formats (Last year I found my old University Thesis on a 5 1/4” floppy disk – I keep it as a souvenir, but can no longer access it). No, let’s assume you moved all your stuff from floppies, VHS tapes ..etc: you are savvy enough to take care of media format conversions before they become obsolete. I keep all digitized (formerly paper) documents in PDF format, hoping that will last forever.
But here’s the hidden trap: your application data my become inaccessible, even while you keep on upgrading to the most recent release of the application itself. A few examples:
- I found old 3’5” diskettes with some MS Works information from the late 80’s. The current MS Works version that comes pre-installed on many laptops can’t open it.
- Service Pack 3 to Microsoft Office 2003 blocked several older formats, including their very own – yes that means Word, Excel, Powerpoint docs you may very well have on your hard disk. This came without warning at the time of installing SP3, resulting in somewhat of an uproar as it got discovered. Some Microsofties resorted to name-calling, but others came to their senses and Microsoft released quick tools to re-enable the blocked formats.
- Microsoft Money users may be in trouble. I have a lot of financial data in Microsoft Money and prior to that in Quicken files. Both applications used to recommend you keep the data files small by archiving earlier years. Every time you “upgrade” Money your current data file is upgraded to the new format – but what happens to the archive files? If you’re on Vista, you’re out of luck: Money’s import/conversion routine is incompatible with Vista, despite stated documentation. Read the gory details here.
The list could probably go on and on. The point is, that you’re in danger of losing access to data where you expect it the least: when you’re a “good customer” upgrading to new releases, and think you are safe, since all your data is created by the same application.
Digital continuity is important, and not something you can take for granted. Whether you take care of it yourself or outsource it to a provider “in the Cloud”, make sure your data is:
- moved periodically (physical preservation)
- updated to currently readable formats.