The Importance of Archiving

The Importance of Archiving

-by Errin

Did you know that the media for the feature film Avatar required 52 petabytes of storage space?

Me neither.

Did you know that there was such a thing as a petabyte until just a second ago?

OK, I’ll confess.  I’d never heard about petabytes myself until I sat in on the Archive panel at last week’s SF Music Tech Summit.  Feeling like a bonafied member of the professional geek squad with my laptop perched on my knees, I quickly Googled the unfamiliar term, and learned the following:

There are 1024 gigabytes in a terabyte.
There are 1024 terabytes in a petabyte.
There are 1 million gigabytes in a petabyte.

And then I scrolled down and saw this written in huge block text:

A PETABYTE IS A LOT OF DATA.

No kidding.

I also learned that 1 petabyte is the equivalent of 20 million four-drawer filing cabinets filled with text, or 13.3 years of HD-TV video (about 58,292 movies).  20 petabytes is the amount of data that Google processes on any given day.  And the entire written works of mankind, from the beginning of recorded history, in all languages, only comprises 50 petabytes worth of space.

Wow.  It’s odd to consider history as something so compressible, but there you have it.  By this time next year they’ll probably be selling 50 petabyte Flash drives, and you can carry the entire written works of mankind around in your pocket.

I was a little embarrassed not to have known what a petabyte was, but as an archivist in the TV and travel industries, I never had to deal in storage capacities that exceeded a couple of terabytes.  Of course, much of TV’s archives are still stored on tape.  The animated programs that I used to work on were laid off to tape at the end of the series, and the hard drives were repurposed for the next program.  Times have changed.  I don’t even want to get started on exabytes, zettabytes and yottabytes, but you should know: they’re out there.

On the archiving panel sat Frederic Lieberman, who was representing the Grateful Dead Archives at UC Santa Cruz, Brewster Kahle of the Internet Archive, John Spencer of BMS/Chace LLC, Mike Wells of Mike Wells Mastering, Elizabeth Cohen of Cohen Acoustical, and moderator Stephen Hart, of NARAS.  It quickly came to light that they had differing opinions about archiving practices.  But what they had in common was a shared frustration with the lack of an archiving standard.

“How many of you are storing collections in your basement?  Or in your attic?”  Hands went up around the room.  “We all know that it’s logical to make multiple copies of our digital assets, but are you storing your copies in a separate location?”  There were a few sheepish faces in the crowd.

Beyond the risks of fire and flood, of insufficient storage facilities and weather damage, the panel lamented the general misunderstanding of basic archiving principles.  Frederic Lieberman told a story about an ethnomusicologist in Bali who contacted him with this urgent question: “My tapes got some fungi on them, so I wiped them down with alcohol and now they won’t hold any sound.  What do I do?”

“She’d wiped the oxide off the tapes,” he said, “and made them unusable.  So even the most basic archiving information is not yet common knowledge to most people.  It needs to be taught.”

I’m guessing that story predated the petabyte, but he still made a very good point.  Not only do basic archiving practices need teaching, but the notion of archiving at all has been slow to spread.  Lots of folks aren’t interested in the preservation of their materials because it costs more money, said Mike Wells.

“There is an ever-growing population that deals with digital media that has no concept of archiving, nor any inclination to do so, because it doesn’t fit into their budgets.”  I knew that to be true from personal experience.  When one of the animated programs I’d been working on at Nickelodeon was canceled, the producers had no interest in preserving the animated files, which represented two years worth of work.  The show was over, the staff was laid off and the budget didn’t allow for the man-hours necessary to neatly wrap the archive.  I’d had to argue persuasively to keep my job long enough to properly store all the data.

But although the show had ceased production, the brand was just kicking off.  Episodes were re-airing and products were being created around the characters of this children’s show.  The Creative Resources department was calling me daily, asking for imagery on which to base their new product designs.  Meanwhile, I’d been ordered to box up all our assets and stick them in a closet.  It was beyond frustrating because I understood that we were essentially putting a lid on the project’s potential.  But my job was over and I had to move on.

“Without preservation there is no monetization,” emphasized the panel.  “Opportunities don’t end with your deliverable.  If you want to repurpose or remonetize the output, you’ve got to know where your data is.  If you can’t find your asset, how will you make money off of it?”

I thought sadly of those boxes of tapes in that closet.

Every company, every individual, has their own methods of archiving.  Until some sort of standard can be introduced, we’re all on our own in deciding the best way to preserve our personal histories.  But the panel did give us some things to think about:

When conceptualizing how you’re going to create your archiving system, consider how many dependencies you’ll have, and how you’ll be able to access those dependencies in the future.  Are you using propriety software?  Keep in mind that companies get bought, or go out of business.  What sort of media are you storing your files on, and will that media be accessible in the future?  Understand that almost all media collapses before its projected expiration date.  Backing up is essential.  Consider how easy or difficult your system makes it to locate your desired assets.  People are much more likely to archive when their assets are easily accessible.  Who wants to spend hours poring through that box of tapes in the closet?

When the panel ended I wondered if I’d been left with more questions than answers.  I wasn’t sure if I’d learned much more than I already knew.  But it did re-emphasize to me the importance of archiving.  “Archiving means preserving the past, as well as the present and the future,” said Frederic Lieberman.  It means the availability of educational resources such as lectures, papers, interviews, and conferences.  Or the enjoyment of cultural events like programs and concerts.  If we preserve these materials, then we can share them, and ultimately broaden our experiences.

“You choose not to archive at your own peril.”  Although I wouldn’t phrase it quite the way the panelists did, I strongly agree that our history is our duty to preserve.  We all have something to leave behind for the education and enjoyment of future generations.  I personally have a whole lot of data to drop on my grandkids.

Probably not as much as a petabyte, but still.