There are huge quantities of microfilm material in the archives of organisations and institutions in…
FOCUS:TECHNOLOGY – Digital memory threatened as file formats evolve
By ANICK JESDANUN Associated Press
NEW YORK — You may have recently discovered priceless photographs of your childhood, yellowing but still tangible. Your grandkids probably won’t fare as well with your digital photos.
The computer files may survive but the equipment to make sense of them might not. This era could become a “digital dark age” – a part of its collective memories forever lost.
At risk are your electronic tax files, your e-mail and music. And that’s just for starters. Institutions, meanwhile, are grappling with ensuring the longevity of digital art, electronic court filings, online journals and much more. “There’s kind of a common misperception about digital lasting forever,” said Howard Besser, director of the Moving Image Archiving and Preservation Program at New York University. “It comes out of the fact that a digital copy is a perfect copy.” Consider the fate of the British Broadcasting Corp.’s computer-based collection of photographs, writings and other snapshots of life in 1986, the 900th anniversary of the written English survey, the Domesday Book. While scholars can still read the 1086 tome, the digital version needs customized software and hardware that are breaking down from old age, meaning records from just 17 years ago are rapidly vanishing.
NASA’s early space records are suffering a similar destiny, as Joe Miller recently discovered. The University of Southern California neurobiologist couldn’t read magnetic tapes from the 1976 Viking landings on Mars. With the data in an unknown format, he had to track down printouts and hire students to retype everything. “All the programmers had died or left NASA,” Miller said. “It was hopeless to try to go back to the original tapes.” Elsewhere, businesses haven’t been able to read electronic records needed for lawsuits. Professors have lost old research papers. “Every now and then, a faculty member would come in in tears having some boxes of completely unreadable tapes,” said MacKenzie Smith, associate director for technology at Massachusetts Institute of Technology Libraries. “They’ve lost their life’s work.” To preserve old files, you have to do more than just move documents to the latest storage medium, such as the current CDs and DVDs. Your computer also needs to understand the document’s file structure. That means moving away from the once-common WordStar format and always using the latest version of Microsoft Word, because even the newest software reads only a few versions back.
But as you migrate, you lose something: coloration here, formatting there. “The spacing of the characters and stuff on pages may be off, so lines get a little bit longer and carry over onto the next line,” said Steve Gilheany of Archive Builders, a records-management consultancy. “Gradually those errors become more of a problem.”
Image formats like JPEG use compression that can sacrifice details unnoticeable to the naked eye. Loss gets compounded converting to JPEG 2000, a newer standard that compresses differently.
Jeff Rothenberg, a preservation specialist at Rand Corp., compares migration to preserving a Picasso by repainting it every few years. He instead favors emulation — imitating old platforms to run old software, properly formatted and all. That’s what researchers at the University of Michigan and the University of Leeds in England did with Domesday. The project took a year and a half, cost hundreds of thousands of dollars and required tracking down original programmers. “It wasn’t something that your average PC user would want to do to get their photo,” said Margaret Hedstrom, a Michigan professor who helped coordinate the restoration. Alternatives include the universal virtual computer. IBM Corp. researcher Raymond Lorie wants to create a common set of instructions that computers of today and tomorrow can all understand. Companies issuing new formats would simply write tools compatible with those instructions – something easier said than done when dealing with a format like “.pdf,” which takes 978 pages to describe. “There are theories and a few test beds but no one has shown it in practice,” said Kenneth Thibodeau, who oversees electronic archiving for the National Archives.
More vexing are the social considerations, such as the legality of copying and adapting obsolete software, said Anne Kenney, assistant librarian at Cornell University. The task would be much easier if software companies committed to open standards that remain fairly constant, Thibodeau said. But the market drives innovation and differentiation from competitors. Microsoft, for instance, constantly changes formats to handle new features. So a Word file from 1994 is tougher to open today than an image based on JPEG, which hasn’t changed. And because JPEG is an open standard, software developers can always develop tools based on it, even if JPEG 2000 or a future version comes to dominate. With Microsoft formats, future programmers must guess.
Recognizing the growing reliance on the proprietary “.pdf” format from Adobe Systems, an international group is working with the company to develop an archival version as a standard. These standards help but do not eliminate “digital dark age” worries. Someone in the future will still have to write tools for old formats.
While museums and private companies have increasingly digitized their collections, the primary goal is to improve access to them, not preserve them. Rick Prelinger, who manages an archive of home movies and other films, won’t toss the old reels just yet. And when The New York Times Magazine created a millennium Times Capsule to be opened in the year 3000, it turned to archival-quality paper and miniaturization on long-lasting nickel plates, which can be read with regular microscopes. Jack Rosenthal, the magazine’s editor at the time, said the staff initially assumed digital would be ideal but was quickly discouraged by experts: “If your aim is to have something lasting 1,000 years from now, you can’t plan on electronics doing the job.”