Roundtable

Please, My Digital Archive. It’s Very Sick.

Our past on the internet is disappearing before we can make it history.

By Tanner Howard

Wednesday, September 04, 2019

Untitled (Plaster Man with Sony Watchman Head), by Nam June Paik, 2005. Smithsonian American Art Museum, Gift to the Nam June Paik Archive from the Nam June Paik Estate.

 

Audio brought to you by Curio, a Lapham’s Quarterly partner

Digital history isn’t history at all—until, without warning, it is. In an age in which any internet user is a creator-in-the-making, reaching a handful of virtual friends or entire corners of the web in a moment’s notice, the line between archive-worthy material and the detritus that populates our feeds grows vanishingly thin. Thus, a paradox emerges: whatever measure of historical value our digital traces may or may not leave behind for future researchers, each individual is capable of becoming a digital archivist, holding on to whatever materials that made their online lives consequential, even if such material means nothing to another human soul.

On paper, the tools to facilitate easy digital archiving already exist. We’re told that the wonders of cloud computing, Google Drive, and the endless memory of our Facebook profiles will hold our past lives in place for posterity, a constant reminder of selves we’d rather forget or those we wish had never left us. But in reality, the web remains a treacherous place for users keen on holding on to remnants of themselves, particularly in ways that escape corporate capture. As platforms and technologies reach obsolescence, abandoned by users eager to find the newest, most relevant home for their virtual selves, the cost of maintaining millions of photos, videos, songs, and memories overwhelms failing tech companies that aren’t in the business of remaining archivists of abandoned profiles.

It’s in this climate that Myspace announced earlier this year that it had lost its catalogue of user-uploaded music: some 50 million tracks disappeared in a moment of digital file corruption. While publications like the Guardian predicted a dozen years ago that Myspace would exist in perpetuity, its slow death was already underway just as the digital ink dried. (Digital newspapers are just as susceptible as everything else online to disappearance: one Columbia Journalism Review report found that the “majority of news outlets [interviewed] had not given any thought to even basic strategies for preserving their digital content.”)

Although Myspace still exists as a shell of its former self, the destruction of its entire music catalogue served as a reminder that the once-dominant social network hasn’t vanished completely. And yet, thanks to its poor digital stewardship, the first musical experiments of countless artists, amateur and professional, were lost in an instant. Even the announcement that 450,000 songs had been preserved by a team at the Internet Archive underlines the scale at which digital history can endure, precariously. While a one percent sample of the now-deleted archive could give any professional historian a lifetime’s worth of researchable material for study, it means little to those who made the nearly 50 million tracks that were lost, never to return. While the missing tracks may be remembered by many as the kind of digital debris that constantly churns from device to device, lacking any enduring value, the rate at which the internet swallows itself whole means that future historians may never be able to decide for themselves what was worth keeping for posterity.

 

At best, our digital past can feel like some half-remembered dream, in theory reachable but only occasionally dredged up by something like Facebook’s “On This Day” feature. While there’s a power in forgetting, this quasi-accessible personal archive makes any desire to return to the digital past unlikely, especially when huge chunks of our online presence, at least on Facebook, can be bound up in other people’s decisions. Your ex can delete their page, or at least the photos of your shared past. Suddenly a piece of your digital life would be lost forever.

For Molly Soda, a digital artist whose work engages with questions of revisiting one’s virtual legacy, however messy, preservation is an active concern. Soda, whose teenage experiences with Myspace and Facebook were recently featured in the video game Wrong Box, has had to try to preserve several of her digital archives, most recently her three-thousand-post Tumblr collection, downloaded in the wake of the service’s draconian crackdown on acceptable material. At the same time, she expresses a sense of acceptance when questioned about her archival efforts, in full knowledge that the web companies tasked with mediating our computerized identities are at best ambivalent in their roles as stewards of such intimate materials.

“There’s this feeling of a loss of control that I’ve always sort of signed up for online,” Soda said. “On the one hand, I’m very upset when these websites go down, but on the other hand I’m like, ‘Yeah, I knew that going into it.’ This stuff is very impermanent, in new ways that I wouldn’t have conceived of.”

Of course, the internet is not the first new media form that’s proven unruly for historians hoping to make sense of the past. For those interested in the early days of filmmaking, the silence of the historical archive is perhaps even more absolute than that of the early web. According to the Film Foundation, an archival nonprofit founded by Martin Scorsese in 1990, more than 90 percent of pre-1929 films, and half of all films made before 1950, are gone forever.

Much like the web, the gaps in historic film preservation happened for both technical and cultural reasons. Before 1952, movies were primarily shot using nitrate film; its brittle and flammable nature resulted in several blazes that destroyed entire archives. But for years, studios and filmmakers refused to see the value in keeping their creations after their theatrical run. In testimony presented at the Library of Congress in 1993, film preservationist Robert Harris, who was responsible for producing the best-kept editions of classic films like Lawrence of Arabia and Rear Window, told officials: “Most of the early films did not survive because of wholesale junking by the studios. There was no thought of ever saving these films.”

Cowboy with amateur movie camera in Montana, 1939. Photograph by Arthur Rothstein. Library of Congress, Prints and Photographs Division.

The near-total loss of filmmaking’s early history is a monumental blow for researchers, who are unable to fully grasp the cultural and aesthetic impacts of many of the era’s defining movies. But while our digital traces may feel quotidian in comparison to these lost artistic works, researchers who study everyday life on the internet nevertheless lament a general cultural apathy toward digital preservation. These experiences may in time gain a significance that is not obvious to us today.

“I ask myself if people do really understand that they invest so much in their Instagram life. It’s a lot of work, but it will be gone because there is no way to save it in the way that it is,” said Olia Lialina, one of the archivists and researchers behind the Geocities Institute, which investigates the legacy of the once-beloved early webpage-hosting website. “All the interactions, the likes and the fights and everything, at this moment there is no capacity to preserve it for anybody.”

Web archiving is always a form of approximation, especially since internet preservation has primarily been the domain of code-savvy computer users or of the Internet Archive web crawler—decentralized processes driven by the obsessions and perspectives of individuals and the uncaring reach of algorithms. For all future historians of the digital age, whether on an individual or institutional level, a recognition of such limitations is essential. Jason Scott, whose work with the Internet Archive has saved many traces of the early web, calls this selective preservation “ice-core drilling,” a necessary reduction of what’s preserved in the face of so much abundance. In many ways, it mirrors concerns raised by Harris in his film-preservation testimony. Discussing the overwhelming number of films to be saved, he argued, “It simply is not all worth saving with today’s limited funding.”

Film preservation depends on a number of fragmented but overlapping technical concerns. A quality film reproduction often requires extensive reconstruction based on disparate source materials. Similarly, any living webpage relies on a host of contingent, external services, with plug-ins frequently contributing to a site’s existence. As a result, the act of holding on to webpages in perpetuity is almost certainly a losing task. Consider the retroactive destruction of Lialina’s Geocities archive: when Geocities Japan, the last vestige of the once-popular service, closed earlier this year, it had an impact on many portions of the saved digital archive that still depended on the living site in a number of unforeseen ways. It’s as if a malevolent patron entered a library and scattered priceless books in unknown new locations—except that the loss of any given digital archive is far more likely to be permanent.

 

According to University of Waterloo history professor Ian Milligan, most historians have yet to take seriously the fact that the explosive growth of the web into everyday use has passed into historical memory. Without considering the ways in which normal people made sense of late twentieth-century life—mediated for the first time on sites like Geocities, Yahoo, and AOL—one would have a simplistic understanding of many of the era’s defining events, from the fallout of the Clinton impeachment to the contested election of 2000 and 9/11. The internet has irrevocably reshaped our lives. For now, the historical record still struggles to reflect that shift.

It’s a concern that animates Milligan’s new book, History in the Age of Abundance? “Our society used to forget—put another way, we did not leave so many traces of ourselves behind,” he writes. While the archive has always privileged the interests of those most capable of leaving something for posterity, the internet age should, at least in theory, serve to upend this historic imbalance. Even with a recognition of the unevenness of digital adoption—divided along familiar lines of class, race, gender, and national origin—the significance of digital technologies in political unrest in the Arab Spring, or during the Occupy movement, any historical examination that doesn’t determine the internet’s impact on these events is incomplete.

That hasn’t stopped researchers like Lialina and Milligan from trying to challenge this imbalance. For Milligan, one of the biggest difficulties in working with something like the Geocities archive is the need to respect privacy. Even if something had been posted publicly, its creator, especially in the web’s infancy, may not have believed that their materials would be available, decades later, for a researcher’s perusal. In determining an appropriate approach, Milligan refers to the notion of an “intimate public,” which may technically grant anyone access to publicly posted materials, but in practice is intended for a select, self-determined audience online. He calls it a “new landscape for research,” with potentially monumental consequences for academics. The increasingly invasive and pervasive gaze of Google and Facebook, recently documented in detail in Shoshana Zuboff’s The Age of Surveillance Capitalism, means that detailed personal-data collections exist in inaccessible servers, an invisible archive that Zuboff calls the “shadow text.”

“It’s shocking how open people are, but mainly because even though they’re writing on a public website, in the 1990s it was probably just ten, twenty, thirty internet friends connecting with them,” Milligan said. “But I can come along in 2018, and it’s a public document, maybe taken out of context. I don’t even need to go to my ethics board to use this material.”

For Lialina, one attempt to get closer to her sources, and to enrich our understanding of their early homepages, is to contact Geocities users via email. She’s reached out to hundreds of accounts, most of which are disconnected, tied to users who have died or to people in disbelief that a researcher would have questions about their decades-old homepage. While the three successful interviews she’s conducted thus far speak to the significance of everyday users’ experiences as crucial to the historical record, their wider absence speaks volumes. We still don’t know how to value normal people’s places in telling the story of the internet, even as oral histories are otherwise essential in tracing lived experiences through epochal changes.

“I think people have heard so many times that what they made was just ridiculous or wrong, that nobody needs it, and that we should move to the next, better platform,” Lialina said. “People are quite modest, actually.”

 

The examples are endless, and seem to suggest that, even with clear goals and a desire to find specific materials in our digital pasts, the virtual archive is just as fragile as anything in the physical realm. I’ve found this to be true in my own research, in awe of potential research topics from the now-distant past that have vanished thanks to the adversities of careful archiving. One such project was The East Village, a “cybersoap” that shot several dozen episodes in the mid-1990s. It’s now vanished entirely from the web. Even if the page it called home, theeastvillage.com, had been preserved, most of the video content would have gone unsaved because of dead plug-ins and data-dense video files. As it stands, I don’t even have that. There is no evidence to understand its character, narrative form, or view of the neighborhood in question.

Still, my digging led me to find the email of Charles James Platkin, now a health researcher but once the show’s creator. When I received a response, asking me to call, the hope that I had unraveled a small historical mystery gave me an immeasurable joy.

In moments, this happiness was dashed. Platkin informed me that the entire project—all the scripts, films, everything that once went on the web in its infancy—had been destroyed in a flood.

Woman uses computer, 1994. Photograph by Martha Cooper. Library of Congress, Prints and Photographs Division.

An archivist’s dream is immaculate preservation, documentation, accessibility, the chance for our shared history to speak to us once more in the present. But if the preservation of digital documents remains an unsolvable puzzle, ornery in ways that print materials often aren’t, what good will our archiving do should it become impossible to inhabit the world we attempt to preserve?

The neglect of our digital past is mirrored by our increasingly tenuous hold on the physical world and its archives. These self-destructive tendencies were nowhere more vivid than in last year’s Brazilian National Museum fire, a readily preventable inferno that’s been described by cultural historians as “a lobotomy in Brazilian memory.” Especially with the ascendancy of protofascist Brazilian president Jair Bolsonaro, who has targeted humanities programs in Brazilian universities with sweeping cuts, the massive cultural death wrought by the museum’s destruction suggests a bleak future in which political and environmental forces unite to ravage our collective historical memory in unforeseen but unmistakable ways.

Both digital and physical archiving are rooted in the same fundamental impulse: hold on to the past so that it may retain meaning in the future. Or, as a voiceover from the recent documentary Recorder: The Marion Stokes Project, a film about a woman who saved decades of broadcast news, said: “All archives create futures.”

 

For every effort I’ve made to save digital objects of personal significance—rushing to get screenshots of my posts in a goofy Facebook meme group called Post Aesthetics, before the posts were deleted in the wake of a controversy around the “Dat Boi” frog and African American vernacular English—other artifacts are likely gone forever.

In 2016 I took a course at Northwestern University called “Art, Writing, Technology.” It was an unusual class, taught by an unusual professor. Danny Snelson is an English PhD by trade, but in practice someone who blurs scholarly boundaries and wouldn’t be caught dead teaching a course on Chaucer or Shakespeare. Our homework was a set of weekly art-making experiments, carried out primarily on a website called NewHive, a drag-and-drop pagemaking tool that made digital art readily accessible, perfect for making vivid webpages with tiled GIFs, flashing text, and embedded YouTube videos. While I mostly made use of the site during those eleven weeks—strange, discursive projects, reflective of a period of intense growth and change—my NewHive profile ever since has been a source of comfort and a testament to one of the most important months of my life.

Like many things that seemed too good to be true, NewHive had its warning signs. Mostly: it was free. Without monetizing my attention or charging me for server space to host my data-heavy creations, it seemed for years that the service was in a sort of suspended animation, needing growth and visibility to survive but lacking a clear path toward sustainability. Over time, I watched the service falter, as tiled GIFs went missing and audio files stopped functioning. Still, my homepage remained. Despite all advance warning, I never got around to preserving my pages.

My mind returned to NewHive a few weeks ago. Saving those old files now felt essential. I loaded my old NewHive URL, username hannertoward, into WebRecorder, the most accessible web preservation software available to the general public.

But almost thirty seconds after typing the URL into my browser, my fears were confirmed. My entire collection was replaced with a white page reading,

Home of the NEW NewHive...

 

We thank you for your patience

Everything was gone. I didn’t even get a warning.