GW Disrupting DH and “The End of the Archive”

This is the full text and audio of my paper for the “Disrupting the Archive” panel at George Washington University’s Disrupting DH Symposium including slides in the appropriate places. Podcast recorded by Eileen Joy, and first posted on her blog at In The Middle.


I’m not going to be giving the remarks I was originally planning to give today, thanks in large part to the “disrupting DH” panel at MLA,

and in particular, the attention paid to the role of disruption in digital humanities.

@adelinekoh’s metaphor for DH and what it means to different groups

There it was drawn to our attention that disruption is a double-edged sword—it can be both revolutionary and liberatory in the same moment it is conservative or reifying; that disrupting “business as usual” depends very much on how we define both “business” and “usual.” DisruptingDH.001 When we talk about disrupting the archive with and within Digital Humanities, then, what is it we interrupt, impede, rupture, co-opt or break?

DisruptingDH.002 In his work Archive Fever, Derrida draws our attention to the dual role of the arkhē as both an Slide06origin and an authority. It is both the source and the power that a source can purvey. But that power is localized—it is conditional and even domestic. The archive has its origins in the documents of law collected together and stored in the house of the local authority. The archon—the keeper of documents—was thus the collector, guardian, and reader of documents, and it was in his power—for it usually was a him—to both keep the law by collecting and interpreting documents andSlide07 to write the law through the production of documents. Derrida points to the way in which documents and power are mutually supportive of each other through their domestication. “The archontic function,” he says, operates by “unification, identification…and classification” within the archive, the repository itself. (Library science is clearly a form of tyranny.) Thus the power of the archive—both its authority and its ability to act as an origin for law or Slide08letter or meaning—lies in its ability to be maintained and to maintain; to secure a history or an authority by its orderly submission to the protocols of unity, identity, and classification. Organization for a purpose, with a function of being able to find, access, and recall any of the authentic—authority-bearing—documents at any time. Slide09

Indeed, recall is one of the archive’s main functions. Derrida suggests that the archontic principle is one of “gathering together.” Interestingly, the Slide10etymology of the Latin word that came to mean “to read,” “legere,” means “to gather together.” To read in the Middle Ages, is to Slide11gather together, to co-lect, not only as an individual that brings things—rēs—together, but also to lect-together, to Slide12engage in the communal production, consumption, and dissemination of texts and ideas that is the cornerstone of “medieval culture,” which according to Mary Carruthers is “fundamentally memorial,” particularly in its literary, textual and artifactual facets. Thus we find in the Slide13model of medieval reading and memory the flipside of the archive’s institutional power: the medieval reader gathers together his rēs—his imaginēs—and stores them all up to re-collect them from memory, to regurgitate them and recompose.

Slide14For every order that is achieved within an archive, there is a dis-order, an un-order, a re-order. A new unity or identity always threatens in each potential new organization or classification. “Order is no longer assured.” Every archive is at once conservative, institutional and also revolutionary. The “end” of the archive lies thus in its order. It is the origin of whatever unity or identity has been imposed upon it by its domicile and the authority it upholds.

So, on the one hand, the power of digital humanities to disrupt the archive lies in its ability to disrupt the physical objects’ domiciles and thus the authority of the cultural institutions that maintain them—or that they maintain. The wider availability of access to digital archives (or digitized archives) is supposed to support the democratization of access to cultural documents. The diffusion of certain locales’ cultural caché, away from the centers of European learning that also serve as the major repositories for premodern archives. In the UK, that would of course be the oldest universities, Oxford and Cambridge, as well as the national archive that has largely migrated to the British Library. In addition to diffusing their cachés, and effectively redistributing or socializing their cultural capital, this also pragmatically deprives institutions and local economies of the monies expended by those seeking access. It quite literally changes the economics of access, shifting access away from only those with institutional or personal financial means and money and away from the institutions and localities that house the archive. While access, as I mentioned, then shifts toward “everyone”Slide15—or everyone with computers and internet at their disposal—money also shifts. Local economies get dropped out, but the institution continues to see money coming in in the form of grants or payments for digital copies and/or permissions to use them.

DisruptingDH.003On the other hand, the much wider dissemination of both the visual image and texts within these archives as well as the dissemination of their organizing principle throughout the globe only reifies the power of the archive DisruptingDH.005granted by its archontic rather than local function. First, as Mary Carruthers herself and Lara Farina have both pointed out to me on different occasions, many forms of digital
manuscript studies risk a disproportionate privileging of the visual and textual aspect DisruptingDH.004of the object while erasing its other material, sensory, and symbolic registers. Second, a digital “reproduction” of an archive or collection
only translates the powers of its unity, identity and classification into a new medium that is more mobile.DisruptingDH.006

You see, the archive, and the documents that are its foundation have a material form. To take medieval manuscripts as an example, they are objects, they usually contain a text or texts ordered in some fashion and most often bound—gathered and sewn together and then enclosed by two covers that DisruptingDH.007literally keep all the pages in line and protect the integrity of this unified object. The object is then identified by a shelfmark, set on a particular shelf within a particular library, and classified in a catalogue based upon its contents, and in particular, its highest profile work or author. But manuscripts are not simply documents—they aren’t merely records of history and local governance. They also participate in a moder apparatus of textual authority. The collated edition, for example, disrupts the archive’s order by extracting from multiple libraries many unified objects and from within them only so many fragments of each as helps the collator to gather together the sources of the “text” and to help him—and it usually is a him—to extract from all these historical, specific and local documents an a-historical text, preferably one traceable to the mind of a single Author, the literary genius. This, for example, is the Kane-Donaldson edition of the Piers Plowman B-text that compares the Piers text in nearly twenty manuscripts and produces a conjectured ur-text from the superposition and comparison of these multiple witnesses. DisruptingDH.009

Now the twentieth-century editions did all this by hand, but with all the new tools—with digitization and optical character recognition and text encoding initiative transcriptions—we can now automate parts of the process of identifying and categorizing errata for genetic analysis. When we do this, what do we disrupt? What do we reify? We destroy or negate or erase the immediate historical and material power of the archive in order to conjecturally reconstruct another history—this one about authors rather than scribes, about composition rather than re-composition. We use the originary authority not to project that authority into our present or future, but to project a yet more elusive authority farther into the past.

Slide23                     OBSOLESCENCE

The archive is limited. It is bounded, by its very form and materiality. Its power imaginary and an accident of organization. So what do we disrupt now? What is the end of an archive after the end? After manuscripts have ceased to be documents of their own history, culture and production? After the death of the author and the archive’s function to recover him—and it usually was a him from within the texts in the manuscripts? If the limits of an archive are in its spatiality, materiality, and boundaries, so is its power. But is that power reproduced when we reproduce its limits en masse? Let’s take the Piers Plowman Electronic Acrhive, for an example—an early digitization project with all the right ends and disruptions at the beginning of the digital movement. The project aims to digitize every single manuscript copy of Piers Plowman and provide an edited transcription of it complete with index, cross-referencing tools, gloss and notes. It is working on translating the archive into a new medium but it has disrupted nothing. In “The Task of the Translator,” Benjamin—himself echoing Jerome’s own conundrum sixteen centuries earlier—reminds us that translation always comes down to a tradeoff between fidelity and meaning, and I would argue that theDisruptingDH.011 same is true for a translation into a new medium. The Piers Plowman Electronic Archive has opted for fidelity, but to what? To the Piers text in the manuscripts. As a result, it reproduces both the limits of archive itself and the limits of the collated edition.* Each digitization is physically separate from each other bound into its serial isolation by the edges of a CD-ROM, a technology that is rapidly becoming obsolete before the project is even half finished.

So we have now updated our obsolete and inaccessible archive to another, more mobile and portable but no less obsolete and inaccessible archive. In order to disrupt yet again, will have to break down this unity (and its fidelity to manuscript fragments only) and the isolation of data within it imposed by its material form. If we let go of the author projection, if we allow the text to be just a rhizomatic shoot in a much larger assemblage, if we organize our fidelity around the emergent materiality of a textual culture moored in the past but extending into the present, we interrupt business as usual yet again and the end of the archive becomes its end again—its finish/finale—its obsolescence becomes its purpose. The goal, now, is for the digital work on and in the archive to transcend the archive itself—for it to yield information and data that can be re-organized into a networked format that allows the material archive to maintain its integrity and locality while lending its authority to identifying a new phenomenon and decoding it—BY encoding it. To translate now for meaning the sententia rather than verbatim, if we use Jerome to help us out.

If we zoom out for a moment from the single manuscript, or single Slide28author, or single repository archive, we can see that Piers Plowman, for instance, even in all its complexity is just “a littil thing, the quantitye of an hesil nutt in the palme of my hand.” Piers itself is immediately connected to a whole constellation of other works within the archival objexts themselves. This graph, to give you an idea, is a data visualization network. It shows the contents of all the manuscripts that contain some form of Piers Plowman, and it helps us to see what else Piers circulated with and how often. Each node represents a work or a category of work—compare “The Siege of Jerusalem” node with “Histories of Britain”—and it is connected to other nodes based on whether the two works co-occur within one of the manuscripts. Both nodes and connections grow larger the more often a work or co-occurrence happens (respectively). The graph is thus not only a map of the manuscripts’ contents, but also a visual index of the frequency of works’ appearance and their co-occurrence within manuscripts. PiersGeneralOverviewPlotBut this too is only a much smaller selection of a larger phenomenon. Piers here is the boundary condition of this graph, which means the edges of this network we are seeing are entirely determined based on whether a work is in a Piers manuscript. This certainly doesn’t show us all the Sieges of Jerusalem, for example, which would exist in its own network of texts that we cannot fully see here. Indeed, any one of these texts that exists in more than one manuscript could be used to expand both our graph and our network. Not to mention that each of these objects is part of the “long fifteenth century” and the unusual and still under-examined phenomenon of prolific manuscript production in England. Most surviving Middle English manuscripts come from the fifteenth century. Even in the case of fourteenth-century authors as varied as the early Yorkshire writer Richard Rolle and the much later Ricardian poets, Chaucer, Gower and Langland. In all these cases the vast majority of manuscripts are fifteenth-century.

Slide30And this, at long last is where my final disruption, medium data, comes in. Medium data, by the way, is a term I use to define quantities of data that are large enough to be useful to certain types of machine reading or computational analysis, but small enough to warrant or even require human intervention—not just in interpretation, but in the creation and collection of data itself. If DH has built its reputation on big data, medium data, in archives numbering in the dozens or hundreds rather than the tens of thousands, throws a wrench in the project. For pre-print archives to participate in what digital methods make available to us, we will have to be creative. We are now at this very moment experiencing a media shift and its attendant cultural revolution. In the same way that all knowledge in manuscripts that did not make it into printed texts was lost—as in it became literally inaccessible to most people—so will all the knowledge of both print and manuscript cultures be lost if they don’t make the transition to digital in a sustainable and forward compatible way. This is the great knowledge migration. And today’s revolution is tomorrow’s institution. If something isn’t online and you want it to make this transition, it’s literally your job to put it there. Now, print is going to make the jump just fine. It makes for a good machine reading and big data. Optical character recognition can read and transcribe all but some of the earliest printed books and those texts can be set up to be text-mined and analyzed, all more or less automatically. But manuscript culture is a hand-inscribed textual tradition, it’s too human for any automation to function without human input. It’s too varied, too irregular, too individual and idiosyncratic. It requires readers. And skilled copyists. It requires digital scribes. DisruptingDH.014

And not just a blind reproduction—digital facsimiles do that already; transcriptions take more engagement, but they will treat the text like a dead object. But translation into data, that helps keep the object alive, that makes it malleable and pulsing within the hands of digital compilers who are every bit as creative as medieval ones—this requires intervention and interpretation on the part of those who create data from the archive, who mine it and its objects for information that can be aggregated in meaningful inscriptions. To translate a hand-made, hand-written archive into a usable digital object, we must engage each object by hand. We must write code or do data entry for each individual object, by hand. The process cannot be automated or relegated to a machine because manuscripts require not only professional writers but professional readers. We’ll need a veritable army of paleographers, experts, art historians and codicologists in addition to programmers, coders and data specialists. Being an expert in manuscript data requires expertise in both manuscripts and data.

Now, no one is arguing for a destruction of the archive or of printed books, or even for replacing the physical archive with a digital one, but if it is to survive the archive as it is must end. This process of endlessly reproducing it must end. Or at least have a new end. We need to be building—we need to be creating a digital infrastructure and manually transmitting the data from our objects, our archives, into the text-nology of the future. Or it will die; it will reach its final end, and disappear into the abyss of obsolescence.


*I want to point out that PPEA does a necessary work to make information available in traditional formats. Moreover, they are definitely moving toward thinking about their own limitations, so a critique of the limits of their format is not directed at the archive’s shortcomings but merely thinking about the epistemology of their format and of early digitization projects. I’m merely using the PPEA as a stand-in for early digitization projects and they way they reproduced the frameworks of print culture and the limits of manuscripts, because that’s all we could think when we imagined the earliest digital projects.

Please do collaborate!

