Data Humanities I: Data v. Publication

When I started thinking about this post, I was originally not going to mention the impetus that made it so urgent to write this right now.  The truth is, before the project has even gone anywhere at all, I already got my first very nasty comment from a Piers Plowman troll (who knew such things existed!) telling me that I should “seek alternate employment” because I was doing such a bad job of… well…whatever it is she thought I was doing.

The basis for this criticism? The JSON-encoded description of the Z-text manuscript failed to include in the MS contents two works that Rigg and Brewer identify in the MS in their edition of the Z-text.  It only included the contents that are listed in the Bodleian catalogue, which is admittedly a little sketchy in some arenas–a few of which haven’t actually been updated since what appears to be the 18th century when the hand-written descriptions of early collections were first written!! Which is not at all a criticism of the Bodley’s catalogues.  Indeed, I get a great kick out of telling my Victorianist friends that the catalogues for my materials are older than their archives.  In the game of whose-stuff-is-oldest one-ups-man-ship, I usually win.

I fantasized about this post simply being a beautiful and compelling manifesto calling for generosity and collectivity without having to even acknowledge the negative comment that made this a more pressing issue to address than I originally thought.  But then I realized that not to acknowledge the difficulties, failures, and mistakes I make is to undermine the project of radical transparency that I am embarking upon. It also protects this very courageous “Jane Doe” from ever having to feel that perhaps her comments went wide of the mark.  So, without further ado, Ms. Jane Doe, this blog is dedicated to you.  You have inspired a great many productive reflections and it would be disingenuous of me not to give you due credit for that.

Buckle up, folks. I’m going to use the word “radical” a lot here, inspired by other radicals advocating for radicality.

Continue reading Data Humanities I: Data v. Publication

Advertisements

@context for Trinity College Dublin MS 212

Last week, we went over how to write simple JSON to describe a manuscript object.  It wasn’t a perfect description (in fact, if you noticed, I used the same “name” in two different “name”/value pairs to mean two different things! I used “folios” to refer to how many folios the MS contained and which folios Piers occupied), but it was valid code.

"folios" fail

 

What I want to talk about this week is how to write descriptions in JSON that are able to be incorporated directly into a linked data framework.  Now, I’m going to talk at more length in a later post on linked data and the basic principles thereof.  To define it in brief, though, I’ll share this definition from W3C:

Linked Data is a way to create a network of standards-based machine interpretable data across different documents and Web sites. 

Today, I simply want to show you how to go from JSON to JSON-LD in a few simple steps that aren’t very much harder than what we did last time.

Continue reading @context for Trinity College Dublin MS 212

Seeing the Body of Piers Plowman with Digital Eyes

Body. Corpus. The material manifestation, the incarnation, even, of the Piers Plowman text in real, fleshy (or pulpy), material objects.

If we think of the corpus of Piers Plowman as inclusive of all its various instantiations and incarnations, how do we think about seeing all of it at once? And moreover, how can we know the specificities of those discrete bodies that contain the poem?  What else might be in those same bodies with Piers ? What other limbs, organs, or members might this body have? What becomes visible if we decide to take the entirety of the manuscript corpus as intrinsic to our definition of what the Piers Plowman corpus–its material and textual body–really is? 

In an effort to try to see what kinds of texts make up this corporeal phenomenon, I attempted to create a single graphic that displays all of the contents of all of the various Piers Plowman manuscripts and their material relation to one another as well as the frequency of their occurrence in the corpus.

The result of that endeavor is this Data Visualization Network created in R with igraph package.

PiersGeneralOverviewPlot

PiersGeneralOverviewPlot

Continue reading Seeing the Body of Piers Plowman with Digital Eyes

The Digital-Material Nexus

Or, simply put, what can you possibly learn about a material phenomenon from digital data visualizations? 

There are a number of overly simplistic answers to that question: “Latour!” or the less polite “Haven’t you read Latour?” or the naive optimist’s “lots of stuff!” or the scientific-method-minded mother “You won’t know until you try it, will you?”  But the case I hope to make to you today is not that data visualizations are useful but that they are necessary in order to better comprehend the material phenomenon of manuscript production, particularly for a single text.

Yes, I said it, necessary. Now I could drown you in theory discussing the gap between language and material phenomenon and the insuperability of that gap,* but what I want to discuss instead is a way not to overcome that gap, but to dissolve it entirely by understanding the interconnected way in which matter and form must inevitably work together in the production of knowledge.

Continue reading The Digital-Material Nexus

Bodley MS 851

That’s, right, let’s begin with the Z-text.  For a first JSON post we are going to start with one of the earliest manuscripts, and I am going to do nothing but talk about JSON and describe the Z-text manuscript in valid JSON code.

So, to get going, let’s talk for a second about this horrifying acronym, “JSON,” which stands for Java Script Object Notation. JSON  is a simple way to store and send STRUCTURED DATA. It is typically used for allowing a web page to exchange data and messages with the server without the whole page having to refresh or update.  It’s simple, it’s complete, and it doesn’t interfere with your ongoing activity on a page but allows you to see more.  Think about things that pop up when you hover over an object, or a shopping cart that may show what’s in your cart without leaving the page you’re on.

Why JSON for data? Well, we are going to talk more about JSON-LD for linked data in the next blog post, but the simple answer is that it allows us to describe a real world object in code in such a way that the resulting script is BOTH human- and machine-readable.

I swear, it’s not scary at all.  Simple JSON often looks like this:

{

“name”: “Jane Medievalist”,

“institution”: “Medieval University”

“books”: [

“name”: “The Middle Ages Rock”,

“name”: “You wish you were medieval!”,

“name”: “So you think you can alliterate?”

]

}

What is this? Well, it’s a JSON Object, which we know because the whole thing is enclosed in the curly brackets { }. JSON operates on objects which can be as simple and elaborate as you like.  Today, I’m presenting the Z-text manuscript as a SINGLE JSON OBJECT.

Continue reading Bodley MS 851

Piers Manuscripts Visually Categorized

Piers Manuscripts Grouped by Textual Variety

This is a circle packing diagram that, as a static image, doesn’t look like much.

Each of the white circles represents an individual Piers Plowman manuscript. Circles are ROUGHLY sized proportional to the number of lines of Piers poetry contained in each manuscript. Lines are estimated based on variations accounted for in the Kane editions and can be viewed in the JSON on the live bl.ock.

Grades of blue circles represent sub-categories of manuscripts based on the textual contents of each. Categorizations take note both of distinct combinations of A, B, and C text and of differentiations between how much text or how much of each are in a given sample of Piers Plowman.

Continue reading Piers Manuscripts Visually Categorized

A Year in Piers

One year, mucking around in the manuscript corpus of Piers Plowman.  One manuscript. One visualization. Every week. For FIFTY-TWO weeks.

This blog marks the commencement of a year-long intensive digital exhibition of the manuscript corpus of Piers Plowman. Now the fifty-two physical manuscripts themselves are distributed across twenty-one different libraries or repositories in four different countries on two continents. Due to these physical constraints, I’m clearly not putting them all in one room (though I do think that would be a fabulous and exciting endeavor) and then blogging about it.  Nor am I simply going to be creating an “online exhibition” of each manuscript, though that kind of collecting is a part of this endeavor. Neither the material nor the digital “presence” of the Piers Plowman manuscripts is really on offer here. This blog is by no means an attempt to translate either the physical body of Piers, or the text each one instantiates, into digital form.

Most importantly, I’m not offering a surrogate (or surrogates) for the manuscripts themselves–something that would allow you to do some kind of substantive research upon a manuscript object, just at a distance. Surrogacy and text encoding are both projects that are already well underway at the Piers Plowman Electronic Archive and even from some of the various repositories that have digitized their Piers manuscript(s).

What I’m offering instead are slices of data and a data infrastructure for making both the text and the materiality of Piers Plowman accessible online and as part of the growing semantic web of linked data.

This project, then, is twofold.  For the next year, I will be making a minimum of two posts per week.

Continue reading A Year in Piers

Visual studies in the Piers Plowman manuscript corpus.