Group for Experimental Methods in Humanistic Research
at Columbia University

Epigraphing the 19th Century

experiment
  • Aaron Plasek
updates ↓

09/01/15 Collected 7,689 epigraphs from 3,000 XML-encoded 19th-century US novels from the Early American Fiction and Wright American Fiction archives.
09/01/15 Newest version of XML scraper collects a range of bibliographic and epigraphic information, and identifies novels that are potentially mistagged.

Frequently ignored and occasionally made up, the epigraph is a textual genre defined both by its physical placement on the page and by the absence of the textual object being signposted. An epigraph attribution situates the text it prefaces within a larger constellation of texts and authors, and in this manner has an indexical function rather similar to scanning the spines of books on a shelf, flipping through a card catalog, or examining a record in a digital relational database. The affordances of citation networks cannot replace other critical methods, but a comparative approach to the different kinds of citation practices made visible by different networks of attribution provides an opportunity to reconsider how shared concepts that constitute a (disciplinary) field are produced, transmitted, and inhibited through our media-centric notions of consensus. From working with many nineteenth-century novels, we know that the Bible and Shakespeare were frequently used as novel epigraphs, but little is known about who else was being quoted. Nor is anything known about the larger patterns of affiliation and citation assembled by these texts.

Calls for Participation

  • Finding untagged epigraphs in TEI-XML Files While epigraphs have been tagged in XML by a human transcriber, there are certain novels in the digitized collection (such as James Fenimore Cooper’s Wyandotte) that are mistagged.1,2 Want some experience editing TEI? Have an algorithmic approach for tagging “rogue” epigraphs? Drop us a line!

  • Name Disambiguation There is little standardization of how names were spelled in nineteenth-century texts. (To wit, there are at least eight spellings of “Shakespeare.”) Epigraph attribution formats are similarly unpredictable: separating titles (when present) from author names (when present) is a difficult algorithmic problem. Want to try out a clever “identity resolution” algorithm on a new data set? Our epigraph collection will be made open access shortly.

Selected Talks / Publications / Press

1 Aaron Plasek and Rob Koehler. “Mediating Genres of Prestige, Credit, and Authority: The Epigraph and the Citation.” Conference paper. Digital Crucible: Arts, Humanities, & Computation. Leslie Center for the Humanities & Neukom Institute for Computational Science. Dartmouth College, Hanover, NH. 7 October 2014.

2 Aaron Plasek. “Fail Better: On Algorithmic ‘Transparency’ as Critical Procedure.” Media Res 1: DH Lightning Talks. CUNY Graduate School, New York, NY. 8 May 2015.