Group for Experimental Methods in Humanistic Research
at Columbia University

Building a Digital Bāburnāma

dataset
  • Moacir P. de Sá Pereira
  • Manan Ahmed

The Embodied Space Lab at Columbia University is constructing a collaborative research dataset of the c16 memoir Bāburnāma. The project is led by Research Data Librarian Moacir P. de Sá Pereira and Associate Professor of History Manan Ahmed.

The project aims to build a digital version of Annette Beveridge’s 1912–1922 translation of Bāburnāma, improving the geospatial data already collected as part of the earlier XPMethod project “Mapping Mughal Hindustan, 1500–1600 CE.” In addition, the researchers will assemble a rich collection of datasets around the work, tracking places mentioned, people, their relationships to each other, and various sentiments, laying the ground-work for a gazetteer of medieval Central Asia.

Beveridge’s translation is available for partner institution download from HathiTrust and, importantly for our purposes, also available as a human-edited text file from Project Gutenberg, made possible by the efforts of Barbara Tozier, Turgut Dincer, Bill Tozier, and the Online Distributed Proofreading Team.

Preliminary Research Questions

  • How does named entity recognition compare with manual annotation on a twentieth-century translation of a sixteenth-century text?
  • What geospatial narrative does Bābur encode in the memoir?
  • How turbulent are Bābur’s relationships with other people in the text?
  • How does Bābur related to place? Is there a map of “Happy Bābur” and a map of “Sad Bābur”?

Phase I (2019–2020)

  • Geospatially annotated digital Beveridge translation
  • Comparison of manual and computer-driven geospatial entity recognition
  • Co-authored publication

Later Phases

  • Biographical annotation
  • Network of persons and relations
  • API access to datasets
  • Reconciliation/synthesis with other editions