Unlocking 18th- and 19th-century Playbills with AI: An Experiment in Qualitative Data Categorization

Authors

  • Deven Parker University of Glasgow
  • Kaiwen Zheng University of Glasgow
  • Michael Gamer University of Pennsylvania
  • Joemon M. Jose University of Glasgow

DOI:

https://doi.org/10.60923/issn.2532-8816/21718

Keywords:

playbill, artificial intelligence, theatre, drama, Large Language Models, genre, performance

Abstract

Recent advances in large language models (LLMs) have generated excitement from scholars in the humanities about the ways our research might be transformed with their successful integration, particularly for tasks that require multiclass classification or categorization. This paper offers an interdisciplinary exploration by researchers in the fields of drama and computer science to use generative AI techniques to categorize, tag, and analyze qualitative information from eighteenth- and nineteenth-century British playbills. Our goal was to replicate the process of manually tagging playbills via categorization in a manner that preserves playbills’ intricate nested data structures. Our results indicate that there is promising work to be done on humanities data where there is already an underlying classification structure in place; at the same time, our attempts revealed unexpected layers of complexity and ambiguity in this dataset, particularly around genre categorization and nested performance data. Ultimately, we wish to highlight not just LLMs’ capacity to categorize historical data at scale but also to shed new light on that data’s existing intricacies.

References

[1] Carroll, Claudia. 2024. “Towards an AI narratology: the possibilities of LLM classification for the quantification of abstract narrative concepts in literary studies.” In The Routledge Handbook of AI and Literature, edited by Will Slocombe and Genevieve Liveley. Routledge.

[2] Cox, Jeffrey N. 1999. “‘Spots of Time’: The Structure of the Dramatic Evening in the Theater of Romanticism.” Texas Studies in Literature and Language 41 (4): 403-425.

[3] Dobson, James E. 2023. “On reading and interpreting black box deep neural networks.” International Journal of Digital Humanities f5: 431-449. https://doi.org/10.1007/s42803-023-00075-w.

[4] Fields, John, Kevin Chovanec, and Praveen Madiraju. 2024. “A Survey of Text Classification With Transformers: How Wide? How Large? How Long? How Accurate? How Expensive? How Safe?” IEEE Access 12: 6518-6531. 10.1109/ACCESS.2024.3349952.

[5] Gamer, Michael, Cassidy Holahan, and Deven Parker. 2024. “The Romantic Melodrama Project: Or, Playbills! Performance!! Metadata!!!” Studies in Romanticism 63 (3): 391-401.

[6] González-Gallardo, Carlos-Emiliano, Hanh Thi Hong Tran, Ahmed Hamdi, and Antoine Doucet. 2024. “Leveraging Open Large Language Models for Historical Named Entity Recognition.” Lecture Notes in Computer Science 15177: 379-395. https://doi.org/10.1007/978-3-031-72437-4_22.

[7] Humphries, Mark, Lianne C. Leddy, Quinn Downton, Meredith Legace, John McConnell, Isabella Murray, and Elizabeth Spence. 2025. “Unlocking the archives: Using large language models to transcribe handwritten historical documents.” Historical Methods: A Journal of Quantitative and Interdisciplinary History, 1-19. http://doi.org/10.1080/01615440.2025.2500309.

[8] Karjus, Andres. 2025. “Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligence.” Humanities and Social Sciences Communications 12 (77): 1-18. https://doi.org/10.1057/s41599-025-04503-w.

[9] Lampert, Christoph H., Hannes Nickisch, and Stefan Harmeling. 2014. “Attribute-Based Classification for Zero-Shot Visual Object Categorization.” IEEE Transactions on Pattern Analysis and Machine Intelligence 36 (3): 453–65. https://doi.org/10.1109/TPAMI.2013.140.

[10] Pedersen, Bjørn-Richard, Maisha Islam, Doris Tove Kristoffersen, Lars Ailo Bongo, Eilidh Garrett, Alice Reid, and Hilde Sommerseth. 2024. “Coding Historical Causes of Death Data with Large Language Models.” Lecture Notes in Computer Science 14129: 30-47. https://doi.org/10.1007/978-3-031-73741-1_3.

[11] Pérez Pozo, Álvaro, Javier de la Rosa, Salvador Ros, Elena González-Blanco, Laura Hernández, and Mirella de Sisto. 2022. “A bridge too far for artificial intelligence?: Automatic classification of stanzas in Spanish poetry.” Journal of the Association for Information Science and Technology 73 (2): 258-267. https://doi.org/10.1002/asi.24532.

[12] Russell, Gillian. 2020. The Ephemeral Eighteenth Century. Cambridge University Press.

[13] Spina, Salvatore. 2023. “Artificial Intelligence in archival and historical scholarship workflow: HTS and ChatGPT.” Umanistica Digitale 7 (16): 125-140. https://doi.org/10.6092/issn.2532-8816/17205.

[14] Vareschi, Mark and Mattie Burkert. 2016. “Archives, Numbers, Meaning: The Eighteenth-Century Playbill at Scale.” Theatre Journal 68 (4) (2016): 597-613.

[15] Yang, Liu, Gang Wang, and Hongjun Wang. 2024. “Reimagining Literary Analysis: Utilizing Artificial Intelligence to Classify French Poetry.” information 15 (2). https://doi.org/10.3390/info15020070.

Downloads

Published

2025-10-21

How to Cite

Parker, D., Zheng, K., Gamer, M., & Jose, J. M. (2025). Unlocking 18th- and 19th-century Playbills with AI: An Experiment in Qualitative Data Categorization. Umanistica Digitale, 9(21), 33–83. https://doi.org/10.60923/issn.2532-8816/21718

Issue

Section

Articles