Key Pronouns through Wmatrix in a Novel of Formation: Conrad’s The Shadow-Line


  • Giuseppina Balossi Liceo Scientifico e Musicale 'G.B.Grassi' (lecco) Independent scholar



Joseph Conrad, fiction, corpus linguistics, Wmatrix, corpus-assisted methods, POS keyness analysis.


The advent of the digital age has had an enormous impact on the way we research, teach and think about language (Leech 1992: 1). In digital humanities, the application of computer-assisted methods can facilitate investigation of corpora, (i.e. digitised samples of language in use), lead to discoveries barely detectable with the naked eye and help put interpretation by intuition to test. Such methods can also assist in investigation of the language of single texts, and make our close reading more effective. This article aims to suggest how we may investigate the narrative voices in a work of fiction through the program Wmatrix (Rayson 2009). The case study under analysis is Conrad’s Bildungsroman The Shadow-Line, A Confession (1917), a story revolving around a young inexperienced sea-captain who, during his first command of a ship, has to overcome a series of difficulties to accomplish his mission. In this work, the neat divide between the I protagonist-narrator’s internal world and the external adult world he has to confront lends itself to investigation of the first person and other personal pronouns in the whole work. Given that stylistic analysis of literary texts is fundamentally a comparative process, I am here interested in comparing the target text, The Shadow-Line to two comparison texts by the same author. The keyness statistics for the pronouns and their detailed analysis through concordances will contribute to placing the I-voice at the centre of the narration and to identifying the foregrounded patterns of pronoun use that convey the I-voice and the other narrative voices throughout the story.





Balossi, G. (2014). A Corpus Linguistic Approach to Literary Language and Characterization: Virginia Woolf’s The Waves. Amsterdam: John Benjamins.

Balossi, G. (2015). A computer-aided approach to I and the World in Conrad’s The Shadow Line. On-line Proceedings of the Annual Conference of the Poetics and Linguistics Association (PALA) pp.1-18. [ conferences/proceedings/2015/Balossi2015.pdf]

Benson, C. (1954). Conrad’s Two Stories of Initiation. PMLA, 69(1), 46-56. doi:10.2307/460126

Biber, D. (2011). Corpus linguistics and the study of literature: Back to the future? Scientific

Study of Literature 1(1) pp.15-23. DOI: 10.1075/ssol.1.1.02bib

Biber, D., Conrad, S., Finegan, E., Leech, G. & Johansson, S. (1999). Longman grammar of spoken

and written English. Harlow: Longman.

Biber, D., Conrad, S. & Reppen, R. (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP.

Burrows, J. F. (1987). Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method. Oxford: Clarendon Press.

Conrad, J. [1917] (2013). The Shadow-Line, A Confession. The Cambridge Edition to the Works of Joseph Conrad, Stape, J. H.,

Simmons, A. S. (eds). Cambridge: Cambridge University Press.

Conrad, J. [1917] (2006). The Shadow-Line. Electronic Version,; [EBook #451].Retrieved February 21, 2018.

Conrad, J. [1902] (2009). Heart of Darkness. Electronic Version,; [EBook #219]. Retrieved February 21, 2018.

Conrad, J. [1910] (2009). The Secret Sharer. Electronic Version,; [EBook #220].Retrieved February 21, 2018.

Damerau, F.J. (1975). The Use of Function Word Frequencies as Indicators of Style. Computers and the Humanities, 9, pp. 271-280.

Garside, R. & Smith, N. (1997). A hybrid grammatical tagger: CLAWS4. In Garside, R., Leech, G., McEnery, A. (eds ). Corpus Annotation: Linguistic Information from Computer Text Corpora, pp.102-121. London: Longman.

Gibbons, A. & Macrae. A. (eds) (2018). Pronouns in Literature: Positions and Perspectives in Language. Basingstoke: Palgrave MacMillan.

Hirsch, M. (1979). The Novel of Formation as Genre: Between Great Expectations and Lost Illusions. Genre 12 (3), pp. 293-311.

Hoover, D.L. (2017). The microanalysis of style variation. Digital Scholarship in the Humanities, Vol. 32, Supplement 2, pp. ii17–ii30. Downloaded from

Kestemont, M. (2014). Function Words in Authorship Attribution. From Black Magic to Theory? Proceedings of the Third Computational Linguistics for Literature Workshop, co-located with EACL 2014 - the 14th Conference of the European Chapter of the Association for Computational Linguistics (27 April 2014, Gothenburg, Sweden), pp.59-66.

Leech, G. (1992). Corpora and theories of linguistic performance. In Jan Svartvik (ed.), Directions in corpus linguistics, pp. 105-122. Berlin & New York: Mouton de Gruyter.

Leech, G. (2013). Virginia Woolf meets Wmatrix. Études de stylistique anglaise.

Online since 19 February 2019, connection on 02 May 2019. URL:

esa/1405; DOI: 10.4000/esa.1405

Mahlberg, M. (2013). Corpus Stylistics and Dickens’s Fiction. London: Routledge.

Mahlberg, M. & P. Stockwell (2016) Point and CLiC: teaching literature with corpus stylistic tools. In Burke, M., Zyngier, S., Fialho, O., (eds) Scientific Approaches to Literature in Learning Environments, pp. 253-270. Amsterdam: John Benjamins.

McIntyre, D. & Archer, D. (2010). A corpus-based approach to mind-style. Journal of Literary Semantics 39(2), pp.167-182.·

Murphy, S. (2015). I will proclaim myself what I am: Corpus stylistics and the language of Shakespeare’s soliloquies. Language and Literature 24 (4), pp. 338-354.

Propp, V. (1968) (2nd edn). The Morphology of the Folktale. (Trans. by Laurence Scott). Austin: University of Texas Press.

Rayson, P. (2008). From key words to key semantic domains. International Journal of Corpus Linguistics. 13(4), pp. 519-549. DOI: 10.1075/ijcl.13.4.06ray

Rayson, P. (2009). Wmatrix: a web-based corpus processing environment, Computing Department, Lancaster University.

Schreibman, S., Siemens, R. & Unsworth, J. (eds) (2016). A New Companion to Digital Humanities (2nd edn). Oxford: Wiley-Blackwell.

Scott, M. (1997). PC analysis of key words - and key key words. System 25(2), pp. 233-245. DOI:


Stubbs, M. (2005). Conrad in the computer: Examples of quantitative stylistic methods.

Language and Literature 14(1), pp. 5-24. DOI: 10.1177/0963947005048873

Tabata, T. (1995). Narrative Style and the Frequencies of Very Common Words: A Corpus-Based Approach to Dickens’s First Person and Third Person Narratives. English Corpus Studies (2), pp. 91-109.




How to Cite

Balossi, G. (2020). Key Pronouns through Wmatrix in a Novel of Formation: Conrad’s The Shadow-Line. Umanistica Digitale, 9, 79-96.