DOI: http://doi.org/10.6092/issn.2532-8816/8615

Abstract

In this paper I discuss the distribution of grammatical monosyllables in the iambic pentameter line. I show that in Milton’s Paradise Lost, the word OF appears with greater than expected frequency at the beginning of the line; 27% of all instances of OF are in the first of the ten metrical positions, and 5% of all the lines begin with OF, making it the second most frequent line-initial word. I suggest that this might reflect the way that Milton uses enjambement in the poem. It also means that OF may function as a clue to the beginning of the line, in the context of other evidence for lineation, essential if the audience is to establish the metrical form of the poem. In contrast, in the eighteenth century blank verse long poems of Thomson and Cowper, the word AND is relatively frequent at the beginning of the line. But Wordsworth uses both OF and AND as frequent line-initial words, merging Milton’s formal practice with the practice of other writers. The paper concludes by reflecting on the relation between statistical characteristics of text and probabilistic aspects of our knowledge of literary form.

Questo articolo prende in esame la distribuzione di monosillabi grammaticali nel pentametro giambico inglese. Qui mostro che, in Paradise Lost di Milton, la parola OF compare con frequenza maggiore all’inizio del verso di quanto ci si potrebbe normalmente aspettare; il 27% di tutte le occorrenze si trova nella prima delle 10 posizioni metriche, e il 5% dei versi comincia con OF, il che la rende così la seconda parola più frequente all’incipit del verso. Questo potrebbe riflettere, suggerisco, il modo in cui Milton adopera l’enjambement nel poema. Ciò significa anche che OF potrebbe funzionare come indizio dell’inizio del verso, nel contesto di altre evidenze per la divisione del poema in tale unità, essenziale se il pubblico deve determinare la forma metrica del poema. D’altro canto, nel verso sciolto del 1700 di Thomson e Cowper, è la parola AND a essere relativamente frequente all’inizio del verso. Invece Wordsworth usa sia OF che AND all’inizio dei propri versi, fondendo la pratica di Milton con quella di altri scrittori. L’articolo propone infine una riflessione sulla relazione tra le caratteristiche statistiche di un testo, e gli aspetti probabilistici della nostra conoscenza della forma letteraria.

Introduction

This paper is an exercise in word-counting, involving John Milton’s long seventeenth century poem Paradise Lost and a selection of long verse texts. Word-counting is a standard practice in much digital humanities research, but this paper adds a contextual feature not often considered, and specific to verse, defined as text which is divided into lines. I count OF, AND, and some other common non-referential grammatical monosyllables, and I focus specifically on whether these monosyllables are in line-initial position or some other position. This question is particularly interesting for Paradise Lost (PL), a poem whose first two lines begin with the very frequent word OF, and in which 5% of lines overall begin with OF. The statistic on which I focus is that 27% of all instances of OF are line-initial. I look at a selection of other iambic pentameter poems after PL, such as Thomson’s The Seasons and find in contrast that OF is not relatively frequent at line-beginning, but that instead the highly frequent word AND is relatively frequent at line beginnings (unlike PL). Wordsworth in the 1805 Prelude uses both OF and AND frequently at line beginning, thus merging Milton’s OF pattern with the AND pattern of other predecessors who influenced him such as Thomson.

It is likely that these distributions are side-effects of another practice: that certain words appear at the beginning of the line because of the syntactic or discourse structure of the text, relative to the poet’s enjambement practice; that is, OF may be particularly frequent in PL because noun phrases are easily split across lines, while AND may be particularly frequent in other poets who favour a more paratactic sequence with less radical phrase-splitting enjambements. However, even if the distributions of these words at line-beginning are side-effects of linguistic and poetic structure, it is also possible for them to function as clues to the line boundary. That is, if a word is relatively frequent at the beginning of the line, then it offers evidence for where the line boundary falls (evidence which is weak on its own, but strengthened in the context of other evidence). A hearer of the text (and a reader) must be able to establish line boundaries in order to parse the metre of the poem, which I suggest is a necessary part of the poem’s reception.

The distribution of grammatical monosyllables in iambic pentameter poetry

The iambic pentameter line, and grammatical monosyllables in books 8-9 of Paradise Lost

All the poems to be considered here are in the metre iambic pentameter. The iambic pentameter line is matched to a metrical template of ten positions, alternating weak and strong. The line can also be thought of as a sequence of five iambic feet, each a pair of weak followed by strong.

W S W S W S W S W S

These ten positions are filled by ten syllables. Sometimes two syllables can fit a single position, and sometimes there is an extra syllable at the end. There is a characteristic rhythm, with stressed syllables tending to fall in the S positions, and unstressed or weakly stressed syllables tending to fall in the W positions. This rhythm is rarely completely periodic (i.e., alternating unstressed and stressed syllables across the line). The precise constraints on what rhythmic variations are possible have been much discussed in metrical theory; Robert Bridges wrote a book on Milton’s prosody, and recent theoretical approaches include Hanson and Kiparsky , Fabb and Halle , and Hayes, Wilson and Shisko .

In this paper, I focus on thirteen grammatical monosyllables. These include the ten words which Ingram and Swaim ( : ix) omitted from their printed concordance of Milton because they were of such high frequency: A, AND, BY, FOR, IN, OF, ON, THE, TO, WITH. To this I have added the preposition AT (the next most common of the prepositions in PL), and OR and BUT (more common in PL than some of the words listed by Ingram and Swaim). I have treated both A and AN as a single word. All of these words - prepositions, articles and conjunctions - tend to be unstressed, and they are much more likely to appear in W positions than in S positions. They are thus good candidates for the first position in the line. I have not included grammatical monosyllables which can carry reference, the pronouns and demonstratives, both because they are easily stressed and also because they carry significant semantic meaning, and thus their distribution is likely to be subject to different factors than the prepositions, articles and conjunctions under consideration here.

This paper centres on the distribution of OF in PL. Bruce Hayes has prepared a version of books 8 and 9 of the 1667 edition of PL in which syllables are separated into metrical positions, thus making it easy to discover how the grammatical monosyllables are distributed across the ten positions of the iambic pentameter line, as shown in . At 2,293 lines, these two books constitute just over a fifth of the whole poem.

W S W S W S W S W S

1

2

3

4

5

6

7

8

9

10

total

AND

122

6

80

46

114

34

115

53

76

0

646

19%

1%

12%

7%

18%

5%

18%

8%

12%

0%

TO

109

36

47

40

56

31

79

75

73

0

546

20%

7%

9%

7%

10%

6%

14%

14%

13%

0%

THE

82

11

115

4

75

4

100

4

127

0

522

16%

2%

22%

1%

14%

1%

19%

1%

24%

0%

OF

128

19

53

23

80

21

62

23

46

0

455

28%

4%

12%

5%

18%

5%

14%

5%

10%

0%

IN

51

29

34

24

22

17

41

32

34

1

285

18%

10%

12%

8%

8%

6%

14%

11%

12%

0%

WITH

49

18

30

15

38

14

36

16

26

0

242

20%

7%

12%

6%

16%

6%

15%

7%

11%

0%

OR

36

1

24

17

22

9

28

14

31

0

182

20%

1%

13%

9%

12%

5%

15%

8%

17%

0%

BUT

44

1

10

12

21

8

19

16

20

0

151

29%

1%

7%

8%

14%

5%

13%

11%

13%

0%

ON

19

15

8

11

13

15

7

13

4

4

109

17%

14%

7%

10%

12%

14%

6%

12%

4%

4%

FOR

36

4

5

9

20

3

10

4

10

0

101

36%

4%

5%

9%

20%

3%

10%

4%

10%

0%

BY

26

8

9

8

12

9

12

6

5

1

96

27%

8%

9%

8%

13%

9%

13%

6%

5%

1%

A/AN

13

2

20

0

15

0

17

0

18

0

79

15%

2%

24%

0%

18%

0%

20%

0%

21%

0%

AT

9

6

6

4

7

3

5

9

17

0

66

14%

9%

9%

6%

11%

5%

8%

14%

26%

0%

total

724

156

441

213

495

168

531

265

487

6

3480

21%

4%

13%

6%

14%

5%

15%

8%

14%

0%

The distribution of the thirteen words in books 8 and 9 of PL (1667), ordered by overall frequency

The top two rows of show the ten metrical positions of the iambic pentameter line, and whether that metrical position is weak or strong. The left-hand column shows the thirteen grammatical monosyllables under examination. The right-hand column shows the total number of tokens of that word in the text. The cells show the number of tokens in each position, and the percentage relative to the total for that word. In this table, the monosyllables are ordered vertically by order of frequency of each word in these two books of the poem. For comparative purposes, I will retain the same order throughout this paper (though the frequency order is not the same in all the poems).

Because the grammatical monosyllables tend to be unstressed or weakly stressed, the metrical rules for iambic pentameter mean that they tend to fall in W positions, and this can be seen in . illustrates this, by showing nine lines, each with OF in a different metrical position; the rightmost column indicates the overall percentage of OF in that position in PL as a whole. OF strongly tends to be in W positions rather than S positions.

Book:line Position S/W %

Of Man, with strength entire, and free will arm’d,

9: 9

1

W

28

Shorn of his strength, They destitute and bare

8: 1062

2

S

4

From out of Chaos to the out side bare

9: 317

3

W

12

O fairest of Creation, last and best

8: 896

4

S

5

Deep to the Roots of Hell the gather’d beach

9: 299

5

W

18

Satan in likeness of an Angel bright

9: 327

6

S

5

Wondrous indeed, if cause of such effects.

8: 650

7

W

14

Above all Cattle, each Beast of the Field;

9: 176

8

S

5

O Conscience, into what Abyss of fears

9: 842

9

W

10

(NO EXAMPLES)

10

S

0

OF in the ten metrical positions in iambic pentameter in Books 8 and 9 of PL 1667 (equivalent to books 9 and 10 of PL 1674)

For comparison, shows ten distinct lines from books 8-9 with ON in each of the ten positions. ON can be unstressed, weakly stressed or stressed, and even in the very strong tenth position. ON appears overall 46% in weak positions and 54% in strong positions.

Line # Position S/W %

On what thou hast of vertue, summon all,

8: 374

1

W

17

Till on a day roaving the field, I chanc’d

8: 575

2

S

14

Chiefly on Man, sole Lord of all declar’d,

9: 401

3

W

7

The evil on him brought by me, will curse

9: 734

4

S

10

To Beasts, whom God on thir Creation-Day

8: 556

5

W

12

Bitter ere long back on it self recoiles;

8: 172

6

S

14

Shall with a fierce reflux on mee redound,

9: 739

7

W

6

There dwell and Reign in bliss, thence on the Earth

9: 399

8

S

12

Meanwhile ere thus was sin’d and judg’d on Earth,

9: 229

9

W

4

Like a black mist low creeping, he held on

8: 180

10

S

4

BY in the ten metrical positions in iambic pentameter in Books 8 and 9 of PL 1667

The grammatical words may thus differ in how likely they are to occupy various positions.

Iambic pentameter lines are sometimes considered to have a ‘caesura’, not a strict rule but an expectation that there is a major pause mid-way through the line, typically after the fourth or sixth syllable. I now consider whether placement of grammatical words is sensitive to this probabilistic mid-line boundary. Consider for example the use of OF after major pauses, as marked in the printed text by a comma or period in Books 8 and 9 of PL. As below shows, where OF appears after these major pauses, the pause is most likely to be after position 4 (the position of one of the ‘caesurae’).

W S W S W S W S W S

1

2

3

4

5

6

7

8

9

10

total

OF

128

20

53

24

77

20

62

23

46

0

453

28%

4%

12%

5%

17%

4%

14%

5%

10%

OF after major pause

17

1

4

3

16

1

10

4

3

0

59

% of total OF in this position

13%

5%

8%

13%

21%

5%

16%

17%

7%

13%

OF after a strong juncture in 1667 PL books 8 and 9

The interpretation of this finding is difficult, though it weakly supports the relevance of the caesura. It might suggest that where OF does not come at the beginning of a line, it is still drawn to the beginning of a half-line, thus marking out or responding to a metrical boundary, even if not the line boundary.

The bottom two rows of show that taken as a group the grammatical words have a relative distribution of 21% in first position and 79% in other positions. It is also notable that these words are almost never line-final: only 6 instances out of 3,480 are in tenth position. For all the poems discussed in this paper, for this set of grammatical words, a similar distribution is found: in the full Paradise Lost 20% of these grammatical words are line-initial, Milton Paradise Regained 22%, Shakespeare Antony and Cleopatra 20%, Dryden Aeneid 21%, Thomson Seasons 23%, Cowper Task 23% and Wordsworth 1805 Prelude 22%. This a striking cross-textual correlation, given the multiple and varying causal factors which are probably involved.

In order to test statistical significance, I assume that a word belonging to this set of grammatical words has an expected distribution of 21% at the beginning of the line and 79% in the remaining nine positions, and I check actual distribution in any poem against this. Other alternatives might have been considered; for example, we might have assumed that these weakly stressed monosyllables appear at chance level in any of the five weak metrical positions, but this is not a realistic expectation because of the various other factors involved in determining the distributions of this kind of words. It remains unclear why these grammatical words tend to appear between 20%-23% of the time in first position. We might have expected a lower percentage, because grammatical words can appear in any of the first nine positions, and thus we might have expected several percent less than 20% in first position. I assume therefore that multiple factors conspire to make grammatical monosyllables of this kind appear 21% of the time in first position.

For each of the poems I examine in this paper, I produce a table as in , here illustrated by the findings for Books 8 and 9 of PL (1667 edition), after preprocessing on available digital editions.

Word Total Position 1

Other Positions

χ2 p

n % n %

AND

646

122

19%

524

81%

1.741

0.187

TO

546

109

20%

437

80%

0.354

0.552

THE

522

82

16%

440

84%

8.809

0.003

OF

455

128

28%

327

72%

13.950

0.000

IN

285

51

18%

234

82%

1.657

0.198

WITH

242

49

20%

193

80%

0.083

0.774

OR

182

36

20%

146

80%

0.163

0.686

BUT

151

44

29%

107

71%

6.030

0.014

ON

109

19

17%

90

83%

0.837

0.360

FOR

101

36

36%

65

64%

13.055

0.000

BY

96

26

27%

70

73%

2.142

0.143

A/AN

79

13

16%

66

84%

0.983

0.321

AT

66

9

14%

57

86%

2.157

0.142

total

3480

724

21%

2756

79%

Books 8 and 9 of Milton: Paradise Lost (1667 edition)

The χ2 test used here examines differences between the observed distribution of each word in first position vs. other positions, compared with the expected 21%-79% distribution (determined as described above). I constructed two categories (similar to a Yes-No scale) to report whether an observation occurs in the first position or not. reports the distributions for two books of PL. It reports the specific χ2 value and significance is based on (p<0.05), i.e., based on a 95% confidence interval. A p values below 0.05 indicate significant differences, either greater than expected frequency or lower than expected frequency. Values in the χ2 and p columns are rounded to three decimal points.

Paradise Lost

Now consider , which expands to show the distribution of initial vs. non-initial words in the all 10,565 lines of Paradise Lost (1674 edition, from ).

Word Total Position 1

Other Positions

χ2 p

n % n %

AND

3336

578

17%

2758

83%

27.141

0.000

TO

2227

446

20%

1781

80%

1.271

0.260

THE

2752

399

14%

2353

86%

70.117

0.000

OF

2051

557

27%

1494

73%

46.873

0.000

IN

1365

273

20%

1092

80%

0.823

0.364

WITH

1162

277

24%

885

76%

5.642

0.018

OR

714

134

19%

580

81%

2.145

0.143

BUT

588

171

29%

417

71%

23.149

0.000

ON

536

86

16%

450

84%

7.933

0.005

FOR

466

120

26%

346

74%

6.341

0.012

BY

515

137

27%

378

73%

9.742

0.002

A/AN

563

107

19%

456

81%

1.350

0.245

AT

272

46

17%

226

83%

2.740

0.098

total

16547

3331

20%

13216

80%

Milton Paradise Lost 1674

In PL, four words have a distribution in first position, which (at 26-29%) is significantly higher than the expected distribution of 21%: OF, BUT, FOR and BY. Of these four words, only OF is a highly frequent word in the text as a whole, a factor which will be important in my suggestion that OF is a clue to the location of the line boundary: the more frequent the word, the better it functions as a clue. In contrast, the overall highly frequent word THE shows a distribution in first position significantly lower than the expected distribution; I do not focus on such significantly low distributions in this paper, since it is not clear that they play any function in communicating the line boundaries to the audience.

Of the 2,051 instances of OF in PL, 557 are at the beginning of the line, 27% of the total. This is a statistically significant difference from what I have suggested is the expected 21% for a grammatical non-referential monosyllable in this position. 5% of the 10,565 lines in PL begin with OF, and with 557 instances this makes OF the second most frequent line-initial word in the poem.

I propose that OF provides the reader (or listener) with evidence for line boundaries, evidence which is interpreted in the context of other evidence offered by the text. This enables the audience to establish lineation, and hence process the metre of the text. OF can perform this function because of its combination of high overall frequency and high initial frequency. Furthermore, the word OF is the first word in the first two lines of the poem, which might further contribute to its being taken as evidence of a line boundary. The first six lines of the poem are quoted below:

Of Mans First Disobedience, and the Fruit

Of that Forbidden Tree, whose mortal taste

Brought Death into the World, and all our woe,

With loss of Eden, till one greater Man

Restore us, and regain the blissful Seat,

Sing Heav’nly Muse, …

Here we see what Zwicky and Zwicky refer to as ‘patterns first’, the tendency for regularities to be established at the beginnings of poems. Because PL refers to the Aeneid in its beginning, it is interesting to compare the beginning of Thomas Phaer’s translation of Virgil’s Aeneid in iambic heptameter rhyming couplets. When completed by Thomas Twyne after Phaer’s death, this became the first full translation of the poem into English. Here are the first six lines (with several new lines before the translation proper begins in line 4):

I that my slender Oten Pipe in Verse was wont to sounde

Of woods, and next to that I taught for husbandmen the ground,

How fruite unto their greedy lust they might constraine to bring,

A work of thankes: Lo now of Mars, and dreadfull warres I singe,

Of armes, and of the man of Troy, that first by fatall flight

Did thence arrive to Lauine Land, that now Italia hight.

(Lally, 1987: 7)

Two of the first six lines begin with OF, and the fifth line has three instances of OF. Virgil’s Aeneid is relevant because the beginning of PL alludes to the beginning of Aeneid in various ways and specifically with reference to the first three words of that poem: arma virumque cano, literally arms man-the-and sing-I, whose second and third words are picked up in the first six lines of the poem (quoted above). ‘Arms’ and ‘man’ are in the accusative case in Latin, and some of the English translations choose to translate this literally as Arms and the man I sing who first did come (John Ogilby, 1649), or Arms, and the Man I sing, who, forc’d by Fate . Milton is of course not translating Aeneid in PL, but his use of OF in Of man’s ... echoes Phaer.

Milton’s distinctive placement of OF at the beginning of the line is amplified by John Philips in his 1701 parody of Milton, ’The Splendid Shilling’, a poem of 141 blank verse lines ( : 112), in which 7 of the 21 uses of OF are line-initial. This is a higher proportion than in PL, but we might expect distortions in a parody. The poem is so small as a sample that significance tests are unreliable.

Word Total Position 1

Other Positions

χ2 p
    n % n %  

AND

33

4

12%

29

88%

1.568

0.211

TO

15

4

27%

11

73%

0.290

0.590

THE

41

8

20%

33

80%

0.055

0.815

OF

21

7

33%

14

67%

1.926

0.165

IN

19

3

16%

16

84%

0.311

0.577

WITH

25

9

36%

16

64%

3.391

0.066

OR

23

5

22%

18

78%

0.008

0.931

BUT

4

3

75%

1

25%

7.031

0.008

ON

4

1

25%

3

75%

0.039

0.844

FOR

2

0

0%

2

100%

0.532

0.466

BY

5

1

20%

4

80%

0.003

0.956

A/AN

20

4

20%

16

80%

0.012

0.913

AT

6

0

0%

6

100%

1.595

0.207

total

218

49

22%

169

78%

Philips ’The Splendid Shilling’

If we return now to Paradise Lost, what might we say about other grammatical words? In terms of relative distribution in first position compared with other positions, BUT and BY are both about as relatively frequent as OF, and WITH and FOR are slightly less frequent than OF. These words might also constitute evidence for the line boundary, but because they are much less frequent overall, their status as evidence is weak. It is worth noting the relatively low percentage (17%) of instances of AND at the beginning of the line; this is interesting, because we will see that in non-Miltonic blank verse, AND is in contrast quite frequent at the beginning of the line.

I have suggested that the effect of having a higher than expected distribution of OF is that it allows OF to be evidence for the line boundary, if combined with other evidence (as discussed below). But this effect might arise from a reason unconnected from the cause: OF may appear with higher than expected distribution at the beginning of lines in PL as a side-effect of some other characteristic of the poem, such as its enjambements. For example, it is entirely plausible that Milton favours a kind of enjambement which splits a noun phrase across the line boundary, after the line-final noun, such that the next line begins with a preposition phrase including an OF-phrase. shows the first twenty instances of line-initial OF in PL; all but two have the OF-phrase as the complement of a noun, within a noun phrase.

line

/ Of Mans First Disobedience

not a complement within an NP

1

the Fruit / Of that Forbidden

2

on the secret top / Of Oreb

7

with all his Host / Of Rebel Angels

38

now / Of force believe Almighty

not a complement within a NP

145

from the Precipice / Of Heav’n

174

the force / Of subterranean wind

231

the shatter’d side / Of thundring Aetna

233

the sole / Of unblest feet

238

their liveliest pledge / Of hope

275

on the perilous edge / Of battel

277

the Mast / Of some great Ammiral

294

on the Beach / Of that Inflamed Sea

300

all the hollow Deep / Of Hell

315

the potent Rod / Of Amrams Son

339

a pitchy cloud / Of Locusts

341

th’ uplifted Spear / Of their great Sultan

348

the greatest part / Of Mankind

368

with blood / Of human sacrifice

393

to the stream / Of utmost Arnon

399

Phrasal contexts for line-initial OF in PL

Corns ( : 37) discusses the organization of sentences relative to lineation in PL and other poems by Milton and other writers: “Milton’s practice in organizing the arrangement of sentences within the ten-syllabled line may be distinguished generally from the norms contemporaneously obtaining and his practice in Paradise Lost is singularly unusual.” 20% of the sentences in PL overlap both beginning and end of the line, compared with 11.3% in Paradise Regained, and none at all in the samples Corns takes from three other poems from the 1640s-60s, Cowley’s Civil War, Dryden’s Annus Mirabilis and Fanshawe’s translation of the Lusiads. This difference relates to Milton’s generally freer notion of the relationship between lineation and all syntactic structures down to the intraclausal level, and this in turn may have as one of its consequences the relatively larger number of lines beginning with OF in PL. However, even if, as seems likely, the distribution of line-initial OF is related to the author’s willingness to split a noun phrase across a line, such a cause may be unconnected to the effect. That is, the enjambement practice or whatever else causes OF to appear at the beginning of the line with such frequency may have no bearing on the use of OF as evidence of a line boundary.

Poetry which does not show the PL pattern of line-initial OF

In this section I examine three other long blank verse poems, and one poem in rhyming couplets, and show that they do not have the distribution of OF found in PL.

I begin with Milton’s other long poem in blank verse, Paradise Regained (1671) (2070 lines in total, text from ).

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

695

146

21%

549

79%

0.000

0.996

TO

478

113

24%

365

76%

2.008

0.156

THE

554

92

17%

462

83%

6.446

0.011

OF

435

88

20%

347

80%

0.156

0.693

IN

243

50

21%

193

79%

0.026

0.871

WITH

179

35

20%

144

80%

0.226

0.635

OR

167

36

22%

131

78%

0.031

0.860

BUT

120

48

40%

72

60%

26.112

0.000

ON

94

22

23%

72

77%

0.328

0.567

FOR

92

36

39%

56

61%

18.229

0.000

BY

130

45

35%

85

65%

14.526

0.000

A/AN

147

25

17%

122

83%

1.413

0.235

AT

71

16

23%

55

77%

0.101

0.751

total

3405

752

22%

2653

78%

Milton: Paradise Regained

Here OF is at the beginning of the line for 20% of its instances, which is roughly the expected frequency for a grammatical monosyllable, and hence OF is not able here to function as evidence for the line boundary. Note that FOR (39%), BY (35%) and BUT (40%) show a significant initial distribution, just as they do in PL, though they are not overall frequent. In this poem there are no grammatical words which combine both a high overall frequency and a high line-initial frequency, and hence there is no good reason to think that grammatical words here function as evidence for the line boundary. Note incidentally that the overall distribution of monosyllables in first position is about the same as in PL (at 22%).

Now I consider one of Milton’s major influences, Shakespeare’s blank verse, an example of what Milton in the second edition of PL called our best English tragedies. Shakespeare’s blank verse in general does not make prominent use of line-initial OF. Antony and Cleopatra is the first play to rise in the use of line-initial OF, with 11% and all of the subsequent plays also have between 9-11%. shows the results for Antony and Cleopatra (the text used includes some prose and some non-iambic pentameter lines, so it is not exactly comparable to the blank verse poems discussed in this paper).

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

600

166

28%

434

72%

16.074

0.000

TO

554

115

21%

439

79%

0.020

0.889

THE

829

168

20%

661

80%

0.270

0.604

OF

433

46

11%

387

89%

28.102

0.000

IN

258

33

13%

225

87%

10.481

0.001

WITH

182

35

19%

147

81%

0.343

0.558

OR

61

19

31%

42

69%

3.786

0.052

BUT

181

50

28%

131

72%

4.788

0.029

ON

95

7

7%

88

93%

10.641

0.001

FOR

190

34

18%

156

82%

1.104

0.293

BY

102

23

23%

79

77%

0.148

0.701

A/AN

341

59

17%

282

83%

2.811

0.094

AT

74

16

22%

58

78%

0.017

0.896

total

3900

771

20%

3129

80%

Shakespeare: Antony and Cleopatra

The word AND combines high frequency with high initial position (a pattern we will see in most other blank verse other than PL). But in stark contrast to PL, the word OF is the second least likely word to appear at the beginning of a line. Milton’s blank verse PL is thus not imitating a Shakespearean model in this regard. Note that the aggregate frequency of these grammatical words at the beginning of the line is 20%, which is comparable to the frequencies found in blank verse in general.

Next I consider a poem not in blank verse (which is always unrhymed), but instead in rhyming iambic pentameter (in the pattern of heroic couplets). This is Dryden’s 1697 translation of the Aeneid in 13,700 lines. I have chosen this for comparison because it is a long poem within the same century as PL, as well as being a translation of Virgil’s Aeneid, an epic poem which in its original form has links to PL.

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

5042

1724

34%

3318

66%

528.970

0.000

TO

2194

375

17%

1819

83%

20.197

0.000

THE

7657

1306

17%

6351

83%

71.783

0.000

OF

2007

181

9%

1826

91%

173.670

0.000

IN

1586

195

12%

1391

88%

72.441

0.000

WITH

1929

413

21%

1516

79%

0.196

0.658

OR

327

137

42%

190

58%

86.065

0.000

BUT

526

336

64%

190

36%

582.930

0.000

ON

752

69

9%

683

91%

63.377

0.000

FOR

531

136

26%

395

74%

6.808

0.009

BY

581

88

15%

493

85%

12.000

0.001

A/AN

1538

177

12%

1361

88%

83.519

0.000

AT

351

72

21%

279

79%

0.050

0.823

total

25021

5209

21%

19812

79%

Dryden: Aeneid

The overall pattern presented here is very different from that seen in the Milton’s poems. Though the overall distribution of grammatical monosyllables in first position is 21%, the same as PL, there is a great deal of variation away from this, with many of the words showing a statistically significant variation above or below. Here, unlike PL, OF (9%) is at a very low frequency at the beginning of the line. Again unlike PL (but like Shakespeare) AND (34%) is the word which most combines overall frequency with relative frequency in first position. Furthermore, as in other poems BUT (64%) and OR (42%) are also relatively frequent at the beginning of the line, and here at a higher degree than in PL. It is likely that the major differences from Milton, including the much lower percentage of line-initial OF, come from the different form of the poem, where the rhyming couplets lead to a different kind of syntax with less enjambement. All the remaining poems to be discussed are in blank verse.

shows the results for James Thomson’s 1746 The Seasons, a blank verse poem in 5,541 lines .

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

1598

516

32%

1082

68%

122.790

0.000

TO

724

149

21%

575

79%

0.077

0.782

THE

3479

500

14%

2979

86%

92.126

0.000

OF

1113

257

23%

856

77%

2.933

0.087

IN

629

157

25%

472

75%

5.946

0.015

WITH

444

149

34%

295

66%

42.210

0.000

OR

234

104

44%

130

56%

77.526

0.000

BUT

131

64

49%

67

51%

61.267

0.000

ON

243

31

13%

212

87%

9.952

0.002

FOR

119

42

35%

77

65%

14.656

0.000

BY

236

54

23%

182

77%

0.504

0.478

A/AN

441

110

25%

331

75%

4.134

0.042

AT

138

30

22%

108

78%

0.045

0.831

total

9529

2163

23%

7366

77%

Thomson: The Seasons

The distribution in initial position of OF (23%) is not significantly different from the expected distribution. However, again we see that AND is both highly frequent overall and also has 32% of its instances at the beginning of the line; 9% of the lines in the poem begin with AND. This makes AND a possible cue to the line boundary. We saw the same in Dryden’s poem, and will see a similar pattern in the next poem as well. We can also see that the relatively infrequent overall words BUT, OR and FOR are all common at the beginning of the line, which we saw also in Milton and does not seem to vary much between poets.

shows William Cowper’s 1785 The Task, a blank verse poem in 5,184 lines (text from Cowper ).

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

1948

604

31%

1344

69%

117.560

0.000

TO

771

200

26%

571

74%

11.343

0.001

THE

2225

434

20%

1791

80%

2.995

0.084

OF

1121

196

17%

925

83%

8.351

0.004

IN

661

96

15%

565

85%

16.713

0.000

WITH

502

121

24%

381

76%

2.915

0.088

OR

238

68

29%

170

71%

8.224

0.004

BUT

278

125

45%

153

55%

96.232

0.000

ON

136

17

13%

119

88%

5.923

0.015

FOR

227

48

21%

179

79%

0.003

0.957

BY

209

68

33%

141

67%

16.765

0.000

A/AN

796

108

14%

688

86%

26.503

0.000

AT

225

22

10%

203

90%

17.080

0.000

total

9337

2107

23%

7230

77%

Cowper: The Task

Here OF, the third most frequent word overall, is distributed 17% in line-initial position, hence at a significantly low frequency. This is a pattern more in line with Dryden or Thomson, and not at all like PL. As in Thomson and Dryden, the word AND combines high frequency with a 31% possibility of being at the beginning of the line, making it a potential alternative marker of the line boundary. BUT, BY and OR are also frequent at the beginnings of lines (but, as elsewhere, not very frequent overall).

The conclusion to the discussion of Paradise Regained, the three non-Milton iambic pentameter poems, and the Shakespeare play, is that OF is not used with a significantly high frequency at the beginning of the line: thus, they differ from PL. However, in the three poems published after Milton, AND is used with unexpected frequency at the beginning of the line, and since it is also a very common word it might function as a line-initial marker. It is possible that the differences in the uses of OF and AND arise from causal factors relating to the difference in how enjambement works, and this is the difference between PL and the other poets discussed here. OF is often found mid-way through a noun phrase, and if line-initial may involve the more extreme forms of enjambement characteristic of PL, while AND can be clause-initial or phrase-initial rather than splitting a phrase.

Wordsworth

In William Wordsworth’s short 1798 poem in blank verse usually titled ’Tintern Abbey’ 38% of the instances of OF are at the beginning of the line (edition from Wordsworth ).

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

60

15

25%

45

75%

0.579

0.447

TO

18

5

28%

13

72%

0.498

0.480

THE

71

12

17%

59

83%

0.719

0.397

OF

61

23

38%

38

62%

10.261

0.001

IN

23

7

30%

16

70%

1.234

0.267

WITH

18

7

39%

11

61%

3.472

0.062

OR

6

1

17%

5

83%

0.068

0.794

BUT

4

1

25%

3

75%

0.039

0.844

ON

6

1

17%

5

83%

0.068

0.794

FOR

11

3

27%

8

73%

0.261

0.610

BY

5

1

20%

4

80%

0.003

0.956

A/AN

27

5

19%

22

81%

0.100

0.752

AT

1

0

0%

1

100%

0.266

0.606

total

60

15

25%

45

75%

Wordsworth: Tintern Abbey

More than half of the 159 lines of the poem begin with one of these grammatical words, while OF is used at the beginning of 23 lines. 14% of all the lines begin with OF, and in a possible echo of PL, it initiates the second line of the poem:

Five years have past; five summers, with the length

Of five long winters! and again I hear

Perhaps in this poem overall, Wordsworth is imitating Milton’s line-initial use of OF. It is also possible that Wordsworth is imitating some aspect of Milton’s enjambement of phrases such that OF appears frequently at the beginning of lines as a consequence. McCully ( : 209) describes the extensive influence of Milton on Wordsworth, including on subtle aspects of his use of iambic pentameter.

If we now consider Wordsworth’s long blank verse poem The Prelude in 8,483 lines (1805 version, text from Wordsworth ), we see not only the same distribution of OF as in PL (27% at line beginning), but significantly that it is now combined with the same frequent use of AND (28% at line beginning) which was seen in eighteenth century poets such as Thomson.

Word Total Position 1   Other Positions   χ2 p
    n % n %    

AND

2333

643

28%

1690

72%

60.537

0.000

TO

1243

275

22%

968

78%

0.946

0.331

THE

2916

386

13%

2530

87%

105.920

0.000

OF

2130

570

27%

1560

73%

42.605

0.000

IN

1152

260

23%

892

77%

1.710

0.191

WITH

753

208

28%

545

72%

19.908

0.000

OR

415

108

26%

307

74%

6.314

0.012

BUT

304

104

34%

200

66%

31.979

0.000

ON

304

60

20%

244

80%

0.292

0.589

FOR

393

108

27%

285

73%

9.950

0.002

BY

451

107

24%

344

76%

2.019

0.155

A/AN

1424

292

21%

1132

79%

0.210

0.647

AT

256

31

12%

225

88%

12.197

0.000

total

14074

3152

22%

10922

78%

Wordsworth: The Prelude, 1805

Wordsworth appears to be combining the practices of Milton in PL with eighteenth-century predecessors such as Thomson. His frequent line-initial uses of both AND and OF are evidence for the line boundary. This is true even if the appearance of these words in line-initial position is a result of Wordsworth mixing the enjambement practices of Milton and Thomson.

Different authors have different versions of iambic pentameter; they all use the same basic template, but control the matching of syllables to positions in different ways: this is true for the difference between Shakespeare and Milton, for example, as Kiparsky and Hayes , among others, have shown. McCully argues that Wordsworth merges elements of Shakespeare’s practice with elements of Milton’s practice in his iambic pentameter. This is an example of Wordsworth’s poetic hybridity, parallel to his hybrid use of line-initial grammatical words.

The determination of the line boundary

Evidence for the line

In this part of the paper I explore the possibility that in Paradise Lost the word OF functions as a clue to the location of the line boundary, when combined with other evidence. The word may function in this way, even if its appearance at the beginning of the line is caused as a side-effect of enjambement.

The audience needs to know where the line boundaries are in order to establish the metrical form of the text. Metre is dependent on the division of a text into lines: the line is the linguistic material which is matched to the fixed-length and fixed-shape metrical template (ten syllable line fitting a ten-position template, in the case of the iambic pentameter). So if the metre is to be established by the audience, they must also establish line boundaries. Once the metre is established, experiential effects can arise which depend on the relation of metre to rhythm; these can include the variations in relative regularity or irregularity (in the relation of rhythm to metre) which for example produce the closural effects described for poems in general by Smith , or for PL in particular by Corns . Once the text is divided into lines, then the mismatch of syntactic structure and lineation can produce particular effects, as argued for Milton’s enjambements by Hollander . Fabb also argues that metrical form is processed in working memory, which requires the text to be processed one line at a time, and in turn this should influence how the poem is experienced. Division into lines may also aid long-term memory for the text , which is a further motivation for identifying the line boundary.

There are various kinds of evidence for the line boundary in a poem (as discussed in detail by Fabb ( : chapter 5)). An audience relates to a poem by reading it, hearing it, or remembering it. The audience can find evidence for the line boundary from either textual or non-textual sources. There are several non-textual kinds of evidence. If read, the visual presentation of the poem shows the line boundaries. If heard, the poem can be performed in a way which indicates the line boundary, for example by pausing distinctively at the line boundary. If recalled from memory, both the visual and aural experience of the poem might be recalled as part of the evidence.

The kinds of textual evidence for the line boundary are either linguistic or poetic. Poetic evidence can come from line-final rhyme, in rhyming poems (but not in the unrhymed blank verse poems discussed here). Poetic evidence can also come from the metre, if the metre can be established by the audience. In iambic pentameter texts, line-final syllables are ten syllables apart and must be word-final, so if the poem is parsed correctly, each ten syllable sequence forming a line must end at the end of a word. The first position of the line has a distinctive metrical/rhythmic potential, which is that although it is a weak position, it is more likely than any other weak position to be occupied by a stressed syllable (‘trochaic inversion’), so that rhythmic aperiodicities are most likely to arise at the beginning of the line. The last syllable in the line is almost always stressed, and as in many metrical traditions, the line tends to be more rhythmically regular in its later part; this is noted for example by Corns ( : 92), who notes for a particular coherent passage in PL that the concluding lines have fully periodic rhythm in the last four positions, but a more variable rhythm in the first six (passages I: 730-7 and IV: 935-45). In this paper I also argue that there is linguistic evidence for the line boundary, in the deployment of particular words such as OF with greater than expected frequency at the beginning of the line. These kinds of poetic and linguistic evidence may play a relatively small role if the poem is read on the page, a larger role if heard (particularly in a performance style where there are no line-final pauses), and an even larger role when the poem is held in memory. The remembering of the poem has a particular relevance for Milton, who was losing or had lost his sight while he was composing PL, and would have needed to develop ways of remembering and keeping track of the lineation of stretches of the text when he was composing it, so that he could later dictate it. The use of OF as partial evidence for line boundary may have played a function for him during this early period of blindness, different from that of his previous poems, and also different from the later Paradise Regained, which was composed more quickly after he had been blind for a while.

OF, as evidence for the line boundary in PL, combines a 27% probability of being at the beginning of a line with a high frequency of overall occurrence (there are 2,051 instances of OF in the whole poem). It is possible to find a grammatical word with a higher probability of being line-initial, but at much lower frequency. For example BUT is relatively more frequent at the beginning of the line in PL (29%), but relatively infrequent overall in comparison to OF, with 588 instances of BUT in the poem. If BUT is also sentence-initial, there is a 68% probability of its being line-initial, but it is even less frequent in this instantiation: 73 instances in the poem. Here we balance two issues. On the one hand, the word OF or the word BUT are both probabilistic clues to the line beginning, irrespective of how many lines actually begin with these words. On the other, the clues will be more useful if more lines begin with these words, hence the overall frequency of the word in the text plays a role; of the 10,565 lines in PL, 557 (5%) begin with OF and 171(2%) begin with BUT. The addition of context to a word such as its being sentence-initial also requires greater processing effort. In the case of OF, we can increase the reliability of the word as a line-marker by excluding all cases of OF which are preceded by an unstressed syllable, on the fairly reliable assumption that the preceding line-final syllable will always be stressed; but this contextual information requires more processing effort and hence is less useful.

Previous authors have shown that a grammatical word can be used to mark a boundary in a literary text (see summary in Fabb ( : 193-212)). For example, Ghezzi argues that in Ojibwe narratives, the overall narrative is divided into sections (in twos and fours, some as small as a clause), which can be referred to as ‘lines’ (following the approach to North American indigenous oral literatures proposed by Dell Hymes). Here, some but not all of the ‘lines’ begin with the word ninguting (‘now presently’), a word with little contribution to the meaning of the text and which functions primarily as a boundary marker. Rubin, Wallace and Houston ( : 447) discuss the frequent use of AND at the beginning of the line in ballads (almost a fifth of the ballad lines begin with AND in their small corpus); they suggest that AND functions usefully as an unstressed word which fits into a metrically weak initial position, and can be used as a relatively meaningless filler which allows the next word to be a stressed and important content word. Their experimental subjects who had to remember and recall the corpus ballads and compose new ballads also tended to use AND at the beginning of lines: “after learning only five ballads, the subjects did begin their ballad lines with the word AND”. This suggests that the experimental subjects learned that AND had a relatively high probability of being line-initial.

Several kinds of evidence can be combined to give stronger overall evidence. For example, on its own the distribution of OF cannot tell us where the line boundaries fall in PL because only 27% of instances of OF are line-initial, and OF comes at the beginning of only 5% of the lines in the poem. However, in the context of other evidence, whether textual or non-textual, OF may offer some additional evidence to strengthen the judgment of where the line boundary falls. The ‘patterns first’ fact that OF is used at the beginning of the first two lines of the poem may be interpreted also as an indication that OF should be taken as a (weak) clue in this way.

Probabilistic judgments of literary form

Fabb argues that for many aspects of literary form, the literary text has that form by virtue of the form being attributed to it by its reader or hearer, with a certain degree of probability. This includes the attribution of the form ’line’ to a sequence of words. The balance of evidence, each with its own probability, can lead to the decision that a sequence of words is a line.

A different but compatible approach to probabilistic literary form is offered by current theories of generative metrics such as Hayes et al. or Kiparsky . In the Hayes et al. theory, iambic pentameter involves a template to which a line is matched. Successful matching depends on the prosodic structure of the line, including the pattern of syllables and stress, and the division into words and prosodic phrases. Shakespeare and Milton both use a slightly different metrical grammar for iambic pentameter; the different grammars constrain the matching of line to template in slightly different ways, even if they are all using the same general metre, e.g., iambic pentameter. This means that the distributions of prosodic variants of the iambic pentameter line are different for the different poets. The prosodic variants can be described in statistical terms for the different authors’ corpora. Each specific line of the poem belongs to a set of lines which all have the same prosodic structure, and the probability of that prosodic structure fitting the metrical template is assessed by the specific metrical grammar used by the poet. The probability of the match correlates with the frequency with which lines of each prosodic type will appear in the text. Highly probable matches will describe the prosodic structures of highly frequent lines. In this way, the statistical characteristics of the text correspond to the metrical grammars which guide the psychological processes by which the poems are produced. Hayes et al. focus on the authorial production of these regularities, but it is possible that an audience, too, can learn to distinguish the different forms of iambic pentameter used by Shakespeare or by Milton, and so may internalize the statistical properties of the data as mental metrical grammars. Thus, McCully argues that Wordsworth internalized a metrical grammar by learning from the distribution of metrical variants in Shakespeare’s and Milton’s poems.

There is good reason to think that our interactions with the world in general can be understood in terms of probabilistic psychological processes and states (in ways inaccessible to introspection). This is the basis for example of Clark’s account of the predictive brain. It is also seen in statistical learning, where subjects are able to learn statistical characteristics of the data presented to them. Thus there is precedence for the claim that, in the right circumstances, an audience can learn the statistical characteristics of a text. For our purposes, the two key distributions are that we generally expect the grammatical monosyllables to appear 21% of the time at the beginning of the line, and that we learn that in PL, OF appears 27% of the time at the beginning of the line, hence more than expected, given that it is a grammatical monosyllable. The right circumstances might involve some strong clues (e.g., the use of OF at the beginning of the first two lines of PL), and will depend on what the audience is exposed to, and able to learn. It may be that the first time an audience hears PL, it is hard to learn that OF is a statistically frequent line-initial word, but poems like PL can be encountered several times and on re-reading, or re-listening, an increased familiarity with the poem will enable the audience better to learn the statistical characteristics of it.

These discussions all raise the question of how the audience for poetry acquires these sorts of probabilistic understandings of textual form. This might arise in part from inherent biases or from explicit learning (e.g., learning about the iambic pentameter), or, most interestingly, from exposure to texts. Exposure to texts with particular statistical characteristics may result in the audience learning how to predict the properties of the text, including for example learning that OF has a 27% likelihood of being at the beginning of the line in PL, or (following Hayes et al.), that there is a higher probability of some line types in PL than others. If an author engages consistently in a particular textual practice, then they have internalized a probabilistic system, which can also in part come from learning (e.g., learning from other poets, learning from their own practice). These considerations in turn raise two further interesting questions. Consider first the possibility that these predictions are acquired by statistical learning. Siegelman and Frost show that an individual can vary in their ability to learn statistically, depending on what is learned; and furthermore that there is inter-individual variation in the ability to learn statistically (with some experimental subjects entirely unable to learn statistically, for specific kinds of learning). Individual authors and individual audience members might thus vary in their ability to learn the properties of a metre, or the distributional characteristics of words. If the individual is a poor statistical learner, then they may produce inconsistent texts (and hence have an attenuated stylistic fingerprint), or they may be unable to learn textual regularities. The second question relates to surprise (or more technically, surprisal): one of the basic claims of many literary critical approaches to metre is that the changing relation between the metrical template and the rhythmic line of poetry can be experienced. Thus the listener can experience ‘tension’ if the prosody of the line is not exactly matched to the template. And, following Smith , the listener can experience ‘closure’ if relative irregular rhythms give way to regular rhythms at the end of a poem (or irregular gives way to irregular, as Fabb argues). If, however, the listener has internalized the range of possible variations, with probabilities attached to them, this might mean that no variation is surprising or has any experiential effect, because all the variations are already predicted. This raises some difficult but also very large questions about the relation between variations in literary language, the knowledge of variation, and the ability to be surprised by variation.

Conclusion

Much digital humanities work, for example in stylometrics, focuses on the ‘small’ words, such as the grammatical monosyllables discussed here. This paper has explored the distribution of these words relative to position in the metrical line. The distribution of these words may tell us about aspects of the syntax of the poetry, as regards syntactic structure across line boundaries. The distributions also characterise different works, and different traditions of composition, and Wordsworth combines two different traditions in his poetry. I have also shown how the findings of textual analysis can be understood in terms of probabilistic judgments of literary form, here using the statistical behaviour of a grammatical word as evidence for the line boundary. This is part of a broader project of exploring the statistical properties of texts as they relate to human psychology and the possibility of statistical learning.

Acknowledgments

Thanks to Achim Barsch, Stefan Blohm, Mark Bruhn, Thomas Corns, Elizabeth Finnigan, Bruce Hayes, Arthur Jacobs, Elspeth Jajdelska, Barbara MacMahon, Chamil Rathnayake, Stefano Versace. Thanks in particular to Rocco Coronato and Sara Gesuato for organizing the conference from which this paper emerged, and for comments on the paper. I also thank two anonymous referees. This article draws on my Major Research Fellowship titled Epiphanies in literature: a psychological and literary linguistic account, which was funded by the Leverhulme Trust.

References

  1. Bridges, Robert. 1921. Milton’s Prosody with a Chapter on Accentual Verse and Notes. Oxford: The Clarendon Press.

  2. Clark, Andy. 2013. Whatever Next? Predictive Brains, Situated Agents, and the Future of Cognitive Science. The Behavioral and Brain Sciences 36, no.3: 181-204.

  3. Corns, Thomas N. 1990. Milton’s Language. Oxford: Blackwell.

  4. Cowper, William. 1899. The Task and Other Poems. Edited by Henry Morley. London: Cassell. Digitized by Project Gutenberg.

  5. Dryden, John. 1697. The Works of Virgil. London. Digitized by Project Gutenberg.

  6. Fabb, Nigel. 1997. Linguistics and Literature: Language in the Verbal Arts of the World. Oxford: Blackwell.

  7. Fabb, Nigel. 2002. Language and Literary Structure: The Llinguistic Analysis of Form in Verse and Narrative. Cambridge: Cambridge University Press.

  8. Fabb, Nigel. 2015. What is Poetry? Language and Memory in the Poems of the World , Cambridge: Cambridge University Press.

  9. Fabb, Nigel. 2016. Processing Effort and Poetic Closure. International Journal of Literary Linguistics 5, no. 4: 1-22.

  10. Fabb, Nigel and Morris Halle. 2008. Meter in Poetry: A New Theory. Cambridge: Cambridge University Press.

  11. Ghezzi, R. W. 1993. Tradition and Innovation in Ojibwe Storytelling. Mrs Marie Syrette’s ’The Orphans and Mashos’. In Arnold Krupat (ed.), New Voices in Native American literary criticism. 37–76. Washington: Smithsonian Institute Press.

  12. Hanson, Kristin and Paul Kiparsky. 1996. "A Parametric Theory of Poetic Meter." Language 72: 287–335

  13. Hayes, Bruce. 1983. A Grid-based Theory of English Meter. Linguistic Inquiry 14, no.3: 357–94.

  14. Hayes, Bruce, Colin Wilson and Anne Shisko. 2012. “Maxent Grammars for the Metrics of Shakespeare and Milton.” Language 88, no.4: 691-731.

  15. Hollander, John. 1975. Vision and Resonance. Two Senses of Poetic Form. New York: Oxford University Press.

  16. Ingram, William and Kathleen Swaim. 1972. A Concordance to Milton’s Poetry. Oxford: Clarendon Press.

  17. Kiparsky, Paul. 1977. The Rhythmic Structure of English Verse Linguistic Inquiry 8: 189–248.

  18. Lally, Steven (ed) 1987. The Aeneid of Thomas Phaer and Thomas Twyne. A Critical Edition Introducing Renaissance Metrical Typography. New York: Garland.

  19. Lancashire, I. 2014. Paradise Lost and Milton’s Associative Memory. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 8002: 88-102.

  20. McCully, C. B. 2000. Writing under the Influence: Milton and Wordsworth, Mind and Metre. Language and Literature 9, no.3: 195-214.

  21. Milton, John 1900. The Poetical Works of John Milton. Edited by H C Beeching. Oxford: Clarendon Press. Digitized by Project Gutenberg.

  22. Phaer, Thomas 1558 The Seven First Bookes of the Eneidos of Virgil Converted into English Meter. London.

  23. Philips, John 1701. An Imitation of Milton [The Splendid Shilling] London: Daniel Brown; and Benjamin Tooke. Digitized at http://spenserians.cath.vt.edu/TextRecord.php?textsid=37978

  24. Richardson, Janette 1962. Virgil and Milton Once Again. Comparative Literature 14. no.4: 321-331.

  25. Rubin, David C., Wanda T. Wallace and Barbara C. Houston. 1993. The Beginnings of Expertise for Ballads. Cognitive Science, 17, no. 3: 435-462.

  26. Siegelman, Noam and Ram Frost 2015. Statistical Learning as an Individual Ability: Theoretical Perspectives and Empirical Evidence. Journal of Memory and Language 81: 105-120.

  27. Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.

  28. Smith, Barbara Herrnstein. 1968. Poetic Closure. A Study of How Poems End. Chicago: University of Chicago Press.

  29. Thomson, James. 1908. Poetical Works. Edited with Notes and a Preface by J Logie Robertson. Oxford. Digitized by Project Gutenberg Australia.

  30. Tillmann, Barbara and W. Jay Dowling. 2007. Memory Decreases for Prose, but not for Poetry. Memory & Cognition 35: 628–639.

  31. Wordsworth, William. 1896. Poetical Works. Edited by William Knight. London: Macmillan. Digitized by Project Gutenberg.

  32. Zwicky Arnold, M. and Ann D. Zwicky. 1987. Patterns First, Exceptions Later. In Robert Channon and Linda Shockey (Eds.), In honor of Ilse Lehiste. 525-538. Berlin, Boston: De Gruyter Mouton.

Last URLs access: 17/06/2019

These books correspond to 9 and 10 of the second 1674 edition. In Hayes’s preparation, the text of the two books is laid out on an Excel spreadsheet, where each column contains all the syllables in a particular metrical position. I further distinguished actual monosyllables from syllables which are part of a larger word, to separate e.g., the grammatical monosyllable ‘for’ from the syllable ‘for’ as part of the word ‘forget’. Then I used the COUNTIF function to count instances of a particular word in the column.

This again uses Bruce Hayes’s annotated text of books 8 and 9, marked for five levels of juncture. Here I focus on OF after a level 4 or 5 juncture, typically a comma or period in the printed text.

For the determination of statistical significance I used R, using a version of this short program written by Chamil Rathnayake:

exp <- c(0.21,0.79) #defines the expected distribution of 21% of a word in initial position and 79% in other positions

obs.of <- c(557, 1494) #defines the observed distribution of a specific word, here OF in all of PL

lapply(mget(ls(pattern=^obs.*)),chisq.test,p=exp)

In the analysis of the whole of PL and other whole texts, I counted the total number of instances of a word, and the number of instances of that word at the beginning of a line. I found digital version of the texts and stripped off everything except the lines of poetry. Line boundaries were marked by replacing paragraph marks (showing ends of lines) with a random symbol, here the percentage sign %. I used AntConc to search either for e.g., ‘of’ to count all instances of OF in the text or ‘% of’ to count the line-initial instances of OF.

Note that this shifts from the 1667 edition used (based on Hayes) for the two books to the 1674 edition (which has a different division of books), the latter being the digital edition provided by Project Gutenberg.

Apart from this striking beginning, OF does not have any special status at the beginning of the line in their translation. In the first hundred lines, there are four lines which begin with OF, out of a total of 25 uses of OF overall; though this is a small sample, it shows a distribution of 16% line-initial OF.

To produce a list of all the lines containing OF, in sequence, I used Alpha (https://sourceforge.net/projects/alphacocoa/) and the ‘Find matching lines’ command.

For this finding, I used LIWC, dividing the complete dramatic texts (including some prose and non blank verse) into sections, and ran it through a custom dictionary consisting of the grammatical words. The outputs of LIWC are in percentages of words per section not numbers of words per section as in ANTCONC, but proportions can still be established reliably. The results for OF are as follows, with plays in date order and the percentage of line-initial OF in parentheses: The Taming of the Shrew (4), The Second Part of King Henry VI (2), The Third Part of King Henry VI (3), The Two Gentlemen of Verona (5), Titus Andronicus (5), The First Part of King Henry VI (4), The Tragedy of King Richard III (4), The Comedy of Errors (6), Love’s Labour’s Lost (6), A Midsummer-Night’s Dream (6), Romeo and Juliet (7), The Tragedy of King Richard II (6), The Life and Death of King John (6), The Merchant of Venice (7), The First Part of King Henry IV (7), The Second Part of King Henry IV (3), Much Ado About Nothing (1), The Life of King Henry V (4), As You Like It (5), Julius Caesar (5), Hamlet, Prince of Denmark (6), The Merry Wives of Windsor (2), Twelfth Night; Or What You Will (3), Troilus And Cressida (5), Othello, The Moor of Venice (6), Measure For Measure (5), All’s Well That Ends Well (6), Timon of Athens (8), King Lear (4), Macbeth (7), Antony and Cleopatra (11), Coriolanus (9), Pericles, Prince of Tyre (9), Cymbeline (10), The Winter’s Tale (11), The Tempest (10), The Famous History of the Life of King Henry VIII (11).

Thanks to Thomas Corns, personal communication, for suggestions relating to Milton’s blindness; see also Lancashire ( : 89).