SONNETCAST
  • Home
  • About
  • OVERVIEW
    • Introduction
    • The Procreation Sonnets
    • Special Guest: Professor Stephen Regan – The Sonnet as a Poetic Form
    • Special Guests: Sir Stanley Wells and Paul Edmondson – The Order of the Sonnets
    • The Halfway Point Summary
    • The Rival Poet
    • Special Guest: Professor Gabriel Egan – Computational Approaches to the Study of Shakespeare
    • Special Guest: Professor Abigail Rokison-Woodall – Speaking Shakespeare
    • Special Guest: Professor David Crystal – Original Pronunciation
    • The Fair Youth
    • Special Guest: Professor Phyllis Rackin – Shakespeare and Women
    • The Dark Lady
    • A Lover's Complaint
    • The Quarto Edition of 1609 and its Dedication
    • Dating the Sonnets— With Miro Roman
    • Summary & Conclusion
  • THE SONNETS
    • Sonnet 1: From Fairest Creatures We Desire Increase
    • Sonnet 2: When Forty Winters Shall Besiege Thy Brow
    • Sonnet 3: Look in Thy Glass and Tell the Face Thou Viewest
    • Sonnet 4: Unthrifty Loveliness, Why Dost Thou Spend
    • Sonnet 5: Those Hours That With Gentle Work Did Frame
    • Sonnet 6: Then Let Not Winter's Ragged Hand Deface
    • Sonnet 7: Lo! In the Orient When the Gracious Light
    • Sonnet 8: Music to Hear, Why Hearst Thou Music Sadly?
    • Sonnet 9: Is it for Fear to Wet a Widow's Eye
    • Sonnet 10: For Shame Deny That Thou Bearst Love to Any
    • Sonnet 11: As Fast as Thou Shalt Wane, So Fast Thou Growst
    • Sonnet 12: When I Do Count the Clock that Tells the Time
    • Sonnet 13: O That You Were Yourself, But Love, You Are
    • Sonnet 14: Not From the Stars Do I My Judgement Pluck
    • Sonnet 15: When I Consider Every Thing That Grows
    • Sonnet 16: But Wherefore Do Not You a Mightier Way
    • Sonnet 17: Who Will Believe My Verse in Time to Come
    • Sonnet 18: Shall I Compare Thee to a Summer's Day
    • Sonnet 19: Devouring Time, Blunt Thou the Lion's Paws
    • Sonnet 20: A Woman's Face, With Nature's Own Hand Painted
    • Sonnet 21: So Is it Not With Me as With That Muse
    • Sonnet 22: My Glass Shall Not Persuade Me I Am Old
    • Sonnet 23: As an Unperfect Actor on the Stage
    • Sonnet 24: Mine Eye Hath Played the Painter and Hath Stelled
    • Sonnet 25: Let Those Who Are in Favour With Their Stars
    • Sonnet 26: Lord of My Love to Whom in Vassalage
    • Sonnet 27: Weary With Toil, I Haste Me to My Bed
    • Sonnet 28: How Can I Then Return in Happy Plight
    • Sonnet 29: When in Disgrace With Fortune and Men's Eyes
    • Sonnet 30: When to the Sessions of Sweet Silent Thought
    • Sonnet 31: Thy Bosom Is Endeared With All Hearts
    • Sonnet 32: If Thou Survive My Well-Contented Day
    • Sonnet 33: Full Many a Glorious Morning Have I Seen
    • Sonnet 34: Why Didst Thou Promise Such a Beauteous Day
    • Sonnet 35: No More Be Grieved at That Which Thou Hast Done
    • Sonnet 36: Let Me Confess That We Two Must Be Twain
    • Sonnet 37: As a Decrepit Father Takes Delight
    • Sonnet 38: How Can My Muse Want Subject to Invent
    • Sonnet 39: O How Thy Worth With Manners May I Sing
    • Sonnet 40: Take All My Loves, My Love, Yea Take Them All
    • Sonnet 41: Those Pretty Wrongs That Liberty Commits
    • Sonnet 42: That Thou Hast Her, it Is Not All My Grief
    • Sonnet 43: When Most I Wink, Then Do Mine Eyes Best See
    • Sonnet 44: If the Dull Substance of My Flesh Were Thought
    • Sonnet 45: The Other Two, Slight Air and Purging Fire
    • Sonnet 46: Mine Eye and Heart Are at a Mortal War
    • Sonnet 47: Betwixt Mine Eye and Heart a League Is Took
    • Sonnet 48: How Careful Was I When I Took My Way
    • Sonnet 49: Against That Time, if Ever That Time Come
    • Sonnet 50: How Heavy Do I Journey on the Way
    • Sonnet 51: Thus Can My Love Excuse the Slow Offence
    • Sonnet 52: So Am I as the Rich, Whose Blessed Key
    • Sonnet 53: What Is Your Substance, Whereof Are You Made
    • Sonnet 54: O How Much More Doth Beauty Beauteous Seem
    • Sonnet 55: Not Marble, Nor the Gilded Monuments
    • Sonnet 56: Sweet Love, Renew Thy Force, Be it Not Said
    • Sonnet 57: Being Your Slave, What Should I Do But Tend
    • Sonnet 58: That God Forbid That Made Me First Your Slave
    • Sonnet 59: If There Be Nothing New, But That Which Is
    • Sonnet 60: Like as the Waves Make Towards the Pebbled Shore
    • Sonnet 61: Is it Thy Will Thy Image Should Keep Open
    • Sonnet 62: Sin of Self-Love Possesseth All Mine Eye
    • Sonnet 63: Against My Love Shall Be as I Am Now
    • Sonnet 64: When I have Seen by Time's Fell Hand Defaced
    • Sonnet 65: Since Brass, Nor Stone, Nor Earth, Nor Boundless Sea
    • Sonnet 66: Tired With All These, for Restful Death I Cry
    • Sonnet 67: Ah, Wherefore With Infection Should He Live
    • Sonnet 68: Thus Is His Cheek the Map of Days Outworn
    • Sonnet 69: Those Parts of Thee That The World's Eye Doth View
    • Sonnet 70: That Thou Are Blamed Shall Not Be Thy Defect
    • Sonnet 71: No Longer Mourn for Me When I Am Dead
    • Sonnet 72: O Lest the World Should Task You to Recite
    • Sonnet 73: That Time of Year Thou Mayst in Me Behold
    • Sonnet 74: But Be Contented When That Fell Arrest
    • Sonnet 75: So Are You to My Thoughts as Food to Life
    • Sonnet 76: Why Is My Verse so Barren of New Pride
    • Sonnet 77: Thy Glass Will Show Thee How Thy Beauties Wear
    • Sonnet 78: So Oft Have I Invoked Thee for My Muse
    • Sonnet 79: Whilst I Alone Did Call Upon Thy Aid
    • Sonnet 80: O How I Faint When I of You Do Write
    • Sonnet 81: Or I Shall Live Your Epitaph to Make
    • Sonnet 82: I Grant Thou Wert Not Married to My Muse
    • Sonnet 83: I Never Saw That You Did Painting Need
    • Sonnet 84: Who Is it That Says Most, Which Can Say More
    • Sonnet 85: My Tongue-Tied Muse in Manners Holds Her Still
    • Sonnet 86: Was it the Proud Full Sail of His Great Verse
    • Sonnet 87: Farewell, Thou Art Too Dear for My Posessing
    • Sonnet 88: When Thou Shalt Be Disposed to Set Me Light
    • Sonnet 89: Say That Thou Didst Forsake Me for Some Fault
    • Sonnet 90: Then Hate Me When Thou Wilt, if Ever, Now
    • Sonnet 91: Some Glory in Their Birth, Some in Their Skill
    • Sonnet 92: But Do Thy Worst to Steal Thyself Away
    • Sonnet 93: So Shall I Live, Supposing Thou Art True
    • Sonnet 94: They That Have Power to Hurt and Will Do None
    • Sonnet 95: How Sweet and Lovely Dost Thou Make the Shame
    • Sonnet 96: Some Say Thy Fault Is Youth, Some Wantonness
    • Sonnet 97: How Like a Winter Hath my Absence Been
    • Sonnet 98: From You Have I Been Absent in the Spring
    • Sonnet 99: The Forward Violet Thus Did I Chide
    • Sonnet 100: Where Art Thou, Muse, That Thou Forgetst so Long
    • Sonnet 101: O Truant Muse, What Shall Be Thy Amends
    • Sonnet 102: My Love Is Strengthened Though More Weak in Seeming
    • Sonnet 103: Alack, What Poverty My Muse Brings Forth
    • Sonnet 104: To Me, Fair Friend, You Never Can Be Old
    • Sonnet 105: Let Not My Love Be Called Idolatry
    • Sonnet 106: When in the Chronicle of Wasted Time
    • Sonnet 107: Not Mine Own Fears Nor the Prophetic Soul
    • Sonnet 108: What's in the Brain That Ink May Character
    • Sonnet 109: O Never Say That I Was False of Heart
    • Sonnet 110: Alas, 'Tis True I Have Gone Here and There
    • Sonnet 111: O For My Sake Do You With Fortune Chide
    • Sonnet 112: Your Love and Pity Doth Th'Impression Fill
    • Sonnet 113: Since I Left You, Mine Eye Is in My Mind
    • Sonnet 114: Or Whether Doth My Mind, Being Crowned With You
    • Sonnet 115: Those Lines That I Before Have Writ Do Lie
    • Sonnet 116: Let Me Not to the Marriage of True Minds
    • Sonnet 117: Accuse Me Thus, That I Have Scanted All
    • Sonnet 118: Like as to Make Our Appetites More Keen
    • Sonnet 119: What Potions Have I Drunk of Siren Tears
    • Sonnet 120: That You Were Once Unkind Befriends Me Now
    • Sonnet 121: Tis Better to Be Vile Than Vile Esteemed
    • Sonnet 122: Thy Gift, Thy Tables, Are Within My Brain
    • Sonnet 123: No! Time, Thou Shalt Not Boast That I Do Change
    • Sonnet 124: If My Dear Love Were But the Child of State
    • Sonnet 125: Were't Aught to Me I Bore the Canopy
    • Sonnet 126: O Thou, My Lovely Boy, Who in Thy Power
    • Sonnet 127: In the Old Age Black Was Not Counted Fair
    • Sonnet 128: How Oft When Thou, My Music, Music Playst
    • Sonnet 129: Th'Expense of Spirit in a Waste of Shame
    • Sonnet 130: My Mistress' Eyes Are Nothing Like the Sun
    • Sonnet 131: Thou Art as Tyrannous, so as Thou Art
    • Sonnet 132: Thine Eyes I love, and They, as Pitying Me
    • Sonnet 133: Beshrew That Heart That Makes My Heart to Groan
    • Sonnet 134: So Now I Have Confessed That He Is Thine
    • Sonnet 135: Whoever Hath Her Wish, Thou Hast Thy Will
    • Sonnet 136: If Thy Soul Check Thee That I Come so Near
    • Sonnet 137: Thou Blind Fool Love, What Dost Thou to Mine Eyes
    • Sonnet 138: When My Love Swears That She Is Made of Truth
    • Sonnet 139: O Call Not Me to Justify the Wrong
    • Sonnet 140: Be Wise as Thou Art Cruel, Do Not Press
    • Sonnet 141: In Faith, I Do Not Love Thee With Mine Eyes
    • Sonnet 142: Love Is My Sin, and Thy Dear Virtue Hate
    • Sonnet 143: Lo! As a Careful Housewife Runs to Catch
    • Sonnet 144: Two Loves I Have of Comfort and Despair
    • Sonnet 145: Those Lips That Love's Own Hand Did Make
    • Sonnet 146: Poor Soul, the Centre of My Sinful Earth
    • Sonnet 147: My Love Is as a Fever, Longing Still
    • Sonnet 148: O Me! What Eyes Hath Love Put in My Head
    • Sonnet 149: Canst Thou, O Cruel, Say I Love Thee Not
    • Sonnet 150: O From What Power Hast Thou This Powerful Might
    • Sonnet 151: Love Is too Young to Know What Conscience Is
    • Sonnet 152: In Loving Thee Thou Knowst I Am Forsworn
    • Sonnet 153: Cupid Laid by His Brand and Fell Asleep
    • Sonnet 154: The Little Love-God, Lying Once Asleep
  • THE SONNETEER
  • EVENTS
  • TEXT NOTE
  • CONTACT
    • SUBSCRIBE

Special Guest: Professor Gabriel Egan – Computational Approaches to the Study of Shakespeare

Picture
LISTEN TO THE SONNETCAST SPECIAL
WITH PROFESSOR GABRIEL EGAN​
In this special episode, Gabriel Egan, Professor of Shakespeare Studies and Director of the Centre for Textual Studies at De Montfort University in Leicester, UK, talks to Sebastian Michael about computational approaches to the study of Renaissance literature in general and to Shakespeare's works in particular: what are the methodologies employed and what insights can they yield, especially in the context of the Sonnets.

SEBASTIAN MICHAEL:

Today I am exceptionally pleased to be able to welcome to our podcast Professor Gabriel Egan to talk to us about computational approaches to understanding Shakespeare.

Now, Gabriel Egan is Professor of Shakespeare Studies and Director of the Centre for Textual Studies at the De Montfort University in Leicester, here in the UK, and one of the four general editors, together with Gary Taylor, John Jowett and Terri Bourus of the New Oxford Shakespeare of which the Modern Critical Edition appeared in October 2016, and the Critical Reference Edition and Authorship Companion in early 2017. The remaining two volumes are the Complete Alternative Versions, with general editors Taylor, Bourus, and Egan that will appear this year in 2024.

He co-edits the academic journal Theatre Notebook for the Society of Theatre Research, and currently he is editing Shakespeare's The Two Gentlemen of Verona for the New Variorum Shakespeare series, which will be published by Modern Language Association in 2025, and Sir Thomas More for the New Oxford Shakespeare Complete Alternative Versions edition, which will be published by Oxford University Press in 2024.

Professor Egan was co-investigator on the AHRC-funded research project Transforming Middlemarch that ran from January 2022 to March 2023 to produce an open access online scholarly digital genetic edition of Andrew Davies’s 1994 BBC television adaptation of George Eliot's novel, and he was principal investigator on the also AHRC-funded research project Shakespeare's Early Editions (SEE) that ran from October 2016 to July 2018, to explore the differences between the Quarto and Folio versions of his plays, to see if they can be quantified and explained in terms of textual corruption and authorial and non-authorial revision. AHRC, I should explain, stands for Arts and Humanities Research Council and is a funding body here in the UK.

His books include The Struggle for Shakespeare's Text: Twentieth Century Editorial Theory and Practice, published by Cambridge University Press and Edinburgh Critical Guide to Shakespeare; Green Shakespeare; Shakespeare and Marx, which also appeared in Turkish; and an edition of Richard Brome and Thomas Heywood's The Witches of Lancashire.

His latest sole-authored book is a monograph called Shakespeare and Ecocritical Theory for the Arden Shakespeare series, which was published by Bloomsbury.

Gabriel, thank you so much for coming to visit me here at my home in Earl’s Court: I am intensely grateful that you agreed to talk to us about computational approaches.

So perhaps to set the framework: your field is digital humanities. What, very broadly, do we mean by digital humanities?


GABRIEL EGAN:

We mean the approach to the study of the humanities – so that's the works of history, culture, literature, arts: the humanities – as seen through the new lenses that the availability of computers and the availability of digital surrogates allows us to adopt.

So when I say ‘digital surrogates’, I mean we have now the texts for a lot of important works in digital form, we have, obviously, high quality images of artworks, we have a lot of historical materials in digital form, and that allows us to do things as humanists that we couldn't do when our resources were purely analogue books, physical objects.

And that's the very broad group, the digital humanities. My bit of it is the textual stuff. So if you ask me about pictures, I'll be out of my depth. But working on text is what I particularly focus on. And there's been a lot of interesting research in that area, particularly in the last twenty years, because of certain technical advances we might get to.


SEBASTIAN MICHAEL:

So working with text, then, what are the computational methods that are used in the analysis of texts? How do these methods evolve with computing? And also how do they work in principle, for somebody who has no idea?


GABRIEL EGAN:

Well, machines can count things for us. And principally it boils down to counting things. What's changed is what we can count. Work in this area started really with linguists in the 1960s who wanted to answer certain questions about the history of language. And they would have particular documents digitised and then start counting things. And it could be quite mundane, what you start with: how long a sentence does this author like? How long a word does this author favour? What order do certain kinds of words come in?

When we started in the Sixties in this field, the machines we were using were quite limited in their abilities. One of the fun essays I give my undergraduate students studying this subject is an article from the 1960s, in which someone was trying to produce a printed concordance to Shakespeare, using a computer that wasn't even powerful enough to hold one Shakespeare play in its memory at a time. So they were describing how they were doing the work piecemeal in the article.

But when we started in the Sixties, first of all, of course there were not many digital texts, so investigators often have to get their text digitised, which usually meant somebody typing it in at a keyboard.

There's also, I mentioned, the problem of the machines being quite small in memory, and the usual thing for investigators to do was to produce a list of what they called ‘stop words’: words we don't want to bother with, the words that are so frequent in the language, like ‘the’ and ‘and’, that we leave them out because they would just swamp our analysis.

And actually, that was true even before computers. If you think about the old printed concordances, Bartlett's Concordance to Shakespeare, in the nineteenth century was the first really comprehensive study of the words in Shakespeare. So a concordance is a book like a dictionary, so it's organised alphabetically: it gives you all the words in, say, Shakespeare's works. It gives you them in alphabetical order, and then a list of all the places those particular words occur. So you could look up ‘blood’, for example, in the B section, and see where every time Shakespeare uses the word ‘blood’, and it's a very long entry.

But when making that concordance, Bartlett decided to leave out ‘the’ and ‘and’ and ‘on’ and ‘in’, because there would just be thousands of records, and this idea of leaving out the stop words persisted into the digital age to begin with, because our machines were so weak and small, and so people studied the unusual words and phrases in particular, in various authors' works, in Shakespeare's works.

I would say the big change since the 1960s, the things that happened, particularly from the 1980s onwards, is that we got machines that were powerful enough, had large enough memories, that we could count everything. Let’s not just count the unusual words. Let's count every single word. And these machines were able to work on large bodies of writing because more and more things got digitised.

Starting in the Seventies, a university project called Project Gutenberg started getting people to keyboard any literary or historical document they thought was valuable from some out-of-copyright text. And these were then shared on the early internet.

But then there was a big boost to that in the 1980s and 90s, when large bodies of texts got systematically keyboarded by corporations, actually, who wanted to sell the data sets to people like me. So the change that made is that we could study all of the language.

And that's quite a remarkable step change, actually, because it may not seem obvious, but if you put all the words that you, Sebastian, have used in the past week in order of how often you use them, you might not be surprised to find ‘the’ up at the top. About one in every fourteen words you use, you speak or write, is ‘the’. ‘And’ comes next. And then various forms of the verb ‘to be’ will come next.

If you produce that list in order of how often you've used those words, you'll have some very rare words at the bottom. You may have only said ‘concatenate’ once in the last week, but you would have said lots of other words very frequently.

If you take the top one hundred of those most frequent words, that’s half of everything you've said or written in the last week. So half of our language is repetitions of those one hundred most frequent words. And until lately, until very recently, we ignored them. Until the 1980s and 90s, we ignored them because we had to.

Now we can analyse them, and they turn out to be extraordinarily rich in what they tell us. I mean, as words on their own, they're pretty boring. There's not much you can say about ‘the’ or ‘a’.

Isn't there a Blackadder sketch where they're trying to recreate Johnson's dictionary, I think, and they come to the conclusion – they’re starting with the letter A, I think – and they come to the conclusion that ‘a’ doesn't really mean anything, which is a joke I think linguists would enjoy.

But these words do have significance, because it turns out that although we all use roughly the same one hundred words as our most frequent words, we all use them to slightly different amounts. That is, we all prefer – we use certain words like ‘between’ and ‘amidst’ or some other word that has a similar meaning. We have favourites and we have words we disfavour, we have preferences amongst those hundred that are idiosyncratic, and in fact, pretty distinctive: they’re your favourites.

And if we count your use of language, if we get enough of what you've said and written and count how often you use each one of these hundred words and then do the same for me, we end up with very different profiles, or at least different enough that if we had a substantial body of your writing and mine and then put them to one side and just labeled them, and then we took some unknown bit of text off the table here, which somebody didn't know whether it was yours or mine, by counting the same hundred words and their frequencies in that text and comparing that profile, that probability distribution, we call it, to the profile for you and the profile for me, we can to about 85% reliability tell who wrote it.

Now that was first discovered, the importance of these very common words – 'function words', we call them – they're words that in English don’t, they really don't have a meaning, actually, they're not semantically loaded: they're words that we use most often for syntactical and grammatical purposes, they join the other words. They're more like the cement between the bricks. And in other languages they don't even exist. So other languages, rather than having various auxiliary verbs, will actually have various inflections of nouns to do the same work. We have these function words. They're short, they’re the glue that holds the rest of the language together, and the first people to notice that they were important for authorship attributions were a couple of investigators called Mosteller and Wallace, who were looking at some American documents.

During the American War of Independence in the eighteenth century, the publications of various founding fathers appeared in essentially a newspaper called The Federalist, and it was about the politics of exactly what kind of commonwealth the colonies were going to build as they split from Great Britain, and some of these articles were signed, they had names, some didn't. They were written by people like Alexander Hamilton and the other founding fathers.

Some were unsigned and there were various disputes about who had written what, and Wallace and Mosteller counted, manually counted, these function words, and were able to find some that they found – because some of the articles were signed, they could tell, okay, this is Hamilton's preference for the word ‘in’, this is Hamilton's dislike... – his words, his likes, dislikes amongst these function words.

And they did the same for the other candidate authors, and they were able to then attribute various texts. This was done manually in the 1950s, using a small number of words, which turn out to be highly distinctive. But what we learned from that is that if you count all the hundred words and do the full profile analysis, any two writers can be distinguished.

And it's something that's pretty much unfakeable, in the sense that you may well know that you like the word ‘proliferate’, or you like the word ‘baroque’, or you like say ‘it's ghastly’. You might have favourite words or even words you think, I don't use that word, I don't like that word.

But you won't, I think, have a preference for ‘above’, or ‘within’ and certainly not for ‘the’. And you wouldn't know how often you do it, but it turns out you consistently use that word a certain number of times in any large body of writing. And I do. And those numbers aren't the same.

And so, as I say, with 85% reliability we can count those words and do various kinds of authorship attribution. It's something I've been involved in.

So the answer to your question fundamentally is, what's changed, what’s evolved, is better machines, more digital text; and, at a slower rate, but I think it is genuinely progress, we’re getting better at it. We're figuring out not just simple things like counting, but some work I've been involved in is about how these words appear next to other words or near other words.

I think it was the linguist Firth who said, “you should know a word by the company it keeps.” And that's the principle we're talking about: which words actually appear near to one another? But the big thing is we now have essentially the full body of published works in English up until the year 1800 in digital form, and that was done by a concerted effort to actually manually keyboard them. And that has been transformative, because we can now – when we find, for example, I’m editing Shakespeare at the moment, when we find an unusual phrase and think, well, is that Shakespeare inventing something or is it a mistake? Is it an error? Is it the wrong work? We can now compare his practice to the whole body of writing at his time, and really put him into the context of other writers.

So yeah, for all sorts of work with Shakespeare, this transforms what we're able to do.


SEBASTIAN MICHAEL:

And you were on the point of mentioning a sort of a time frame, a rough date when this began to change significantly. When does this happen, that we have the body effectively of English literature digitised, and have computers strong enough to be able to say, well, with today's technology, we can really perform operations that yield practically certain results.


GABRIEL EGAN:

It's a fascinating history, actually. It starts, this story, in the 1930s with a new technology emerging just then, which is microfilming, which became cheap enough that people started to think it was a useful way of preserving books that might disappear, and also disseminating them, because you can copy microfilm quite cheaply.

And the real pressure came with the looming Second World War. People started to worry, certainly in London, that if London was bombed, rare books would be lost. And there was an effort to microfilm a lot of rare books in the British Museum Library, as it was then. And a company called, well, they became University Microfilms International, I forget what they were originally called, but various… – I think Bell and Howell were involved – it became a large project that had a commercial and an academic aspect.

And the idea was, let's try and microfilm all the books up to the year 1700. Actually, they started with a less ambitious idea, the Civil War was their first target. They wanted to do all the books in what's called The Short-Title Catalogue. Pollard and Redgrave. So a lot of work was done at the British Library – British Museum as it was – Folger Shakespeare Library, Huntington Library, all the places that had rare books around the world. And this became a commercial product, and you could buy this entire cabinet of microfilm that was essentially all the books eventually up to about the year 1700, was one of the cut offs that was used at one point.

And a later project took that from the year 1700 to the year 1800, when there was a big explosion in publishing. But this first collection, called Early English Books, was selling on microfilm very well through the 1960s and 70s. In fact, when I started as a graduate student, it was still the only way to get these things. If you were lucky, your university had bought the cabinet of microfilms, about thirteen hundred microfilms, and often… – there are multiple books per film, and for big books there are multiple films per book. It was actually a very awkward, very difficult thing to use. But to actually have there where you were, in front of you, the treasures, the rare books of the British Museum, Huntington, Folger, Bodleian Library and others was transformative, and in the late 1990s, the company that owned these microfilms thought, we could digitise them. We've got, now, computers powerful enough that we could digitise each of these microfilms.

And it turns out microfilm is a very binary medium, by which I mean silver halide crystal either goes transparent or it doesn't. It's on/off. And so they decided to digitise them in one-bit-per-pixel, as it were, form, and to sell the images online. And that became Early English Books Online, EEBO, which is still an essential resource for this work.

So that appeared in, I think, 1999, it came to the UK. And then a wonderful project called the Text Creation Partnership, for which Sean Martin, a librarian in the US is an unsung hero, deserves a lot of credit. Sean Martin organised a consortium to say, well, a lot of these books are out of – well, they’re all – out of copyright. But why don't we get people to keyboard them, to actually type in?

Because you can do what's called optical character recognition with an image of an old book. But with the kind of printing we're talking about, the quality is very low. You get a lot of mistaken letters. Famously, of course, the long S looks like an F, but all sorts of errors occur. We really need someone to sit with the images and type them in. And that Text Creation Partnership then made searchable text for the first 25,000 of these roughly 130,000 books.

And the genius of Sean Martin's plan was that the people involved in doing the keyboarding would first of all– they were usually libraries – they paid for the work to be done, but they would have first use of these digital texts, which were then, after a number of years, passed into the public domain. That was the genius key, because now the first tranche of Early English Books Online Text Creation Partnership (EEBO/TCP) is free to anybody to use.

And it was, of course, immediately seized upon by clever linguists who are computer literate, who then produced wonderful transformations of that data set. Where before we could search for strings of characters… – so, for example, if I wanted to know how does Shakespeare’s use of the word ‘row’  – r o w – compare to his contemporaries, does the meaning of that word change? I could search for ‘r o w’, but that's not really a word. That's three letters in a row. It's the same three letters for the verb ‘to propel a boat’ or the verb ‘to argue’. It's also the same three letters for the noun meaning ‘the opposite of a column’. I can probably think of a few more, but we weren't looking at words, we were looking at strings.

What people have done with the data set since then is use very clever analysis that actually marks up the text to say, that's ‘row’ being used as a verb. That’s… – next word is an adjective. Next word, this is a noun. This is a participle. This is a pronoun. This is a… – It was doing the morphosyntactic analysis, that's the correct term.

And that really does supercharge what you can do, because then you can ask for the combinations of various types of words without specifying the actual letters. So you can say, who likes to put two adjectives before a noun at the start of a paragraph, or at the start of a speech or whatever? And that's something called EarlyPrint.org in the US. Washington University, Saint Louis.

So the fact that this EEBO data set, which was just pictures, then became a typed-up data set, and then that it was given away to the world, which meant people didn't have to pay to make transformations of it, they could produce wonderful, linguistically rich data sets, has really transformed what people like me are able to do, not only as editors, but also as investigators of authorship, of chronology and so on.

So that actually, the availability of texts, I would say, is even more important than the increase in the processing power of computers, because actually, we figured out back in the Sixties – I mentioned the example of the scholar trying to produce a concordance using a machine that couldn't even hold the whole of a Shakespeare play – the point is, we figured out you can do it piecemeal. You can look at a line at a time and actually produce a concordance without holding the whole text in memory.

So we figured out things that meant that processing power itself wasn't a limitation. It might take a week for a job, as we called it, a task to complete, but all computers are essentially equivalent. That's what Alan Turing showed us, that they're all equivalent. They just might take a bit longer. So that wasn't the barrier. The barrier was the availability of the raw text.

And now we're in a delightful situation that we have that data set, which isn't the only game in town I should mention: in the 1980s, a company called Chadwyck-Healey had a parallel project, but they called theirs Literature Online, and their aim was, let's get all of English literature typed up. Let's pay graduates, I think actually in South Asia, mostly, to keyboard all of this set of what we call English literature. And it was… – some scholars were put together to form an advisory board on what should they digitise, what texts should they choose; but they produced this wonderful thing which came out originally on multiple CD-ROMs and then was put online. That's still a commercial product, but a lot of universities have access to it. And again, there's a lot you can do with that data set.

You can't talk about the language in general, because you've chosen to only have things that subsequently got classified as English literature. But it's a big data set. There's a lot of things in there, and you can reasonably talk about habits amongst the English writers who became canonical.


SEBASTIAN MICHAEL:

As a little aside, almost, or a question in-between: What software do you use? Are there consumer products or prosumer products that you could buy and then use, or do you make your own? Do you code your own?


GABRIEL EGAN:

People vary in that. My practice is to write my own, because I really want to know exactly what it does, rather than trust what it says on the tin, to use a modern expression. People familiar with software development will know if I say, we often, when writing software, draw upon someone else's library.

You can, in a bit of a code, you can say, give me the library that is pi-stacked, which is some statistical software that works under the language Python. And if you don't go and look what's in that library that you're borrowing – and when I say ‘library’, it's a bunch of code that someone else wrote that they promise will do these various things for you, and it saves you a lot of time that you don't have to reinvent the wheel – but you're relying upon them making certain choices and correctly applying their various algorithms.

And I'm quite wary of that. So where I can, I like to write my own code, but other people do use off-the-shelf packages, but it's not quite what you suggested there, Sebastian, that there's a consumer product for all this. I think there's a very small market for this, but there are online tools.

Voyant Tools is one, actually, people use a lot where you can actually simply copy and paste your favourite Charles Dickens novel, drop it in, I mean, all the text; so you get it from, say, Project Gutenberg, drop it in and it will tell you all sorts of interesting things, like longest sentence, shortest sentence, three word phrase most repeated in this text, four word phrase most repeated; obviously, most frequent word; least frequent word; and all sorts of other statistics that people care about, including things like readability.

You can measure the ‘school level’, as it were, the level of education a person would need to make sense of this thing. So you might say, you know, let's compare whether Jane Austen or Charles Dickens is inherently easier to read for the modern American student, say, using various tests of what words they know and don't know.

Of course, you've got the problem that you’re being a bit anachronistic there, because there'll be words that were not unfamiliar to people in the 19th century but are now considered quite arcane.

So, but yes, there are tools, online tools. They vary, they come and go. So I wouldn't recommend any particular one. But for anybody serious about this, I would recommend learning to program for yourself. It's become remarkably easy. Modern languages like the language I mentioned, Python. That’s something I teach, actually, to humanists who aren't digitally literate, and you can teach in a day or two enough of this language that a person can then take a book and produce for themselves, lists of… – produce a concordance, produce lists of words in frequency of use, or not simply one word and frequency of use for each of those words, but taking pairs of words: how often do this pair appear and triplets and four and five; and we call them n-grams, so a 6-gram, 7-gram, and you can find what eight-word phrase occurs three times in A Tale of Two Cities, for example, that you might have noticed reading it, but you may not have.

So the answer to the basic question, I say do it yourself. Learn, I encourage people to learn programming. Apart from very pleasurable, it’s very valuable for people who want to do this kind of work, to actually fully understand what the software is doing for them.


SEBASTIAN MICHAEL:

Moving towards poetry, perhaps, because we are, of course, in this podcast particularly interested in the Sonnets: are there any particular challenges or also opportunities that poetry specifically offers to computational approaches.


GABRIEL EGAN:

There are. I mentioned earlier that from the computer’s point of view, unless told otherwise, a word is just a collection of letters. But of course, we as humans care about the differences between the meaning of ‘r o w’, the verb and ‘r o w’, the noun. So there's an extra meaning in the language at that level that's not simply a string of characters.

When you get to poetry, that form of language, of course, has the additional layer of meaning caused by the rhythm of the language. And this is something that we haven't actually done very much work – I mean I say ‘we’ collectively, the field, hasn't done a lot of work – on that. It's quite hard, or it was until recently quite hard to get a machine to appreciate the rhythm, in the sense of to be able to distinguish between, say, iambic pentameter and some other form; iambic pentameter being, of course, the dominant form in Shakespeare's verse writing.

We now have tools that are quite good at that. It's taken a long time to get there, and one of the obstacles has been that in a sense, we don't have a ground truth, or at least, it's not quite as well agreed upon as it is for, say, the dictionary definition of words.

What I mean is, if you find two scholars of prosody, the study of rhythm and language, they'll disagree about the correct meter that a line is in, and they'll even dispute, for example, the notions of, say, stress, emphasis, and rhythm.

The raw terminology by which we try to get a handle on this phenomenon is not agreed upon. That is, they'll look at the same line and disagree about what they're looking at. And to me, it's actually an area… – the fault there really lies with human beings having not really sorted out what they mean by some of these terms. And my experience is that it also now is virtually untaught: my undergraduates, some of them have some sense of how to analyse the rhythm of language. They've learned a few bits of jargon, but they don't really have a strong interiorised sense of rhythm.

And in the absence of that, which I think is completely excusable, what they need then are hard and fast rules. And the trouble is, I don't know two people in this field who have the same hard and fast rules; or rather, they have hard and fast rules, and they actually – and this is quite annoying to me – they feel very strongly that they're correct, and they're quite poor at explaining why this is the way that line is stressed. And in particular to me, because my background in this, I come to this via the plays and via working in a theatre company – I used to work at the Shakespeare's Globe Theatre in London – and I strongly believe in the right – more than that: the authority – of the performer’s choice. And by and large the experts I know, at least, in the area of Shakespeare's poetry, that is the non-dramatic poetry, don't so much believe in the right. They believe that the actor is either getting it right or getting it wrong, rather than, as I would see it: I would see the meter as being an opportunity, something you might conform to, but that the art of the performer is playing the variations upon that.


SEBASTIAN MICHAEL:

I suppose when we speak about Shakespeare, we have to bear in mind that in his days, the pronunciation of his language was in places really quite different to ours, with stresses being either always in a different place to where we have them, or put in a different place to where we would put them simply, again, for prosody, which segues beautifully into Shakespeare.

You, if I understand this correctly, are a Shakespeare specialist within this. And so when you approach Shakespeare particularly, what are in your view, the noteworthy important findings, discoveries that using computing, using stylometrics, using counting and analysing of words in this way has given us?


GABRIEL EGAN:

The big discoveries concern Shakespeare's engagement with other writers. That is, it's become now, I think, undeniable that about a third of all Shakespeare, a third of all the plays have someone else's writing in them as well.

Now, that isn’t... – when I say a third, I'm counting, say 43 plays by Shakespeare. So we've got the 36 plays that are in the 1623 First Folio of Shakespeare. Plus everyone accepts that Pericles is partly by Shakespeare, 1609 with George Wilkins, and The Two Noble Kinsmen, not published until 1634 is co-authored. So we have 36 plays plus those two.

What's changed in the last twenty or thirty years is how many people have come around to the idea that Shakespeare also contributed to the play Sir Thomas More. Most interestingly, that one survives, part of it, in his own handwriting, a manuscript in the British Library; that he also contributed to the play Arden of Faversham. Most recently that he contributed to the play The Spanish Tragedy as it was adapted after Kyd's death: it was added to by Shakespeare amongst other people, and then also a play called Edward III, he contributed; and then you start to shade off into the ones where, well, it depends who you talk to, whether he also contributed, for example, to the play Mucedorus.

So what we've got is a growing canon, but not all the specialists are on board yet. There's a bit of a grey area. So for example, twenty or thirty years ago, the play Pericles was not universally accepted as co-authored. In fact, I think – who was it now – one of the major editions tried to make a claim for Shakespeare's sole authorship of Pericles, rather than it being co-authored with George Wilkins.

So there's been a sort of expansion, and more and more people have got on board. But at the same time, as this canon has expanded to around 43 plays now, it's also in some areas contracted in the sense that – and this is where, this is really what answers your question about the big changes from stylometry – we can now show quite clearly that substantial amounts of what we thought before were wholly Shakespearean plays are actually co-authored. So all three of the Henry VI plays now appear to be co-authored, with Shakespeare contributing really quite, very little, rather, to the First Part of the Henry VI plays.

We can also show very strongly, I think, that Macbeth is co-authored in the sense that, not that he sat with someone and wrote the play, but that after Shakespeare's death, the play as he knew it was adapted by Thomas Middleton and substantially cut, actually. One reason why it's short is not that Shakespeare wrote it short. It's been cut by Middleton, who then also added a few things that were not in the original, and that that's all we have. The Macbeth we have is this adaptation version.

People were suggesting that in the 19th century, on account of, there's a song in the play, the witches’ song, one of them, which was definitely a popular song of 1619, so three years after Shakespeare died. So it would seem that that song was put in because it was popular at the time before the First Folio was published.

So what we're doing, we've got a canon that is expanding in the sense that we keep finding more and more bits of Shakespeare in other people's plays, in the sense that we've always known The Spanish Tragedy as a play by Thomas Kyd. We now think part of that is by Shakespeare. We already knew, we've known for a long time, Arden of Faversham being a play that was published anonymously in 1592, we now think part of that is by Shakespeare; Sir Thomas More is definitely by other people, by Henry Chettle, Thomas Dekker, and others and we now think part of that's by Shakespeare.

But in other areas of the canon, as it were, it's contracted, in that we acknowledge that formerly canonical Shakespeare, things like the Henry VI plays, Macbeth, also Measure for Measure has a substantial amount of Middleton in it, and Timon of Athens, again, a play that's in the Folio, we think now is co-authored. So it's contracting in those areas.

What both of those developments give us is a rather different version, a rather different vision of Shakespeare, the man in his cultural milieu. That is, we get a much stronger sense of a man engaging with his fellow writers. A man being sociable, working with others as a member of a theatre company, much less of the idea of the bald headed genius in the garret with his pen, thinking things up; much more the man in the writers’ room, working with others.

Which of those versions of Shakespeare you prefer is to one's own taste. I happen to prefer the more sociable Shakespeare, but I try to not let that influence my judgement, because one of the great problems in this field is that we shouldn't really care about what you're going to find when you do an investigation. This will, of course, colour, even unconsciously colour your processing of the evidence.

And in any case, we have enough of a sociable Shakespeare, from what we already knew about the co-authorship, we had pretty clear ideas about some co-authorship. Finding more of it tells me that – as anybody in the theatre, I think, would expect – that he engaged with other people. He was an actor. He was a writer in a thriving new financial success. The economic success of theatre in his time is an extraordinary thing. And of course, he was engaged with other people and wasn’t, how could he be, a loner in that business?


SEBASTIAN MICHAEL:

How much agreement or disagreement is there amongst scholars about this? Are you going out on a limb when you, when your research comes up with such findings, or do people broadly agree because the data is really quite compelling?


GABRIEL EGAN:

Well, unfortunately it is a very vexed field. People care very much. One senior scholar, half jokingly, I think, a few years ago, said he was going to form the ‘Society for the Protection of Shakespeare's Text’ because he objected so much to the New Oxford Shakespeare making these claims about the Henry VI plays.

So people feel quite personally engaged with their vision of Shakespeare, and the idea that you might attribute something that they think is by Shakespeare to someone else. So it is rather a vexed field.

But I think if we take the long view, a lot of harm was done about one hundred, almost exactly one hundred years ago, I think it was 1924. There was a very influential public talk by E.K. Chambers, a very senior Shakespeare scholar, called The Disintegration of Shakespeare, in which he was trying to argue against some work done in the late 19th and early 20th century, which was finding co-authorship across the Shakespeare canon. And that essay was extremely influential. It was a British Academy talk here in London, and then got published as a very influential essay arguing against what he called ‘disintegration’.

And the word itself tells us what attitude he had towards ‘the disintegration of Shakespeare’, and that really persisted until the 1980s, when various people... – and in fact, the 1986/87 Oxford Complete Works of Shakespeare was the first to really investigate, or actually to report, because they didn't do any new investigation, it reported, really, what the experts in the field were saying about Shakespeare and his collaborations. And that was something that that edition was heavily criticised for. People were just not ready to hear some of these conclusions about Shakespeare.

People want to think of him – that image of the lone genius in the garret is a very powerful one and people wanted to preserve that. So it's been a slow, I would say, taking the long view, it's been a slow process of people accommodating themselves over the last few decades to the idea that Shakespeare was a co-author.

But of course, we haven't really defined co-author, and I've mentioned a couple of different ways of being a co-author: one is to sit with another person in the same room, as we are now, and collaborate on something. Another way you might do it is to divvy up the work beforehand and say, you go and do Acts Two and Three, I'll go and do Acts One and Four, and we'll come together and talk about Act Five. And that certainly is one way of working we can demonstrate. That seems to be how Pericles was done, with George Wilkins taking Acts One and Two and Shakespeare doing Three, Four and Five.

But another way to be a co-author is to take over a play that exists, perhaps by a dead writer, and to adapt it. And that seems to be what happens in some of Shakespeare's history plays. He takes over an existing play and adapts it. And then, of course, in the end that happens to him as well. When Middleton takes over Macbeth and Measure for Measure and adapts them. So that's another way, sort of beyond the grave, to be a co-author. They're all co-authorship of various kinds, but they're not collaboration, or rather, one is an active collaboration, one is a more passive collaboration.

But I think an important aspect of this from the point of view of English studies is that, what doesn't happen, what we can now prove doesn't happen, is that when two writers decide to work together, even if they work really closely together, side-by-side, they don't merge their styles. Style stays with the person.

Now, it's not the case that I can say this line is Shakespeare and the next one is by Middleton. The next one is by Shakespeare, because we need substantial bodies of writing to do the analysis. But when they did work perhaps by Acts, then you really can see the joints. They didn't manage to blend. They didn't manage to cover up the joints in the way that was… – it was popular to claim they must have done.

Since the 1960s, the predominant theory about what would happen would be that when writers work together, their styles are blended to the point that you can't tell who did what. And that actually emerges out of really French structuralist and post-structuralist theory about the nature of authorship, which is that it is essentially impersonal, that somehow a culture speaks through its great writers. And what you're hearing are the ideas and the idiom of that culture.

Turns out that's completely untrue. Michel Foucault and Jacques Derrida, Roland Barthes were wrong, we now know, about the nature of authorship. It turns out that the key determinant of what ends up in the text is who wrote it. By which I mean, if you take, for example, all the history plays by all the people who wrote history plays in Shakespeare's time, and you do various kinds of analysis, linear discriminant analysis is one of them, where you simply ask, objectively, if you had to group these by various measures, which plays would you put with which plays? You ask the machine to group them by various criteria, and the patterns are authorial.

You get a bunch of Shakespeare, a bunch of Heywood, a bunch of earlier writers, but you get them forming these clumps by author. And that happens even with history plays, by which I mean, the pattern’s stronger still outside of history plays; history plays are the ones with the least authorial signal, that is, history plays are rather like each other. There is a pattern to being a history play.

Once you get to the comedies and tragedies, it's a very stark distinction. Plays separate themselves out by authorship, which is the exact opposite of what I was taught as an undergraduate was the nature of authorship. So the empirical work has shown the theory to be utterly false, which is quite a shock to some of us, brought up on high French theory, that authors really make the most difference to what gets written.

I still get people saying to me, though, when you present some result of authorship analysis, ‘well, so what? We don't really care about authors, do we?’ And my response to that is, actually even people who say they don't care about them, they say, ‘oh, I don't look at that, I look at the culture of the plays. I look at the historical context, I look at the theory of mind being wielded here, or I look at the status of various marginalised groups, I don't care about authorship'. But in fact, people always do, when you press them, care, because what they do is they compare this writing with this writing and they say, for example, ‘the wonderful thing about The Merchant of Venice and, say, Shakespeare's representation of Christian prejudice against Jewish people, the wonderful thing is that the same tolerance is shown in…' and they’ll mention other works by Shakespeare, and they will essentially be treating it as one of his habits of mind, say, a generosity, tolerance.

And then if you point out that one of those other works that they're pointing to actually isn't wholly by Shakespeare, then that argument is somewhat undermined.

So, in fact, everybody, I would say, everybody does actually care about authorship if they wish to compare one thing with another. If you make no arguments comparing one text with another, then I suppose you could say, I don't care about authorship, it doesn't matter to me.

And you could, I suppose you could say, ‘I care about not who wrote the play, but which playing company put them on. I want to treat all of the Lord Admiral's Men’s plays as one thing. And let's compare them to the Lord Chamberlain's Men. We know what most those plays are. Let's compare them to the Queen's Men. Is there a company style, for example?'

Turns out, of course, actually, that that has been studied, and we can't find a company style. Turns out that the Chamberlain's Men's plays by Shakespeare are like the few Admiral's Men's plays that Shakespeare seems to have had some hand in.

So that, again, it seems like in all these debates, the trump card is authorship, which some will find reassuring. Some still will not really find this comforting. I still meet people, particularly in America, more than in Britain, with a heavy investment in some of this French theory, who simply can't accept that and have written at length about why it's wrong. The trouble is, it's empirical. It's not a matter of debate.


SEBASTIAN MICHAEL:

To my ears, as somebody who is a writer before I am a podcaster, this is music rather than noise. But it also brings us really quite neatly and beautifully into the Sonnets, because I'm one of these people who am in love with the Sonnets. I'm developing a very personal relationship with Shakespeare through his Sonnets. And I feel – of course, this may yet be proven to be wrong, but my contention is that they are his most personal work and that we learn a lot about Shakespeare, the person, Shakespeare the man through them. And one of the interesting questions when we take the Quarto Edition of 1609, then we have these 154 sonnets that are numbered and that are therefore arranged into a sequence. And we have really quite interesting and long and quite intense debates about, is this the correct sequence or not?

And I'm fascinated by this edition, which I have here on the table, which I've talked about on the podcast with Paul Edmondson and Sir Stanley Wells, who agreed to talk to me about their rearranging of the Sonnets in what they call, what they themselves refer to as a “putative chronological order of composition.” And one of my questions to them, of course, naturally, was, well, where do we get the knowledge, the information from, which order they were composed in? And they then referred me to the research done by MacDonald P Jackson.

One of the questions that is of particular interest to me, and I believe possibly therefore also to our listeners, is what can you say about the work of Professor Jackson, and maybe more generally and principally: in an approach such as his, how precise can that be? How clearly can we say about a body of poems such as the collection that we have, well, they should be re-ordered into this different sequence to the one that we have, because it is numbered.


GABRIEL EGAN:

Before I get to Mac Jackson, I just want to mention when you talk about the reordering, this would be reordering in order to reflect order of composition.


SEBASTIAN MICHAEL:

This is what Paul Edmondson and Sir Stanley Wells certainly do with their edition: a) they bring into the collection other sonnets that were not part of the 1609 collection, they take into it, into their edition, also sonnets that were written for plays; and then they put them into what they believe to be – although they told me in our conversation, they follow the work done by MacDonald P Jackson, and their stated intention with this is a very interesting one, and one which creates something of a challenge to somebody like me, which is they want to get away from what they consider to be a “distracting narrative” of a Fair Youth and a Dark Lady. And they are saying that by taking the Sonnets out of the order in which they were originally published – bracket: whereby they actually both say also that the order in which they were originally published was, according to them, they believe, or they are ready to believe, an order given the Sonnets by Shakespeare himself, which, of course, opens a whole new set of questions about how was he involved, to what extent was he involved in the publication? – but taking the Sonnets out of the order in which they were published and putting them into what could be the order in which they were composed, so as to get a new perspective on them.


GABRIEL EGAN:

And the reason they do that is because it became clear in the late 1970s and 1980s that our editions of Shakespeare in general, our Complete Works in particular, were not really thinking through the order of anything. So a lot of editions, Complete Works of Shakespeare editions, would put the plays roughly in the order that the First Folio of 1623 put them in. So you'd have the section of comedies, a section of histories, a section of tragedies. And there's not really any good reason to do that unless you say, I want to show what the First Folio did. It's not a natural order. It's distinctly unnatural in some ways. Plays don't easily break into exactly three categories. You may say there's a whole other category called the tragicomedies or the romances. You might say the Folio makes some very odd choices about where it puts some plays, and it's also shoehorned some plays into their category.

So what happened in the late Seventies, particularly because of Stanley Wells’s thinking of what he was going to do. Well, it started with the New Penguin Shakespeare, of which he was the associate editor, I think, helping J.B. Spencer. But he really, when he got the job of putting together the Oxford Shakespeare of 1986/87, which he started doing in 1977, he decided that what actually mattered was the order of composition.

So what you get really, for the first time, I think, in the Oxford Shakespeare of 86/87, is the plays in the order he wrote them, as best we can tell. And this was really quite transformatory. And certainly, I was an undergraduate when that edition had just come out, and to read Shakespeare in that order by people who thought it mattered was wonderful, because you do start to see something that I don't think you see any other way: you see the development of a writer's professional capacities. You see him getting better at certain things. So you see what he has trouble with in Two Gentlemen of Verona, probably his first play, regarding scenes of more than two people, you see him developing his skills, and you see him also returning to ideas and reworking them and improving upon them. Particularly in the late plays, you see patterns of dramatic interaction where he has another go at it essentially and does better.

So Stanley Wells was a champion of this idea of reading the works in the order they were composed. Now, the Sonnets, of course, are published, as you said, in 1609, in an edition that may or may not – and we don't know, I would say – have had some involvement from Shakespeare, including choosing the order. In putting them into the order of composition, I think Stanley Wells is trying to do the same thing that he did with the Oxford Shakespeare, which was help us see how a writer changes across his lifetime, because the Sonnets, as far as we can tell, some were written, with one exception, which is very early, but they were written from the early 1590s through to possibly even as late as the middle of the 1600s, as in the first decade of the century.

And to see patterns of development across that, that career would be most interesting. The problem, of course, is that putting plays into their order of composition is one matter; and really order of composition and first performance is what you care about, because to a large extent plays stand on their own.

But if you think sonnets form runs, as it were, and this can be demonstrated, there are things about a run of a dozen sonnets that are connected, not merely thematically, but actually also formally, by which I mean the things to do with repeated rhymes that connect them.

Then it may be that Shakespeare intended the order of the 154 that we have, although he'd written some of them a long time ago, and some of them quite recently. But his final intention, as it were, if he was involved in the publication, may have been to put them in exactly that order, using some old stuff of his and some new stuff to create a whole new thing. So I would say the reordering in the chronology will in that case reveal something, or perhaps reveal something quite interesting, but might also be breaking part of the intention of the writer to create this thing using some old and some new material that he had, and which he reworked at that point in time. I would say that's an unsolved matter.

We don't know whether he intended the order of 154, but they do, I think most people think they fall into groups or zones of the sonnets. And those also seem to be datable. That is, it's not simply that there's a group of sonnets about the Young Man, and the sonnets about the Dark Lady, and the sonnets about the Rival Poet and so on. They don't only have those thematic connections, they're also connected to each other in other ways that Mac Jackson was amongst the first to discover. And one of those ways is rhyme, actually.

If you treat that order of 154 as not merely random and look for rhymes, you can find whole runs of rhymes where one sonnet follows the next. What I mean is you get a rhyme of ‘she/me’, and the next one will have a rhyme of ‘me/be’, as well as ‘be/thee’. And they'll have a sort of chain link effect running through a series of them.

And you can demonstrate that it's not – Mac Jackson did this – you can say it's not by chance, because if you simply jumble the 154, you can't find them. These things disappear. This must be intentional. But they don't run through the whole 154, they run through these zones. And various people have said, well, let's treat the Sonnets as four distinct zones. Can we then say anything about each zone instead of 'Dark lady', 'Rival poet', 'Young Man'. And it turns out we can, and rhyme was one of the things that in the late Nineties MacDonald P Jackson did the study of this by comparing the rhymes in the Sonnets to the rhymes in the plays, because we have a fairly good sense of the chronology of the plays. It's been much studied since the eighteenth century. We have actual dates of performance for some plays, so we know this play was performed typically, say, at court.

We have dates by which things must have been created. So if we know, for example, the play was performed at court, then that's the [latest] date by which it was written: it must be written to be performed, and dates before which it cannot have been written. So, for example, if a play depends upon some topical event, it can't have been written before that event occurred historically.

So when Henry V, when that play alludes to Essex's expedition to Ireland, that can't have been written before the planning of that expedition in 1598, and so it actually couldn't have been written before he'd gone. You don’t, generally – the allusion looks forward to his return from Ireland – you don't normally look forward to someone's return before they've gone. So there are dates – what we call them, terminus ad quo, terminus ad quem – there are dates ‘not which before’ and ‘not which after’ for the plays. And we've actually got a lot of those. And of course, we've got some grey areas, but by various other kinds of analysis, we've got a fairly agreed upon chronology for the plays.

And if you agree that chronology, you can then say, well, how do the Sonnets compare in their use of things like rhyme? Because a lot of the habits we're tracking here seem to drift in Shakespeare's mind over time. That is, things… – there aren’t so many sharp jumps. He starts to favour something and then disfavours it. There's a sort of flow to what he's doing. And some of them are very datable.

For example, there's a fashion for humours in the late 1590s, plays with ‘humour’ in the title, but actually in which people talk about a humour as in an essential characteristic of personality. So the melancholic humour, the sanguine humour. And there are plays that use ‘humour’ in their title. And if you track simply that word ‘humour’ through Shakespeare's career, he uses it very rarely until the late 1590s and then he spikes. In other words, he follows the fashion.

And he has Nym, isn't it in? Yes, Nym: “That is the humour of it,” he says over and over again in Henry IV, Part, Part Two. Henry V, Merry Wives of Windsor. Corporal Nym. So that word sort of comes into Shakespeare's vocabulary and goes out again. If you had a ‘humour sonnet’ – I don't think we do, but if we did – you might say, well, that's going to be much more likely to have occurred during that peak than at the time when Shakespeare hardly ever uses the word ‘humour’, so you can somewhat date the sonnets by the plays.

If you agree the play chronology, you can look for similar things. And what MacDonald P Jackson did was start by looking at rhymes, but then he got particularly into rare words. Words that occur ten times or fewer across all of Shakespeare. Words that Shakespeare does use, but not very often. And those really do come into Shakespeare's idiolect and then go out again. So there are words like ‘humour’ that he uses at a certain time and hardly ever uses outside that time period. And you can track those and get a very good sense of what's in his mind, as a word that he likes using at the moment.

There's a whole variety of these measurable authorial behaviours. Another one I should have mentioned in relation to chronology is the use of rhyme itself. Shakespeare hasn't got a single index of his use of rhyme in all his plays. It's something that he uses a lot more in plays around 1593, 94, 95 than he does other times in his writing. And if you think, well, why would that be? The logical conclusion would be, well, he wrote some plays in the early 1590s, then had to stop because the theatres were closed because of plague. And he turns, it seems, to writing long narrative poems. And they're rhymed. These rhyme a lot, because of the nature of the poetic form he's using.

Thereafter, he uses rhyme a lot more in his plays, so it seems that his poetry influenced his drama. What I’m describing of course is that he wrote these long narrative poems apparently when the theatres were closed, because there was no outlet for plays at that time: why would you write plays when there's no theatre to perform them in? Then he goes back to the drama when the theatres reopen. But that helps us.

So the the fact that rhyme is something that he starts to do a lot of in his plays around that time, again, helps us date plays where we are uncertain, where we don't have those external clues like an allusion to an historical event. So what Mac Jackson looked at in the early 2000s was rare words in Shakespeare, words he uses ten times or fewer across all of his 43 plays, which had already been studied by other people and tabulated. And what MacDonald P Jackson is able to do is take other people's tables of data and wring from them new discoveries.

I should have started with some context about his own work. MacDonald P Jackson's first great contribution to our field was in 1975. He published a book on Shakespeare and Middleton, particularly on Middleton's contribution to Timon of Athens, in which he did the process I mentioned earlier of actually counting manually these function words, these small, highly frequent words, ‘the’ and ‘and’ and ‘on’ and ‘in’ in the plays of Shakespeare and Middleton by hand with graph paper, making a little cross every time one of them came up; he was able to show Middleton's hand in some works we thought previously were just purely by Shakespeare.

But that was work done before we had computers to do the counting. And what he specialised in was, first of all, he did the counting, but also the statistical analysis of the results. Because all of these habits are really probabilities. It's not a cast iron certainty, we’re going to find a certain word by Shakespeare. But what you can say is, given the data I've gathered, the evidence I've put together, how likely would we get these results if, say, these two plays weren't by the same author? Or how likely would I find this pattern of rhyme?

Say, for example, with the Sonnets, I mentioned the run of rhymes across, say, ten or twelve sonnets. When we jumble them up and look again for the same phenomenon, we find it's not there. You can say, well, how likely is it that by chance you would get 16 sonnets in a row that have a shared rhyme? And you can do the proper analysis, statistical analysis which says, well, that will happen, but not very often. You'd have to have a thousand playwrights writing for a hundred years before that thing would just happen by chance.

And that's always what we're asking. We're not actually asking is this real, this phenomenon I'm tracking? Or did this happen purely by chance? We’re actually asking a question that sounds the same but isn't, which is how often will chance alone do this? And you get what's called a p-value. And in social sciences, people often say, well p is less than 0.05. Therefore, there's something going on here. That is, if chance alone would do this one time in twenty or fewer, then probably chance didn't do it on this occasion.

And that's actually, logically, that's a very weak construction that doesn't bear much interrogation. But that is what the social sciences and some sciences use, which is, anything that's rarer than one in twenty, there’s something going on here, we'll look more closely at it.

But the maths for doing this is very well established. And MacDonald P Jackson was one of the first to actually, whenever he discovered something, to do the proper statistical analysis to say could that happen by chance, how often would that happen by chance? Or is there something actually going on here?

So looking at rare words in the early 2000s and taking a chronology of plays upon which we all pretty much, to a large extent, agree, within one or two years or three years at most in most cases, he said, let's take the zones of the Sonnets and try and find affinities to various times in Shakespeare's playwriting career that we can identify. And what he found in particular was that Sonnets 104 to 126, that group distinctly shared rare words with late Shakespeare plays, certainly after 1600, so that his big conclusion is that that group, Zone 3, 104 to 126, is Jacobean. It's later than the rest.

Now, his argument about the other Zones 1, 2, and 4 are arguments he would less strongly stick to. He makes suggestions about those, but I believe his big, strong claim is that that zone is later than all the rest.


SEBASTIAN MICHAEL:

Then the question, of course, is, I could imagine not only on my mind, but also on some of the listeners out there, how dependable is this? Or how – of course, how dependable is it – but also how controversial or not is it? In other words, do people argue with or against this? Or is everybody more or less agreed that the work of MacDonald P Jackson has not really been disproved, we can consider that to be fairly reliable? Where does scholarship stand on this?


GABRIEL EGAN:

Those who even engage in this area tend to do so mainly for the plays, not for the poems. So we're talking of a niche within a niche here. So there isn't an awful lot of debate, to be frank. In defence of what Wells and Edmondson do, and I'm certain that Stanley Wells would use this defence, is that you can't damage this stuff. You're not hurting it. I'll tell you which order I'm putting them in and why. But there are many other editions of Sonnets out there as well, so it's not like I've taken an obscure work.

If you work on a play by one of Shakespeare's contemporaries and make an edition of it for a major publisher, for a big academic publisher, you'll be producing the work that most people are going to read for the next fifty years. If it ever gets re-edited, it won't be for another fifty years. There's a kind of responsibility to be conservative; that is, to present the thing as it first entered the world, and to not muck it about too much. When with something like the Sonnets, where there's many editions available, many views are available, that gives people like Wells the freedom to say, look, I've had this hypothesis, I'm quite convinced that Mac Jackson is right about the dating. Let's see what that does to what we can make of the work itself. And to me, that's a completely justified approach. That is conservativism, as in, don't muck about with it too much, for works that are edited once in a generation; and for works that are absolutely everywhere, it's worthwhile, it’s illuminating to actually explore the interpretative consequences of some empirical scholarship like Mac Jackson's.

There isn't much debate about these things. People debate the chronology of the plays, they debate that, but not very vociferously. They debate the Sonnets and stylistic questions about stylometric questions hardly at all.

What gets everybody worked up is the authorship discussion. So, as I say, it's a niche within a niche. What's in their favour for what they've done in relying upon Mac Jackson's work is that although he's not God, he's got an extremely high reputation amongst those who understand what he's doing, because although he's changed his mind on certain things over time, no one has ever found any major methodological errors in what he's doing. He's extremely careful, and he supports what he does with proper statistical analysis. And this is extremely rare in our field.

That is, a lot of people will detect a phenomenon and say, well, look at this thing I've found, look at this fact about the text: that is so extraordinary, it must be not by chance, it must be intended. Something artistic going on. What Mac Jackson would always say is, well, if chance were to produce this thing, how often would it produce the thing you've got? He's always rigorous about distinguishing what chance can do from what must in fact be creative intention. And that's important because language will throw up some very weird things.

I'll give you an example. There's a very senior scholar who works on not individual words, but whole phrases that occur across various texts and tries to attribute authorship by the repeated use of various three or four word, or even five word, phrases. And one technique is to take an author's complete works and a text you suspect might be by that author, and look for all the three and four and five word phrases that are in common. And you get that list of maybe, say, one hundred three or four or five word phrases. Then you go looking across everybody else's writing. And because of EEBO/TCP I mentioned earlier, you can look at not merely all other literary authors, but actually all other authors of the period and say, well, how rare are these phrases? And you can say, well, this first phrase, I've got 25 other people using that one. So you throw away all the ones that are common, and you get left with a nugget of rare phrases, that is rare in the sense that they're not much used by anybody else, and they're in common to the text you're trying to attribute and the canon of your candidate author.

And at this point, our scholar throws up his hands and says, well, that is extraordinary. Look at that. That wouldn't happen by chance, would it? And the truth is, yes, it does happen by chance. And I once, to amuse myself thought, let's see how often this would happen with a text that can't be written by the same person. So I took my PhD thesis, 80,000 words, and a friend of mine, Matt Stegall, University of Bristol now. We were students at the same time. His PhD was, I think, the same year, but it was available digitally as well. Took his PhD thesis and did this process: I found all the phrases that were in common. And then when looking across various Google searches for English language and natural language corpuses, so just big textual corpuses of English, I threw away all the phrases that were not rare, and I was left with half a dozen extremely rare phrases that both he and I had used in our PhD thesis, and we didn't know each other at the time, I should mention, we became friends later. So we couldn't have influenced each other. But it turns out that any two substantial bodies of writing will have in common a number of extremely rare phrases. And if you happen upon these things, it might strike you intuitively as very meaningful. But it isn't. It's a mere truth of large bodies of writing.

And that's one of the things that careful scholars like Mac Jackson make sure they're not fooled by. They don't seem to rely upon their intuition about how rare things are, or how likely it will be that things this rare would crop up by chance. They do the maths and show it.


SEBASTIAN MICHAEL:

Why is it that since MacDonald P Jackson, nobody as far as we know, has done much work on the Sonnets? How is this, and should it change? And if the thought was yes, this really should change, how would it change?


GABRIEL EGAN:

Well, we may be up against a hard limit to our studies, that is, there's not that much writing in 154 sonnets. One of the reasons that we can do reasonable analysis of something like Shakespeare is we've got about a million words, and you need large bodies of text for the local variations to, as it were, cancel each other out. And when we do analysis of authorship, it's always with the caveat that we've only looked amongst these candidate authors because they're the only ones for whom we have enough writing surviving to do it.

So, for example, it's very hard to do any analysis of the works of Thomas Kyd because he left us one play, The Spanish Tragedy. Maybe Soliman and Persida is his. I think it probably is, but that's not as agreed upon as his writing The Spanish Tragedy. A translation of Cornelia, maybe, yes, that seems to be his as well, but that's a translation of someone else's play.

You've got a small corpus to work with. All of these things are heavily dependent upon having a substantial body of writing. I couldn't take the notepad you're sitting with in front of you now away, and analyse it and tell you anything meaningful. I'd have to take, if you’ve got a book of yours, I'd take a book away. Then I could do some analysis. So with the Sonnets, we just don't have much writing.

And we also have the problem that we've done a lot of our refining of our procedures using drama, and we're not sure how applicable lessons learned from drama are for verse. I mentioned earlier that when you allow all the variables to, as it were, cluster, to form clusters in the writing, when you've got a bunch of texts and ask the machine to analyse what's most like to what, I’d say that authorship trumps all other considerations, like who it was written for or what genre it's in. That was amongst the drama.

For the things that we have learned a lot about lately, the function words, that does also seem to be true of the other genres. So the verse, the non-dramatic verse seems to share those characteristics, but we've got to be very careful. Plays are like other plays more than they are like poems. I mean, they all divide in the same way, more or less, and there are certain constraints upon what you can do that may matter in ways we don't yet understand. That simply don't apply. We simply can't transfer the lessons. Or at least if we do so, we should be very conscious of doing so, applying methods and lessons we learned from drama to the non-dramatic verse.

There's that, and there's the fact that far fewer people are studying the poems than are studying the plays, both at student level, teacher level, researcher level.


SEBASTIAN MICHAEL:

Is there a known or knowable reason for this?


GABRIEL EGAN:

Why more people study the plays?


SEBASTIAN MICHAEL:

I suppose they are a richer body of work to start with.


GABRIEL EGAN:

They are a richer body of work. The Folio must take some of the blame here. So the first Collected Works of Shakespeare is a play only collection. The narrative poems, the long narrative poems Venus and Adonis and Lucrece sold very well. There are no authorship questions about those, in the sense that these are published with Shakespeare's name on them, so are securely Shakespearean. Why there might be less studies, I don't know, they… – things that you can perform are going to attract certain kinds of reuse that some things that you… – well, you can perform, people do perform them; I believe you perform the Sonnets, but they're not quite the same. And if you want to get a – well now, maybe you'll disagree with me – but if you want to get a group of young people interested in Shakespeare, I wouldn't start with the Sonnets. I would start with some dramatic interaction and say, right, you be Portia, you be Charlotte. You know, I'd get something going that they can get their teeth into, which is, I would say there's an easier route into that than there is to give someone a poem and say, look, what do you think this poet is thinking about?


SEBASTIAN MICHAEL:

I have really one more question and then a bonus question, if you like. The question that I had always intended to ask you is a slightly obvious one. But if you talk about computational approaches and text, how do you view the role that AI is now playing already or is going to play in Shakespeare and literature studies, text studies in the digital humanities henceforth.


GABRIEL EGAN:

We don't know where AI is going. Nobody saw coming, well, few people saw coming five years ago what's happened in the last year or two, which has been an amazing transformation of capacity for these things to just astound us.

So to predict would be very foolhardy. But I think that the problem we have with AI, the limit we have, is what's known in AI circles as the scrutability problem. That is, a machine running AI software may well give you a very interesting answer, but it can’t tell you, and the people who made it can't tell you – so by machine I mean not only the physical hardware, I mean the software, the algorithm it's running – the computer giving you an answer, an AI giving you an answer, can't be explicated. That is, you can't know why that answer was given.

When those of us who write programs to analyse text write them, we can show each other what the machine is doing. We can absolutely, deterministically say what the machine did was:.. It might be a very long winded explanation. And we do have research students working on this, and I do make them absolutely give me the long winded version: we have talks in front of the blackboard for an hour where they say, here's what my investigation did. And I make absolutely sure that we both understand every part of that investigation. The experiment is fully described.

It is of the nature of a neural network, the underlying architecture of an AI, that even the people who made it don't know how it does it. That is, a piece of AI that can look at photographs and say, that one's a cat, that one's a dog, that's a horse: the people who made the neural network, who trained the neural network to do that, can't tell you why the various weights, as they're called within the neurons within that neural network, produce that outcome. They don't know where the discrimination occurs.

It's the same problem as talking to a person. These AI are getting so good that first of all, they now do make us feel by talking to a person, they actually sound like another person. They pass the Turing test, as we say. The Imitation Game.

But actually, the problem is worse than that, in that then what they tell you can't be checked by any internal examination of the working of the machine. So it's as if you're asking another person, do you think Shakespeare wrote this? Do you think these sonnets are in the right order? Do you think this play was written in 1604? The AI can't tell you why it has that opinion.

A neural network trained to, for example, produce authorship attributions, to do classification, if it's using a neural network is somewhat of a black box. You can never know, and I don't mean that we're not smart enough to know, what I mean is the people who trained it don't yet know why that neural network, given those weights, produces the discrimination that it does. All they can say is – and this is so important for this kind of work – we've tested it. And when we gave it 10,000 pictures, a random jumble of 5,000 cats and 5,000 dogs – 99.3% of the time it correctly identified the animal. Now that's objective. That's empirical. That's good. We can at least say that's how reliable it is. We don't know why it's doing that, but we can say that ‘in these blind tests…’. And that's about the most hope you can have, is that if you do produce an AI that is able to do, say, authorship analysis, what you have to do is then take hundreds of works that you already know the author of, pretend that you don’t, ask the machine to do the classification and see how well the machine did, given that you know for yourself who wrote what.

And again, that kind of what we call validation run at least tells us how reliable the machine is. It doesn't tell us why it's getting it right, but we at least know that it is right. And then you can say, given this machine on many, many tests, many runs of validation is 99.2% reliable on authorship when distinguishing amongst these candidates, now, if I have a text of unknown authorship and ask it to choose amongst those, say, six candidates, at least if it points to a particular one, say that of these six, that writer is the most likely.

Of course, you've said nothing there about the possibility that your correct author isn't among the candidates. There may be someone else who writes like one of these, or unlike one of these. And what's been chosen is the one amongst those six who's most like the person you've never actually trained upon. So there are always problems to do with the range of candidates you've got. And that is severely constrained by the fact that most of the drama of the period doesn't survive. We have about five hundred plays from between, say, 1576, when the first permanent theatre was built, through to 1642, when the theatres were closed. We've got about five hundred plays surviving of, we think, two to two and a half thousand plays written. So we've got a fifth, a quarter or a fifth of all the drama, if we're lucky.

But of those, we've got a lot of writers for whom there's no name. We just have one name, Anonymous. And for most of the writers, or for all of the writers, I should say, we have no more writing than Shakespeare left us. That is, he has the largest surviving canon.

Middleton's somewhere close. Johnson wrote a lot, but all of it's not plays. But we just don't have enough of most writers writing left to say anything. And that's not going to be solved. It's not like there's more information to be wrung from that little data set. It just isn't enough data to tell you. There’s a – in information science it's called the Shannon limit – there’s a limit to how much actual information is in there.

So some of these questions may always stay unanswerable. And I should add that one of the grave problems that Shakespeare's dominance, by which I mean the fact that he leaves us so many more plays than anybody else, that actually causes a problem for a lot of the methodologies we're using.

If you're looking for a phrase, if you say, I think this phrase in my text I want to attribute is peculiar, is unusual, I haven't heard it before, and you say, well, who else is using it? And you look for that phrase; or just, all other things being equal, the writer who left you 43 plays is more likely to give you a match than the writer who left you one play. So there's a bias towards attributing things to Shakespeare, and we actually haven't yet figured out any reasonable way to compensate for that bias.

People have talked about various kinds of weighting, so it matters more if it turns up at a play by Chettle or Dekker, because they didn't leave so many plays, than if it turns up in Shakespeare. But there’s… – no one has a well worked out mathematical reason for giving a heavier, or for giving the right number. We don't know how much it matters. It certainly isn't a matter of dividing one canon size by the other canon size. It's got to be more complicated than that. So we have, as I say, a bias towards Shakespeare. But more fundamentally, we have the limitation of how much text is left us.

What is surprising is that just counting things is a good start. So I mentioned that if we count the frequencies of these hundred most frequent words in your writing and say mine, we can detect, we can distinguish between you and me with 85% reliability, but it turns out we can do more than just count those words. As I mentioned, we can look at their proximities one to another, and that can actually get us up to about 90-95% accuracy. So there is still some more information to be wrung from that limited data set, but even then, it only works for writers you've got enough writing with, and I don't think there's any hope of AI breaching that, because it isn't about being cleverer, it's about the intrinsic limitation of the information you've been left by history.


SEBASTIAN MICHAEL:

To wind this down, the bonus question, so to speak. I've asked it all of my guests so far, is – and you mentioned just now that Shakespeare is of his era the one playwright, the one author, the one poet, of whom we have most – where do you stand on the ‘Well, is it Shakespeare? Who is Shakespeare? Was Shakespeare Shakespeare? Or was he somebody else?’ debate? Do you have an opinion on that? Do you find it worthwhile pursuing, or what would your answer be to the question, was it Shakespeare? Did he exist? Is the Shakespeare we think of as Shakespeare an entity that we are entitled to think of as an individual in the way that we do?


GABRIEL EGAN:

I think we are. We're entitled to think of him as undoubtedly an actual man. He's a human being who lived. We are certain about when he was born to within a few days, and when he died to within a few days. He is the man from Stratford-upon-Avon, and we're entitled to be certain about that, to the same extent that we can be certain about that for anybody. That is, we'd have to give up on speaking about any writer, because for that not to be the case, the conspiracy would have to be of such vast proportions that actually everything else I think about the nature of reality would start to be questionable. That is, I would have to start questioning, you know, well, did we go to the Moon or was that faked as well? There's such a mountain of evidence.

And I hear people sometimes say, well, there's actually not much evidence about Shakespeare, the writer, but the books themselves are the evidence. It takes us on a circular logic to suggest that the books, all those title pages saying ‘by William Shakespeare’ are some kind of forgery. That's not impossible. It's of an extremely low order of likelihood that so many people would manage to pull off such a vast conspiracy for no good reason.

Years ago, when I worked at the Globe Theatre, the Artistic Director of The Globe, Mark Rylance, used to pin on the wall of the Green Room one of the newsletters, one of the anti-Stratfordian societies. And it often had paragraphs that were really just rhetorical questions: If the man from Stratford was this great playwright, why did he leave us no books? Where's his library? And I once thought, I can't let this go, and I annotated, I wrote the answer, well, actually, lots of great writers of the period didn't leave us any library. That's not an unusual fact. I named a few writers who we don't dispute they're writers, but they didn't leave us a library. And then I started doing more and more. And then one day he caught me. Rylance caught me. I said, I hope you don't mind. Oh, no. No, please: I'v been fascinated by the answers. So he's willing to engage with the discussion. But for me it’s: the size of the conspiracy would have to be so extraordinary to fake all those title pages for no good reason, so that I don't get particularly short-tempered with it. But maybe if I had listened to it as long as Stanley Wells has, I would share… – because sometimes, I've seen him get short-tempered about having to engage, or rather, the futility of his engagement, because I've heard him debate this topic with those who don't agree, he’s been spectacularly convincing, but they still come back the next week with more objections.

But I'm not short-tempered, I'm quite happy to discuss it with people because, and I think for me, you can't evade this by saying authorship doesn't matter, authorship does matter, who wrote what really matters. And for me, there's actually beyond just the irritation about that you'd have to ignore a mountain of evidence, there's also, I think, the immorality of denying a writer the credit for their achievement.

I think actually writing is a labour, intellectual labour is work, and we must credit people with their works. There's a dignity to the achievement that must be respected. And as an editor and as a critic, I feel an obligation to actually uphold that to the best of my understanding of what it is. So it's not a veneration. It's according due respect to the achievement of someone who did this work.


SEBASTIAN MICHAEL:

Which really serves as a fantastic way of closing the conversation. I want to thank you very much for taking the time and for talking to us in such great depth about computational approaches. This has been a truly, truly remarkable afternoon for me. Thank you very much. Professor Gabriel Egan.

This project and its website are a work in progress.
If you spot a mistake or if you have any comments or suggestions, please use the contact page to get in touch.
​To be kept informed of developments, please subscribe to the email list. 
If you would like to donate, you can do so here. Thank you!
​​

©2022-25  |   SONNETCAST – WILLIAM SHAKESPEARE'S SONNETS RECITED, REVEALED, RELIVED
​
  • Home
  • About
  • OVERVIEW
    • Introduction
    • The Procreation Sonnets
    • Special Guest: Professor Stephen Regan – The Sonnet as a Poetic Form
    • Special Guests: Sir Stanley Wells and Paul Edmondson – The Order of the Sonnets
    • The Halfway Point Summary
    • The Rival Poet
    • Special Guest: Professor Gabriel Egan – Computational Approaches to the Study of Shakespeare
    • Special Guest: Professor Abigail Rokison-Woodall – Speaking Shakespeare
    • Special Guest: Professor David Crystal – Original Pronunciation
    • The Fair Youth
    • Special Guest: Professor Phyllis Rackin – Shakespeare and Women
    • The Dark Lady
    • A Lover's Complaint
    • The Quarto Edition of 1609 and its Dedication
    • Dating the Sonnets— With Miro Roman
    • Summary & Conclusion
  • THE SONNETS
    • Sonnet 1: From Fairest Creatures We Desire Increase
    • Sonnet 2: When Forty Winters Shall Besiege Thy Brow
    • Sonnet 3: Look in Thy Glass and Tell the Face Thou Viewest
    • Sonnet 4: Unthrifty Loveliness, Why Dost Thou Spend
    • Sonnet 5: Those Hours That With Gentle Work Did Frame
    • Sonnet 6: Then Let Not Winter's Ragged Hand Deface
    • Sonnet 7: Lo! In the Orient When the Gracious Light
    • Sonnet 8: Music to Hear, Why Hearst Thou Music Sadly?
    • Sonnet 9: Is it for Fear to Wet a Widow's Eye
    • Sonnet 10: For Shame Deny That Thou Bearst Love to Any
    • Sonnet 11: As Fast as Thou Shalt Wane, So Fast Thou Growst
    • Sonnet 12: When I Do Count the Clock that Tells the Time
    • Sonnet 13: O That You Were Yourself, But Love, You Are
    • Sonnet 14: Not From the Stars Do I My Judgement Pluck
    • Sonnet 15: When I Consider Every Thing That Grows
    • Sonnet 16: But Wherefore Do Not You a Mightier Way
    • Sonnet 17: Who Will Believe My Verse in Time to Come
    • Sonnet 18: Shall I Compare Thee to a Summer's Day
    • Sonnet 19: Devouring Time, Blunt Thou the Lion's Paws
    • Sonnet 20: A Woman's Face, With Nature's Own Hand Painted
    • Sonnet 21: So Is it Not With Me as With That Muse
    • Sonnet 22: My Glass Shall Not Persuade Me I Am Old
    • Sonnet 23: As an Unperfect Actor on the Stage
    • Sonnet 24: Mine Eye Hath Played the Painter and Hath Stelled
    • Sonnet 25: Let Those Who Are in Favour With Their Stars
    • Sonnet 26: Lord of My Love to Whom in Vassalage
    • Sonnet 27: Weary With Toil, I Haste Me to My Bed
    • Sonnet 28: How Can I Then Return in Happy Plight
    • Sonnet 29: When in Disgrace With Fortune and Men's Eyes
    • Sonnet 30: When to the Sessions of Sweet Silent Thought
    • Sonnet 31: Thy Bosom Is Endeared With All Hearts
    • Sonnet 32: If Thou Survive My Well-Contented Day
    • Sonnet 33: Full Many a Glorious Morning Have I Seen
    • Sonnet 34: Why Didst Thou Promise Such a Beauteous Day
    • Sonnet 35: No More Be Grieved at That Which Thou Hast Done
    • Sonnet 36: Let Me Confess That We Two Must Be Twain
    • Sonnet 37: As a Decrepit Father Takes Delight
    • Sonnet 38: How Can My Muse Want Subject to Invent
    • Sonnet 39: O How Thy Worth With Manners May I Sing
    • Sonnet 40: Take All My Loves, My Love, Yea Take Them All
    • Sonnet 41: Those Pretty Wrongs That Liberty Commits
    • Sonnet 42: That Thou Hast Her, it Is Not All My Grief
    • Sonnet 43: When Most I Wink, Then Do Mine Eyes Best See
    • Sonnet 44: If the Dull Substance of My Flesh Were Thought
    • Sonnet 45: The Other Two, Slight Air and Purging Fire
    • Sonnet 46: Mine Eye and Heart Are at a Mortal War
    • Sonnet 47: Betwixt Mine Eye and Heart a League Is Took
    • Sonnet 48: How Careful Was I When I Took My Way
    • Sonnet 49: Against That Time, if Ever That Time Come
    • Sonnet 50: How Heavy Do I Journey on the Way
    • Sonnet 51: Thus Can My Love Excuse the Slow Offence
    • Sonnet 52: So Am I as the Rich, Whose Blessed Key
    • Sonnet 53: What Is Your Substance, Whereof Are You Made
    • Sonnet 54: O How Much More Doth Beauty Beauteous Seem
    • Sonnet 55: Not Marble, Nor the Gilded Monuments
    • Sonnet 56: Sweet Love, Renew Thy Force, Be it Not Said
    • Sonnet 57: Being Your Slave, What Should I Do But Tend
    • Sonnet 58: That God Forbid That Made Me First Your Slave
    • Sonnet 59: If There Be Nothing New, But That Which Is
    • Sonnet 60: Like as the Waves Make Towards the Pebbled Shore
    • Sonnet 61: Is it Thy Will Thy Image Should Keep Open
    • Sonnet 62: Sin of Self-Love Possesseth All Mine Eye
    • Sonnet 63: Against My Love Shall Be as I Am Now
    • Sonnet 64: When I have Seen by Time's Fell Hand Defaced
    • Sonnet 65: Since Brass, Nor Stone, Nor Earth, Nor Boundless Sea
    • Sonnet 66: Tired With All These, for Restful Death I Cry
    • Sonnet 67: Ah, Wherefore With Infection Should He Live
    • Sonnet 68: Thus Is His Cheek the Map of Days Outworn
    • Sonnet 69: Those Parts of Thee That The World's Eye Doth View
    • Sonnet 70: That Thou Are Blamed Shall Not Be Thy Defect
    • Sonnet 71: No Longer Mourn for Me When I Am Dead
    • Sonnet 72: O Lest the World Should Task You to Recite
    • Sonnet 73: That Time of Year Thou Mayst in Me Behold
    • Sonnet 74: But Be Contented When That Fell Arrest
    • Sonnet 75: So Are You to My Thoughts as Food to Life
    • Sonnet 76: Why Is My Verse so Barren of New Pride
    • Sonnet 77: Thy Glass Will Show Thee How Thy Beauties Wear
    • Sonnet 78: So Oft Have I Invoked Thee for My Muse
    • Sonnet 79: Whilst I Alone Did Call Upon Thy Aid
    • Sonnet 80: O How I Faint When I of You Do Write
    • Sonnet 81: Or I Shall Live Your Epitaph to Make
    • Sonnet 82: I Grant Thou Wert Not Married to My Muse
    • Sonnet 83: I Never Saw That You Did Painting Need
    • Sonnet 84: Who Is it That Says Most, Which Can Say More
    • Sonnet 85: My Tongue-Tied Muse in Manners Holds Her Still
    • Sonnet 86: Was it the Proud Full Sail of His Great Verse
    • Sonnet 87: Farewell, Thou Art Too Dear for My Posessing
    • Sonnet 88: When Thou Shalt Be Disposed to Set Me Light
    • Sonnet 89: Say That Thou Didst Forsake Me for Some Fault
    • Sonnet 90: Then Hate Me When Thou Wilt, if Ever, Now
    • Sonnet 91: Some Glory in Their Birth, Some in Their Skill
    • Sonnet 92: But Do Thy Worst to Steal Thyself Away
    • Sonnet 93: So Shall I Live, Supposing Thou Art True
    • Sonnet 94: They That Have Power to Hurt and Will Do None
    • Sonnet 95: How Sweet and Lovely Dost Thou Make the Shame
    • Sonnet 96: Some Say Thy Fault Is Youth, Some Wantonness
    • Sonnet 97: How Like a Winter Hath my Absence Been
    • Sonnet 98: From You Have I Been Absent in the Spring
    • Sonnet 99: The Forward Violet Thus Did I Chide
    • Sonnet 100: Where Art Thou, Muse, That Thou Forgetst so Long
    • Sonnet 101: O Truant Muse, What Shall Be Thy Amends
    • Sonnet 102: My Love Is Strengthened Though More Weak in Seeming
    • Sonnet 103: Alack, What Poverty My Muse Brings Forth
    • Sonnet 104: To Me, Fair Friend, You Never Can Be Old
    • Sonnet 105: Let Not My Love Be Called Idolatry
    • Sonnet 106: When in the Chronicle of Wasted Time
    • Sonnet 107: Not Mine Own Fears Nor the Prophetic Soul
    • Sonnet 108: What's in the Brain That Ink May Character
    • Sonnet 109: O Never Say That I Was False of Heart
    • Sonnet 110: Alas, 'Tis True I Have Gone Here and There
    • Sonnet 111: O For My Sake Do You With Fortune Chide
    • Sonnet 112: Your Love and Pity Doth Th'Impression Fill
    • Sonnet 113: Since I Left You, Mine Eye Is in My Mind
    • Sonnet 114: Or Whether Doth My Mind, Being Crowned With You
    • Sonnet 115: Those Lines That I Before Have Writ Do Lie
    • Sonnet 116: Let Me Not to the Marriage of True Minds
    • Sonnet 117: Accuse Me Thus, That I Have Scanted All
    • Sonnet 118: Like as to Make Our Appetites More Keen
    • Sonnet 119: What Potions Have I Drunk of Siren Tears
    • Sonnet 120: That You Were Once Unkind Befriends Me Now
    • Sonnet 121: Tis Better to Be Vile Than Vile Esteemed
    • Sonnet 122: Thy Gift, Thy Tables, Are Within My Brain
    • Sonnet 123: No! Time, Thou Shalt Not Boast That I Do Change
    • Sonnet 124: If My Dear Love Were But the Child of State
    • Sonnet 125: Were't Aught to Me I Bore the Canopy
    • Sonnet 126: O Thou, My Lovely Boy, Who in Thy Power
    • Sonnet 127: In the Old Age Black Was Not Counted Fair
    • Sonnet 128: How Oft When Thou, My Music, Music Playst
    • Sonnet 129: Th'Expense of Spirit in a Waste of Shame
    • Sonnet 130: My Mistress' Eyes Are Nothing Like the Sun
    • Sonnet 131: Thou Art as Tyrannous, so as Thou Art
    • Sonnet 132: Thine Eyes I love, and They, as Pitying Me
    • Sonnet 133: Beshrew That Heart That Makes My Heart to Groan
    • Sonnet 134: So Now I Have Confessed That He Is Thine
    • Sonnet 135: Whoever Hath Her Wish, Thou Hast Thy Will
    • Sonnet 136: If Thy Soul Check Thee That I Come so Near
    • Sonnet 137: Thou Blind Fool Love, What Dost Thou to Mine Eyes
    • Sonnet 138: When My Love Swears That She Is Made of Truth
    • Sonnet 139: O Call Not Me to Justify the Wrong
    • Sonnet 140: Be Wise as Thou Art Cruel, Do Not Press
    • Sonnet 141: In Faith, I Do Not Love Thee With Mine Eyes
    • Sonnet 142: Love Is My Sin, and Thy Dear Virtue Hate
    • Sonnet 143: Lo! As a Careful Housewife Runs to Catch
    • Sonnet 144: Two Loves I Have of Comfort and Despair
    • Sonnet 145: Those Lips That Love's Own Hand Did Make
    • Sonnet 146: Poor Soul, the Centre of My Sinful Earth
    • Sonnet 147: My Love Is as a Fever, Longing Still
    • Sonnet 148: O Me! What Eyes Hath Love Put in My Head
    • Sonnet 149: Canst Thou, O Cruel, Say I Love Thee Not
    • Sonnet 150: O From What Power Hast Thou This Powerful Might
    • Sonnet 151: Love Is too Young to Know What Conscience Is
    • Sonnet 152: In Loving Thee Thou Knowst I Am Forsworn
    • Sonnet 153: Cupid Laid by His Brand and Fell Asleep
    • Sonnet 154: The Little Love-God, Lying Once Asleep
  • THE SONNETEER
  • EVENTS
  • TEXT NOTE
  • CONTACT
    • SUBSCRIBE