Group Abstract

Message Boards

WOLFRAM COMMUNITY

21.8K Views

3 Replies

10 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Recreation Dynamic Interactivity Graphs and Networks Wolfram Language Computational Linguistics Natural Language Processing Computational Humanities

Shakespearean Sonnets' rhymes analysis

Thales Fernandes

Thales Fernandes, Centre de Biochimie Structurale (CBS) de Montpellier

Posted 7 years ago

Shakespearean sonnets are composed with the rhyme scheme ABAB CDCD EFEF GG , which means that each verse with the same label needs to rhyme. An example is the famous sonnet 18: A: Shall I compare thee to a summers day? B: Thou art more lovely and more temperate. A: Rough winds do shake the darling buds of May, B: And summers lease hath all too short a date. C: Sometime too hot the eye of heaven shines, D: And often is his gold complexion dimmed; C: And every fair from fair sometime declines, D: By chance or natures changing course untrimmed. E: But thy eternal summer shall not fade, F: Nor lose possession of that fair thou owst, E: Nor shall Death brag thou wandrest in his shade, F: When in eternal lines to time thou growst. G: So long as men can breathe or eyes can see, G: So long lives this, and this gives life to thee. To analyse the rhymes we first import it into Mathematica: sonnets = Select[StringTrim@StringSplit[#, "\n"] & /@ StringSplit[ToLowerCase@Import@"Shakespeare's Sonnets.txt", "\n\n"], Length@# == 14 &]; The file is attached below and we select only sonnets with 14 verses (there are 2 outside this pattern). The next step is the select the last word of each verse and remove punctuation: lastWords = Map[Last@StringSplit, sonnets, {2}] //. s_String :> StringReplace[s, { RegularExpression@"[,.;:!?-]$" -> "", RegularExpression@"'(.+)'" -> "$1", RegularExpression@"(.+)[,.;:!?]'" -> "$1"} ]; The code is a bit awkward, but it works... Using the rhyme scheme, we pair the words that rhyme to form a graph: data = Union@Flatten@Table[Thread[lastWords[[All, i]] \[DirectedEdge] lastWords[[All, i+If[i==13,1,2]]]], {i, {1,2,5,6,9,10,13}}]; We are now ready to plot a graph of the word rhymes and segment each graph since most of them are disjoint. (g=Graph[data]) // WeaklyConnectedComponents; Select[Union /@ %, Length@# > 6 &] // Reverse@SortByLength; Manipulate[Subgraph[g, %[[i]], VertexLabels -> "Name", ImageSize -> 600], {i, 1, Length@%, 1}] In the graph bellow you can see that key ([kee]) rhymes with survey (ser-vey) in Shakespeare time, hence it was probably pronouced [key]. A more in deep analysis of this subject can be found in the NativLang video What Shakespeare's English Sounded Like - and how we know. Attachments: Shakespeare's Sonnets.txt

POSTED BY: Thales Fernandes

3 Replies

Sort By:

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 7 years ago

- Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming!

POSTED BY: EDITORIAL BOARD

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 7 years ago

Thales, this is quite nice! BTW I think you did not define `SortByLength`. I thought I should recreate this in a bit different manner. Note various places, such as using `TextCases` and `TextWords` to simplify regular expressions processing. I started from resource data - we can search for availability: ResourceSearch["Shakespeares Sonnets"] Then download the data: raw = ResourceData["Shakespeare's Sonnets"]; This will split into separate sonnets and delete noise such as roman numeral for sonnet number: sonnets = Select[TextCases[raw, "Paragraph"], StringLength[#] > 10 &]; So we have `Length@sonnets` 154 sonnets. Some do not have 14 lines In[]:= Position[Length/@StringSplit[sonnets,"\n"],x_/;x!=14] Out[]= {{39},{99},{126}} so we delete them: sonnets14 = DeleteCases[sonnets, Alternatives @@ Extract[sonnets, {{39}, {99}, {126}}]]; Find last words in every line: lastWords = Map[Last[TextWords[#]] &, StringSplit[sonnets14, "\n"], {2}] and following the rhyming pattern you suggested: rhymePatt = {1, 3, 2, 4, 5, 7, 6, 8, 9, 11, 10, 12, 13, 14}; we can build the graph (I chose undirected graph, for alternative sake): g = Graph[UndirectedEdge @@@ Flatten[Partition[#[[rhymePatt]], 2] & /@ lastWords, 1]] And finally computing subgraphs subGra = Subgraph[g, #, VertexLabels -> "Name", PlotTheme -> "Minimal", BaseStyle -> Opacity[.4], ImageSize -> 500 {1, 1}] & /@ ConnectedComponents[g]; and building the app - I kept multiple edges to see what is more frequently used:

POSTED BY: Vitaliy Kaurov

Thales Fernandes

Thales Fernandes, Centre de Biochimie Structurale (CBS) de Montpellier

Posted 7 years ago

Thank you, Vitaliy. It's always so nice to see our code being rewritten in another form (in a far more understandable way, I might add). I might read the doc about `ResourceSearch` again and try to use it more in the future. `PlotTheme -> "Minimal"` gives a really pretty result! BTW I think you did not define SortByLength. It's part of a collection of functions I have defined. I have been using these so much that I forgot they weren't part of Mathematica. I might create a post sometime with some handy custom functions.

POSTED BY: Thales Fernandes

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback