Despite the plethora of historical resources available, the study of mathematics remains divorced from a discussion of the stories behind its most creative minds. However, tracing the history of any academic area, especially from curated third-person narratives, raises some basic questions. What should be the scope of the discipline? Is there an overlap with related areas, which has impacted its trajectory? Are we truly analyzing the details of the mathematicians' inner lives, or are we limited to an interpretation of their lives that is suited to our times? Can a straightforward chronological account sufficiently describe the development of a living discipline?
In an effort to illuminate the lineage of discovery in mathematics, I decided to strictly emphasize on highlighting the landmarks where significant intellectual progress was made through detailed biographical accounts that have survived. The 2,7775 mathematicians included in my study were collected from MacTutor, a web-based archive of the history of mathematics maintained by the University of St. Andrews, and biography data from both MacTutor and Wikipedia were analyzed separately. My project uses different computational methods such as text mining and social network analysis to describe the biographies of mathematicians starting from 1689 BC to modern-day. In some sense, this is a simplified, but nonetheless rewarding, quest to understand mathematics using mathematics and I want to thank my mentor @Jofre Espigule for being wholly supportive throughout the summer school, and for keeping the vision for this project from the onset.
Constructing the dataset
The full chronology of mathematicians was downloaded from MacTutor.
dataset =
Import["http://www-groups.dcs.st-and.ac.uk/history/Indexes/Full_\
Chron.html", {"Hyperlinks"}];
dataset1 =
Flatten[StringCases[dataset,
"http://www-groups.dcs.st-and.ac.uk/history/Biographies/" ~~ __]];
dataset2 = Import[#, "Text"] & /@ dataset1
Then, the HTML text from each biography was parsed.
TemplateApply;
$mapping =
Reverse[Normal[
Templating`HTML`PackagePrivate`$SafeHTMLEntities], {2}];
htmlToText[txt_String] := StringReplace[txt, $mapping]
htmlBioList = Map[htmlToText, dataset2]
Afterwards, the birth date and birthplace of each mathematician was extracted using string patterns, converted into a Date Object and GeoLocation Object respectively and exported as MX files.
birthdate= Flatten/@StringCases[htmlBioList,"Born:<font color=\"green\"> "~~Shortest[y__]~~"<br>"-> y]
inChecker[{string_,___}]:= If[StringContainsQ[string," in"],StringReplace[string,Shortest[y__]~~" in"~~___:> y], string]
inChecker[{}]:= Missing[]
birthDateObjects = Interpreter["Date"][completebirthDates]
countryList =
StringSplit[
DeleteStopwords[
Take[StringSplit[Flatten@birthdate, " in "][[All, -1]], -1]]][[
All, -1]]
geoPositions = Interpreter["Location" | "Country"][countryList]
Then, the full text of biographies was analyzed to create a list of associations called vertexList, with the keys as the name of the mathematician and the values as the other mathematicians who are referenced in their biography.
bioReferences =
StringCases[htmlBioList,
"../Mathematicians/" ~~ Shortest[a__] ~~ ".html" -> a];
uniqueReference = Map[Union, bioReferences]
allNames =
Flatten@StringCases[
Import["http://www-groups.dcs.st-and.ac.uk/history/Indexes/Full_\
Chron.html", {"Hyperlinks"}],
"/Biographies/" ~~ Shortest[a__] ~~ ".html" -> a];
vertexList =
Flatten@ MapThread[
Thread[Rule[#1 , #2]] &, {allNames, uniqueReference}]
The biography data of female mathematicians was collected for a separate analysis.
femaleDataset =
Import["http://www-history.mcs.st-and.ac.uk/Indexes/Women.html", \
{"Hyperlinks"}]
femaleDataset1 =
Flatten[StringCases[femaleDataset,
"http://www-history.mcs.st-and.ac.uk/Biographies/" ~~ __]];
femaleDataset2 = Import[#, "Text"] & /@ femaleDataset1
femHtmls = Map[htmlToText, femaleDataset2]
allfemNames =
Flatten@StringCases[
Import["http://www-history.mcs.st-and.ac.uk/Indexes/Women.html", \
{"Hyperlinks"}], "/Biographies/" ~~ Shortest[a__] ~~ ".html" -> a]
fembioReferences =
StringCases[femBioList,
"../Mathematicians/" ~~ Shortest[a__] ~~ ".html" -> a];
femuniqueReference = Map[Union, fembioReferences]
femvertexList =
Flatten@ MapThread[
Thread[Rule[#1 , #2]] &, {allfemNames, femuniqueReference}]
A corresponding list of all mathematicians' biographies was collected from Wikipedia using SPARQL queries of IDs collected from MacTutor.
Thanks to @Aaron Enright for sharing the code to complete this task!
enWikiGet[mgpid_String] :=
With[{res =
Import["https://query.wikidata.org/sparql?query=" <>
URLEncode["SELECT ?wikipedia_article
WHERE
{
?person wdt:P1563 \"" <> mgpid <> "\" .
?wikipedia_article schema:about ?person .
?wikipedia_article schema:isPartOf \
<https://en.wikipedia.org/> .
}"] <> "&format=json", "RawJSON"]},
Replace[res["results", "bindings"][[All, "wikipedia_article",
"value"]], {s_String} :>
URLDecode@
StringReplace[s, "https://en.wikipedia.org/wiki/" -> ""]]]
wikiNames = Map[enWikiGet, allNames];
allWikiHtmls =
A second list of associations was extracted using mathematicians who are cited in each others' Wikipedia biographies.
wikiHtmlsasoc = AssociationThread[wikiNames, allWikiHtmls];
allWikiHtmlsFiltered = Select[wikiHtmlsasoc, StringQ];
wikiNameAsoc =
AssociationThread[Keys[allWikiHtmlsFiltered],
Map[Intersection[#, wikiNames] &,
StringCases[Values[allWikiHtmlsFiltered],
"/wiki/" ~~ Shortest[n__] ~~ "\"" -> n]]];
vertexList2 =
Flatten@ MapThread[
Thread[Rule[#1 , #2]] &, {Keys[wikiNameAsoc],
Values[wikiNameAsoc]}];
Finally, a list of mental health disorders (called mentalDisorders) was cross-examined with the MacTutor biographies. The code was also separately run on the Wikipedia biography datasets.
listOfDisorders =
ToLowerCase[
StringCases[Values[allWikiHtmlsFiltered], mentalDisorders,
IgnoreCase -> True]];
wikiAsocNames = Keys[allWikiHtmlsFiltered]
wikiMentalDiseases =
Map[# -> wikiAsocNames[[
Flatten@Position[
listOfDisorders, {___, ToLowerCase[#], ___}] /. {{x_}} ->
x]] &, mentalDisorders];
Frequency of Mathematicians Overtime
A date histogram was plotted presenting the number of mathematicians against their year of birth. Log scales were used to allow a large range to be displayed without small values being compressed down into bottom of the graph.
The first known mathematician is Ahmes, an ancient Egyptian scribe born in 1680 BC who wrote the famous Rhind Papyrus. The next written record of mathematical work was found after nearly 800 years in the form of geometrical calculations of fire-altar construction contained in the Indian Sulbasutras. Since then, the number of mathematicians has shown a steady increase, with sudden drops in the 2nd and 8th century. A plausible reason could be the rise of intense religious conflicts, first the banishment of Jews from Palestinian land by Romans in the 2nd century and later, frequent battles between religious groups and the expansion of Islam in the 8th century, leaving little time and resources to support a societal ecosystem for the growth of mathematics. Although, we shouldn't be too quick to draw conclusions about the impact of war on science. From a previous analysis of the Math Genealogy Project, we know that after Sputnik's launch in the 1950's the US congress drastically increased funding towards education, which results in a significant increase in the number of mathematics PhDs.
Word Clouds
All Biographies
Female Biographies
The above images show the word frequencies in the biographies of all mathematicians and only female mathematicians separately. It is interesting to observe words such as 'daughter', 'husband' and 'family' in the latter, suggesting that the lives of female mathematicians, or as they were recorded, were riddled by patriarchal gender roles. Indeed, the only word that appears often in the WordCloud of all biographies is 'father'.
MacTutor Social Network
The list of associations in vertexList is visualized using the built-in Graph function. There are 24385 references in total.
g = Graph[vertexList, VertexLabels -> Placed["Name", Tooltip]]
It is quite fascinating that the network of all mathematicians resembles the quintessential symbol of a heart, as a testament to very human, intellectual relationships which drive formal inquiry.
Using the BetweenessCentrality function, we can compute the most critical node i.e. the widely mostly widely referenced mathematician across all biographies. Interestingly, in the MacTutor biographies, our investigation yields Euclid.
critical = BetweennessCentrality[g];
criticalNode = Pick[VertexList[g], critical, Max[critical]]
This illustrates an important insight about how mathematics is a study of abstract systems that is contingent on the abstract systems that were studied before. In other words, the trajectory of mathematical discovery is largely shaped by early works in mathematics. One can conjecture how the discipline would have progressed were it not for Elements, but of course one doesn't have a basis for comparison. Our exploration of the mathematical universe is ultimately tied to very circumstantial elements such as time and place, a point we will revisit later on.
MacTutor Female Mathematicians
Let's first visualize the frequency of female mathematicians over time.
In total, there are far fewer mathematicians, precisely 5 females for every 100 mathematicians. Promisingly, the numbers have seen a slow, but steady increase over time.
The first recorded female mathematician was Hypatia of Alexandria, a prominent astronomer and polymath who lived in the beginning of the Hellenistic period. She was often consulted by her contemporaries regarding the construction of astronomical devices such as the astrolabe and hydroscope. In addition, Hypatia helped her father, who was also a mathematician, revise numerous texts including Ptolemy's Almagest and Euclid's Elements. At a time of considerable tension between Hellenistic ideals and Christianity, Hypatia was a beacon of tolerance and taught many Christian students.
The following network visualizes female mathematicians marked in red.
f = Graph[femvertexList, VertexLabels -> Placed["Name", Tooltip]]
HighlightGraph[f, Subgraph[f, allfemNames ]]
Nearly all female nodes are connected with male nodes, with very few connections between female mathematicians. In fact, the only complete subgraph, computed using the built-in function FindClique], includes [Mary Boole and her daughter, Alicia Boole Stott. Both women were self-taught mathematicians who had significant contributions in children's education and four-dimensional geometry respectively. Both women were also intimately related to George Boole.
Finally, the most connected female mathematician was Emmy Noether (23 March 1882 – 14 April 1935), a German mathematician who is known for her work in abstract algebra and theoretical physics.
Wikipedia Social Network
Here, we visualize the social network of mathematicians who are referenced in each others' Wikipedia biography. Compared to the 24385 edge count in the MacTutor biographies, the Wikipedia network contains 30541 edges. It appears that Wikipedia biographies are more informative.
The highlighted section is computed by FindClique[] and contains a subgraph comprising of all the ancient Greco-Roman mathematicians. I found this quite fascinating because the clique reflects something fundamental about how we have accumulated knowledge about that period of history, i.e. since there are relatively fewer written records, all that we know about these ancient mathematicians is gleaned from the work of few existing texts.
Contrary to the Mac Tutor network, the critical node in the Wikipedia biography is Isaac Newton.
Prevalence of Mental Health Disorders
This portion of the project is mainly inspired by Logicomix, a graphic novel depicting Bertrand Russell's quest for absolute truth which features legendary thinkers like Gottlob Frege, David Hilbert, Kurt Gödel and Ludwig Wittgenstein. Logicomix explores the philosophical struggles of these mathematicians who inexplicably descended towards madness in their personal intellectual journeys.
The most common mental health disorder among mathematicians across the Wikipedia dataset is depression. The finding is hardly surprising given the general prevalence of depression.
Drug Use
As a fun exercise, I also ran a list of drugs against the Wikipedia biographies and found some interesting results.
If you are curious about the names of these mathematicians, you can find the lists attached below.
Geographical Distribution of Mathematicians Across History
Finally, I present a gif of the development of mathematics around the world. Please note that the color function is scaled cumulatively.
I am convinced that without the stories of great mathematicians, their history, the creative process, our understanding of mathematics is incomplete. After all, the study of mathematics has historically been tied to real geographic locations on our planet, not exclusively in an esoteric world of ideas.
Mathematics is a wonderful intersection of the two, the spirit of which is best embodied in a quote I came across during my research by science writer and polymath Mary Somerville (26 December 1780 – 29 November 1872):
"Sometimes I find [mathematical problems] difficult, but my old obstinacy remains, for if I do not succeed today, I attack them again on the morrow."
Attachments: