Group Abstract Group Abstract

Message Boards Message Boards

Analytics of Republican Debate and network percolation

Posted 10 years ago
POSTED BY: Vitaliy Kaurov
15 Replies
Attachments:
POSTED BY: Alan Joyce

Hi,

there are two more little things to add. Vitaliy has this fantastic post on measuring interest in the conflicts in Syria and Ukraine. We can of course use the same technique to study people's interest in the presidential candidates:

fullnames = {{"Donald", "TRUMP"}, {"Jeb", "BUSH"}, {"Scott", "WALKER"}, {"Marco", "RUBIO"}, {"Chris", "CHRISTIE"}, {"Ben", 
    "CARSON"}, {"Rand", "PAUL"}, {"Ted", "CRUZ"}, {"Mike", "HUCKABEE"}, {"John", "KASICH"}, {"Carly", "FIORINA"}};
data = WolframAlpha[#[[1]] <> " " <> #[[2]], {{"PopularityPod:WikipediaStatsData", 1}, "ComputableData"}] & /@ full names;

DateListPlot[data, PlotRange -> All, PlotTheme -> "Detailed", AspectRatio -> 1/4, ImageSize -> 800, PlotLegends -> fullnames[[All, 2]]]

enter image description here

It would now be interesting to identify what the peaks mean. Some are more obvious than others, but I have not got a neat and automated way to identify the events that cause these peaks. Vitaliy, I think that in your post you identified the peaks "manually". There are websites like Wikipedia, that list important events for most days. But the data does not appear to suffice to identify peaks at this level of detail automatically. Do you have any idea as to how to automise that?

Another thing is that we could draw an angle path from the sentiment list. This looks like so:

candidates = {"TRUMP", "BUSH", "WALKER", "RUBIO", "CHRISTIE", "CARSON", "PAUL", "CRUZ", "HUCKABEE", "KASICH", "FIORINA"};
sentimentlist = Table[-"Negative" + "Positive" /. ((Classify["Sentiment", #, "Probabilities"] & /@ #) &@TextSentences@Part[debateBySpeaker[#] & /@ candidates, k]), {k, 1, Length[candidates]}];
ListLinePlot[AnglePath[#] & /@ sentimentlist, PlotLegends -> candidates, ImageSize -> Large]

enter image description here

This plot is (relatively) easy to interpret: if the sentences are positive the curve bends left, otherwise right.

Cheers,

Marco

POSTED BY: Marco Thiel

Very interesting, Marco! The Wiki data can actually be used to reflect on what candidates people view as related. Again your data:

fullnames = {{"Donald", "TRUMP"}, {"Jeb", "BUSH"}, {"Scott", "WALKER"}, {"Marco", "RUBIO"}, 
                     {"Chris", "CHRISTIE"}, {"Ben", "CARSON"}, {"Rand", "PAUL"}, {"Ted", "CRUZ"}, 
                     {"Mike", "HUCKABEE"}, {"John", "KASICH"}, {"Carly", "FIORINA"}};

data = ParallelMap[WolframAlpha[#[[1]] <> " " <> #[[2]], 
{{"PopularityPod:WikipediaStatsData", 1}, "ComputableData"}] &, fullnames];

But I'll get the last year to be fair to the recent campaign and use log plot to see better the details:

recent = TimeSeriesWindow[#, {{2014, 1, 1}, Now}] &@ TemporalData[data];
DateListLogPlot[recent, PlotRange -> All, PlotTheme -> "Detailed", AspectRatio -> 1/4, 
 ImageSize -> 800, PlotLegends -> fullnames[[All, 2]]]

enter image description here

Let's get mutual correlation matrix - note the diagonal INfinity trick - for the self-edge removal in WeightedAdjacencyGraph.

m = Outer[Correlation, #, #, 1] &@ 
QuantityMagnitude[Normal[recent][[All, All, 2]]] (1 - IdentityMatrix[Length[fullnames]]) /. 0. -> Infinity;

Significant negative correlations are hard to get in such data, but positive values can be quite high:

m // Flatten // Sort

enter image description here

MatrixPlot[m, FrameTicks -> {Transpose[{Range[11], #}], Transpose[{Range[11], Rotate[#, Pi/2] & /@ #}]}, 
   ColorFunction -> "Rainbow"] &@fullnames[[All, 2]]

enter image description here

So we are getting a complete weighted graph:

g = WeightedAdjacencyGraph[m, VertexLabels -> Thread[Range[11] -> fullnames[[All, 2]]], 
   VertexSize -> "ClosenessCentrality", VertexStyle -> Opacity[.5]];

FindGraphCommunities still react on EdgeWeight:

comm = FindGraphCommunities[g]

{{1, 5, 6, 9, 10, 11}, {2, 3, 4, 7, 8}}

So I wonder if anyone with actual knowledge of politics can see in this clustering some truth:

CommunityGraphPlot[g, comm]

enter image description here

POSTED BY: Vitaliy Kaurov
POSTED BY: Marco Thiel

Right, we should fix that. In the meantime, it's kind of interesting to look more closely at the words I threw out of the earlier clouds, and the context in which they appear. For example, "we need" is such a common phrase in these debates, but what is it that each candidate thinks "we" need?

enter image description here

POSTED BY: Alan Joyce

Not so interesting. Check the earlier notebooks — I made a point of removing "people" and a handful of other words that were exceptionally common (across all candidates) in the context of the debates. The democratic clouds would showcase more significant differences between the candidates if they did the same thing.

POSTED BY: Alan Joyce

This is just so much fun and informative; a timely use of current analytics.

POSTED BY: Drew Lesso

enter image description here - another post of yours has been selected for the Staff Picks group, congratulations !

We are happy to see you at the tops of the "Featured Contributor" board. Thank you for your wonderful contributions, and please keep them coming!

POSTED BY: EDITORIAL BOARD

connecting them by the times their daily hits overlap

Sort of, yes, but "overlap" is a too broad term. The measure of that is "Correlation" - and it is exactly the name of the function used in the main block of code:

m = Outer[Correlation, #, #, 1] &@ 
QuantityMagnitude[Normal[recent][[All, All, 2]]] (1 - IdentityMatrix[Length[fullnames]]) 
/. 0. -> Infinity;
POSTED BY: Vitaliy Kaurov

So it's essentially visually spikes of traffic to each candidates wiki page, connecting them by the times their daily hits overlap?

POSTED BY: Jonathan Wallace

Communities just group those candidates whose pages viewed by public more synchronously. Wiki data are in hits per day based on weekly averages of daily hits to English-language page. That explanation can be seen on any W|A page under the wiki-data plot - for example: Donald Trump

enter image description here

POSTED BY: Vitaliy Kaurov

Is the wiki data hits per page or what exactly is that data? I'm unsure what the "communities" are...wiki page queries?

POSTED BY: Jonathan Wallace

What if instead of showing what the candidates said, we show what people heard? I wonder if there's a way to pull Twitter data by #demdebate or #gopdebate for a word cloud of reactions?

POSTED BY: Jonathan Wallace

Very interesting that “people” is prominent in 4/5 of the Democratic word clouds in this this post

http://blog.wolfram.com/2015/10/14/democratic-presidential-debate-word-clouds/

and none of the Republican ones in this post

http://blog.wolfram.com/2015/08/13/the-winner-of-the-gop-presidential-debate/

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard