Message Boards Message Boards

[WSS17] Wikipedia Articles on Computational Fields

Posted 7 years ago

Collecting Data

In order to perform analysis on Wikipedia articles, we need to define our set. Since the computational field is relatively small, we can do this by searching for "Computational" on Wikipedia and filtering out all results that don't contain "computational" or "journal".


Link Analysis

We can use the WikipediaData[] function in the Wolfram Language to collect all necessary information we need. By creating a graph where a node is an article in our set and a directed edge is a link, we can create a directed graph showing how articles within the set are linked.

Computational article links

If we run CommunityGraphPlot[] on this graph, we can see how articles are sub-grouped based on their links.

Computational article communities

We can then coalesce these groups into individual points to see more clearly how the groups interact with each other.

Group communities

If we automatically label the groups by word frequency we end up with this:

Word cloud communities


Edit Histories

Using https://tools.wmflabs.org/xtools-articleinfo/ we can easily collect data on the edits over time from any Wikipedia article.

We can look at various statistics such as number of unique editors versus number of total revisions. Three of one hundred articles have been removed as outliers to more easily view the majority.

Editor counts vs. revision counts

We can also look and see how all of our articles have been edited over time.

Edits over time example

If we look at the composition of the top 30 editors, we can see what percent of edits made by the top 30 editors each editor made. For computational biology, it looks like this:

Computational biology pie chart

Another interesting graph we can look at is seeing how many editors it takes to reach over 50% of an article's total edits. Individualized data is not provided past the top 30 editors, leading to the spike at 30.

Fifty percent

Lastly, we can look and see of articles that have more than 30 editors, what percentage of total edits did the top editors contribute.

Top thirty editors contributions

POSTED BY: Ethan Truelove
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract