Message Boards Message Boards

Automatic Generation of Academic Citation Graph

Project Title: Automatic Generation of Academic Citation Graph

Also on: WordPress

Source Code and Data: GitHub repository

In certain fields of academic studies (e.g. Deep Learning), academic papers are released in a much faster speed than people in the field read them (although it is certainly true in all fields). As researchers, we know that we want to know how the papers fit into the whole academic conversation, so it would be nice if we can automatically generate an academic paper citation graph, and immediately tell which one cites which.

I created a tool for you homo academicus to automatically create the said citation graph for any paper. This should be helpful for researchers to catch up on the trend of a rapidly changing field.

First, if you are using Mendeley (or any other Reference Management Software), export your papers as a .bib file which should include the arXiv ID and issue year information. Then, use Mathematica to run the code. It will take you to the Astrophysics Data System of Harvard and find out the list of reference for each paper. Finally, a citation graph will be drawn with the help of Wolfram Language.

See the below example. Here, I’ve selected a list of papers in Mendeley about adversarial examples (published in the past five years), and I want to know how they are related to each other (“citationally”).

Image 1 - Mendeley

Click File->Export and then save the papers’ metadata as My Collection.bib.

Use Mathematica to run the code in 1 – arxivID extraction from bib.nb, and the code will execute the saving of a new file arXivIDandYearLists.mx which stores the respective arXiv ID and issue year of the papers. Then run 2 – Citation Graph.nb in Mathematica again. This is the end product:

Image 2 - Citation Tree

You can see that Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples (Papernot et. al. 2016) and Explaining and harnessing adversarial examples (Goodfellow et. al. 2014) are the most influential nodes among those selected papers (i.e. most cited).

If you want to add more information (say author) to the vertex labels, you can modify 1 – arxivID extraction from bib.nb or 2 – Citation Graph.nb to do that. You just need to change a few lines so I am not going to be verbose here.

Enjoy.

Amazing!!

A useful add-on would be to build a web-crawler for the citation-web in order to discover influential papers (as you have said in deep learning it is easy to get lost in the literature).

Ideally one would like to sort out interesting from uninteresting papers, anyway the notion of "interesting" is not trivial to conceptualize in computational terms. The approach of most-cited is surely valuable but it has the drawbacks that you discover good paper when they are old.

I wonder if from the graph topology can be computed some heuristic to suggest interesting recent papers...

POSTED BY: Ettore Mariotti
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract