Message Boards Message Boards

[WSS16] Predicting & Recommending Connectivity in the WL Documentation

Posted 8 years ago

Introduction

The Wolfram Language & System Documentation Center provides descriptions, parameters, and methods for each function deployed in Wolfram Mathematica. Additionally, these web pages include a section entitled "See Also," where other Mathematica functions are hyper linked.

Sample of Wolfram function documentation page

Motivation

The purpose of the hyper links in each function's "See Also" section is to guide the user to other functions that are likely to be of interest to them, given that they have visited the documentation of the current function.

However, it is not always clear what criteria should be used to determine when one function should link to another. Generally, the connections are made at the developers discretion.

Questions

Can we predict where connections between functions will be present?
Can we recommend where connections should be present?

Graph Approach

At first, it may be intuitive to begin with a qualitative graph analysis approach. We have a set of connections and we want to visualize how those connections are organized, and perhaps describe some characteristic behavior. However, just plotting the graph proves this approach to be unwieldy.

Connectivity Graph of ~5,000 Mathematica Functions

Graph Metrics as Features

Instead, we computed graph network metrics, which we will feed into Classify. Each function was considered a node, and each connection, a directed edge. For each connection (and a balanced set of "potential" connections), we computed over a dozen features, including:

  • ratio of degrees in and out for a pair of functions
  • shortest path between the functions
  • number of connections in common
  • ratio of PageRank Centrality
  • string overlap in the function names

Performance

Simply using Classify, we are able to predict whether a directed connection exists between two functions, with accuracy over 97%.

ConfusionMatrix

Tool in Development

With this model, we can build a tool to help developers inquire about specific functions. We can now take any function, consider directed pairs with all other functions, and predict whether the directed connection is currently present. If not, we recommend connections that the model predicts with high probability, and display how the addition of these recommended connections would change that function's local community.

enter image description here

enter image description here

enter image description here

POSTED BY: Laura Buchanan
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract