This is a proposal for adding functions that can transform or rename arbitrary graph properties, such as EdgeWeight
s.
Why?
When importing a graph that has properties, the properties will often be in an undesirable format. One may need to deal with missing values, string/number conversions, etc. It may also be necessary to re-map one property to another. Say, the graph may have a "ws"
property that represents weights, and we may need to map it to Mathematica's standard EdgeWeight
.
(Of course, a prerequisite to working with imported graphs is to improve the Import
functionality a bit for several formats ...)
Another common problem is doing numerical transformations on edge weights. In some applications, a high weight value represents a strong connection. In others, it represents a long distance (e.g. EdgeBetweennessCentrality
). Right now it is not easy to transform weights to have the appropriate properties.
Similarly, we may also want to transfer EdgeWeight
to EdgeCost
or EdgeCapacity
to we can use them in with the appropriate functions.
What interface?
We could have a simple Map
-type function for transforming property values:
GraphPropertyMap[fun, graph, prop]
This would create a new graph, with fun
applies to every property value of the property prop
. For example, this would invert all weights:
GraphPropertyMap[1/#&, graph, EdgeWeight]
But I showed this only to introduce the idea. I would like to propose something more general instead.
An immediate useful generalization is a function comparable to MapThread
that can combine values from multiple properties, and store them into a new one (or overwrite an old one). At this point, it will make sense to distinguish between edge and vertex properties. So I propose the following syntaxes:
EdgePropertyMap[fun, graph, prop]
would do the same as GraphPropertMap
above, assuming prop
is an edge property.
EdgePropertyMap[fun, graph, {p1, p2} -> p3]
would combine values from properties p1
and p2
to create p3
. For example, let the EdgeWeight
be the sum of properties p1
and p2
:
EdgePropertyMap[Plus, graph, {p1, p2} -> EdgeWeight]
(When a property value is not present for one particular edge, but it is for the other, Missing[]
could be passed to the combiner function.)
This could also be used for renaming a property. E.g., transfer EdgeWeight
to EdgeCost
:
EdgePropertyMap[Identity, graph, EdgeWeight -> EdgeCost]
At this point we may allow a simplification, and just drop the function:
EdgePropertyMap[graph, EdgeWeight -> EdgeCost]
There should be no ambiguity since only one argument is GraphQ
(first or second). But the exact syntax and order of argument is really just a detail ...
We may make it even nicer and let is use associations, so instead of EdgePropertyMap[(#1 + #2) #3&, graph, {p1, p2, p3} -> p4]
we can write EdgePropertyMap[(#p1 + #p2) #p3, graph, {p1, p2, p3} -> p4]
. (I am assuming here that property names must be symbols or strings, and the same name in symbol or string form is equivalentjust like with options) Though in my personal opinion, this is going a bit too far without giving a true benefit. Unnamed arguments should be fine too.
For full generality, I suggest that these functions should treat the following as "properties" too: VertexList
, EdgeList
, VertexIndex
, EdgeIndex
. These special properties would be readable, but not writeable (i.e. they can only appear on the LHS of the ->
in the third argument). This would allow taking into account the present edge or vertex name when transforming a property. E.g. we could have an exceptional case for certain vertices, or vertex indices larger than a threshold.
We could also do things like
VertexPropertyMap[Style[#, Bold, FontSize->14]&, graph, VertexList -> VertexLabels]
to easily create labels from names for each vertex. Or we could re-style existing labels:
VertexPropertyMap[Style[#, Italic]&, graph, VertexLabels]
Even more generally, instead of treating EdgeList
, EdgeIndex
, etc. as "properties", they could be functions that act on the graph and return a list of the same length as the EdgeCount
or VertexCount
(for EdgeProprtyMap
and VertexPropertyMap
, respectively). Then we could do:
VertexPropertyMap[2.5 Log[1 + #]&, graph, VertexDegree -> VertexSize]
to size vertices according to their degree.
There are lots of functions which could be used in this context, such as VertexDegree
, BetweennessCentrality
, EdgeBetweennessCentrality
, etc.
I hope I convinced you that these would be extremely useful and versatile functions. They also fit well with Mathematica's philosophy of operating with general, high-level concepts. One simple function allows doing many different things without introducing ambiguity.
Summary
I propose two functions, called EdgePropertyMap
and VertexPropertyMap
which are capable of:
- transforming a single graph property using a function
- combining multiple properties into new ones
- renaming a property, or transferring values from one property to another one
Notice that this concept works well with any type of graph, including mixed graphs (which I would rather see deprecatedbut they wouldn't be a problem for these functions) and multigraphs (which are currently unusable with most properties, except for a few built-in ones).
Going further
We could go further and think about:
- How to use vertex properties to create edge properties, e.g. EdgeWeight based on VertexCoordinates (which I realize is now a graph-property, but this function could treat it as a vertex property).
- How to compute a vertex property based on the properties of neighbouring vertices
- What set of lower level functions makes all of this easier
This is too much for this post though, and brings up too many difficult questions that there isn't room for here. For example, if properties were usually handled as associations in the form <| vertex1 -> value1, ..., vertexn -> valuen |>
, that would seemingly simplify lots of things. But we would again bump into the fundamental problem of properties and parallel edges. That would have to be solved first by tweaking the design (e.g. refer to edges not solely by name, but by a combination of name and index? or index only?) So I do not want to go into this.
Any comments on this design, or proof-of-concept implementations are most welcome! I am hoping that by actually demonstrating the usefulness of the concept, we can convince the decision makers at Wolfram to include these in the next version.