Message Boards Message Boards

GROUPS:

Fast way to obtain index-based edge list of graph/network

Posted 5 years ago
5633 Views
|
3 Replies
|
10 Total Likes
|

I am looking for the fastest possible way to obtain the edge list of a graph as a packed array, represented using vertex indices (not vertex names). This is to construct a representation of the graph that is easy to handle in C code (through LibraryLink).

Example:

For Graph[{a,b,c}, {b <-> c, c <-> a}], I expect the output {{2,3}, {3,1}} as a packed array. Anything that can be reshaped into this (ArrayReshape), e.g. {2,3, 3,1}, is equally acceptable.

Requirements:

  • The edge ordering must be preserved
  • Must work on multigraphs
  • Must be as fast as possible. I expect that the fastest solution would use different approaches based on what kind of graph is given to it as input.
  • Must work on both undirected and directed graphs, but I do not need it to work on "mixed graphs" (MixedGraphQ).

Motivation:

One would usually need to use this representation of the graph when trying to work with a graph in C through LibraryLink. Preserving the edge ordering is important for two reasons: (1) we need to match up the EdgeWeight vector (2) we may want to do processing which returns edge properties (e.g. EdgeBetweennessCentrality) and need to match up the edge vector with the result.

In particular, I need this for IGraph/M, and I was not able to find a fully satisfactory fast method during the nearly two years I have been working on this package.

While I am primarily looking for what I described above, I also welcome solutions that do not preserve the edge ordering but can still match up edge weights with the edge vector. Currently, IGraph/M uses one of three functions based on whether the graph is weighted and whether edge ordering is important. Each of these three have multiple branches for different types of input. This is of course terribly complicated and error-prone, hence my question. (I should note that this case can be done using AdjacencyMatrix / WeightedAdjacencyMatrix, and I am generally satisfied with the performance.)

I have asked this question on StackExchange before, and there are very useful tips from user @halmir (be sure to check the comments too), but I still do not have a fully satisfactory solution. https://mathematica.stackexchange.com/q/95742/12

My current best implementation

This is what I use at the moment. It returns the result in flattened form (i.e. reshapeable into what I asked for).

igEdgeList[graph_] :=
    Developer`ToPackedArray@If[GraphComputation`GraphRepresentation[graph] === "Simple",
      Flatten[EdgeList@IndexGraph[graph, 0], 1, If[DirectedGraphQ[graph], DirectedEdge, UndirectedEdge]]
      ,
      Lookup[
        AssociationThread[VertexList[graph], Range@VertexCount[graph] - 1],
        Flatten[EdgeList[graph], 1, If[DirectedGraphQ[graph], DirectedEdge, UndirectedEdge]]
      ]
    ]

The key observation that this is based on is that IndexGraph is very fast on some graph, generally fast on graphs which use the "Simple" internal representation, but generally slow on graphs that use the "Incidence" internal representation (note that there are other representations as well and I do not have the full list).

Benchmark dataset

Here is a list of graphs to use for benchmarking. I am looking for a solution that does no worse than my own on all graphs in this dataset.

SeedRandom[137];

g1 = GridGraph[{250, 250, 2}];
g2 = Graph@EdgeList[g1];
g3 = GraphComputation`ToGraphRepresentation[g2, "Simple"];
g4 = GraphComputation`ToGraphRepresentation[g2, "Sparse"];
wg1 = Graph[g1, EdgeWeight -> RandomReal[1, EdgeCount[g1]]];
wg2 = Graph[g2, EdgeWeight -> RandomReal[1, EdgeCount[g2]]];

h1 = RandomGraph[{10000, 400000}];
h2 = Graph@EdgeList[h1];
h3 = GraphComputation`ToGraphRepresentation[h2, "Simple"];
h4 = GraphComputation`ToGraphRepresentation[h2, "Sparse"];
wh1 = Graph[h1, EdgeWeight -> RandomReal[1, EdgeCount[h1]]];
wh2 = Graph[h2, EdgeWeight -> RandomReal[1, EdgeCount[h2]]];

e1 = ExampleData[{"NetworkGraph", "CondensedMatterCollaborations"}];
e2 = ExampleData[{"NetworkGraph", "HighEnergyTheoryCollaborations"}];

m1 = Graph[UndirectedEdge @@@ RandomInteger[{1, 2000}, {80000, 2}]];
m2 = GraphComputation`ToGraphRepresentation[m1, "Sparse"];

graphs = AssociationThread[
   {"g1", "g2", "g3", "g4", "wg1", "wg2", "h1", "h2", "h3", "h4", 
    "wh1", "wh2", "e1", "e2", "m1", "m2"},
   {g1, g2, g3, g4, wg1, wg2, h1, h2, h3, h4, wh1, wh2, e1, e2, m1, m2}
   ];

Here's my solution benchmarked using Mathematica 11.2:

timings = Table[First@AbsoluteTiming@igEdgeList[#], {5}] & /@ graphs; // AbsoluteTiming
(* {14.944, Null} *)

timings // Map[Min] // Normal // Map@Apply[List] // TableForm

g1  0.277564
g2  0.281124
g3  0.280772
g4  0.240914
wg1 0.279735
wg2 0.274157
h1  0.131919
h2  0.241129
h3  0.126432
h4  0.221863
wh1 0.124445
wh2 0.24105
e1  0.039344
e2  0.010532
m1  0.048079
m2  0.03865
POSTED BY: Szabolcs Horvát
3 Replies

The following won't work for multigraphs, but it is faster for the other graphs:

elist[g_Graph] := With[
    {sa = UpperTriangularize @ WeightedAdjacencyMatrix[g, EdgeWeight->Range@EdgeCount@g]},

    sa["NonzeroPositions"][[InversePermutation @ sa["NonzeroValues"]]]
]

Check:

Grid @ Table[
    With[
       {
       r1=AbsoluteTiming[Flatten[elist[Symbol@g]]-1],
       r2=AbsoluteTiming[igEdgeList[Symbol@g]]
       },
       {g,r1[[1]], r2[[1]], r1[[2]]===r2[[2]]}
    ],
    {g, {"g1","g2","g3","g4","wg1","wg2","h1","h2","h3","h4","wh1","wh2"}}
]

g1  0.052843 0.276878    True
g2  0.051412 0.283879    True
g3  0.048165 0.222559    True
g4  0.050206 0.239958    True
wg1 0.049296    0.228189   True
wg2 0.048674    0.263723   True
h1  0.081907 0.11832     True
h2  0.084709 0.241035    False
h3  0.077764 0.124032    True
h4  0.082046 0.21298     True
wh1 0.080524    0.118146   True
wh2 0.077027    0.232665   False

The two discrepancies occur because some of the edges are reversed. For example:

Sort/@elist[h2] - 1 === Sort /@ Partition[igEdgeList[h2],2]
Sort/@elist[wh2] - 1 === Sort /@ Partition[igEdgeList[wh2],2]

(* True *)

(* True *)
POSTED BY: Carl Woll

Great out-of-the-box thinking, as always, Carl! :-) Edge reversals in an undirected graph are no problem at all.

POSTED BY: Szabolcs Horvát

Another suggestion was given on StackExchange: use IncidenceMatrix and derive the edge list for the positions and values in the (sparse) incidence matrix. I used a LibraryLink function to do this fast. This solution is now used by IGraph/M 0.3.95, and also exposed by it as the IGIndexEdgeList function. This function is even faster than EdgeList, especially for large graphs, thus it is useful for implementing fast edge-list based operations.

POSTED BY: Szabolcs Horvát
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract