# Fast way to obtain index-based edge list of graph/network

Posted 5 years ago
5633 Views
|
3 Replies
|
10 Total Likes
|
 I am looking for the fastest possible way to obtain the edge list of a graph as a packed array, represented using vertex indices (not vertex names). This is to construct a representation of the graph that is easy to handle in C code (through LibraryLink).Example:For Graph[{a,b,c}, {b <-> c, c <-> a}], I expect the output {{2,3}, {3,1}} as a packed array. Anything that can be reshaped into this (ArrayReshape), e.g. {2,3, 3,1}, is equally acceptable.Requirements: The edge ordering must be preserved Must work on multigraphs Must be as fast as possible. I expect that the fastest solution would use different approaches based on what kind of graph is given to it as input. Must work on both undirected and directed graphs, but I do not need it to work on "mixed graphs" (MixedGraphQ). Motivation:One would usually need to use this representation of the graph when trying to work with a graph in C through LibraryLink. Preserving the edge ordering is important for two reasons: (1) we need to match up the EdgeWeight vector (2) we may want to do processing which returns edge properties (e.g. EdgeBetweennessCentrality) and need to match up the edge vector with the result.In particular, I need this for IGraph/M, and I was not able to find a fully satisfactory fast method during the nearly two years I have been working on this package.While I am primarily looking for what I described above, I also welcome solutions that do not preserve the edge ordering but can still match up edge weights with the edge vector. Currently, IGraph/M uses one of three functions based on whether the graph is weighted and whether edge ordering is important. Each of these three have multiple branches for different types of input. This is of course terribly complicated and error-prone, hence my question. (I should note that this case can be done using AdjacencyMatrix / WeightedAdjacencyMatrix, and I am generally satisfied with the performance.)I have asked this question on StackExchange before, and there are very useful tips from user @halmir (be sure to check the comments too), but I still do not have a fully satisfactory solution. https://mathematica.stackexchange.com/q/95742/12My current best implementationThis is what I use at the moment. It returns the result in flattened form (i.e. reshapeable into what I asked for). igEdgeList[graph_] := DeveloperToPackedArray@If[GraphComputationGraphRepresentation[graph] === "Simple", Flatten[EdgeList@IndexGraph[graph, 0], 1, If[DirectedGraphQ[graph], DirectedEdge, UndirectedEdge]] , Lookup[ AssociationThread[VertexList[graph], Range@VertexCount[graph] - 1], Flatten[EdgeList[graph], 1, If[DirectedGraphQ[graph], DirectedEdge, UndirectedEdge]] ] ] The key observation that this is based on is that IndexGraph is very fast on some graph, generally fast on graphs which use the "Simple" internal representation, but generally slow on graphs that use the "Incidence" internal representation (note that there are other representations as well and I do not have the full list).Benchmark datasetHere is a list of graphs to use for benchmarking. I am looking for a solution that does no worse than my own on all graphs in this dataset. SeedRandom[137]; g1 = GridGraph[{250, 250, 2}]; g2 = Graph@EdgeList[g1]; g3 = GraphComputationToGraphRepresentation[g2, "Simple"]; g4 = GraphComputationToGraphRepresentation[g2, "Sparse"]; wg1 = Graph[g1, EdgeWeight -> RandomReal[1, EdgeCount[g1]]]; wg2 = Graph[g2, EdgeWeight -> RandomReal[1, EdgeCount[g2]]]; h1 = RandomGraph[{10000, 400000}]; h2 = Graph@EdgeList[h1]; h3 = GraphComputationToGraphRepresentation[h2, "Simple"]; h4 = GraphComputationToGraphRepresentation[h2, "Sparse"]; wh1 = Graph[h1, EdgeWeight -> RandomReal[1, EdgeCount[h1]]]; wh2 = Graph[h2, EdgeWeight -> RandomReal[1, EdgeCount[h2]]]; e1 = ExampleData[{"NetworkGraph", "CondensedMatterCollaborations"}]; e2 = ExampleData[{"NetworkGraph", "HighEnergyTheoryCollaborations"}]; m1 = Graph[UndirectedEdge @@@ RandomInteger[{1, 2000}, {80000, 2}]]; m2 = GraphComputationToGraphRepresentation[m1, "Sparse"]; graphs = AssociationThread[ {"g1", "g2", "g3", "g4", "wg1", "wg2", "h1", "h2", "h3", "h4", "wh1", "wh2", "e1", "e2", "m1", "m2"}, {g1, g2, g3, g4, wg1, wg2, h1, h2, h3, h4, wh1, wh2, e1, e2, m1, m2} ]; Here's my solution benchmarked using Mathematica 11.2: timings = Table[First@AbsoluteTiming@igEdgeList[#], {5}] & /@ graphs; // AbsoluteTiming (* {14.944, Null} *) timings // Map[Min] // Normal // Map@Apply[List] // TableForm g1 0.277564 g2 0.281124 g3 0.280772 g4 0.240914 wg1 0.279735 wg2 0.274157 h1 0.131919 h2 0.241129 h3 0.126432 h4 0.221863 wh1 0.124445 wh2 0.24105 e1 0.039344 e2 0.010532 m1 0.048079 m2 0.03865 
3 Replies
Sort By:
Posted 5 years ago
 The following won't work for multigraphs, but it is faster for the other graphs: elist[g_Graph] := With[ {sa = UpperTriangularize @ WeightedAdjacencyMatrix[g, EdgeWeight->Range@EdgeCount@g]}, sa["NonzeroPositions"][[InversePermutation @ sa["NonzeroValues"]]] ] Check: Grid @ Table[ With[ { r1=AbsoluteTiming[Flatten[elist[Symbol@g]]-1], r2=AbsoluteTiming[igEdgeList[Symbol@g]] }, {g,r1[[1]], r2[[1]], r1[[2]]===r2[[2]]} ], {g, {"g1","g2","g3","g4","wg1","wg2","h1","h2","h3","h4","wh1","wh2"}} ] g1 0.052843 0.276878 True g2 0.051412 0.283879 True g3 0.048165 0.222559 True g4 0.050206 0.239958 True wg1 0.049296 0.228189 True wg2 0.048674 0.263723 True h1 0.081907 0.11832 True h2 0.084709 0.241035 False h3 0.077764 0.124032 True h4 0.082046 0.21298 True wh1 0.080524 0.118146 True wh2 0.077027 0.232665 False The two discrepancies occur because some of the edges are reversed. For example: Sort/@elist[h2] - 1 === Sort /@ Partition[igEdgeList[h2],2] Sort/@elist[wh2] - 1 === Sort /@ Partition[igEdgeList[wh2],2] (* True *) (* True *) 
Posted 5 years ago
 Great out-of-the-box thinking, as always, Carl! :-) Edge reversals in an undirected graph are no problem at all.
Posted 5 years ago
 Another suggestion was given on StackExchange: use IncidenceMatrix and derive the edge list for the positions and values in the (sparse) incidence matrix. I used a LibraryLink function to do this fast. This solution is now used by IGraph/M 0.3.95, and also exposed by it as the IGIndexEdgeList function. This function is even faster than EdgeList`, especially for large graphs, thus it is useful for implementing fast edge-list based operations.