Group Abstract

Message Boards

WOLFRAM COMMUNITY

9.9K Views

11 Replies

8 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

How can I improve the speed of Graph(GraphPlot) for large Rules?

Xiang Li

Posted 13 years ago

Hello, Example: 18000 Rules in a Graph. Question 1: How do I code it to make it faster? Question 2: Same resource set, aber the run time is different. Thanks, Xiang Li

POSTED BY: Xiang Li

11 Replies

Sort By:

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 13 years ago

This is an interesting observation, I will pass this on to our developers. Thanks!

POSTED BY: Vitaliy Kaurov

Szabolcs Horvát

Posted 13 years ago

Vitaliy, It does seem a bit paradoxical though that GraphEmbedding would need to recompute this every time ... consider the following very hackish approach: In[1]:= g = RandomGraph[{10000, 20000}]; In[2]:= ToBoxes[g]; In[3]:= v1 = Cases[ToBoxes[g][[1, 2, 1, 3, 2, 1, 2]], TagBox[DiskBox[pt_, _], ___] :> pt]; // AbsoluteTiming Out[3]= {2.381779, Null} In[4]:= v2 = GraphEmbedding[g]; // AbsoluteTiming Out[4]= {8.640521, Null} In[5]:= v1 == v2 Out[5]= True In[3] seems to be a much faster alternative for GraphEmbedding (starting with the second call), but of course it only works for graphs formatted using the default style. Of course I wouldn't use it because I don't feel comfortable with this kind of misuse of functions

POSTED BY: Szabolcs Horvát

Szabolcs Horvát

Posted 13 years ago

Thanks for asking about it!

POSTED BY: Szabolcs Horvát

Szabolcs Horvát

Posted 13 years ago

Hello Vitaliy, I'd just like to point out that this kind of measurement: ti = AbsoluteTime[]; GraphPlot[EdgeRules[g]] AbsoluteTime[] - ti for plotting things is not reliable. There are several stages to rendering a graphic on screen, and not all of them happen in the kernel. A better way to estimate the times is this: In[1]:= g = RandomGraph[{10000, 20000}, GraphLayout -> None]; // AbsoluteTiming Out[1]= {0.007185, Null} In[2]:= g = SetProperty[g, GraphLayout -> Automatic]; In[3]:= GraphEmbedding[g]; // AbsoluteTiming Out[3]= {4.454666, Null} In[4]:= boxes = ToBoxes[g]; // AbsoluteTiming Out[4]= {5.641420, Null} In[5]:= Rasterize[Notebook[{Cell@BoxData@boxes}], "Image"]; // AbsoluteTiming Out[5]= {3.383325, Null} In[6]:= boxes = ToBoxes[g]; // AbsoluteTiming Out[6]= {1.314225, Null} This tells us that: generating the graph is instantaneous laying out the graph takes about 4.4 seconds generating the box form of the graph takes about 1.2 seconds. We get 5.6 the first time because this includes layout out the graph too, but notice that the second ToBoxes runs in only 1.3 seconds again. The layout was probably cached. Compressing the boxes, sending them to the Front End, and rendering them takes about 3.3 seconds. All this adds up to 9 seconds in total. Measuring with a stopwatch gives me about 8 seconds, confirming that this way of benchmarking is correct (at least for estimation). Note that this must be measured in a fersh kernel, or with a newly generated graph, to avoid caching the layout! It's easy to forget about wall time while only looking at AbsoluteTiming's output. Now let's try GraphEmbedding again: In[7]:= GraphEmbedding[g]; // AbsoluteTiming Out[7]= {4.169074, Null} Well, this function doesn't use the supposedly cached graph layout. Why? Is this a bug?

POSTED BY: Szabolcs Horvát

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 13 years ago

I confirmed, this is definitely not a bug. The internal mechanics is more sophisticated but we cannot go into that ;)

POSTED BY: Vitaliy Kaurov

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 13 years ago

I went ahead and did some benchmarking. Here is a random graph with 10000 vertexes and 18000 edges: In[1]:= g = RandomGraph[{10000, 18000}]; In[2]:= EdgeRules[g] // Length Out[2]= 18000 And this are the times that it takes to produce the graphs (I am not sure if the rendering is reflected adequately): ti = AbsoluteTime[]; g AbsoluteTime[] - ti ti = AbsoluteTime[]; GraphPlot[EdgeRules[g]] AbsoluteTime[] - ti Now, if you are interested only in generating graphics you can take an alternative root. Extract vertex coordinates and from pairs of indexes that comprise edges: ge = GraphEmbedding[g]; ll = EdgeList[g] /. x_ \[UndirectedEdge] y_ -> {x, y}; Now use just lines for edges and GraphicsComplex - it takes less than a second! ti = AbsoluteTime[]; Graphics[{Opacity[.1], GraphicsComplex[ge, Line[ll]]}] AbsoluteTime[] - ti

POSTED BY: Vitaliy Kaurov

Xiang Li

Posted 13 years ago

Hello, Kaurov, Thanks for the comprehensive answer. The last method (GraphicsComplex) is exactly what I want!!! Xiang Li

POSTED BY: Xiang Li

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 13 years ago

Xiang, it is not clear which you need to speed up: Graph or GraphPlot. Graph, according to your data, plots pretty fast - 0.0468 seconds. Is it not satisfactory? A bit more clarification is needed and also a minimal example of working code. Could you please provide that?

POSTED BY: Vitaliy Kaurov

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 13 years ago

FYI, Timing includes only CPU time spent in the Mathematica kernel. It does not include time spent in external processes connected via MathLink or otherwise. Nor does it include time spent in the Mathematica front end. I would recommend using AbsoluteTiming. But be aware that AbsoluteTiming measures only the time involved in actually evaluating expr, not time involved in formatting the result.