Group Abstract

Message Boards

WOLFRAM COMMUNITY

9.9K Views

1 Reply

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

How fast should Mathematica work on an HPC cluster?

Cuneyt Eroglu

Posted 11 years ago

I am writing an academic paper where I need to optimize a function 6,000 (six thousand) times with different parameter values. Since each optimization run (using NMaximize) can take several minutes, my desktop cannot handle this task. So, I turned to my university's high performance computing cluster. They were very helpful and gave a compute node with 40 cores. Taking advantage of parallel computation in Mathematica, I could complete this computation task in a "reasonable" amount of time, but I was not impressed to be honest. The HPC cluster was only about 5-6 times faster than my desktop. (I did a comparison of 300 optimization runs on the cluster and my desktop.) My questions are: Is this normal? Isn't an HPC cluster supposed to provide you with much higher speeds? Especially since my program is completely parallelizable? I'm running the same function (NMaximize) multiple times and there are not interactions among these runs. I did some research on the computing node I was allocated, which was supposed to have 40 cores. Apparently, it's 40 LOGICAL cores, but only 20 physical cores. Basically, they gave me 10 dual-core Intel Xeon processors with 2.8 GHz speed. Again, I'm no expert, but I think you could buy a workstation with comparable computing power. For example, a workstation with two 10-core CPUs that runs Windows. Aren't HPC clusters supposed to be several orders of magnitude more powerful than a commercially available desktop/workstation? So far, I have used the HPC cluster for medium-sized problems, which took hours. If I were to try to solve large-sized problems, I am afraid the run time would reach days instead of hours. So, I am looking for a solution. Should I ask for more computing power from the HPC cluster? More physical processors and/or more memory? Should I maybe seek a workstation-based solution? Would cloud-based solutions provide the speed I am looking for? Any help/suggestions would be greatly appreciated. Sincerely, Cuneyt

POSTED BY: Cuneyt Eroglu

1 Reply

Sort By:

Marco Thiel

Marco Thiel, University of Aberdeen - Dept. of Physics/Mathematics

Posted 11 years ago

Hi there, my experience with HPC and Mathematica is quite good. Here's a little program: MersennePrimeQ[n_Integer] := PrimeQ[2^n - 1]; serial = Table[AbsoluteTiming[Select[Range[n], MersennePrimeQ];][[1]], {n, 100, 4000, 100}]; parallel = Table[AbsoluteTiming[Parallelize[Select[Range[n], MersennePrimeQ]];][[1]], {n, 100, 4000, 100}]; ListPlot[{serial, parallel}] I used 12 cores and for slightly larger Mersenne numbers I get a speed-up of around 8. {0.0430, 0.1570, 0.3470, 0.4840, 0.7752, 1.1520, 1.6039, 2.0522, \ 2.2048, 2.7491, 2.8243, 3.2581, 4.0768, 4.7226, 4.4919, 4.9101, \ 4.8510, 5.5316, 6.3384, 5.7927, 5.8100, 6.9138, 6.4262, 6.7049, \ 7.4091, 6.8417, 6.5115, 7.2273, 7.5031, 7.8422, 8.0007, 7.2548, \ 7.5676, 7.9047, 7.9098, 7.7667, 8.4086, 8.3989, 8.4812, 7.26858} As expected for small numbers single core calculation is faster. For other calculations I have seen better ratios than this. For a slightly more intelligent way of distributing the processes you can get to better ratios. We also use the grid manager to access classroom PCs all across the campus and it works quite well. If larger data volumes are produced, the network might be limiting in that case. Cheers, M.

Hi there,

my experience with HPC and Mathematica is quite good. Here's a little program:

MersennePrimeQ[n_Integer] := PrimeQ[2^n - 1];
serial = Table[AbsoluteTiming[Select[Range[n], MersennePrimeQ];][[1]], {n, 100, 4000, 100}];
parallel = Table[AbsoluteTiming[Parallelize[Select[Range[n], MersennePrimeQ]];][[1]], {n, 100, 4000, 100}];
ListPlot[{serial, parallel}]

enter image description here

I used 12 cores and for slightly larger Mersenne numbers I get a speed-up of around 8.

{0.0430, 0.1570, 0.3470, 0.4840, 0.7752, 1.1520, 1.6039, 2.0522, \
2.2048, 2.7491, 2.8243, 3.2581, 4.0768, 4.7226, 4.4919, 4.9101, \
4.8510, 5.5316, 6.3384, 5.7927, 5.8100, 6.9138, 6.4262, 6.7049, \
7.4091, 6.8417, 6.5115, 7.2273, 7.5031, 7.8422, 8.0007, 7.2548, \
7.5676, 7.9047, 7.9098, 7.7667, 8.4086, 8.3989, 8.4812, 7.26858}

As expected for small numbers single core calculation is faster. For other calculations I have seen better ratios than this. For a slightly more intelligent way of distributing the processes you can get to better ratios.

We also use the grid manager to access classroom PCs all across the campus and it works quite well. If larger data volumes are produced, the network might be limiting in that case.

Cheers,

POSTED BY: Marco Thiel

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback