Message Boards Message Boards

Some tips about using gridMathematica over a Mixed Network

Posted 11 years ago
This post demonstrates some basic techniques to help the gridMathematica users to use their own clusters over different networks. A typical example is a mixture of home LAN and a VPN to your remote location. 

1. Installation

Detailed information about Installation is covered here. Just make sure that you install the gridMathematica with a user name of low privilage so you won't be able to modify important locations like /usr/bin or windows/system32

2. Test 

Once services are started on each node, you can check the localhost:3737 of each slave machine. You should have admin password to activate the licenses. gridMathematica server won't be working properly until license is activated.  

3. Work on the distributed system

To have a full control of the net work in terms of the Mathematica process, I recommand the build-in package and utility function. 
In this case, the master machine is my macbook pro. There is no problem as I work with two Windows nodes. Every piece of data is transferred via web packet so it is OS independent . 

 I load the "LightweightGridClient" package, which comes with Mathematica by default.

The package is fairly rich. You can find a bundle of useful utitlity functions here. 
Names["LightweightGridClient`*"] // TableForm

If your DNS is correctly setup over the network, you should be able to launch the subkernels directly with the host name of the nodes. If you have some issue, in general, you can edit the hosts file on your master machine to include the ip address and the host name of the same machine. Search online to find the location of this file based on the OS.  

It is always a good idea to check the connectivity in case of launching kernels with chunk of error messages. I use the following functions to check the configuration of the distributed system. "shenghui-pc" is a machine in my home network while "shenghuiywin" is the machine in the VPN domain. To make sure the two machines are recognizable by the master node, I added the full domain name for the machine in VPN to indicate that it is a machine in a network different from that for "shenghui-pc".
agent1 = RemoteServicesAgentInformation["shenghui-pc"]
agent2 = RemoteServicesAgentInformation[""]

Simple as the result, the two output are objects containing useful set-up information for each grid server, among which the most important are ContactURL and DefaultKernelCommand. The first indicates the location of the machine online and the latter tells me where the executable on the device.

You can launch kernel via the following command. The name of the options are straightforward, which are defined with LightweightGrid function

If you want to also launch the kernels on the master machine, you can simply put a number in the list. 
kernels = LaunchKernels[{
    LightweightGrid[{"Agent" -> "shenghui-pc", "KernelCount" -> 4}],
    LightweightGrid[{"Agent" -> "",
      "KernelCount" -> 4}]

here 4 simply tells the master kernel to launch 4 subkernels on the local machine. 

4. Use MathLink to assign parallel jobs

This is a technique that is useful if you have computational heavy task for subkernels and you still want to work in the same session. Otherwise you may have to wait until a result is returned from a function like ParallelEvaluate.

Pick up a subkernel from the list of link objects

LinkObject[x_,_,_]/;StringMatchQ[x,RegularExpression["\\d{4,5}"]~~__](*pick subkernels' link only*)

Do manual LinkWrite and LinkRead on LAN: 
link = LinkObject["53802@,53803@", 211, 7];
LinkWrite[link, Unevaluated[Pause[5]; FactorInteger[2^32 - 3]]]

Comparing to ParallelEvaluate, you can work within the same notebook during the evaluation. You can pick up the result at any time after, in my case ,5 second computation time by 

You may verify my observation by yourself : (cell brackets will be highlighted for about 5 seconds) 
ParallelEvaluate[Pause[5]; FactorInteger[2^32 - 3], 1]

MathLink is very flexible and you can also work on functions that require FrontEnd: 
Hold[Image[ {{0, 63, 127, 191, 255}, {0, 50, 100, 150, 200}},

Eventually, do not forget to close kernels at the end of the evaluation: 
CloseKernels[] // TableForm

Note: This type of network is excellent for parallelization with big chunk of computation. The limit is latency and bandwith (LAN is much faster than VPN, while the latter determines the overall time). Therefore, it is not recommended to use ParallelSubmit which cause speed down due to many submission over the network if tasks are rather small.
POSTED BY: Shenghui Yang
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract