Message Boards Message Boards

1 Reply
0 Total Likes
View groups...
Share this post:

CUDALink: How can I control the grid dimensions and block numbers?

Posted 10 years ago
Hi all, first time posting to this forum. I have a question regarding the use of GPUs with Mathematica. I need to control the grid dimensions and blockIdx (.x .y and .z). I know how to control the number of threads in CUDAFunctionLoad. However, I can run no more than 1024 threads per block and I need to run 8196. So, I need eight blocks in sequential order (i.e. 1, 2, 3, etc.). My code runs fine up to 1024 threads that fill one block. But, I do not know how to tell Mathematica that I need eight blocks and in order. Can you help? Thanks.

POSTED BY: Gustavo Carri
The grid size is determined by the input size when a CUDAFunction is called; if you create the CUDAFunction with a block size of 1024 and then calls it on a list of length 8192 it will create 8 blocks. Inside the CUDA kernel, memory should be addressed something like:
foo = data[threadIdx.x + blockIdx.x * blockDim.x];

This is exemplified by the very first basic example in the CUDAFunctionLoad documentation page.

You can also explicitly set the number of threads. I'm not on a CUDA-capable machine at the moment, but I believe the syntax for 1D is:
fun = CUDAFunctionLoad[src, name, argtypes, 1024];
fun[args, 8192]

(for higher dimensions, the block/grid specification would be a list of dimensions).

Also maybe worth noting: the 1024 limit on the first dimension of block size is a CUDA/hardware limitation, not a Mathematica limitation.
POSTED BY: Dylan Roeh
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract