Message Boards Message Boards

Load a function from cubin, ptx or library file using CUDAFunctionLoad?


According to the documentation of CUDAFunctionLoad it should be easy to specify a compiled file (cubin, ptx, dll should all work) as the source for loading a CUDAFunction. Unfortunately it does not for me. Compiling from source works fine, but as soon as I try to load the CUDAFunction from the compiled file (I tried cubin, ptx and a dll) things fail.

Here is a very simple example that does not work for me, no matter what combination I try:

Let's create a cubin file first from a very simple CUDA kernel:

code = "
  __global__ void addTwo(int * in, int * out, int length) {
    int index = threadIdx.x + blockIdx.x*blockDim.x;
    if (index < length)
        out[index] = in[index] + 2;
cubinFile = CreateExecutable[code, "test", "Compiler" -> NVCCCompiler, 
   "CreateCUBIN" -> True];

This successfully creates test.cubin.

Unfortunately loading the function addTwo fails:

cudaFun = CUDAFunctionLoad[File[cubinFile], 
   "addTwo", {{_Integer, _, "Input"}, {_Integer, _, 
     "Output"}, _Integer}, 256, "ShellCommandFunction" :> Print, 
   "ShellOutputFunction" -> Print];

CUDAFunctionLoad::invsrc: CUDALink encountered invalid source input. The source input must be either a string containing the program, or a list of one element indicating the file containing the program.

The input file should be valid, but maybe I am missing something obvious here. Interestingly enough going the same route creating a .ptx file yields a different error:

ptxFile = 
  CreateExecutable[code, "test", "Compiler" -> NVCCCompiler, 
   "CreatePTX" -> True];
cudaFun = 
   "addTwo", {{_Integer, _, "Input"}, {_Integer, _, 
     "Output"}, _Integer}, 256, "ShellCommandFunction" :> Print, 
   "ShellOutputFunction" -> Print];

CUDAFunctionLoad::notfnd: CUDALink resource not found.

In addition to cubin and ptx files I tried compiling a library file using CreateLibrary, which was created fine but also could not be loaded using CUDAFunctionLoad.

Any ideas on what is going wrong here and how I can actually load a CUDAFunction from a compiled file?

You can just copy and paste the code above into mathematica and run it as long as you have CUDA setup properly. Can you reproduce the behavior?

Additional Information:
I am running Mathematica 11.2 on Windows 10.
CUDA is setup properly and I can do all CUDA computations in Mathematica.

1 month ago

Group Abstract Group Abstract