Message Boards Message Boards

Why is my CUDAImageMultiply inconsistent with ImageMultiply?

Posted 8 years ago

Hello I am trying to understand how to get CUDAImageMultiply to use the same precision as ImageMultiply:

Here is a prototype example which illustrates the question:

m1 = Table[Sin[4. Pi x/500] Sin[4. Pi y/500], {x, 500}, {y, 500}];
m2 = RandomReal[{0.5, 1}, {500, 500}];

im1 = Image[m1, ColorSpace -> "Grayscale"]
im2 = Image[m2, ColorSpace -> "Grayscale"]
ImageMultiply[im1, im2] (*as expected*)

Using CUDAImageMultiply but without explicitly allocating memory on the GPU:

Needs["CUDALink`"]
CUDAImageMultiply[im1, im2] (*as expected*)

The above doesn't really give any speed up. It is probably the memory transfer. So, it is natural to try:

cimg1 = CUDAMemoryLoad[im1]
cimg2 = CUDAMemoryLoad[im2]

Allocate gpu memory for the product (I suspect the problem lies in the next step)

cimg3 = CUDAMemoryLoad[im2]

One gets a nice speedup without the memory transfer:

RepeatedTiming[ 
CUDAImageMultiply[cimg1, cimg2, "OutputMemory" -> cimg3];]

But, the result appears to have have only, ummm, 256 shades of gray.

CUDAMemoryGet[cimg3] (*not as expected*)

Does anyone have a fix for this? Or, even better, some CUDA or OpenCL code for matrix multiplication m1*m2 (not Dot).

Thanks

POSTED BY: W. Craig Carter
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract