Group Abstract

Message Boards

3.9K Views

0 Replies

1 Total Like

View groups...

Follow this post

Share this post:

GROUPS:

Image Processing External Programs and Systems Wolfram Language Optimization Packages

W. Craig Carter, MIT

Posted 9 years ago

Hello I am trying to understand how to get CUDAImageMultiply to use the same precision as ImageMultiply: Here is a prototype example which illustrates the question: m1 = Table[Sin[4. Pi x/500] Sin[4. Pi y/500], {x, 500}, {y, 500}]; m2 = RandomReal[{0.5, 1}, {500, 500}]; im1 = Image[m1, ColorSpace -> "Grayscale"] im2 = Image[m2, ColorSpace -> "Grayscale"] ImageMultiply[im1, im2] (as expected) Using CUDAImageMultiply but without explicitly allocating memory on the GPU: Needs["CUDALink`"] CUDAImageMultiply[im1, im2] (as expected) The above doesn't really give any speed up. It is probably the memory transfer. So, it is natural to try: cimg1 = CUDAMemoryLoad[im1] cimg2 = CUDAMemoryLoad[im2] Allocate gpu memory for the product (I suspect the problem lies in the next step) cimg3 = CUDAMemoryLoad[im2] One gets a nice speedup without the memory transfer: RepeatedTiming[ CUDAImageMultiply[cimg1, cimg2, "OutputMemory" -> cimg3];] But, the result appears to have have only, ummm, 256 shades of gray. CUDAMemoryGet[cimg3] (not as expected) Does anyone have a fix for this? Or, even better, some CUDA or OpenCL code for matrix multiplication m1*m2 (not Dot). Thanks

POSTED BY: W. Craig Carter

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback