Message Boards Message Boards

1
|
3669 Views
|
2 Replies
|
4 Total Likes
View groups...
Share
Share this post:

Automatic preprocessing of image data using UMAP in DimensionReduce

Posted 2 years ago

I have been getting really nice clustering results using the UMAP method in DimensionReduce (Mathematica 13) where the data is a list of images, for example:

rep = DimensionReduce[ imagelist, 3, Method->"UMAP", TargetDevice -> "GPU"];

Without specifying any other options (like FeatureExtractor), does the above command implement any kind of automatic preprocessing on the image data before applying UMAP? The reason I ask is that when I compare the Python implementation of UMAP on the same set of images (where the RGB values are converted to numpy arrays, with no other preprocessing) I get results that are consistently much worse. So it seems like there is something useful that the Mathematica algorithm is doing under the hood to the images. Would it be possible to find out the details?

Thanks, Mike

2 Replies

Thank you, that's helpful!

Hi Michael, you can explore the internals of the DimensionReducerFunction to check the preprocessor—ideally you should be able to do

Information[_DimensionReducerFunction, "FeatureExtractor"]

but we have not hook it up there yet. In the meantime you can check what the internal processor is doing (hover over each processor to see more info)

reducer[[1, "Processor"]]

reducer_processor

and apply it to an image

reducer[[1, "Processor"]] @* reducer[[1, "Preprocessor"]] @ RandomImage[]

reduced_dataset

Remember to use DimensionReduction instead of DimensionReduce in order to get the function and not the reduced data directly.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract