Group Abstract

Message Boards

8.8K Views

1 Reply

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Data Science Image Processing Wolfram Language Machine Learning Neural Networks

Posted 6 years ago

The Wolfram Neural Net repository offers ResNET-50, a very successful net to identify the main object in an image. After downloading the net, I wanted to learn about the structure of the NetEncoder NetExtract[pred, "Input"] Mathematica tells us that the type is "Image" and the "Image Size" is 224x224. I have two question related to this: How is it possible that the net can be successfully applied to e.g. larger images of e.g. 250x250. I wonder that we do not get a warning? Does this mean that the net automatically resizes the original image (e.g. from 250x250 pixels) to 224x224? If one knows in advance that the images have image size, e.g. 500x500. Is it "easily" possible to change the hyperparameters of ResNet-50 for larger images? Thank you for your time.

POSTED BY: Wolfgang Hitzl

1 Reply

Sort By:

Posted 6 years ago

224 by 224 is a common practice to make 32x32 = 1024 more samples from one 256 by 256 input since AlexNet. I am afraid it is hard coded in the input layer and you need to crop images by yourself. You may want to try ImageAugmentationLayer mentioned in this blog. Prepend this layer to your NN will do the image augmentation for you.

POSTED BY: Shenghui Yang

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback