Message Boards Message Boards

Upload the SVHN dataset into the Wolfram Data Repository?


SVHN is a real-world image dataset for developing machine learning and object recognition algorithms with minimal requirement on data preprocessing and formatting. It can be seen as similar in flavor to MNIST (e.g., the images are of small cropped digits), but incorporates an order of magnitude more labeled data (over 600,000 digit images) and comes from a significantly harder, unsolved, real world problem (recognizing digits and numbers in natural scene images). SVHN is obtained from house numbers in Google Street View images. Official Dataset website

Description of the Dataset:

 - 10 classes, 1 for each digit. Digit '1' has label 1, '9' has label 9
   and '0' has label 10.

 - 73257 digits for training, 26032 digits for testing, and 531131
   additional, somewhat less difficult samples, to use as extra training

a sample image

  • Comes in two formats:

    1. Original images with character level bounding boxes.
    2. MNIST-like 32-by-32 images centered around a single character (many of the images do contain some distractors at the sides).

I want to upload this dataset into the WolframData Repository and work on it in the same way the wolfram team had done with the MNIST dataset ,I dont know how to do it, please help me?

POSTED BY: Megri Youcef
5 months ago

You can submit data to be included in the data repository going to File > New > Data Resource.

Alternatively you can see this Stack Exchange article which covers how to do this:

POSTED BY: Sean Clarke
4 months ago

Hi Sean, is there any difference between method suggested by MSE and by Stephen Wolfram in his blog:

POSTED BY: Kapio Letto
4 months ago

I wouldn't worry about it. They don't really differ that much except in how to bring up the notebook that you fill-out and submit.

If you have any questions about any of the fields, please let us know.

POSTED BY: Sean Clarke
4 months ago

Group Abstract Group Abstract