Group Abstract

Message Boards

WOLFRAM COMMUNITY

12.9K Views

4 Replies

12 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Data Science Mathematica Add-Ons Wolfram Language Machine Learning Neural Networks Artificial Intelligence

Tuning YOLOv2 object detection neural networks on custom datasets

Alec Graves

Posted 5 years ago

Hello community, please enjoy my latest project! I hope you can find it useful, and feel free to provide some feedback. I am also quite new to organizing libraries for the Wolfram Language, so any feedback there will be greatly appreciated! Keywords: YOLO, YOLOV2, fine tuning, pre trained, object detection, neural networks, machine learning

POSTED BY: Alec Graves

4 Replies

Sort By:

Bassel Harby

Bassel Harby, American University in Cairo

Posted 3 years ago

Dear Alec, I am currently working on a wood defects detector for my bachelor graduation thesis, and I am trying to implement this using Yolov2. I ran through your packages and your code you did great job writing them and I would like to mention that they are the only helpful resources i found on the internet for training an object detector on custom data using Mathematica. However, I tried your Packages and each time I run the BuildYoloLoss package I get the following error `FunctionLayer::compilerr: Cannot interpret ThreadingLayer[<>][#1, #2] & as a network.` I am running Version 13.01, and since you wrote your code on earlier versions, I think there is an issue related to the syntax of the function layers specification in the GIOU Loss package. I would appreciate your help on that, and if you have other resources or insights on how to build similar loss functions and perform transfer learning on recent Yolo versions using Mathematica, as I cant find any. Thanks in advance.

POSTED BY: Bassel Harby

Alec Graves

Posted 3 years ago

Thanks for letting me know that it is not working. I have not tested this on 13, but I just downloaded 13.2 and will take a look at it today.

POSTED BY: Alec Graves

Alec Graves

Posted 3 years ago

It looks like in 13 some things about what could be compiled in FunctionLayer changed. In 12.3, I had to write some ugly code to get the anchor box to box output conversion network to compile in a FunctionLayer, so before it looked like this: FunctionLayer[Apply[Function[{anchorsIn, grid, conv}, Block[ (* This is a really bad function because in 12.2, FunctionLayer was not compiling nice functions. ) ( ... ) ( We assume the net has been reshaped to dimensions n x Anchors x dim x dim. ) { boxes = conv[[1 ;; 4]], classPredictions = conv[[6 ;;]], confidences = LogisticSigmoid[conv[[5]]] }, ( We first need to construct our boxes to find the best fits. ) Block[{ boxesScaled = Join[((1.0/(inputSize/32.0)Tanh[boxes[[1 ;; 2]]] // TransposeLayer[{2 <-> 3, 3 <-> 4}]) + (grid // TransposeLayer[{1 <-> 3, 3 <-> 2}]) // TransposeLayer[{4 <-> 2, 3 <-> 4}]), (Transpose @ (5LogisticSigmoid[ boxes[[2 ;; 3]]]) anchorsIn // Transpose)]}, <\| "boxes" -> boxesScaled, "confidences" -> confidences, "classes" -> classPredictions \|> ]]]]] In 13, it seems like compiling nested blocks/modules/whatever does not work anymore, but we can simplify this expression to a single module now without getting compilation errors. FunctionLayer[Apply[Function[{anchorsIn, grid, conv}, Module[ (* We assume the net has been reshaped to dimensions n x Anchors x dim x dim. ) { boxes = conv[[1 ;; 4]], classPredictions = conv[[6 ;;]], confidences = LogisticSigmoid[conv[[5]]] }, ( We first need to construct our boxes to find the best fits. ) <\| "boxes" -> Join[((1.0/(inputSize/32.0)Tanh[boxes[[1 ;; 2]]] // TransposeLayer[{2 <-> 3, 3 <-> 4}]) + (grid // TransposeLayer[{1 <-> 3, 3 <-> 2}]) // TransposeLayer[{4 <-> 2, 3 <-> 4}]), (Transpose @ (5LogisticSigmoid[ boxes[[2 ;; 3]]]) anchorsIn // Transpose)], "confidences" -> confidences, "classes" -> classPredictions \|> ]]]] Lastly, it seems the convention for calling ThreadingLayer and FunctionLayer with multiple arguments changed: before, you could do ThreadingLaye[...][x, y] but in 13 you need to do: ThreadingLayer[...][{x, y}] And now it works again: I have pushed these changes to the Wolf Detector github repo, so older versions are probably broken now. But it now works in 13.2. It is also worth noting that 13 added the built-in function TrainImageContentDetector, but IDK what algorithm it is using. Maybe someone from WRI can chime in on that?

It looks like in 13 some things about what could be compiled in FunctionLayer changed.

In 12.3, I had to write some ugly code to get the anchor box to box output conversion network to compile in a FunctionLayer, so before it looked like this:

FunctionLayer[Apply[Function[{anchorsIn, grid, conv}, Block[
(* This is a really bad function because in 12.2, FunctionLayer was not compiling nice functions. *)
(* ... *)
(* We assume the net has been reshaped to dimensions n x Anchors x dim x dim.  *)
{
  boxes = conv[[1 ;; 4]],
  classPredictions = conv[[6 ;;]],
  confidences = LogisticSigmoid[conv[[5]]]
},
(* We first need to construct our boxes to find the best fits. *)

Block[{
  boxesScaled =
      Join[((1.0/(inputSize/32.0)*Tanh[boxes[[1 ;; 2]]] // TransposeLayer[{2 <-> 3, 3 <-> 4}]) +
          (grid // TransposeLayer[{1 <-> 3, 3 <-> 2}]) // TransposeLayer[{4 <-> 2, 3 <-> 4}]),
        (Transpose @ (5*LogisticSigmoid[ boxes[[2 ;; 3]]]) * anchorsIn // Transpose)]},
  <|
    "boxes" -> boxesScaled,
    "confidences" -> confidences,
    "classes" -> classPredictions
  |>
]]]]]

In 13, it seems like compiling nested blocks/modules/whatever does not work anymore, but we can simplify this expression to a single module now without getting compilation errors.

FunctionLayer[Apply[Function[{anchorsIn, grid, conv}, Module[
    (* We assume the net has been reshaped to dimensions n x Anchors x dim x dim.  *)
    {
      boxes = conv[[1 ;; 4]],
      classPredictions = conv[[6 ;;]],
      confidences = LogisticSigmoid[conv[[5]]]
    },
    (* We first need to construct our boxes to find the best fits. *)
      <|
        "boxes" -> Join[((1.0/(inputSize/32.0)*Tanh[boxes[[1 ;; 2]]] // TransposeLayer[{2 <-> 3, 3 <-> 4}]) +
              (grid // TransposeLayer[{1 <-> 3, 3 <-> 2}]) // TransposeLayer[{4 <-> 2, 3 <-> 4}]),
            (Transpose @ (5*LogisticSigmoid[ boxes[[2 ;; 3]]]) * anchorsIn // Transpose)],
        "confidences" -> confidences,
        "classes" -> classPredictions
      |>
    ]]]]

Lastly, it seems the convention for calling ThreadingLayer and FunctionLayer with multiple arguments changed:

before, you could do

ThreadingLaye[...][x, y]

but in 13 you need to do:

ThreadingLayer[...][{x, y}]

And now it works again:

Training progress indicator

semantic box labelled picture of apples

I have pushed these changes to the Wolf Detector github repo, so older versions are probably broken now. But it now works in 13.2. It is also worth noting that 13 added the built-in function TrainImageContentDetector, but IDK what algorithm it is using. Maybe someone from WRI can chime in on that?

POSTED BY: Alec Graves

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 5 years ago

-- you have earned *Featured Contributor Badge* Your exceptional post has been selected for our editorial column *Staff Picks* http://wolfr.am/StaffPicks and Your Profile is now distinguished by a *Featured Contributor Badge* and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback