Message Boards Message Boards

6
|
2461 Views
|
11 Replies
|
39 Total Likes
View groups...
Share
Share this post:

Can Wolfram's new compiler technology speed up this AABB tree code?

Posted 9 months ago
POSTED BY: Greg Hurst
11 Replies
Posted 9 months ago
POSTED BY: Greg Hurst
Posted 9 months ago

I was able to refactor the code to get OverlappingBBoxes mapped over the 100000 testbboxes to go from 44.85 seconds down to 1.3 seconds (a 34.5x speedup). I suspect there still room for improvement.

Code attached this time rather than being pasted in the post.

Improvements:

  • I changed the flow of my code a bit:
    • Place bboxes that overlap with the splitting plane into the left node (as opposed to keeping separate).
    • For bboxes in a node, sort them by their splittingdim+1 dimension. So now in OverlappingBBoxes we can binary search and quickly eliminate ~50% bboxes to manually check.

In lieu of proper documentation I found what I believe to be the source code at

FileNameJoin[{$InstallationDirectory, "SystemFiles", "Components", "Compile"}]

Based on what I saw:

  • I think Native`UncheckedBlock is similar to "RuntimeOptions" -> "Speed" in Compile.
    • With help from a 2019 WTC presentation I was able to inspect the IR with Compile`CompileToIR and noticed tons of check were omitted when this was used.
    • Not all checks are omitted with this though. NullNode exceptions in binary trees are still checked for example.
    • Is there any other way to turn off more checks?
  • Based off the MSE answer here we have some control over what gets inlined.
    • Based on FunctionInlineInformation.m in the source code, we can use the option "InlinePolicy" -> "MaximumCallCount" -> 1 to inline a function that is only called in one spot.
    • I don't know if functions are inlined at a later step, so perhaps this is superfluous.
  • list[[i]] seems to avoid bounds checks with Native`UncheckedBlock, but multi arg Part does not.
    • Looks like with UncheckedBlock, unary Part gets converted to

      Native`Unchecked[Compile`GetArrayElement]
      

but multi arg Part does not. * I refactored my AABB node to store a flat list rather than a matrix.

Attachments:
POSTED BY: Greg Hurst
Posted 9 months ago

I tried to create a 'more natural' AABB tree node using TypeDeclaration. This ended up giving a 2x slowdown during construction -- not sure if I used TypeDeclaration properly.

Basically it makes more sense to tag the data in the node, e.g.

<|"splitloc" -> "Real64", "splitdim" -> "MachineInteger", "min" -> "Real64", 
  "max" -> "Real64", "indices" -> "PackedArray"::["MachineInteger", 1]|>

The return value is also invalid (I think):

tree = AABBTree[bboxes]; // AbsoluteTiming
(* {0.237855, Null} *)

tree // FullForm
(* DataStructure["BinaryTree`AABBNode",$Failed] *)

Code

nodedec = TypeDeclaration["Product", "AABBNode", <|"splitloc" -> "Real64", "splitdim" -> "MachineInteger", "min" -> "Real64", "max" -> "Real64", "indices" -> "PackedArray"::["MachineInteger", 1]|>];

$btype = "BinaryTree"::["AABBNode"];

With[{$btype = $btype, $max = $MaxMachineNumber, $min = -$MaxMachineNumber},
decl2 = FunctionDeclaration[
    centerNode3D,
    Typed[{"MachineInteger", "PackedArray"::["Real64", 3], "PackedArray"::["MachineInteger", 1], "Real64"} -> $btype] @ Function[{d, bboxes, inds, x},
       Module[{min, max, data},
         min = If[Length[inds] > 0, Min[bboxes[[inds, d, 1]]], $max];
         max = If[Length[inds] > 0, Max[bboxes[[inds, d, 2]]], $min];
         data = CreateTypeInstance["AABBNode", <|"splitloc" -> x, "splitdim" -> d, "min" -> min, "max" -> max, "indices" -> inds|>];
         CreateDataStructure["BinaryTree", data]
       ]
    ]
]];

With[{$btype = $btype, $intvec = "PackedArray"::["MachineInteger", 1]},
pureAABBTree = Function[{Typed[bboxes, "PackedArray"::["Real64", 3]]},
    Module[{iAABBTree, inds},
       iAABBTree = Function[{Typed[d, "MachineInteger"], Typed[inds, $intvec]},
         Module[{len, n, x, loc, r=0, l=0, c=0, j, Sleft, Scent, Sright, node},
          len = Length[inds];

          n = Ordering[bboxes[[inds, d, 1]]][[Quotient[len+1, 2]]];
          x = bboxes[[inds[[n]], d, 1]];

          If[len <= 100, Return[TypeHint[centerNode3D[d, bboxes, inds, x], $btype]]];

          loc = ConstantArray[0, len];

          Do[
              j = inds[[i]];
              If[bboxes[[j, d, 1]] < x,
                 l++;
                 loc[[i]] = 1,
                 If[bboxes[[j, d, 2]] > x,
                   r++;,
                   c++;
                   loc[[i]] = 2
                 ]
              ],
              {i, len}
          ];

          Sleft = ConstantArray[0, l];
          Scent = ConstantArray[0, c];
          Sright = ConstantArray[0, r];

          r = l = c = 1;
          Do[
              j = inds[[i]];
              If[loc[[i]] === 0,
                 Sright[[r++]] = j;,
                 If[loc[[i]] === 1,
                   Sleft[[l++]] = j,
                   Scent[[c++]] = j;
                 ]
              ],
              {i, len}
          ];

          node = TypeHint[centerNode3D[d, bboxes, Scent, x], $btype];

          If[l > 1, BinaryTree`SetLeft[node, iAABBTree[Mod[d+1, 3, 1], Sleft]]];
          If[r > 1, BinaryTree`SetRight[node, iAABBTree[Mod[d+1, 3, 1], Sright]]];

          TypeHint[node, $btype]
         ]
       ];

       inds = Typed[KernelFunction[InversePermutation], {$intvec} -> $intvec][Ordering[bboxes[[All, 1, 1]]]];

       TypeHint[iAABBTree[1, inds], $btype]
    ]
]];

AABBTree = 
 FunctionCompile[{nodedec, decl2}, pureAABBTree, 
  CompilerOptions -> {"AbortHandling" -> False, 
    "LLVMOptimization" -> "ClangOptimization"[3], 
    "OptimizationLevel" -> 3}];
POSTED BY: Greg Hurst

That seems like a valid use of TypeDeclaration. However, I've myself not worked yet with the "Product" type in any significant examples so I can't comment if it is a suitable use for that type.

I'm not sure why it's that bit slower either, but the returned value you get I think is valid. What's failed I think is the attempt to get its full form. This is the same behavior you commented separately with the other questions (point no. 2). I tried to answer it there too

Posted 9 months ago

Here are some questions I gathered while working through compiling this code. If these warrant a new post, please let me know!

  • Are DataStructures in compiled code documented? I saw zero mention / examples of it other than a mention of Typed on each data structure ref page: Typed[x,"BinaryTree"] give x the type "BinaryTree".
  • Is it possible to return a binary tree that is not "BinaryTree"::["InertExpression"]? For example I tried returning "BinaryTree"::["PackedArray"["Real64", 2]] and the return value was of the form DataStructure[some internal symbol, $Failed]. Code compiled fine though.
  • Are not all data structure methods supported in the compiler? For the binary tree I tried node["SetLeft", expr] and the compiler told me "SetLeft" is an unknown method. Luckily in this case BinaryTree`SetLeft[node, expr] works just fine.
  • When would I use ExtensibleVector v.s. DynamicArray? The functionality of DynamicArray seems to be a proper subset of ExtensibleVector, so why would I ever use DynamicArray? Performance perhaps?
  • If ExtensibleVector / DynamicArray are storing integers only, will the footprint be as efficient as having a packed array? Seems I can treat the "Elements" as "ListVector" or "PackedArray" seamlessly, but I don't know if I'm copying data when doing so.
  • Is it possible to return a list of elements of different types? "ListVector" can return a ragged list as long as all elements are the same type.
  • Documentation of the compiler seems quite sparse.
    • Took me awhile to discover "Rule" is a compilable type, I didn't see it mentioned anywhere in the docs, other than a couple examples.
    • CompilerOptions ref page has little to no information in it. I found an MSE post where people found some info on it though.
    • All examples seem to be toy examples illustrating a certain feature. Those are great, but some 'real world' examples would be super helpful
POSTED BY: Greg Hurst

At the moment I only have an answer for two of these questions

  • Point no. 2: You can definitely return data structures of any compiled type, from compiled code (as the evaluator doesn't know about compiled types, only about expressions). The front end renders them differently on screen, and nothing seems to work with them outside compiled code. I think the expression you mention with a $Failed is just a failure when attempting to get the full form of the object, but it's perfectly valid for its purpose. I wonder if the failure to show the full form is meaningful and on purpose. Even if it's compiled stuff, you can for example successfully show the full form of a compiled function. I'll try to find out more

  • Point no. 6: I think the way to achieve that is with the type "InertExpression". It allows you to put anything inside. On the counterpart, some functions won't work if you apply them to elements of the type "InertExpression", so I understand this is one of the reasons why you wouldn't just make "ListVector"::["InertExpression"] all the time if you know you have uniform types. (There may be more reasons that I'd also like to know)

Maybe someone else could give an answer for the rest of points or I can try to find out more and ask

Posted 9 months ago

Thanks for the info.

For point 4 I think the answer is use DynamicArray when you are only appending, for performance reasons.

I found the source code for data structures in

FileNameJoin[{$InstallationDirectory, "SystemFiles", "Components", 
  "Compile", "TypeSystem", "Declarations", "SpecificTypes", 
  "DataStructures"}]

and by the looks of it, it just is performance reasons.

And for point 5, I think the data will be treated like packed, based on looking through the code.

POSTED BY: Greg Hurst
Posted 9 months ago
POSTED BY: Greg Hurst
Posted 9 months ago
POSTED BY: Greg Hurst

Thank you too for posting the results!

You are completely right that you can directly give the tree the appropriate specific type, here "BinaryTree"::["PackedArray"["Real64", 2]]. But there is a subtlety. You are passing to the function “OverlappingBBoxes” a tree that was created in top-level code. Data structures that are created in top-level code are made to work with expressions, so that they can work outside of compiled code (because that’s what the evaluator knows about).

This means that you’re passing this tree that has type "BinaryTree"::["InertExpression”]. If, at the same time, you declare the function to receive a tree of type "BinaryTree"::["PackedArray"["Real64", 2]], the system will not find a matching declaration for the argument you passed and will raise an inconsistency error.

One solution can be to construct your tree with compiled code, and directly use the specific compiled type "BinaryTree"::["PackedArray"["Real64", 2]] at construction time. (With compiled code you have the option to use the compiled types, in addition to the more general “InertExpression" that works in compiled and uncompiled code.) So for example, you can declare the function "centerNode3D" with type:

Typed[{"MachineInteger", "PackedArray"::["Real64", 2], "Real64"} -> "BinaryTree"::["PackedArray"["Real64", 2]]]

and type the "tree" argument in "pureOverlappingBBoxes" as

Typed[tree, "BinaryTree"::["PackedArray"["Real64", 2]]]

and this will allow you to get rid of the casts.

As an alternative, this second type annotation can be replaced by typing the value given to "data" in "pureOverlappingBBoxes":

data = TypeHint[node["Data"], "PackedArray"::["Real64", 2]];

This is a good example of one of the points I commented about annotating values in the body of the function to guide the inferencer. You have the option to type somewhere and/or somewhere else, as long as it's sufficient and consistent. (I precisely used this alternative to have the inferencer work out and inform me of the type of the tree, because I didn't know its format!)

That said, in my computer the code with and without the casts is running pretty much with the same timing. Let me know if you see any differences

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract