Group Abstract Group Abstract

Message Boards Message Boards

Skip-lists: A missing ingredient to WL's Multicomputational paradigm?

Posted 2 years ago
Attachments:
POSTED BY: Brad Klee
5 Replies
Posted 2 years ago

Compare with new code, where optimization would begin to become noticeable on lists with more than 5000 elements (100+ pairwise actors).

Testing on updated hard square code reveals this prediction to be quite accurate:

timing test

Left uses a naive algorithm with non-optimal complexity due to Sort, while the right should achieve optimal complexity using Skip12.

With only 16 masses, the left notebook runs noticeably faster. With 100 masses, the right notebook runs barely faster. With five hundred masses, the optimal complexity algorithm using Skip12 runs noticeably faster (at least 2x). Another method using AVLTree DataStructure was not implemented, due to time constraints and lack of KeyDrop method.

Eventually the code will be distributed through WFR.

POSTED BY: Brad Klee
Posted 2 years ago

Admittedly, the following sort of works:

size = 2^10;
randData = k /@ RandomInteger[size, size];

AbsoluteTiming[Fold[
  ResourceFunction["IndexedQueue"][
    heap, #2, "Remove"] &,
  Fold[
   ResourceFunction["IndexedQueue"][
     heap, #2, "Enqueue"] &,
   Clear[heap];
   heap = ResourceFunction["IndexedQueue"][{
      "InitialSize" -> Length[randData],
      "NullElement" -> Missing[],
      "ComparisonFunction" -> Greater}, {},
     "Initialize"],
   randData],
  RandomSample[Range[size]][[1 ;; size]]
  ]];

ResourceFunction["IndexedQueue"][heap, "EmptyQ"]
Out[]= True

Even though, obviously:

DeleteCases[heap[[1]], Alternatives[Missing[], -1]]
Out[]  ={k[954], k[615]}

Possibly this is just "multicomputational junk", meaning path-dependent artifacts, which are left over from the insert / delete process, have no inherent meaning, and shouldn't be there for optimization reasons.

POSTED BY: Brad Klee
Posted 2 years ago

Additionally we should compare with AVL trees, which can do something similar:

SeedRandom[23434];
randKeyedData = 
  MapIndexed[Rule @@ {k[#2[[1]]], #1} &, RandomInteger[2^15, 2^15]];
ds = CreateDataStructure["AVLTree", Null, Order[Last[#1], Last[#2]] &];
times = AbsoluteTiming[ds["Insert", #]][[1]] & /@ randKeyedData;
times2 =  AbsoluteTiming[ds["Remove", #]][[1]] & /@ 
   RandomSample[randKeyedData];
ds["Elements"]

ListLogPlot[{times, times2,
  skiplistInsertionTimes,
  skiplistDeletionTimes},
 PlotStyle -> {Green, Brown, Red, Orange},
 PlotRange -> All]

log comparison

Compiled data structure times are in brown / green, while top-level SkipList times are now red / orange. It seems very probable that compile time advantage could provide the 10-50x speedup SkipList would need to compete against AVL trees.

POSTED BY: Brad Klee

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks http://wolfr.am/StaffPicks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Re "[ This could be my fault If I've misunderstood how to use IndexedQueue, someone please let me know if they can explain this 6893 number!! ]": You could ask the author but I doubt said author will have a chance to investigate any time soon.

POSTED BY: Daniel Lichtblau