Message Boards Message Boards

[LiVE] Functional Dataflow Coding Session #4: Tries and identicons for data

Posted 5 years ago

Please join Functional Dataflow Coding Session #4 live Tue 5ET on https://www.twitch.tv/wolfram

Direct link: https://wolfr.am/FtPdPfKZ

If you missed the previous sessions, you can watch them through the Wolfram YouTube Channel: https://wolfr.am/FCPMQHCM

This channel is related to my forthcoming book "Functional Dataflow" based largely on functional approaches to data wrangling and structuring in WL rather than on specific statistical methods using a real world dataset

We'll discuss two self-contained topics and a wrap up that combines them.

  • A new implementation of trie operators that can take a function parameter to gather statistics as the trie is built from a list of lists.

    • Such tries have a variety of applications such as preprocessing documents into suffix tries for efficient string matching - algorithms leveraging tries can match in O(s+t) time vs the naive O(s x t) method.

    • Efficient indexing with statistics of general sequence data, such as mobile app user flows.

  • Experimental Flag Identicons: visual representations to tag and distinguish data layers based on filesystem paths, so that data layers sharing the same path prefix will have common elements (see screenshot)

enter image description here

Wrap up showing that the trie operator's use associating the graphics strips of the identicons with each path component (screenshot 2)

fileTokens [ funcTrie[AssociationMap[ flagIdenticon /* smallStripGr /* Framed]]] // Normal

enter image description here

POSTED BY: Alan Calvitti
Posted 5 years ago

Dear Alan,

Thank you for your Livecoding sessions.

I am catching up on them, but found it difficult to extract the links to your notebooks and prerequisites from the videos i.e. full URLs to Google Drive.

Would it be possible for you to share the links per session in this community?

Alternatively, all file links could be shared by Wolfram Research using the Youtube show notes as well.

Cheers,

Dave

POSTED BY: Dave Middleton
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract