Message Boards Message Boards

Can Deepminds Learned Model Be Recreated In Wolfram?

Posted 3 years ago

I am fascinated by this paper from Deepmind on MuZero "Learned Models" that have produced better than AlphaZero results for logic gameplay which applies to autonomous robots among many other applications.

My question is this... Can this be built in Wolfram using NetGraphs or other?

Paper: https://arxiv.org/abs/1911.08265

I am willing to put up a bounty of 200 dollars in case anyone who can do this needs some minor compensation for their time to show the community how this can be reproduced in Wolfram. Not sure how to do that though but if needing $ please email me at djtelicloud (at) gmail (dot) com because I can only give one bounty to one person to accomplish this. I will also be attempting this myself and will post my results here for free.

POSTED BY: David Johnston
4 Replies
Posted 3 years ago

Hi. Did you ever find an implementation of the MuZero paper with Wolfram Language/Mathematica?

POSTED BY: Lenny Johnson

Sometimes when a press release uses "Neural Net" or "AI" they are sweeping a lot under the rug. A feed-forward neural network is a static filter whose coefficients are determined empirically by iterative optimization - referred to as "training" in the popular literature. Mathematically they are a probability density function. Although "neural nets" were popularized decades ago by simple models of synaptic communication, the present-day biological neural theories are well beyond static feed forward models.

Today's advanced systems (e.g., effective autonomous systems) are dynamic. A dynamic neural network is composed of elements which are wired with feedback, so the system has an initial state and a steady state. In continuous time you can think of this as a set of transistors wired with assorted feedback to produce a certain kHz frequency dependent on input. We represent these wirings in either a diagram or a system of differential equations. For discrete input we use step-dependent stochastic difference equations

f[t]=g[f[t-1],h[t-1],...] with f[0]=initial input

Here, feedback is occurring in runtime evaluation of the function -- not just the determination of function coefficients, where a different sort of feedback might be used.

POSTED BY: Richard Frost

Regarding "Can Deepminds Learned Model Be Recreated In Wolfram?"

There are two technical requirements: [1] A description (e.g. diagram) of their dynamic system of probability density functions, and [2] their training algorithm and data set.

There is also one financial requirement: A "bounty" or contract offer for an amount commensurate with the level of effort.

POSTED BY: Richard Frost
POSTED BY: David Johnston
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract