Message Boards

6148 Views

4 Replies

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Data Science Game Development Wolfram Language Machine Learning Neural Networks

Can Deepminds Learned Model Be Recreated In Wolfram?

David Johnston, Nvidia

Posted 3 years ago

I am fascinated by this paper from Deepmind on MuZero "Learned Models" that have produced better than AlphaZero results for logic gameplay which applies to autonomous robots among many other applications. My question is this... Can this be built in Wolfram using NetGraphs or other? Paper: https://arxiv.org/abs/1911.08265 I am willing to put up a bounty of 200 dollars in case anyone who can do this needs some minor compensation for their time to show the community how this can be reproduced in Wolfram. Not sure how to do that though but if needing $ please email me at djtelicloud (at) gmail (dot) com because I can only give one bounty to one person to accomplish this. I will also be attempting this myself and will post my results here for free.

POSTED BY: David Johnston

4 Replies

Sort By:

Posted 3 years ago

Hi. Did you ever find an implementation of the MuZero paper with Wolfram Language/Mathematica?

POSTED BY: Lenny Johnson

Posted 3 years ago

Sometimes when a press release uses "Neural Net" or "AI" they are sweeping a lot under the rug. A feed-forward neural network is a static filter whose coefficients are determined empirically by iterative optimization - referred to as "training" in the popular literature. Mathematically they are a probability density function. Although "neural nets" were popularized decades ago by simple models of synaptic communication, the present-day biological neural theories are well beyond static feed forward models. Today's advanced systems (e.g., effective autonomous systems) are dynamic. A dynamic neural network is composed of elements which are wired with feedback, so the system has an initial state and a steady state. In continuous time you can think of this as a set of transistors wired with assorted feedback to produce a certain kHz frequency dependent on input. We represent these wirings in either a diagram or a system of differential equations. For discrete input we use step-dependent stochastic difference equations f[t]=g[f[t-1],h[t-1],...] with f[0]=initial input Here, feedback is occurring in runtime evaluation of the function -- not just the determination of function coefficients, where a different sort of feedback might be used.

POSTED BY: Richard Frost

Posted 3 years ago

Regarding "Can Deepminds Learned Model Be Recreated In Wolfram?" There are two technical requirements: [1] A description (e.g. diagram) of their dynamic system of probability density functions, and [2] their training algorithm and data set. There is also one financial requirement: A "bounty" or contract offer for an amount commensurate with the level of effort.

POSTED BY: Richard Frost

Posted 3 years ago

Thank you for your response. I appreciate you taking the time. :) All we have access to is the PDF available for download on that link so I don't believe we have that diagram. As far as the financial part, I am just a community member like everyone else here working on cool stuff in my personal time. If tree search and deep RL models can be combined like this paper describes but in Wolfram, it could be generalized to many problems like robots, vision, natural language, etc. and would benefit all the community like the Wolfram Neural Net Repository does and other community initiatives like the Function Repository. It's not really a contract thing, just me pitching in a bit for someone who is already passionate about things like this and would probably have done it anyway but maybe not shared with the community as a whole without someone else pitching in "something" at least. Just an idea even if not up to the right standards. Again, thank you for responding though. :)

POSTED BY: David Johnston

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback