Message Boards Message Boards

GROUPS:

Metaprogramming: the Future of the Wolfram Language

Posted 1 month ago
512 Views
|
6 Replies
|
8 Total Likes
|

With all the marvelous new functionality that we have come to expect with each release, it is sometimes challenging to maintain a grasp on what the Wolfram language encompasses currently, let alone imagine what it might look like in another ten years. Indeed, the pace of development appears to be accelerating, rather than slowing down. However, I predict that the "problem" is soon about to get much, much worse. What I foresee is a step change in the pace of development of the Wolfram Language that will produce in days and weeks, or perhaps even hours and minutes, functionality might currently take months or years to develop. So obvious and clear cut is this development that I have hesitated to write about it, concerned that I am simply stating something that is blindingly obvious to everyone. But I have yet to see it even hinted at by others, including Wolfram. I find this surprising, because it will revolutionize the way in which not only the Wolfram language is developed in future, but in all likelihood programming and language development in general. The key to this paradigm shift lies in the following unremarkable-looking WL function WolframLanguageData[], which gives a list of all Wolfram Language symbols and their properties. So, for example, we have:

WolframLanguageData["SampleEntities"]

enter image description here

This means we can treat WL language constructs as objects, query their properties and apply functions to them, such as, for example:

WolframLanguageData["Cos", "RelationshipCommunityGraph"] enter image description here

In other words, the WL gives us the ability to traverse the entirety of the WL itself, combining WL objects into expressions, or programs. This process is one definition of the term “Metaprogramming”.

What I am suggesting is that in future much of the heavy lifting will be carried out, not by developers, but by WL programs designed to produce code by metaprogramming. If successful, such an approach could streamline and accelerate the development process, speeding it up many times and, eventually, opening up areas of development that are currently beyond our imagination (and, possibly, our comprehension).

So how does one build a metaprogramming system? This is where I should hand off to a computer scientist (and will happily do so as soon as one steps forward to take up the discussion). But here is a simple outline of one approach.

The principal tool one might use for such a task is genetic programming:

WikipediaData["Genetic Programming"]

In artificial intelligence, genetic programming (GP) is a technique whereby computer programs are encoded as a set of genes that are then modified (evolved) using an evolutionary algorithm (often a genetic algorithm, "GA") – it is an application of (for example) genetic algorithms where the space of solutions consists of computer programs. The results are computer programs that are able to perform well in a predefined task. The methods used to encode a computer program in an artificial chromosome and to evaluate its fitness with respect to the predefined task are central in the GP technique and still the subject of active research.

One can take issue with this explanation on several fronts, in particular the suggestion that GP is used primarily as a means of generating a computer program for performing a predefined task. That may certainly be the case, but need not be.

Leaving that aside, the idea in simple terms is that we write a program that traverses the WL structure in some way, splicing together language objects to create a WL program that “does something”. That “something” may be a predefined task and indeed this would be a great place to start: to write a GP metaprogramming system that creates WL programs that replicate the functionality of existing WL functions. Most of the generated programs would likely be uninteresting, slower versions of existing functions; but it is conceivable, I suppose, that some of the results might be of academic interest, or indicate a potentially faster computation method, perhaps. However, the point of the exercise is to get started on the metaprogramming project, with a simple(ish) task with very clear, pre-defined goals and producing results that are easily tested. In this case the “objective function” is a comparison of results produced by the inbuilt WL functions vs the GP-generated functions, across some selected domain for the inputs.

I glossed over the question of exactly how one “traverses the WL structure” for good reason: I feel sure that there must have been tremendous advances in the theory of how to do this in the last 50 years. But just to get the ball rolling, one could, for instance, operate a dual search, with a local search evaluating all of the functions closely connected to the (randomly chosen) starting function (WL object), while a second “long distance” search jumps randomly to a group of functions some specified number of steps away from the starting function.
[At this point I envisage the computer scientists rolling their eyes and muttering “doesn’t this idiot know about the {fill in the bank} theorem about efficient domain search algorithms?”].

Anyway, to continue. The initial exercise is about the mechanics of the process rather that the outcome. The second stage is much more challenging, as the goal is to develop new functionality, rather than simply to replicate what already exists. It would entail defining a much more complex objective function, as well as perhaps some constraints on program size, the number and types of WL objects used, etc.

An interesting exercise, for example, would be to try to develop a metaprogramming system capable of winning the Wolfram One-Liner contest. Here, one might characterize the objective function as “something interesting and surprising”, and we would impose a tight constraint on the length of programs generated by the metaprogramming system to a single line of code. What is “interesting and surprising”? To be defined – that’s a central part of the challenge. But, in principle, I suppose one might try to train a neural network to classify whether or not a result is “interesting” based on the results of prior one-liner competitions.

From there, it’s on to the hard stuff: designing metaprogramming systems to produce WL programs of arbitrary length and complexity to do “interesting stuff” in a specific domain. That “interesting stuff” could be, for instance, a more efficient approximation for a certain type of computation, a new algorithm for detecting certain patterns, or coming up with some completely novel formula or computational concept.

Obviously one faces huge challenges in this undertaking; but the potential rewards are also enormous in terms of accelerating the pace of language development and discovery. It is a fascinating area for R&D, one that the WL is ideally situated to exploit. Indeed, I would be mightily surprised to learn that there is not already a team engaged on just such research at Wolfram. If so, perhaps one of them could comment here?

6 Replies

Would any community member be interested in attempting to build a Wolfram One-Liner generator?

Posted 1 month ago

Very cool ideas, Jonathan. Like Amazon's "Alexa", it would be nice if we could say "hey 'Wolf', show me what i would look like if i were a horse." And it would take your photos and DNA data, a horse's DNA data, and automatically interpolate between them in an efficient and meaningful way, yielding a 3D image.

Auto-parallelization would be wonderful too, if Mathematica somehow knew when to compute in serial, and when to compute in parallel.

In regards to a "Wolfram One-Liner generator"-- That's a great idea, not just for the competition, but for discovery in general. And not just for one line. I think the best way to do that would be to train neural nets on existing code, like you said, rather than exhaustively trying all permutations of functions. One could imagine a mining program, which runs nonstop, looking for novel relationships and correlations between shapes, rule-types, dimensions, etc. A "wisdom miner," if you will.

One tool which might come in handy is the "fundamental structure of relationship", which i define in this post, since a meta-programming virtual assistant would need to be able to reason, at least somewhat. I'm not sure if a 'virtual assistant' is what you meant, but that's the direction I would take it.

I would love to try to build one, but sadly it is outside my skill-set, and it would be an all-consuming project. But I think you're correct that high-level meta-algorithms will be responsible for the most interesting results in the next few years.

Bryan,

I was sitting here puzzling why this post had received no responses. I has reached the conclusion that either community members saw the idea as too obvious and routine to be worth commenting on, or alternatively that my description was so inept that no-one had the slightest idea what I was talking about.

So it was refreshing, to say the least, you read your reply. Some great ideas in there, too, like auto-parallelization. I imagine Wolfram would claim that they already have the precursor to your "Hey Wolf" concept in Siri (although you may disagree).

Your description is exactly on the money:

a mining program, which runs nonstop, looking for novel relationships and correlations between shapes, rule-types, dimensions, etc. A "wisdom miner," if you will.

Great phrase, "wisdom miner", and very appropriate. I see it working a little like bitcoin mining: once you have developed a sensible meta-programming algorithm you leave it running 24-7, occasionally pinging you when it produces something interesting.

The approach you suggest is much more structured than the one I proposed (Genetic Programming) and might be a better way to go: but GA is simpler to implement and, with the enormous computer power at our disposal these days, can often produce interesting results.

The point about trying to create a Wolfram "One-Liner" generator is that it is a kind of Turing test. Or, rather, a kind of Turing++ test: meaning, that on seeing a successful result an expert human programmer would realize that (a) this program was NOT produced by a human and (b) was far superior to anything s/he could imagine a human programmer creating.

Thanks again for your contribution.

Posted 1 month ago

Jonathan,

I think this is a very interesting discussion. As you know, there is a pretty famous and successful GA program by Mark Kotanchek written in Mathematica: DataModeler:

http://www.evolved-analytics.com

I think it evolves Genetic Algorithms only, not Genetic Programs (the difference being whether the resulting code includes programmatic constructs like Do/For/If, as opposed to just functions like Plus/Times/Sin; GA are also called Symbolic Regression in this context), so I don't know whether it could evolve a 'one-liner generator'.

There is an interesting discussion of Meta-programming by Shifrin (primarily) on StackExchange

https://mathematica.stackexchange.com/questions/2335/metaprogramming-in-mathematica/2352#2352

where he discusses some of the shortcomings of MMA in this area, as well as the strengths of course.

Of course Metaprogramming is an old topic in MMA: Michael Trott has a short program that randomly generates bits of MMA code which are then tested for syntaxicity, and which seems to be close to what you describe. Curiously it isn't in his 'Meta-Mathematica' chapter of his Guidebook for Programming, but is in his Guidebook for Graphics instead, page 273:

https://books.google.com/books?id=9DGhukeFNLIC&pg=PA275&lpg=PA275&dq=trott+syntaxq&source=bl&ots=oDKTjVxIvH&sig=XPWRv3YDMts5GjVPJXQ82l4q41E&hl=en&sa=X&ved=2ahUKEwj4qriajcLdAhUFmeAKHfb3Cs0Q6AEwAXoECAkQAQ#v=onepage&q=trott%20syntaxq&f=false

I myself have also been working on an evolutionary algorithm in MMA for a number of years and presented a bit of it a long time ago at the annual conference

http://library.wolfram.com/infocenter/Conferences/5850

My goal has always been to implement a full Genetic Programming framework. But just the functional, non-programming has been tough enough, and I am still a ways away from evolving MMA programs.

I would think that the biggest difficulty anyone will have with such an endeavor is the implementation of a 'typing' system. MMA doesn't have explicit types, so as soon as one gets away from the operations over basic lists (as one would do with symbolic regression where all of the input and output arguments of functions are numeric vectors of the same dimensions) then one has to control what functions can get plugged into one another.

Hi Bernard, Thank you for this very informative response. Yes, I am fully aware of Mark Kotanchek's excellent DataModeler system, having used it on several occasions and I can highly recommend it. I was aware, too, of the Shifrin post on meta-programming. His exposition is much more thorough and detailed than my own outline and covers far more ground. The brief set out in my post was limited to the creation of a function generator. The distinction between GA and GP is a nice one, but I am not sure I buy into it, especially in the case of a functional language like WL, in which the sort of programming constructs you refer to are typically replaced by listable operators. In a language like WL, programs == functions, largely.

I was unaware of the interest of the great Michael Trott in the subject of meta-programming, but I should hardly be surprised - is there any area of MMA that the man has not explored? Nor was I aware of your own efforts in this field, which are also very interesting.

In any event, it was not my intention to create the impression that I was suggesting something radically new. On the contrary, I expressed my puzzlement as to why the well-known concept of meta-programming had not been taken up with the sort of enthusiasm I might have expected. I was also seeking to draw attention to the fact that, given the scope of development of the Wolfram Language in the last few iterations, meta-programming looks much more feasible than it was when Michael was writing his magnum opus.

There is an issue here that I would like to draw out further. I believe that the Wolfram Language is in many ways far better suited to meta-programming than many other traditional languages. Firstly, as I have already pointed out, WL provides ready access to the constructs of the language itself, enabling them to be conveniently manipulated like objects.

Another observation, perhaps a more important one, is this: that for functional programming languages like WL the key to success must surely be related to, if not determined by, the scope of the functionality they encompass. To take an example, in my own field, econometrics, there was almost no reason to prefer MMA before time series objects were introduced; now, however, I believe that MMA is positioned to be come a leader (if not the leader) in financial analysis and econometrics thanks to the enormous array of new functionality that has been added in recent releases (FinancialData, TimeSeries, FinancialDerivative, Random Process, etc, etc).

If that is true, then one of the keys to gaining a competitive advantage as a functional programming language is the speed with which new functionality can be developed. In that context, the ability to generate such functionality in an automated way takes on a strategic significance.

Posted 1 month ago

Yes, I agree 'why hasn't metaprogramming been taken up' is a very interesting question. It is certainly something which WL would seem to have some sort of comparative advantage at. And the various shortcomings that Shifrin identifies - particularly in the ability to create macros - don't seem to have been addressed by WRI in the interim. Probably there doesn't appear to be enough demand for such capabilities, so they aren't putting the resources to it, right?

My distinction for GA vs GP is that in GA there are no dependencies across the expression tree - so long as the expression evaluates in a bottom-up fashion, it doesn't matter whether the first arg or the last arg of an expression is evaluated first. In GP, where there are variables, then the left-right evaluation order matters. So, for example, an 'algorithm' can evaluate Plus[x,x,1] /. x->2 but a 'program' cannot evaluate Plus[x=x+1,x=2,1]. Not without an error, anyway.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract