Group Abstract Group Abstract

Message Boards Message Boards

Thoughts on a Python interface, and why ExternalEvaluate is just not enough

Posted 8 years ago
POSTED BY: Szabolcs Horvát
25 Replies

Szabolcs, we have made massive improvements to the ExternalEvaluate framework of the last years especially w.r.t virtual environments. Wondering what your current thoughts are on this topic?

POSTED BY: Arnoud Buzing
Posted 6 years ago

Thoughts on additions in 12.0?

POSTED BY: Max Coplan

@Max Coplan

Thoughts on additions in 12.0?

I don't have time to write a detailed response so I'll just say that the improvements in 12.0 are large, and it's heading in the right direction. But there is still some way to go.

I am already making use of it:

https://mathematica.stackexchange.com/questions/195380/how-can-i-use-the-python-library-networkx-from-mathematica

The data transfer from Python -> Mathematica is now structured, fast and customizable through the Wolfram Client for Python. My biggest wish is that this be implemented for the Mathematica -> Python direction as well for M12.1.

POSTED BY: Szabolcs Horvát
Posted 7 years ago

Here's a teaser for something I've been working on for a bit. I've now gotten things working so I can run a python and Mathematica concurrently in respective notebook interfaces:

enter image description here

POSTED BY: b3m2a1 ​ 
Posted 7 years ago

This looks really exciting! Do you have a documentation on this? Or, any related public project people can join?

POSTED BY: Ting Sun
Posted 7 years ago

Give this a look: http://community.wolfram.com/groups/-/m/t/1468475

If you want to contribute to the repo be my guest.

There are also some nice ideas from @Szabolcs Horvát here that I think really should be pursued in terms of making this more extensible.

POSTED BY: b3m2a1 ​ 

One nice thing about WXF is that is support associations. Another (from a user perspective) is that it is simple and well documented. Consider e.g. the simple problem if detecting a packed array on a MathLink link. It took a bit of experimentation and guesswork with MathLink. With the open WXF specification, it is immediately clear what is possible, what is not possible, and what is the best way to do something.

I wonder if it is meant as a replacement for MathLink when developing links to other systems. I also wonder if it performs better. At least the decoding could probably be made faster as we do not depend on a closed library now, but can implement our own decoder. Of it only handles encoding/decoding, and does not give us a ready-to-use means of data transfer like MathLink does.

Even with LibraryLink, now it might be easier to transfer complex expressions encoded as WXF and transferred as a byte-type RawArray.

POSTED BY: Szabolcs Horvát

Hi Szabolcs, Where can I find the WXF specification? I can only find the Import Export possibilities but isn't that only disk based? Or can it also be used in streams? Happy to learn more thx

POSTED BY: l van Veen

Hi! It's here: http://reference.wolfram.com/language/tutorial/WXFFormatDescription.html As far as I can tell, it's a complete specification.

This was not really advertised when it came out and I completely missed its significance. In particular, I missed the fact that it's fully documented, which is precisely what makes it useful.

POSTED BY: Szabolcs Horvát

We are more then welcome to listen to customers for this functionality, which is why we are going to release this library as open source code when ready.

As I mentioned this library allows you to import / export arbitrary mathematica expressions using WXF, and this format is optimized also for transfer PackedArray and NumericArray, and for most types the conversion is automatic.

You can create a dump in mathematica and export it, or you can programmatically start a kernel from python and retrieve the result of an arbitrary computation.

Automatic data conversion from WL to python is not done because WXF importer in python has been implemented very recently so it takes time to add this functionality (you might expect automatic conversion of DateObject to python datetime out of the box which right now works only in the opposite direction).

Take a look at the source code in wolframclient.serializers and wolframclient.deserializers, keep in mind that everything is subject to changes right now, but at least you might get an idea of what can be done.

print(Range[10]) – do you mean a mix of Python and Mathematica syntax?

What really matters to me is to be able to use this interface to solve practical problems that come up day-to-day in my work. The current ExternalEvaluate is not capable of this. I'd like to suggest to let real user needs drive development, and also to set priorities based on such needs.

Let me give examples:

  • Call some minor scientifically oriented function that exists in some Python library but not Mathematica. E.g. compute a spherical Voronoi tessellation. This should be a no-fuss at most three-lines-of-code task.

  • A concrete application I have in mind is using the networkx library. Example task: compute a minimum weight cycle basis (not available in Mma). This won't be a three-line task because the involved data structures are more complex. But it should be not too hard to set up a framework for transferring graphs back and forth between the two systems, and once that is done, it should be easy to call any functions (like the cycle basis computation).

Python can open up a gateway to a huge number of useful libraries many of which are completely unavailable to Mathematica at this moment. Thus a Python interface should be taken very seriously, and preferably optimized specifically for Python (instead of making it generic and work with any language, like JavaScript).

Example: One task I had to solve recently was to call certain ITK functions from Mathematica (image processing). Like any high profile scientific library, ITK has a Python interface. In fact, it has two, one of them being specifically optimized for scripting language. Right now I had to use LibraryLink to make it work with good performance. It took more than a day to set up a framework for it, and even after that was done, each new function I need to access takes a 5-10 minute setup process. The kind of Python interface I wish for would make this task easy and seamless: directly transfer image data as a NumericArray/RawArray with negligible overhead, call the function, retrieve the result.

To sum up:

Please let real-world applications drive the development of this functionality. Ask users what they imagine doing with a Python interface. Ask those users who use Mathematica daily to get real work done.

POSTED BY: Szabolcs Horvát

For what I know we are planning to have a way to convert mathematica expressions automatically in python cells, I think the future syntax might look something like print(Range[10]).

We are working on a python client library that we plan to release it on github in the future that allows to import/export WXF data files from python and to interact with a kernel using MathLink.

you can take a look at our code by evaluating "import wolframclient; str(wolframclient)" in a python cell. the library is still under active development, and we are in the process of writing documentation for it, which is why the code is not released yet.

Unfortunately I'm not in the position to reply to your other questions about mathlink or why we choose ZMQ over mathlink, but we might change the implementation.

Sincerely. Riccardo.

Thank you for showing this @Riccardo ! I do have the M12 prerelease, but I did not know about these improvements.

Performance was one of the deal-breakers for me. The other one was that there was no way to send structured data to Python. Converting expressions to Python code, which is then parsed and run by Python, is not a good way to do this. Was this fixed in M12? If yes, could you give an example please? If no, are there plans to fix it?

The simple use case I've been suggesting: take a matrix m = RandomReal[1, {100,100}]. How do I compute its eigenvalues (and perhaps eigenvectors) in Python? This includes all the basics one might expect from a language interface: send structured data, call functions, receive structured data.

About WXF:

What your post implies, but you did not spell it out, is that WXF is an openly documented format for storing and transferring Mathematica expressions. I was not aware of this, but from your post it sounded like you must have a decoder for it on the Python side. Will you make this decoder available to the rest of us, perhaps even open source? What about decoders for other languages?

I am also wondering about how this relates to MathLink. I always imagined MathLink to be based on a very similar binary format.

IMO a reasonable way to implement an interface to another language would be to first expose the MathLink API to it. Then we would have a means to transfer expressions back and forth. Why did you choose ZeroMQ and WXF instead of MathLink? Is it simply because Python already speaks ZMQ, or is the performance better with WXF? I remember once I found that transferring certain expressions as JSON, which isn't even binary, was faster than using MathLink, quite shocking!

Finally, I always thought that open sourcing MathLink would be beneficial because it would work around interfacing with GPL'd (or other copyleft) libraries. Is the semi-open WXF a step in this direction?

POSTED BY: Szabolcs Horvát

I'm Riccardo Di Virgilio, currently working at WRI and one contributors of the python implementation for ExternalEvaluate. Data transfer in the first implementation was not efficient, but we have been working to improve it, and with M12 we will ship a much more efficient data transfer thanks to WXF binary format. We are able to serialize a lot of built in data types including integer, float, decimals, datetime, time, complex, fractions, list, tuples, associations, etc...

My current setup using a 2015 2.5 GHz Intel Core i7 macbook pro provides a 500% performance boost over the example that was posted at the beginning of this thread.

In[2]:= ExternalEvaluate[session, "range(10000)"]; // AbsoluteTiming
Out[2]= {0.093617, Null}

We also developed an efficient conversion from numpy arrays to NumericArray.

In[7]:= ExternalEvaluate[session, "import numpy; numpy.ndarray(10000).reshape(4, 2500)"]; //
AbsoluteTiming
Out[7]= {0.001233, Null}

Another very nice feature is an inspectable traceback which provides a very nice interface to debug your python code which is just not possibile from the command line.

Attaching a screenshot.Python traceback

Posted 7 years ago

Just tried out the updated External-related functions in v11.3 this afternoon, and I feel really excited about writing Python code and getting results directly within the notebook! And now numpy-related objects get much better support from Mathematica, which thus can be directly read in without any conversion. In general, I really like this update and very happy to see the power of both systems, Mathematica and python, got combined in a synergistic way.

POSTED BY: Ting Sun
Posted 7 years ago

I would also like to add my vote and support for Szabolcs' request to WRI to implement a low-level and fast interface to Python. My motivation is to get access to Python's superior machine learning libraries.

B

POSTED BY: Bernard Gress
Posted 8 years ago

Hi Szabolcs and thanks for the reply. Actually, I was a bit embarrassed about what happened, since I don't make it a habit of posting unrelated thoughts in other people's threads :-)

Back to the topic of this discussion, I would like to emphasize I agree about the need for a greater integration of Mathematica with Python. Thinking about it, I've decided that calling Python from Mathematica would be just as helpful as calling Mathematica from Python. Totally agree that the speed of data transfer is critical for things like computational geometry, as you mentioned, or other stuff like machine learning, and also about the fundamental importance of numpy arrays. And about how much an obstacle context switching is...although you can hack together anything, once you start dealing with multiple frameworks the focus becomes more on the programming and less on the problem at hand. And then, just the amount of libraries out there in Python that focus on scientific computing, data analysis and machine learning that would be great to just plug into Mathematica.

I didn't quite follow your argument on why MathLink/WSTP API is a better model then J/Link for a Python interface.

For myself, I find it much easier to put together simple looping constructs...procedural programming...in Python than with Mathematica. Although I like functional programming and see its power, I find that it can get kind of dense. Also, I like having classes around and doing object oriented programming. For example, I really like doing a list of items like so:

class Item:
    def __init__(self, i, parent):
        self.number = i
        self.parent = parent

class List:
    def __init__(self, n):
        self.list = []
        for i in range(n):
            self.list.append(Item(i, self))

Using OO programming (sparingly, of course, mostly just to hold data) can really help to reason through what you are doing and I find Mathematica lacks such facilities, or, at least, I have not learned enough of Mathematica to not miss classes. Using the above structure, it is easy for me to put metadata type functions in the List class and item-specific functions in the Item class and loop through the list to examine the items, etc. But, what's missing is calling Mathematica straight from Python (or visa versa)!

Thanks again

POSTED BY: Stephan Foley
Posted 8 years ago

Hi Szabolcs, I put a note on my original post...I made a separate thread requesting that an interface be developed allowing Python to access Mathematica and I think the moderators decided to move my post here. So, now I'm here. Although we are talking different things...you want to access Python through Mathematica, I want to access Mathematica through Python, it is similar.

I agree with you that Mathematica's great strength and Python's great weakness is plotting. Also, Jupyter notebooks just can't compare with Mathematica notebooks. And, although a lot of the same functionality of Mathematica can be found in libraries such as SymPy, SciPy, and friends, it makes a big difference to have everything integrated and documented under one roof.

J/Link is a two way thing and you can use J/Link to call Mathematica from Java. That was the purpose of my original post, which I thought was to be a separate thread...to ask that a similar interface be developed to allow Python to call Mathematica functionality more transparently.

I am in total agreement with what you said here:

... Python is increasingly popular for scientific work. There are Python libraries implementing functionality that Mathematica does not have at this moment. I would find it extremely useful to be able to access some of this functionality while staying in the same system, using the same familiar plotting and data wrangling functions, etc. Currently, if I need a specific library that only has a Python interface, I am forced to use not only this library but everything else from Python as well (e.g. plotting) because there is no efficient communication between the systems. I am confined within a single system, and can't pick the best tool for the task.

and would just add that I would like to access Mathematica from Python for much the same purposes.

POSTED BY: Stephan Foley

Hi Stephan,

I made a separate thread requesting that an interface be developed allowing Python to access Mathematica and I think the moderators decided to move my post here. So, now I'm here.

I completely missed that, and I think I misunderstood what you meant. With that context now it makes sense.

POSTED BY: Szabolcs Horvát

@Szabolcs

Very clear!

Hopefully, LLVM compilation will allow for a much faster Wolfram Language execution (eagerly waiting to hear the latest news on the WTC: how much does it cover? still the 99.99% of the language? How automatic is it getting? Can we imagine full automation/transparent to the user? What is the current optimization level / % of pure C?). Put together with an eventual evolution of the parallelization technology, a project that I'm still waiting to discover of its (eventual) existence..., and we will need to link with other languages on much fewer occasions... focusing less time on optimization, and more time on the main purpose of the algorithms.

POSTED BY: Pedro Fonseca
Posted 8 years ago

Hello, I would like to request the Wolfram developers consider doing a good Python interface to Mathematica. The C interface seems to be discouraged and I looked over the J/Link interface and just thought..."man, do I really want to program in Java".

Personally, I feel having a Python interface would be fun. Also, academia is currently bursting out of the seams with Python programmers, so it would make business sense for Wolfram to tie Mathematica into Python.

Wolframscript is OK, but Python has classes which allow for easier storage and organization. Python is much easier to read and I would venture to say too much Wolframscript becomes "write once, read never," or, at least, it becomes pretty dense and obscure.

I think it would be great (and much easier) to be able to call the immense power of Mathematica from Python when writing command line scripts.

Thanks for the consideration!

Clarification: I had posted this as a separate topic, but I guess the moderators of the forum decided it fit better with this topic and moved my post here. Because of this, I haven't yet read the rest of the thread, but will do so now.

POSTED BY: Stephan Foley
POSTED BY: Szabolcs Horvát

I cannot tell you more at the moment, but would like to assure you that this is the beginning of the story, not the end. There are development initiatives in that general direction and I encourage you to stay patient and look forward to more exciting things coming.

POSTED BY: Vitaliy Kaurov
Anonymous User
Anonymous User
Posted 8 years ago
POSTED BY: Anonymous User

Dear Szabolcs,

as usual your post is very helpful and instructive. Thank you for your posts both here on on StackExchange. They make my life much easier.

Thanks,

Marco

POSTED BY: Marco Thiel
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard