Message Boards Message Boards

LTemplate: a package for faster LibraryLink development

GROUPS:

I am posting to share the LTemplate package, which intends to speed up LibraryLink development by automating the generation of boilerplate code.

Please see the presentation of the package on StackExchange (or below):

I have mentioned this package before on Wolfram Community. At that time I was asking for feedback on the design. The package has come a long way since then, and I think that now it is ready for more general use.

Any comments or feedback is most welcome! Just post a response to this thread.


LibraryLink is an API for extending Mathematica through C or C++. It is very fast because it gives direct access to Mathematica's packed array data structure, without even needing to make a copy of the arrays.

Unfortunately, working with LibraryLink involves writing a lot of tedious boilerplate code.

Mathematica 10 introduced managed library expressions, a way to have C-side data structures automatically destroyed when Mathematica no longer keeps a reference to them. I find this feature useful for almost all non-trivial LibraryLink projects I make, but unfortunately it requires even more boilerplate.

For me, writing all this repetitive code (and looking up how to do it in the documentation) has been the single biggest barrier to starting any LibraryLink project. It just takes too much time. Is there an easier way?


I wrote a package to automatically generate all the boilerplate needed for LibraryLink and for managed library expressions based on a template that describes a class interface:

Here's how it works

  • Write a template (a Mathematica) expression that describes a C++ class interface
  • This template is used to generate LibraryLink-compatible C functions that call the class's member functions, to compile them, and to finally load them
  • Instances of the class are created as managed library expressions; all the boilerplate code for this is auto-generated
  • The package is designed to be embedded into Mathematica applications. This avoids conflicts between different applications using different LTemplate versions. LTemplate can also be used standalone for experimentation and initial development.

Examples

Note: The package comes with a tutorial and many more examples.

While LTemplate is designed to be embedded into other applications, for this small example I'll demonstrate interactive standalone use.

<< LTemplate`

Let us use the system's temporary directory as our working directory for this small example, so we don't leave a lot of mess behind. Skip this to use the current directory instead.

SetDirectory[$TemporaryDirectory]

To be able to demonstrate managed library expressions later, we turn off input/output history tracking:

$HistoryLength = 0;

Let's write a small C++ class for calculating the mean and variance of an array:

code = "
  #include <LTemplate.h>

  class MeanVariance {
    double m, v;

  public:
    MeanVariance() { mma::print(\"constructor called\"); }
    ~MeanVariance() { mma::print(\"destructor called\"); }

    void compute(mma::RealTensorRef vec) {
     double sum = 0.0, sum2 = 0.0;
     for (mint i=0; i < vec.length(); ++i) {
         sum  += vec[i];
         sum2 += vec[i]*vec[i];
     }
     m = sum / vec.length();
     v = sum2 / vec.length() - m*m;
    }

    double mean() { return m; }
    double variance() { return v; }
  };
  ";

Export["MeanVariance.h", code, "String"]

LTemplate functions live in the mma C++ namespace. mma::RealTensorRef represents a reference to a Mathematica packed array of Reals. It is just a wrapper for the MTensor type. There may be more than one reference to the same array, so array creation and destruction must still be managed manually. A RealTensorRef allows convenient linear indexing into the array using the [ ] operator. RealMatrixRef and RealCubeRef are subclasses of RealTensorRef, which additionally allow indexing into 2D or 3D arrays using the ( ) operator.

We have three member functions: one for computing both the mean and variance in one go, and two others for retrieving each result.

The class must go in a header file with the same name, MeanVariance.h in this case.

Let's write the corresponding template now:

template =
  LClass["MeanVariance",
   {
    LFun["compute", {{Real, _, "Constant"}}, "Void"],
    LFun["mean", {}, Real],
    LFun["variance", {}, Real]
   }
  ];

We can optionally check that the template has no errors using ValidTemplateQ[template].

Compiling and loading is now as simple as

In[]:= CompileTemplate[template]

Current directory is: /private/var/folders/31/l_62jfs110lf0dh7k5n_y2th0000gn/T

Unloading library MeanVariance ...

Generating library code ...

Compiling library code ...

Out[]= "/Users/szhorvat/Library/Mathematica/SystemFiles/LibraryResources/MacOSX-x86-64/MeanVariance.dylib"

and then

LoadTemplate[template]

The compilation step created a file called LTemplate-MeanVariance.cpp in the same directory, the source code of which I attached at the end, for illustration.

We can now create a managed library expression corresponding to this class:

obj = Make["MeanVariance"]

During evaluation of In[]: constructor called

(* MeanVariance[1] *)

arr = RandomReal[1, 10];

obj@"compute"[arr]

{obj@"mean"[], obj@"variance"[]}
(* {0.482564, 0.104029} *)

We can check the list of live expressions of type MeanVariance using

LExpressionList["MeanVariance"]
(* {MeanVariance[1]} *)

As soon as Mathematica has no more references to this expression, it gets automatically destroyed

obj =.

During evaluation of In[]: destructor called

LExpressionList["MeanVariance"]
(* {} *)

The reason why we had to use $HistoryLength = 0 above is to prevent Out keeping a reference to the expression.

One practical way to write a Mathematica functions that exposes this functionality is

meanVariance[arr_] :=
 Block[{obj = Make[MeanVariance]},
  obj@"compute"[arr];
  {obj@"mean"[], obj@"variance"[]}
 ]

As soon as the Block finishes, obj gets automatically destroyed:

meanVariance[arr]

During evaluation of In[]: constructor called

During evaluation of In[]: destructor called

(* {0.618649, 0.033828} *)

This is one of those special cases when using Block over Module may be worth it for performance reasons. (The usual caveats about Block apply though.)

Notice that the expression has the form MeanVariance[1]. The integer index 1 is the ManagedLibraryExpressionID. The symbol MeanVariance is created in the context

LClassContext[]
(* "LTemplate`Classes`" *)

This context is added to the $ContextPath when using LTemplate interactively as a standalone package, but not when it's loaded privately by another application. We can check the usage message of the symbol:

<!-- language: lang-none -->
?MeanVariance

class MeanVariance:
    Void compute({Real, _, Constant})
    Real mean()
    Real variance()

The package is too large to present fully in a StackExchange post, so if you are interested, download it and read the tutorial, which has several more examples!

LTemplate is continually under development, and breaking changes are possible (though unlikely). However, since the recommended way to deploy it is to embed it fully in the Mathematica application that uses it, this should not be a problem. I am using LTemplate in the IGraph/M package, which proves its feasibility for use in large projects.

There are additional features such as:

  • Multiple related classes in the same template
  • Pass another managed library expression to a function, and receive it as object reference on the C++ side (LExpressionID)
  • Format templates to be human-readable (FormatTemplate)
  • User-friendly error messages
  • Error handling through exceptions (mma::LibraryError)
  • Calling Print for debugging, massert macro to replace C's standard assert and avoid killing the Mathematica kernel.
  • Calling Message, setting a symbol to associate standard messages with
  • Argument passing and return using MathLink (LinkObject passing)
  • mlstream.h auxiliary header for

The documentation isn't complete, but if you have questions, feel free to comment here or email me.


Questions and limitations

Can I use plain C instead of C++? No, LTemplate requires the use of C++. However, the only C++ feature the end-user programmer must use is creating a basic class.

Why do I have to create a class? I only want a few functions. I didn't implement free functions due to lack of time and need. There's no reason why this shouldn't be added. However, you can always create a single instance of a class, and keep calling functions on it. The overhead will be minimal according to my measurements.

Why can't I use underscores in class or function names? LTemplate currently only supports names that are valid both in Mathematica and C++. This excludes underscores and $ signs (even though some C++ compilers support $ in identifiers). This also helps avoid name conflicts with auxiliary functions LTemplate generates (which always have underscores).

How do I write library initialization and cleanup code? Currently LTemplate doesn't support injecting code into WolframLibrary_initialize and WolframLibrary_uninitialize. Initialization code can be called manually from Mathematica. Create a single instance of a special class, put the initialization code in one of its member functions, and call it from Mathematica right after loading the template. The uninitialization code can go in the destructor of the class. All objects are destroyed when the library is unloaded (e.g. when Mathematica quits). Warning: when using this method, there's no guarantee about which expression will be destroyed last! To fix this, initialization/uninitialization support is planned for later.

Can I create a function that takes an MTensor of unspecified type? No, LTemplate requires specifying the data type (but not the rank) of the MTensor. {Real, 2} is a valid type specifier and so is {Real, _}. {_, _} is not allowed in LTemplate, even though it's valid in standard LibraryLink. The same applies to MSparseArrays (mma::SparseArrayRef).

Which LibraryLink features are not supported?

  • The numerical type underlying tensors must be explicitly specified. Tensors without explicitly specified types are not supported.

  • LibraryDataType[Image, ...] is not yet supported (MImage), but it is planned because I need it.

  • The support for LibraryDataType[SparseArray, ...] is still limited, but the basics are there.

  • There's no explicit support for library callback functions yet, but they can be used by accessing the standard LibraryLink API (function pointers in mma::libData).

Contributions and ideas for improvements are most welcome!


Source of LTemplate-MeanVariance.cpp:

#include "LTemplate.h"
#include "LTemplateHelpers.h"
#include "MeanVariance.h"


namespace mma {

WolframLibraryData libData;

#define LTEMPLATE_MESSAGE_SYMBOL  "LTemplate`LTemplate"

#include "LTemplate.inc"

} // namespace mma


std::map<mint, MeanVariance *> MeanVariance_collection;

DLLEXPORT void MeanVariance_manager_fun(WolframLibraryData libData, mbool mode, mint id)
{
    if (mode == 0) { // create
      MeanVariance_collection[id] = new MeanVariance();
    } else {  // destroy
      if (MeanVariance_collection.find(id) == MeanVariance_collection.end()) {
        libData->Message("noinst");
        return;
      }
      delete MeanVariance_collection[id];
      MeanVariance_collection.erase(id);
    }
}

extern "C" DLLEXPORT int MeanVariance_get_collection(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    mma::IntTensorRef res = mma::detail::get_collection(MeanVariance_collection);
    mma::detail::setTensor<mint>(Res, res);
    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT mint WolframLibrary_getVersion()
{
    return WolframLibraryVersion;
}

extern "C" DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData)
{
    mma::libData = libData;
    {
       int err;
       err = (*libData->registerLibraryExpressionManager)("MeanVariance", MeanVariance_manager);
       if (err != LIBRARY_NO_ERROR) return err;
    }
    return LIBRARY_NO_ERROR;
}

extern "C" DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData)
{
    (*libData->unregisterLibraryExpressionManager)("MeanVariance");
    return;
}


extern "C" DLLEXPORT int MeanVariance_compute(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    const mint id = MArgument_getInteger(Args[0]);
    if (MeanVariance_collection.find(id) == MeanVariance_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try {
       mma::RealTensorRef var1 = mma::detail::getTensor<double>(Args[1]);

       (MeanVariance_collection[id])->compute(var1);
    }
    catch (const mma::LibraryError & libErr)
    {
       libErr.report();
       return libErr.error_code();
    }

    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT int MeanVariance_mean(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    const mint id = MArgument_getInteger(Args[0]);
    if (MeanVariance_collection.find(id) == MeanVariance_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try {
       double res = (MeanVariance_collection[id])->mean();
       MArgument_setReal(Res, res);
    }
    catch (const mma::LibraryError & libErr)
    {
       libErr.report();
       return libErr.error_code();
    }

    return LIBRARY_NO_ERROR;
}


extern "C" DLLEXPORT int MeanVariance_variance(WolframLibraryData libData, mint Argc, MArgument * Args, MArgument Res)
{
    const mint id = MArgument_getInteger(Args[0]);
    if (MeanVariance_collection.find(id) == MeanVariance_collection.end()) { libData->Message("noinst"); return LIBRARY_FUNCTION_ERROR; }

    try {
       double res = (MeanVariance_collection[id])->variance();
       MArgument_setReal(Res, res);
    }
    catch (const mma::LibraryError & libErr)
    {
       libErr.report();
       return libErr.error_code();
    }

    return LIBRARY_NO_ERROR;
}
POSTED BY: Szabolcs Horvát
Answer
2 years ago

Version 0.3 of LTemplate is available now: https://github.com/szhorvat/LTemplate

Major changes since 0.2:

  • mlstream.h auxiliary header that makes it easier to handle function arguments and return values with MathLink-based passing
  • preliminary sparse array support
  • expanded documentation
  • a skeleton project is now included to make it quick and easy to set up a complex multiplatform LTemplate-based application
  • many fixes
POSTED BY: Szabolcs Horvát
Answer
1 year ago

This is pretty cool, Szabolcs, thanks for sharing this with the community.

One thing you might consider adding in future is something we've been taking advantage of for the neural network implementation we're working on for 11, which is to use the new, fast RawJSON import/export facility to handle things like multiple return values, associations, lists of strings, lists of booleans, tuples of mixed types, and other such things that don't have a native LibraryLink representation.

You can use Developer`ReadRawJSONString and Developer`WriteRawJSONString on the Mathematica side to efficiently serialize/deserialize all but large numeric tensors to/from UTF8 JSON strings, which can then be sent/received to the C++ side. On the C++ side you can use a header-only JSON parsing library to parse these and with a few simple functions turn them into things like std::vectors or hashmaps of strings and so on. It's possible to use templates to even make JSON (de)serializers that handle arbitrary kinds of nested vectors, tuples, hashmaps and so on without having to write any boilerplate.

As long as the bottleneck isn't serialization, it makes it much easier to work with weird data types using this technique. Large tensors of course are better suited to going through the normal LibraryLink mechanism, and returning multiple such tensors is better suited to your current approach.

Alternatively, you can use the ExpressionJSON variants of the above Developer functions to send arbitrary symbolic expressions over, so that your C++ program can process and emit things that aren't just lists and associations of the basic types, but could be e.g. polynomials or Entities or Quantities or what-have-you.

LibraryLink has picked up support for RawArrays, though that's not documented since RawArrays are themselves not documented. Let me know if you are interested in that and I'll share how to do that.

I also have a snippet of code that let's you write debugf("fmtstring", arg1, arg2...) from the C/C++ side and have that print immediately to your notebook on the Mathematica side, that's invaluable when debugging. This can be turned on and off via an EnableDebugPrint[] function so you can leave it in production code and turn it out when you need to.

POSTED BY: Taliesin Beynon
Answer
1 year ago

LibraryLink has picked up support for RawArrays, though that's not documented since RawArrays are themselves not documented. Let me know if you are interested in that and I'll share how to do that.

Is it possible to specify the type or rank of the RawArray in LibraryFunctionLoad?

LibraryDataType[RawArray] or simply RawArray works as a type specification. But LibraryDataType[RawArray, "Integer8"] doesn't.

EDIT: Sorry, I didn't realize this post actually went through. There are problems with the Community website sometimes. I grepped the files in the $InstallationDirectory for uses of RawArrays in LibraryLink, and there isn't a single use which specifies type of rank. I think that's a good enough reason not to try to do that myself either. I can verify on the C side that I got the expected type. If it's not the expected one, I either throw an error or convert it. We have MRawArray_convertType.

POSTED BY: Szabolcs Horvát
Answer
11 months ago

Hi Taliesin,

Thank you for the comments and tips! When I read your post, what immediately struck me was: why are you not using MathLink for this instead of JSON? MathLink must surely be faster than serializing to a textual representation. Or is it?

So I tried it out:

  • I generate a list of integer arrays of random lengths between 0..10 (I keep them short to eliminate any possible packed array advantage MathLink might or might not have). Integer can be up to 1000000000.
  • Then I send this to Mathematica using either MathLink or JSON (using RapidJSON, which claims to be very fast).

And indeed, the JSON version is faster ...

This generates a list of $2^{21}$ tiny integer lists:

In[36]:= obj@"generate"[2^21]

Transfer using MathLink:

In[38]:= expr = obj@"getML"[]; // AbsoluteTiming    
Out[38]= {1.94122, Null}

Transfer using JSON:

In[40]:= AbsoluteTiming[
 expr2 = Developer`ReadRawJSONString[obj@"getJSON"[]];
 obj@"releaseJSONBuffer"[];
 ]

Out[40]= {1.33406, Null}

In[41]:= expr == expr2    
Out[41]= True

Th JSON version is indeed faster.

But how is that possible? Doesn't MathLink use a binary representation for this, and shouldn't that take up less space and be faster?

Effectively this is how I transferred the data using MathLink

<!-- language: lang-c -->
std::vector<std::vector<int>> list;
...
MLPutFunction(link, "List", list.size())
for (const auto &vec: list)
    MLPutInteger32List(link, vec.data(), vec.size());

I did notice that the result does take up a lot more space in Mathematica than in JSON serialization:

In[31]:= Developer`WriteRawJSONString[expr] // ByteCount
Out[31]= 147593352

In[32]:= ByteCount[expr]
Out[32]= 352419368

That is understandable because: in JSON a 32-bit integer is only 10 digits or less, i.e. 10 bytes. In Mathematica each (non-packed-array-member) integer is 8 bytes plus some meta information totalling to 16 bytes according to ByteCount.

But MathLink should be more efficient than that: given that I use MLPutInteger32List and I am not putting each integer one bye one, it should in principle be able to transfer them in some "packed" format, furthermore it should only use 32-bit (not 64) for each, until they are read by the kernel.

Does this mean that MathLink is due for an update? Or does it have some inherent limitation which prevents it from being more efficient than it already is? Or perhaps we see the function call overhead compared to a header-only (thus fully inlineable) JSON library? It is should definitely be possible to make a binary format faster than a text-based JSON (maybe Cap'n Proto which you mentioned before, or similar).

If I generated random-length lists in the length range 0..100 instead of 0..10, then the performance advantage of JSON goes away.

In[43]:= obj@"generate"[2^18]

In[44]:= expr = obj@"getML"[]; // AbsoluteTiming    
Out[44]= {1.47904, Null}

In[45]:= AbsoluteTiming[
 expr2 = Developer`ReadRawJSONString[obj@"getJSON"[]];
 obj@"releaseJSONBuffer"[];
 ]
Out[45]= {1.78779, Null}

Another question: If you use JSON transfer for a machine learning application, isn't it a problem with that converting from binary to decimal and back doesn't leave floating point numbers intact? There may be a very small rounding error.

POSTED BY: Szabolcs Horvát
Answer
1 year ago

Thanks for doing the benchmark, that's useful to know.

I wasn't trying to suggest that JSON makes sense for numeric data, or as a replacement for MathLink. I think the criteria for when to use JSON as a protocol are: 1) performance is not the bottleneck, but rather developer time 2) the data is not numeric, e.g. one or many floating point values that need to roundtrip accurately 3) the structure is fairly complicated structurally, e.g. involves associations in some natural way or multiple fields.

And if you are in a situation that JSON is also naturally emitted by a third party libraries, parsing it on the Mathematica side directly is obviously preferably to translating the JSON to MathLink calls on the C++ side.

In our particular application, we use JSON here and there for transmitting 'metadata' back to Mathematica. Actual numeric tensors are communicated using the RawArray interface.

POSTED BY: Taliesin Beynon
Answer
1 year ago

LibraryLink has picked up support for RawArrays, though that's not documented since RawArrays are themselves not documented. Let me know if you are interested in that and I'll share how to do that.

Yes, I would be quite interested in this if you are willing to share, and debugf as well :-) It is a good idea to make it possible to turn on/off such output.

POSTED BY: Szabolcs Horvát
Answer
1 year ago

I haven't tested this, it's a distilled version of my actual code, but it shouldn't be far off:

https://gist.github.com/taliesinb/a3385002601421b3e8e2

For RawArrays, I actually found a nice notebook from Piotr about it, but I don't want to share it without his permission. I've drawn his attention to this thread.

POSTED BY: Taliesin Beynon
Answer
1 year ago

Thank you! I'm looking forward to it as I am quite curious about what RawArrays can be used for.

I have never used RawArrays before, partly because they are undocumented and partly because they did not seem all that useful for pure-Mathematica programming. Today I spelunked a bit, looked at the RawArray functions in Developer`, noticed that Normal and equality comparison (==, ===) works on them.

I am hoping to be able to use them to represent special data structures directly, but also memory-efficiently, as an (immutable) Mathematica expressions, and thus integrate them much better into Mathematica.

Currently LTemplate is very much focused on working with opaque and mutable objects. The memory is allocated and managed completely on the C side, and Mathematica only has a reference to C-side objects (as in integer, using managed library expressions). These mutable objects are not a good fit for pure-Mathematica programming, so when I used LTemplate in a published package, I completely hid them from the end user.

What can we do with RawArrays in Mathematica other than create them, convert them to a list, or compare them?

I imagine applications such as representing the state of a fast random number generator, and integrating it to Mathematica's RNG framework. I once wanted to do this with LTemplate and used the managed library expression ID as the Mathematica-side "state". But it turned out that Mathematica's RNG framework requires that states be comparable with == (this doesn't seem to be documented, but if they are not equality-comparable, things break). So it won't work this way. It would be necessary to represent the state as a Mathematica list. But that is messy and more work than I'd want to do because Mathematica integers (mint) may not map to what a particular RNG implementation might use internally, and also the mint size differs between platforms. I don't think 32-bit platforms will go away just yet: the Raspberry Pi is 32-bit. With a RawArray whose internal type is known and fixed this would be easier. I guess in principle a RawArray["Byte",...] could store an arbitrary C++ POD type. (Of course there's also applications like storing images, sounds, compressed byte-stream, etc.)

I will be away for ~10 starting today, so I'll only be able to check responses afterwards.

POSTED BY: Szabolcs Horvát
Answer
1 year ago

enter image description here - Congratulations! This post is now a Staff Pick as distinguished by a badge on your profile! Thank you, keep it coming!

POSTED BY: Moderation Team
Answer
25 days ago

Group Abstract Group Abstract