Preamble
This is an answer to a question HERE on how one can organize multi-file projects in Mathematica / WL. There are two main axes in this problem, and they are
- Loading procedure / loading sequence
- Namespace / context management
To the first approximation, one can treat these independently. However, any complete solution will provide some solutions for both which might be interdependent, so I will consider them together.
Simplest case: multi-fragment (multi-file) project, single namespace
This is more or less what is suggested in the tutorial you mentioned. In this scheme, there is a single public context MyProject`
and (usually) a single private context MyProject`Private`
, just like in a single-file package. However, the code is split into several files / fragments.
While there is a single public and private namespace, a split to several fragments requires one to have a fragment that would declare all public / exported symbols, even if their implementation will live in a different fragment, and to load that fragment before any others.
Since init.m
file is present in the Kernel
sub-folder, it is an entry point, and you are free to customize everything in that init.m
. There are two main variations one can employ in this scheme
Namespace-aware fragments
You load all fragments within init.m
using Get
, one after another.
This method has the following advantages:
- Project's loading sequence is explicit and declarative.
- It will be easier to statically analyze, if such a need arises
- One doesn't need to use the
$InputFileName
variable and absolute paths
But every fragment would need to separately wrap the code into BeginPackage
- EndPackage
, as well as Begin["`Private`"]
- End[]
for the private section, which one may consider as a disadvantage.
One master fragment and N slave fragments
You only load your main fragment within init.m
with Get
, and inside the main fragment Private`
section you load all other fragments.
This method couples project's loading with execution of implementation code, but then you don't need to assign a namespace for each fragment - if you load it inside the first / main fragment Private`
section, other fragments may contain just the code, and will automatically be parsed into Private`
context - so that in this approach only the main fragment needs the structure BeginPackage
- EndPackage
, as well as Begin["`Private`"]
- End[]
.
It is however less declarative, and one would need to use the $InputFileName
and absolute paths to load other fragments.
Problems solved and not solved in this approach
The main problems that this approach solves:
- It is much more convenient to work with smaller source files
- One can split code according to the different functionality, making it easier to reason about
The problems this approach does not solve
- Name collisions between private symbols: since there is still a single public and single private namespaces, one may forget that a given private symbol has been defined in one fragment and define it again for a completely different purpose.
- No real modularization: even though code is split into fragments, the developer is not encouraged to minimize fragment's interdependencies, avoid circular dependencies, etc. Moreover, even if the code has pieces which are completely independent from each other, single namespace does not allow one to really separate them well enough.
New-style packages: undocumented modern way of structuring multi-fragment packages
With a usual disclaimer that one can't rely on undocumented functionality, and can use it at one's own risk only, let me briefly describe the new package format (AFAIK, available since V10).
Project's structure
In this format, the project typically has a flat structure, similar to the structure you mentioned:
MyProject
Kernel
init.m
Main.m
ExtraTools.m
Utilities.m
The folder Kernel/init.m
is optional, if one of the fragments is named the same as the project's folder, in the above case would be MyProject.m
. One can, however, still use Kernel/init.m
, if one wants some other fragment to run first.
Note that one doesn't need to load manually all the fragments - this is done automatically by Get
, for a new-style package, so it is enough to only indicate the fragment that you want to run first.
Syntax and scoping
In this format, you split your package into several fragments, each of which has to start with a declaration:
Package["MyPackage`"]
In this format, you don't need to wrap the code in BeginPackage
- EndPackage
or Begin
- End
. Instead, if you want to make your symbol public (i.e. having the context MyPackage`
), you write
PackageExport["MySymbol"]
anywhere in the file / fragment (in fact, it does not matter in which fragment even, although it is more logical usually to do that in the one where you implement that symbol).
There is one additional level of scope in this format, which is package scope. Package-scoped symbols are available to all fragments in the package, but not exported to the end user. You declare them as
PackageScope["MyInnerSharedSymbol"]
Package-scoped symbols live in MyProject`PackageScope`
context.
If you want to import other packages, you use PackageImport
, which is an analog of Needs
:
PackageImport["DatabaseLink`"]
typically one does that at the start of the fragment, after Package
declaration. You shouldn't use Needs
in place of PackageImport
in this format.
All symbols that are not exported, not package-scope, and not in any of the imported packages, are considered private. The difference with the standard format, however, is that each fragment has its own private context, so that private contexts of different fragments are different and don't collide. For example, for a fragment named ExtraTools.m
, the private context will be MyPackage`ExtraTools`PackagePrivate`
, while for fragment Utilities.m
, it will be MyPackage`Utilities`PackagePrivate`
. This separation of private sub-contexts is also why package-scoped symbols become necessary, as a way to privately communicate between different fragments.
How it works, and a few gotchas
How it works (brief summary)
The way this format works is different from the standard one, where all contexts of all symbols were determined at run-time. Here, all fragments are first analyzed statically, to determine which symbols should be created in which contexts. During that analysis, dependencies (packages imported using PackageImport
) are loaded dynamically, so one probably can't call it fully static. After all symbols get resolved, the fragments are actually run.
Why Needs
can not be used
The above makes it more understandable, why Needs
should not be used in place of PackageImport
- the code using Needs
will execute too late, after all static analysis would have been already performed, and so imported symbols won't have a chance to resolve in time, and in code of the package would be instead considered private symbols).
Loading order
Unfortunately, one can't easily control the fragments execution order, which is alphabetical in fragments names. One can, however, point Get
in the init.m
to a specific fragment, to ensure that that one will be executed first - and then all the rest in the alphabetical order of their names.
If you must enforce specific loading order, you can mangle fragments names, adding letters A
, B
etc. in front, to ensure that their alphabetical order corresponds to the order in which you actually want them to be loaded. However, one might argue that dependence of very specific loading order for several fragments is evil, and a sign of bad design.
A pitfall with declaration statements
It is important to point out is that statements PackageExport
, PackageScope
, PackageImport
are actually tokens for a static analyzer, rather than real Mathematica code. In particular, wrapping them into anything won't work:
PackageImport["MyOtherPackage`"];
or
If[var === val, PackageExport["MySymbol"]]
won't work (the semicolon at the end is a very common error).
Per-fragment nature of PackageImport
Regarding PackageImport
, one non-obvious thing is that a package, imported in one fragment, is not considered imported in others. This means that, if you use some package in several fragments of your new-style package, you need to PackageImport
it in all of them, even though they are parts of the same big package you are developing. This behavior has both pros and cons, although it may often seem an inconvenience.
Problems solved and not solved in this approach
The main problems that this approach solves:
- All of the problems that the simpler approach solves
- Better encapsulation: no collisions between private symbols of different fragments
- Somewhat better modularization: the fact that one can't use private symbols of one fragment in the other fragment, forces one to more carefully separate the interface from implementation. One is given the tool of package-scoped symbols to use for such communication, but you will be better off reducing their number to a necessary minimum
Problems still not solved in this approach
- Still better level of modularization can only be achieved when symbols belonging to different structural parts, live in different namespaces
- The new-style package format as such does not scale well when the amount and complexity of code grows. This is so both because it does not provide a true multi-level modularization, and because individual modules are not separately reloadable or otherwise self-contained (for example, can't be tested in isolation).
Still, this method is a huge step in the right direction, and for most small to medium or even large projects can be enough.
Multi-package projects
If your code base really becomes large and complex, you may want to upgrade to a more flexible and powerful structure. The basic idea is to make different modules of your project independent full-fledged packages. For each individual module, one can use any of the methods I described earlier. One would also need a few additional pieces such as custom loader and probably an interface section.
I will describe this method in more detail, if there is enough interest.
Conclusions
Depending on one's needs and also stylistic preferences, one can use a number of different schemes to scale up the code and create multi-file projects. I have described a few methods that should do it for most user-level projects.