Message Boards Message Boards

2
|
18881 Views
|
13 Replies
|
2 Total Likes
View groups...
Share
Share this post:

How do I set up multiple .m files in a package?

I have read through the documentation and watched the videos. I even poked around to find older material.

I think I understand how to set up a package in Workbench.

However, all the examples I have seen have a single .m file for the code.

I want to make a package/application that has multiple source files.

How do I do this?

Does each .m file have to have its own BeginPackage[]/EndPackage[] context, or can there be just one .m file that sets the context and the other source files are included as optional arguments to the only BeginPackage[] function call?

In the past, packages have been set up with a 'master' package that was called (Get[] or Needs[]) that would load all the sub-packages -- each of those could the loaded individually. How would I go about setting this up?

Thanks.

13 Replies

As the original poster, I would like to thank everyone for this excellent discussion. I have a lot of ideas to test out.

Clearly, this topic is complicated enough to have some 'official' tutorial from Wolfram Research. I am hopeful that with the release of Workbench 3, there will be a revision in the classes to go along with it. At one time, I was promised a second level course on Workbench and advanced programming topics, but this topic seems to have been lost in the ever lengthening list of things to do.

I can understand this, because this topic is of little interest or use to the vast majority of Wolfram language users -- those who use the tool as a big graphical calculator or rapid prototyping tool. However, the strength of any language's culture is the degree of support it provides to those people who want to push the envelope.

This is an example of a roll-up of packages into a "Main`" package. Note: some backticks in the text are deleted because it influenced formatting. Worked fine in coding examples.

First the two Packages "P3subP1m" and "P4subP1m" roll up into "P1m". Then "P1m" and "P2m" roll up into te "Main" package. Since I am using global symbols in my packages and I need them to be recognized because I don't want private contexts to be create in all the places they appear I add Needs["Appglobals"] to every package. All package must be .m or .wl "files". You can not simulate this in a notebook by executing every single command. Start with Needs["Main'"] in a notebook. When you ask for ff in the global context the following is asked for {qq,pp,zz,hh,GCriskpath,GVdataobject}. qq and pp are create as "Main Private qq" and "Main Private pp" because they in usage statements more than 1 level lower and therefore not recognized. zz and hh are in usage messages 1 level lower and are picked up. The function zz:=pp +qq is now evaluated and picks up the values in the packages 1 level lower.

BeginPackage["Main`"];
ff::usage = "ff does something";
Needs["Appglobals`"];
Needs["P1m`"];
Needs["P2m`"];
Begin["`Private`"];
ff:=(Print["This is the Main func:"];{qq,pp, zz,hh, GCriskpath,GVdataobject} );
End[];
EndPackage[];

BeginPackage["P1m`"];
zz::usage="zz ...";
Needs["Appglobals`"];
Needs["P3subP1m`"];
Needs["P4subP1m`"];
Begin["`Private`"];
zz := pp + qq
End[];
EndPackage[];

BeginPackage["P2m`"];
hh::usage="hh ...";
Needs["Appglobals`"];
Begin["`Private`"];
hh := 13
End[];
EndPackage[];


BeginPackage["P3subP1m`"];
pp::usage="pp ...";
Needs["Appglobals`"];
Begin["`Private`"];
pp := 11
End[];
EndPackage[];

BeginPackage["P4subP1m`"];
qq::usage="qq ...";
Needs["Appglobals`"];
Begin["`Private`"];
qq :=22
End[];
EndPackage[];

BeginPackage["Appglobals`"];
GCriskpath = "the path to a local file folder"
GVdataobject = {{"a","b"},{"c","d"}}
EndPackage[];

After evaluating: Needs["Main`"] only the Main context is on the context path.

{$Context, $ContextPath} shows only Main on the contextpath
{"Global`", {"Main`", "TestPackage`", "TemplatingLoader`",   "PacletManager`", "System`", "Global`"}

$Packages shows all sub-packages have been read.

{"P2m`", "P4subP1m`", "P3subP1m`", "P1m`", "Main`", "QuantityUnits`", "GetFEKernelInit`", "TemplatingLoader`", "ResourceLocator`", "PacletManager`", "System`", "Global`"}

Find all symbols present with one of the contexts in it. This is as expected.

Map[Names[# <> "*" <> "`*"] &, {"P1m", "P2m", "Main", "P3subP1m",    "P4subP1m"}] // Column
{
 {{"P1m`zz"}},
 {{"P2m`hh"}},
 {{"ff", "Main`Private`pp", "Main`Private`qq"}},
 {{"P3subP1m`pp"}},
 {{"P4subP1m`qq"}}
}

Would a user of this package "Main" normally only have ff as an available routine? Users would not normally dig into the tree structure using contexts or even know what the sublevel contexts are.

So the development of the application is like this: There is a master programmer. The master programmer has A level programmers who work for him and speak only to him. Each A level programmer may have B level programmers who work for them and speak only to them. In turn the B level programmers may have C level programmers, etc. The master programmer is the only one who speaks to the outside world.

This is an incredibly structured and organized system! Very authoritarian. Agreed that some projects may require it.

I would like to suggest that this is not the way that scientists or mathematicians work or the way research or study projects are organized. To begin with they don't know what the tree structure is in advance. If they knew they wouldn't be doing research or carrying out a study program. If they commit to a tree structure they will be limiting their flexibility and maybe biasing their thinking. Otherwise they must be continually revising the tree structure, which can get to be a messy chore.

These might be projects that involve just one or a small number of collaborators. Still, over time, they may assemble a very worthwhile application with multiple packages, documentation and accompanying notebooks. I believe these kind of users can be important to WRI, more so than very large scale industrial projects.

For these types of projects a looser parallel structure is needed. The folder structure for such an application might look like the following:

`$UserBaseDirectory`/Applications/
  ApplicationName/
   Documentation Folder,
   FrontEnd Folder
   Kernel/init.m
   PackageA.m
   PackageB.m
   PackageC.m

The three packages will all export routines to final users. In addition they may use routines exported by each other. Why might that happen? Because an organization by topic, or by individuals, may not correspond to a pure tree structure.

Here, I believe, is a way to fix it. Use a special form of init.m file for the application.

Declare the packages in the application.

ApplicationNamePackages = {"PackageA`", "PackageB`", "PackageC`"};

Context names to be added to $ContextPath in each package. Used in each package.

System`$ApplicationNameContextPaths = 
  "ApplicationName`" <> # & /@ ApplicationNamePackages;

Each package will then use the following statement after the BeginPackage statement. (As with Pieter only the first argument is used in the BeginPackage statement.)

$ContextPath=Union[$ContextPath,System`$ApplicationNameContextPaths];

Going back to the init.m file, the following routine will read the package Public symbols for each package, to establish contexts, but clear all the definitions including all definitions in the Private context. (It would be sort of nice if we had a command that would read only the Public part of a package and never touch the definitions in the Private section.)

ReadApplicationNamePackageNames[packageList_] :=
 Module[{fullPackageName, packageSymbolNames},
  Do[
    fullPackageName = "ApplicationName`" <> pname;
    Get[fullPackageName];
    packageSymbolNames = Names[fullPackageName <> "*"];
    Unprotect /@ packageSymbolNames;
    Clear /@ packageSymbolNames;
    Clear @@ {fullPackageName <> "Private`*"},
    {pname, packageList}];
  ]

The following does this for the packages.

ReadApplicationNamePackageNames[ApplicationNamePackages]; 

Finally, we reload all of the packages to obtain the definitions, now all placed in the correct context.

Get["ApplicationName`PackageA`"]
Get["ApplicationName`PackageB`"]
Get["ApplicationName`PackageC`"]

That is the end of the init.m file. The key requirement here is that all contexts for the application are known before any Private section code is established. It's possible that duplicate names might be used from different packages, which will produce shadowed warnings. But this can also occur in using routines from different applications or conflicting with WRI symbols. This is easily fixed and only a minor problem.

Perhaps this approach can be combined with the method described by Pieter. We would have more than one "Main" package, and these might or might not contain sub-trees. So in the above init.m code PackageA, PackageB and PackageC, would be three "Main" packages.

(same post, something went wrong the first time...) This is an excellent question. First my own remarks on programming in the wolfram language. First I have a general idea on what to create. (the algorithm is my invention and the computer only executes it)

I create a few snippets of code for the crucial areas to check how complicated or easy the problem is (storyboard coding). Then I create a flowchart of input and output-data-formats and function names. This exercise creates more insight in the problem and in most of the times it simplifies the solution approach. (do I need global data objects or will I go for the non global symbols approach)

Then I create two notebooks. One to program the functions and one to test the functions. After programming the next function in the flowchart I test the function in the testnotebook (the programming notebook can be kept clean by doing all experiments in the testnotebook). When a few new functions have been added and tested it is time for an integration test by running the full testnotebook. So before I look at the Workbench all programming is done in 2 notebooks.

Every set of functions in my flowcharts are subdivided into more aggregated blocks of code. The idea is to put every aggregated block of code into a subpackage and then merge it all together into 1 Main[] function. So subpackages need to roll-up into 1 main package and my global variables need to be global to only the mainpackage context. Let's load subpackage P1 and P2 and the globalvars package.

BeginPackage["Main`"]
main::usage = "main[]..."
Needs["Globalvars`"]
Needs["P1`"]
Needs["P2`"]
Begin["`Private`"]
 main := ....
End[]
EndPackage

The most important thing is not to include P1 and P2 in the second argument of BeginPackage. The reason is that they are Needs that are executed before the BeginPackage command itself. This puts P1 and P2 on the contextpath before Main is added. The problem arises when EndPackage is executed. This switches back to the contextpath status before Beginpackage. So P1 and P2 stay on the contextpath when leaving the package. The way Needs is used here is called Hidden import.

Now there is a new trick in M10. Usage exports a function from the private to the package context. But this now also works of a package in a package in a package. Every usage function is now visible to only the 1 level higher package. With this approach all your symbols are recognized in all your packages and you never have to use full context to refer to something. So you end up with only 1 context added to the ContextPath when you arrive back in the global context. You global symbols are now global to the context Main and lower so there is not issue if the use accidentally creates the same names in Global` context.

This is the moment when I start to use the workbench. I import my programming and testing notebooks and create .wl files for the package structure I need. This structure by the way depends very much on the number of programmers involved. Every programmer gets a part of the overall function flowchart. He/She then subdivides his module in files using Get as a stub to a placeholder section. The workbench has support for GIT and via GIThub you get version management control on the whole team. When you start to do some builds it is good to now you have all documents, notebooks, packages etc. in one place. In some occasions programmers de-rail into developing a solution and feel the needs to switch a few versions of their files back. This is easy when the package files are organized by programmer.

The new nested packages approach is a great addition to developing large programs in the Wolfram language since you organize by programmer and by aggregated package and execute version management on each of these files.

I find this rather difficult to understand Pieter. Can any package refer to an exported symbol from any other package, regardless of their relative positions in the tree structure? Can P1 make a hidden import of P2 and P2 make a hidden import of P1? Are the multiple packages no longer loaded by the init.m file?

If exported names are only known at the next higher level in a tree does that mean that the ultimate user can't use those lower level routines?

What does "He/She then subdivides his module in files using Get as a stub to a placeholder section." mean? Is a "module" a set of package files, each of which may have hidden imports? And what is GIT and GIThub?

This sounds like a method that is heavily dependent on the tree structure and not at all flexible.

A system that established the contexts of all exported symbols (to the ultimate user) first and only then established the definitions in the Private sections would be much more robust - and simpler.

David (Park), First my answer to your reply.

"Can any package refer to an exported symbol from any other package, regardless of their relative positions in the tree structure?" With BeginPackage["pack",{"needs1","needs2"}] all context will be left on the Contextpath after the Endpackage statement. So yes when the user is in Global all contexts will be check in sequence for a particular symbol. This is not what I intent to create. I want to create a rollup of packages and make only 1 symbol available to the global context.

"What does "He/She then subdivides his module in files using Get as a stub to a placeholder section." mean?" This is a package with two placeholders in it. The ProgA and ProgB are .wl that are no packages but do export and have their own private section:

BeginPackage["PPackA`"]
Get["ProgA`"]; 
Get["ProgB`"];
EndPackage[]

Aff::usage="Aff[] is exported outside the next begin end statement"
Begin["`Private1`"]
Aff[x_]:=x^2
End[]

Agg::usage="Agg[] is exported outside the next begin end statement"
Begin["`Private2`"]
Agg[x_]:=x^2
End[]

"And what is GIT and GIThub?" GIT create local version management in a folder on your disk (workbench workspace). GIT takes snapshots of all folders and files in that directory. You are able to switch back in versions per folder, or file. You are also able to compare files in different versions. Very important functionality when you have an existing application and want to change some things to improve. If it does not work you need to be able to go back to the "production" code. GIThub is more of sharing mechanism. The main "build" officer needs to be able to see all the "official" work of the programmers (in their workbench folder) in its team to create a team.

The remark in "a tree structure and not at all flexible" is not understood. I am after a bullet proof piece of software that a user can not by accident change disabling it functionality.

Posted 10 years ago

This is an excellent question. First my own remarks on programming in the wolfram language. First I have a general idea on what to create. (the algorithm is my invention and the computer only executes it)

I create a few snippets of code for the crucial areas to check how complicated or easy the problem is (storyboard coding). Then I create a flowchart of input and output-data-formats and function names. This exercise creates more insight in the problem and in most of the times it simplifies the solution approach. (do I need global data objects or will I go for the non global symbols approach)

Then I create two notebooks. One to program the functions and one to test the functions. After programming the next function in the flowchart I test the function in the testnotebook (the programming notebook can be kept clean by doing all experiments in the testnotebook). When a few new functions have been added and tested it is time for an integration test by running the full testnotebook. So before I look at the Workbench all programming is done in 2 notebooks.

Every set of functions in my flowcharts are subdivided into more aggregated blocks of code. The idea is to put every aggregated block of code into a subpackage and then merge it all together into 1 Main[] function. So subpackages need to roll-up into 1 main package and my global variables need to be global to only the mainpackage context. Let's load subpackage P1 and P2 and the globalvars package.

BeginPackage["Main"] main::usage = "main[]..." Needs["Globalvars"] Needs["P1"] Needs["P2"] Begin["Private"] main := .... End[] EndPackage

The most important thing is not to include P1 and P2 in the second argument of BeginPackage. The reason is that they are Needs that are executed before the BeginPackage command itself. This puts P1 and P2 on the contextpath before Main is added. The problem arises when EndPackage is executed. This switches back to the contextpath status before Beginpackage. So P1 and P2 stay on the contextpath when leaving the package. The way Needs is used here is called Hidden import.

Now there is a new trick in M10. Usage exports a function from the private to the package context. But this now also works of a package in a package in a package. Every usage function is now visible to only the 1 level higher package. With this approach all your symbols are recognized in all your packages and you never have to use full context to refer to something. So you end up with only 1 context added to the ContextPath when you arrive back in the global context. You global symbols are now global to the context Main and lower so there is not issue if the use accidentally creates the same names in Global` context.

This is the moment when I start to use the workbench. I import my programming and testing notebooks and create .wl files for the package structure I need. This structure by the way depends very much on the number of programmers involved. Every programmer gets a part of the overall function flowchart. He/She then subdivides his module in files using Get as a stub to a placeholder section. The workbench has support for GIT and via GIThub you get version management control on the whole team. When you start to do some builds it is good to now you have all documents, notebooks, packages etc. in one place. In some occasions programmers de-rail into developing a solution and feel the needs to switch a few versions of their files back. This is easy when the package files are organized by programmer.

The new nested packages approach is a great addition to developing large programs in the Wolfram language since you organize by programmer and by aggregated package and execute version management on each of these files.

POSTED BY: Updating Name

George, I have one reply and one posting on Wolfram Community that may be useful to you.

The first basically shows how to lay out an application with multiple packages. You essentially load all the packages through the init.m file. This reply is in:

http://community.wolfram.com/groups/-/m/t/393033?ppauth=L19VHYIc

The second is my posting on:

Large Multipackage Applications

This deals with the problem of not maintaining a strict tree order - which may be difficult or inconvenient to do. I have two small zip files, ContextProblem.zip and ContextFix.zip that illustrate the problem and a fix for it. The problem is that the BeginPackage statements read all the packages listed in the second argument and establishes definitions. If the context of some symbols is not known they get pushed into the Private context. A correct definition may later be established when the package itself is read in. But it's hit or miss which one will be used. The solution is to use the init.m file to first read in all the packages to establish contexts but Clear all the symbol and Private definitions when reading each package. Then, with all the contexts established, read them again. That appears to work but I haven't done large scale testing on it.

It really would be nice if a Workbench for documenting that was simple to use and clearly documented itself was provided. I view it as essential for serious Mathematica usage in science and mathematical applications.

David, I have investigated your "Context Problem example". One of the issues (I think) is the use of Needs in the BeginPackage statement. Let me know what you think about my reply to George?

Pieter, yes, let me read your comments more carefully later on and experiment a bit. I didn't know about the version 10 behavior of usage messages that you describe (is it documented somewhere that you can point me to?). The only immediate issue with that of course is if the package is to be used for an earlier version of Mathematica. But let me look over your comments later on when I have some time and I will respond.

POSTED BY: David Reiss

Yes, do some simple experiments to see if you have the general idea scoped out and let me know if you get to a point of confusion or contradiction...

POSTED BY: David Reiss

Thanks for the reply.

For my immediate project, I suppose I could put al the code in one .m file There might be 10 or 20 pages of code when I am done.

However, my experience working with c (etc.) is that breaking up the code over multiple files is a good thing.

One idea that I had was to rewrite my c code (about 4000 pages) in Mathematica -- a significantly larger project than my current one, so my question really has to do with this larger project.

If I understand correctly, I could set up a master file that has BeginPackage["myBigProject", {"subProj1", "subProj2:, etc}] and then each sub project would start BeginPackage["subProj1], etc. (I see that the editor messed up what I typed, but I think that the meaning is clear.)

If one (or more) of the projects would have utilities used by other sub projects, could I then use the second argument to Need[] the lower level project into the sub project? It seems logical.

My main reason for doing this type of factoring is to keep the model code separate from the view. Even for a modest project, that seems to be desirable. The view would only need to know about the Public stuff from the model, and vice versa.

I am planning to use Workbench so that I can create the documentation. I know that there is (probably) a way to make the Symbol, Guide, and Tutorial pages without using Workbench, but there may be advantages in using the same tools that Wolfram developers use internally.

I guess I will have to do some experiments....

One question that I'd ask first is why you want to do this. One reason may simply be organizational--that you want to create several different packages and have the code-base for each separate. An immediate question arises as to whether there are dependencies amongst the packages. Of course if there are dependencies they must be in a descending tree structure: no closed loops. Another question is whether you want all of the packages to be loaded at the same time. In the "old days" sometimes people wanted control over which packages in a group were loaded so as to conserve memory resources--that's rarely necessary these days though, unless a given package loads significant data resources, insanely large expressions, or needs to do some time -consuming calculations in the course of its loading.

I tend just to put everything into one package... and I generally program in a notebook rather than Workbench. But this is a matter of preference of course--there are advantages to each approach. If programming in a notebook, I just put the code for distinct areas in different section organizations. The size of the code is immaterial (subject to the caveats above): I have packages with on the order of 60k lines of code and they load quickly.

As for your question--there are several approaches. One general approach is to have a top-level context and to put all the subsidiary packages in their own contexts. Then in the BeginPackage call for the top level package, also include the other packages in the BeginPackage's second argument. Or the packages that are included in the BeginPackage's 2nd argument may not be all of the other packages' contexts--the contexts in that BeginPackage command as well as the other packages' BeginPackage commands can reflect the package dependency. Also the advantage of using BeginPackage for each sub package is that each will have a private sub-context and so one can have the public symbols properly separated from thePrivate stuff in each package's sub-context.

Another possibility that one may think of is to code the subsidiary packages' code in a way that does not declare contexts and just pull them in using Get inside of the main package's code as needed. This presents a problem since this either has to be done before the Private sub context or before it. Then it is a mess to get the subsidiary packages' public andPrivate symbols placed correctly: explicit full context names would need to be used. So this is a non-starter I think.

So, those are some thoughts...

POSTED BY: David Reiss
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract