Message Boards Message Boards

0
|
6064 Views
|
7 Replies
|
5 Total Likes
View groups...
Share
Share this post:

Custom Import/Export via File Extension in 12.0.0

Posted 4 years ago

I am having an issue during the first block of code that executes during kernel initialization.

The reason I ask is that often, the first line of my notebooks involve loading a large library of custom code. I have many custom file formats that rely on registering with the Import/Export mechanisms.

There has been a bug in Mathematica where FileFormatDump`$FILEFORMATS is not properly defined until the kernel returns after initializing. For example, in the front end, kill your associated kernel, and evaluate the above symbol. Then try evaluating again.

In 11.3, you could resolve this problem by calling Import[] on a fake file. This would forcibly load the necessary packages (FileFormatDump`, ImportExport`, etc.)

In 12.0, the above solution no longer works.

1.) IMO, this is a bug. The kernel should not execute user code until it is fully initialized.

2.) How do I register a custom file extension during fresh kernel initialization?

The following code works EXCEPT when evaluated during kernel initialization. ImportExport`RegisterExport["ABC",...];
ImportExport`RegisterImport["ABC",...];
FileFormatDump`$FILEFORMATMATRIX["ABC"]={"ABC",...};

7 Replies

Thank you, this is very valuable!

POSTED BY: Szabolcs Horvát
Posted 4 years ago

@Szabolcs Horvát @Andrew Raffensperger you may find this interesting, I've just submitted a function repository function to correctly register a format's extension and mime type. It can be found here.

POSTED BY: Sean Cheren
Posted 4 years ago

Change

Quiet@Import["__FAKE FILE__",""];

to

ImportString["", "Text"]; StringFormat[""];

Remove "Extensions" from RegisterImport and RegisterExport. Also do not AppendTo FileFormatDump`$FILEFORMATS, instead insert at position -2. There's also this other symbol you'll want to add to, like in my example from above:

AppendTo[System`ConvertersDump`ExtensionMappings, "*.myext" -> "myFmt"];
FileFormatDump`$FILEFORMATS = Insert[FileFormatDump`$FILEFORMATS, "myFmt", -2];
POSTED BY: Sean Cheren

Hey Sean, thanks for taking the time to respond.

Here is working code that can be called after Quit[] for:

  1. 11.3.0 for Microsoft Windows (64-bit) (March 7, 2018)
  2. 10.1.0 for Microsoft Windows (64-bit) (March 24, 2015)
  3. 11.3.0 for macOS (I'm running 12.0.0 ATM so I can't get the old version string.)
Module[{name="ABC",exts={"*.abc"}},Quiet@Import["__FAKE FILE__",""];
ImportExport`RegisterExport[name,Function[Print[{"export args:",##}];Export[#1,#2,"Byte"]],"Extensions"->exts];
ImportExport`RegisterImport[name,Function[Print[{"import args:",##}];Import[#1,"Byte"]],"Extensions"->exts];
If[!MemberQ[FileFormatDump`$FILEFORMATS,name],AppendTo[FileFormatDump`$FILEFORMATS,name]];
FileFormatDump`$FILEFORMATMATRIX[name]={name,True,False,False,False,False,exts,{},None,{}};
file=CreateFile[]<>".abc";
Print[file];
out={1,2,3};
Export[file,out];
Import[file]===out]

This code errors in 12.0.0.

I understand this is undocumented code so I kept my post kinda vague. I was more concerned that code executed in the front end depends on the liveliness of the associated kernel. This was just the first example I found where I got bit by this behavior.

Posted 4 years ago

I can't reproduce the behavior of this working:

ImportExport`RegisterExport["ABC",...];
ImportExport`RegisterImport["ABC",...];
FileFormatDump`$FILEFORMATMATRIX["ABC"]={"ABC",...};

I tried this to be specific:

In[10]:= ImportString["", "Text"]; StringFormat[""];
myFormatImport[filename_String, opts : OptionsPattern[]] := 
  StringDelete[Import[filename, "String"], "myFormat!\n"];
ImportExport`RegisterImport["myFmt", myFormatImport];
myFormatExport[filename_String, expr_, opts : OptionsPattern[]] := 
  Export[filename, "myFormat!\n" <> expr, "String"];
ImportExport`RegisterExport["myFmt", myFormatExport];
FileFormatDump`$FILEFORMATMATRIX["myFmt"] = {"myFmt", True, False, 
   False, False, False, {"*.myext"}, {}, None, {}};
Export["test.myext", "foo"]
FileFormat["test.myext"]
Import["test.myext"]

During evaluation of In[10]:= Export::infer: Cannot infer format of file test.myext.

Out[16]= $Failed

Out[17]= "Table"

Out[18]= {{"myFormat!"}, {"foo"}}

This didn't work in either 12.0 or 11.3 whether or not the kernel was initialized. To get more to the point, these are undocumented internal symbols you are attempting to modify. There is no guarantee the behavior would or should work, nor that if it works in one version it will continue working in the next. These are not the only modifications necessary, and how to add a format's extension to the system is not obvious by spelunking. I will be submitting a function to WFR as soon as I am able which will correctly add extensions to WL for custom formats. I will post here when this is complete.

In the meantime, here is a complete example that works. I highly recommend switching to the new WFR function once it is approved, since it will cover more cases and remain maintained.

In[19]:= Quit

In[1]:= ImportString["", "Text"]; StringFormat[""];

myFormatImport[filename_String, opts : OptionsPattern[]] := 
  StringDelete[Import[filename, "String"], "myFormat!\n"];
ImportExport`RegisterImport["myFmt", myFormatImport];
myFormatExport[filename_String, expr_, opts : OptionsPattern[]] := 
  Export[filename, "myFormat!\n" <> expr, "String"];
ImportExport`RegisterExport["myFmt", myFormatExport];
AppendTo[System`ConvertersDump`ExtensionMappings, 
  "*.myext" -> "myFmt"];
FileFormatDump`$FILEFORMATS = 
  Insert[FileFormatDump`$FILEFORMATS, "myFmt", -2];
FileFormatDump`$FILEFORMATMATRIX["myFmt"] = {"myFmt", True, False, 
   False, False, False, {"*.myext"}, {}, None, {}};
Export["test.myext", "foo"]
FileFormat["test.myext"]
Import["test.myext"]

Out[9]= "test.myext"

Out[10]= "myFmt"

Out[11]= "foo"
POSTED BY: Sean Cheren

Thanks for the response.

Maybe I wasn't clear enough with my example. My code is NOT running as part of init.m or any related initialization. I am simply talking about the first cell in a notebook.

And the bug is: depending on whether the kernel has been started or not, evaluating code in the first cell produces different results. This is insane!

Why do I have to type something into a blank cell, evaluate it, wait for the kernel to return, and then evaluate the rest of my notebook?

You will find some related information in this QA and other QAs linked from it:

A commonly recommended solution is to run your startup code in a scheduled task that fires very soon after kernel initialization. Personally, I do not like this method. Suppose that kernel startup is triggered by evaluating a cell. Which will evaluate first then: the startup code (scheduled task) or the first cell?

It may be better in practice to give up on running this code on initialization. Just put it in a package and load the package manually when needed. Of course, I can see why doing this is not the most convenient, but I cannot think of anything better ... Maybe someone else will ...

One more tip: You may want to investigate if DeclarePackage can be of some help. I have not thought this through, it may not work. I do not personally use this function (when I tried, it turned out to be not suitable for the problem I was trying to solve—that doesn't mean it won't be useful for you).

POSTED BY: Szabolcs Horvát
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract