Message Boards Message Boards

Atom types created by Mathematica not read by other programs

Posted 8 years ago

This problem has caused me nightmares and I really need to know how to solve it: I am using several DFT programs and other data processing codes to simulate molecules. I create the geometries of these molecules in Mathematica and save them as .xyz files. The thing is, when supplying these .xyz files to the DFT codes, the latter codes cannot read the atom types created by Mathematica! Another problem is that, Mathematica write xyz components in the scientific form which cannot be processed in many codes.. Any ideas why this happens and how to fix it??

POSTED BY: eft rsd
12 Replies

That's great, glad I could help. The problem was this: When you write a file, you need a character that tells programs where a line ends, and that's a newline character. The trouble is that different operating systems use different characters to do this (you can read more about this in the wiki article I linked). Most programs like text editors know this and they can display all these varieties correctly, they may even give you the option to change from one to another. Other programs can't do this, and it seems we're both working with a few that don't. So our programs don't read the newlines quite right, and in your case that led the program to complain about the first thing on each new line (which happens to be the atom types). My export function just replaces newline characters that Windows and Apple systems use with the ones used by Linux and other Unix systems. You can still work with these files on Windows, I guess most of the tools you're using are clever enough to understand this. (Notably, Microsoft's own Editor will not display the files correctly, but I'm pretty sure all other text editors will.)

POSTED BY: Bianca Eifert
Posted 8 years ago

Thanks a lot for the clear explanation (^_^)

POSTED BY: eft rsd

This is a complete shot in the dark, but do you by any chance use different operating systems for exporting the file and then for running the DFT codes? Because there are different conventions for linebreaks. For instance, I export files on my Windows PC and then run them on a Linux cluster, and that does not work out of the box. I use VASP, and the error message I get doesn't refer to formatting issues either. (In the case of VASP, the parser complains that a certain line doesn't have the expected number of entries, which is of course not the problem at all.) Let me know if that might be the problem and I'll post a solution.

Daniel - that's such an easy mistake to make. I feel like all the ...Form[] functions go against the spirit of the "everything is an expression" paradigm, because unlike Style, they don't actually return a particularly usable object.

POSTED BY: Bianca Eifert
Posted 8 years ago

Sorry for being not clear. I actually do what you do: I export the files on Windows and run them on Linux. The error I get is exactly what I reported first that is non of the mentioned programes can read the Mathematica-created atom types. However, I can visualize the files in xcrysden and xmakemol.

If I create an .xyz manually and then copy to it all lines from the Mathematica-exported file except atom types, and then write the atom types myself every thing works fine.

POSTED BY: eft rsd

That kinda sounds like a weird formatting problem... Can you try if this export function works, just so that we know whether or not it was a linebreak problem?

xyzExport[filename_, pos_, types_] :=
  Module[{content},
   content = 
    Join[{{Length[types]}, 
      StringSplit["Some name or title can go here"]}, 
     Join[{types[[#]]}, pos[[#]]/100] & /@ Range[Length[types]]];
   (*convert to UNIX linebreaks (LF):*)
   content = 
    ToCharacterCode[
     ExportString[content, "Table", "FieldSeparators" -> " "]];
   content = 
    If[MemberQ[content, 10],(*from LFCR or CRLF:*)
     DeleteCases[content, 13],(*from CR:*)content /. {13 -> 10}];
   (*export:*)
   Export[filename, content, "Binary"]
   ];

You can try something like this, or use your own molecular data instead:

xyzExport["benzene.xyz",
QuantityMagnitude[ChemicalData["Benzene", "AtomPositions"]],
ChemicalData["Benzene", "VertexTypes"]
]
POSTED BY: Bianca Eifert
Posted 8 years ago

Dear Bianca, The export function worked fantastically! I can't thank you enough. Could you please explain why what was the error and how this export function fixed it? Thanks, eftrsd

POSTED BY: eft rsd
Posted 8 years ago

Thanks all for the suggestions.

I am using SIESTA and CP2K for DFT calculations and the "Lev00&Tetr" package for pre and post processing. Each simply gives the error that the atom type is not known and calculations do not run! So far I have solved this problem by copying/pasting the coordinates to a pre-prepared file which is OK for small systems but not easy for systems of 1000 atoms!

POSTED BY: eft rsd

Thanks for the suggestion, Daniel! How exactly would that work, though? Here's the situation so far:

Get a molecule:

pos = QuantityMagnitude[ChemicalData["Benzene", "AtomPositions"]]
types = ChemicalData["Benzene", "VertexTypes"]

The output is:

{{-119.07, -1.5997, -75.474}, {-13.956, 
  84.677, -110.68}, {-105.11, -86.277, 35.210}, {105.11, 
  86.277, -35.210}, {13.956, -84.677, 110.68}, {119.07, 1.5997, 
  75.474}, {-209.33, -2.6239, -132.92}, {-24.311, 
  149.49, -195.19}, {-184.85, -152.21, 62.495}, {184.85, 
  152.21, -62.495}, {24.311, -149.49, 195.19}, {209.33, 2.6239, 
  132.92}}
{"C", "C", "C", "C", "C", "C", "H", "H", "H", "H", "H", "H"}

Here's the original problem:

ExportString[{types,pos}, {"XYZ", {"VertexTypes", "VertexCoordinates"}}]

Output:

"12
Created with the Wolfram Language : www.wolfram.com
C   -1.19070  -1.59970E-2  -7.54740E-1
C   -1.39560E-1    8.46770E-1   -1.10680
C   -1.05110  -8.62770E-1   3.52100E-1
C    1.05110   8.62770E-1  -3.52100E-1
C    1.39560E-1   -8.46770E-1    1.10680
C    1.19070   1.59970E-2   7.54740E-1
H   -2.09330  -2.62390E-2  -1.32920
H   -2.43110E-1    1.49490  -1.95190
H   -1.84850  -1.52210  6.24950E-1
H    1.84850   1.52210 -6.24950E-1
H    2.43110E-1   -1.49490   1.95190
H    2.09330   2.62390E-2   1.32920"

Using InputForm like this:

ExportString[{types,InputForm[pos]}, {"XYZ", {"VertexTypes", "VertexCoordinates"}}]

... results in this error:

Export::uneqlen: Elements VertexCoordinates and VertexTypes must have the same length. >>
$Failed

Applying InputForm to each number separately like this:

ExportString[{types,Map[InputForm, pos, {2}]}, {"XYZ", {"VertexTypes","VertexCoordinates"}}]

... results in the following error:

Export::errelem: The Export element VertexCoordinates contains a malformed data structure and could not be exported to XYZ format. >>
$Failed

This behaviour makes sense to me since InputForm and other functions like it are just display functions that don't actually return their input in an altered form, or as a formatted string or anything like that. (Speaking of strings, ToString doesn't help either, Export still considers that a malformed data structure because "VertexCoordinates" are supposed to be numbers.)

All in all, I still basically vote for writing a new customized export based on generic table format, but I'm also still curious about the atom type problem.

Anyway, here's a suggestion for the export that eliminates the exponential notation:

Export["benzene.xyz",
 Join[
  {{Length[types]}, StringSplit["This is a benzene molecule!"]},
  Join[{types[[#]]}, pos[[#]]/100] & /@ Range[Length[types]]
  ],
 "Table", "FieldSeparators" -> " "]
POSTED BY: Bianca Eifert

Right you are, sorry. I should have realized this formatting would not be so easily handled.

I'll file this as a bug report and look into getting it improved.

POSTED BY: Daniel Lichtblau

Try wrapping the thing you export with InputForm. So it would be something like this: ExportString[InputForm[ ii], {"XYZ",...}].

POSTED BY: Daniel Lichtblau

I just did a quick test; the scientific form is indeed annoying, and I have no idea what to do about that.

The atom types look fine to me... What's wrong with them according to the programs you're using? (Also, which programs are you using?) Do your other programs perhaps require atoms to have unique types, i.e. not 4 rows of "C", but "C1", C2", C3", "C4"?

If all else fails, XYZ is an incredibly simple format; have you considered writing your own export function? (I know that's an ugly solution, sorry...)

Edit: About the number of digits... I suspect that might also be easier to remedy if you export as a plain table yourself...

POSTED BY: Bianca Eifert
Posted 8 years ago

I would like to add another question: When Mathematica create the .xyz file it sets the precision of numbers to 6 regardless of their precision before exporting them. Is there a way to increase the precision in the exported data?

POSTED BY: eft rsd
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract