Version 12 introduces the Molecule
, a computable data structure to represent a chemical species. This function is new and labeled as "Experimental", so the behavior of it is not set in stone. In this post I'm looking for feedback from users on how Molecule
should behave.
Much of the functionality is based around the field of cheminformatics. In fact much of the implementation uses the RDKit cheminformatics library, in the same way that string patterns are implemented using the PCRE regular expression library.
From a computational point of view, keeping hydrogens as explicit vertices in the underlying chemical graph is inefficient. For organic molecules, hydrogens can account for over half the atoms present, yet their presence can be inferred from common valence rules. Without including them explicitly, a whole range of properties can be computed.
It's only when you want to view the molecule as a 3D object that they become essential.
As pointed out in this Stack Exchange post, there is currently some inconsistency with how Molecule
and MoleculePlot
treat hydrogens. When creating a Molecule
, you can specify whether to have explicit hydrogens via the IncludeHydrogens
option. By default, when creating a Molecule
from an explicit list of atoms and bonds, explicit hydrogens are not added.
Molecule[{"C", "C", "C"}, Bond /@ {{1, 2}, {2, 3}}]

This molecule expression has implicit hydrogens, indicated visually by the fact that the atom count shows two numbers: the first showing the explicit atom count and the second, in gray, the full atom count. If we want for the implicit hydrogens to be made explicit, we can pass in an option
Molecule[{"C", "C", "C"}, Bond /@ {{1, 2}, {2, 3}}, IncludeHydrogens -> True]

The default option value for IncludeHydrogens
is Automatic
, meaning that if they are present in the input we keep them but if not we treat them as implicit. If you evaluate SetOptions[Molecule, IncludeHyrogens -> True]
, then every molecule you create will have explicit hydrogens.
But now we have a question, how should we treat the following input?
Molecule["propane"]
With the value of Automatic
, should hydrogens be explicit or implicit? This is a point I would like user feedback on.
Another question is how MoleculePlot
and MoleculePlot3D
should behave based on the input. In the current implementation, MoleculePlot will by-default show some hydrogens but not all, and this is independent of whether the input has implicit or explicit hydrogens.

Likewise MoleculePlot3D
will always show the hydrogens, even if they are implicit.

This seems inconsistent, and has caused at least two users some confusion. So my next question for users is, should the automatic behavior of the plotting functions change depending on whether the Molecule
in question has implicit or explicit hydrogens?