WolframAlpha.com
WolframCloud.com
All Sites & Public Resources...
Products & Services
Wolfram|One
Mathematica
Wolfram|Alpha Notebook Edition
Programming Lab
Finance Platform
SystemModeler
Wolfram Player
Wolfram Engine
WolframScript
Enterprise Private Cloud
Enterprise Mathematica
Wolfram|Alpha Appliance
Enterprise Solutions
Corporate Consulting
Technical Consulting
Wolfram|Alpha Business Solutions
Resource System
Data Repository
Neural Net Repository
Function Repository
Wolfram|Alpha
Wolfram|Alpha Pro
Problem Generator
API
Data Drop
Products for Education
Mobile Apps
Wolfram Player
Wolfram Cloud App
Wolfram|Alpha for Mobile
Wolfram|Alpha-Powered Apps
Services
Paid Project Support
Wolfram U
Summer Programs
All Products & Services »
Technologies
Wolfram Language
Revolutionary knowledge-based programming language.
Wolfram Cloud
Central infrastructure for Wolfram's cloud products & services.
Wolfram Science
Technology-enabling science of the computational universe.
Wolfram Notebooks
The preeminent environment for any technical workflows.
Wolfram Engine
Software engine implementing the Wolfram Language.
Wolfram Natural Language Understanding System
Knowledge-based broadly deployed natural language.
Wolfram Data Framework
Semantic framework for real-world data.
Wolfram Universal Deployment System
Instant deployment across cloud, desktop, mobile, and more.
Wolfram Knowledgebase
Curated computable knowledge powering Wolfram|Alpha.
All Technologies »
Solutions
Engineering, R&D
Aerospace & Defense
Chemical Engineering
Control Systems
Electrical Engineering
Image Processing
Industrial Engineering
Mechanical Engineering
Operations Research
More...
Finance, Statistics & Business Analysis
Actuarial Sciences
Bioinformatics
Data Science
Econometrics
Financial Risk Management
Statistics
More...
Education
All Solutions for Education
Trends
Machine Learning
Multiparadigm Data Science
Internet of Things
High-Performance Computing
Hackathons
Software & Web
Software Development
Authoring & Publishing
Interface Development
Web Development
Sciences
Astronomy
Biology
Chemistry
More...
All Solutions »
Learning & Support
Learning
Wolfram Language Documentation
Fast Introduction for Programmers
Wolfram U
Videos & Screencasts
Wolfram Language Introductory Book
Webinars & Training
Summer Programs
Books
Need Help?
Support FAQ
Wolfram Community
Contact Support
Premium Support
Premier Service
Technical Consulting
All Learning & Support »
Company
About
Company Background
Wolfram Blog
Events
Contact Us
Work with Us
Careers at Wolfram
Internships
Other Wolfram Language Jobs
Initiatives
Wolfram Foundation
MathWorld
Computer-Based Math
A New Kind of Science
Wolfram Technology for Hackathons
Student Ambassador Program
Wolfram for Startups
Demonstrations Project
Wolfram Innovator Awards
Wolfram + Raspberry Pi
Summer Programs
More...
All Company »
Search
Join
Sign In
Dashboard
Groups
People
Message Boards
Answer
(
Unmark
)
Mark as an Answer
GROUPS:
Staff Picks
Chemistry
Data Science
Material Sciences
Curated Data
Wolfram Language
Machine Learning
Wolfram Summer School
10
Sathvik Ajay Iyengar
[WSS20] Neural Networks for Software-Interface Molecular Vibration Data
Sathvik Ajay Iyengar, Rice University
Posted
5 months ago
1414 Views
|
2 Replies
|
13 Total Likes
Follow this post
|
Deep learning is becoming commonplace in many fields of science and technology. The ability to identify and predict patterns in large data sets is a powerful tool, and can in fact, replace many brute-force and calculation/simulation intensive methods. Traditionally, identifying molecular vibrational modes, and hence the corresponding IR-spectra for organic molecules is a non-trivial process.Theoretically, it is performed by specialized ab initio quantum chemistry software, and is a time-consuming process. Experimentally, it is determined by specialized spectrometers. This project is aimed at predicting such inherent patterns between parametrized molecular physical/chemical properties, and their vibrational modes.
Here, we focus on two major aspects. First, we introduce a method to
write, execute and animate
input and data files to a third party quantum chemistry software (Orca) through the Wolfram Language (WL). Next we use this freshly generated vibrational spectra data to try and
predict patterns
using a variety of machine learning functionalities.
Introduction
An infrared or vibrational spectrometer is an instrument that experimental chemists, materials scientists and other researchers use to identify or “characterize” chemical substances. When the molecules of an unknown chemical substance are made to interact with infrared radiation, the molecules absorb the radiation, shift to an excited state, and dissipate energy in the form of molecular vibrations. These vibrations are either symmetric or antisymmetric, giving rise to many sub-varieties (stretching, scissoring, rocking, wagging, twisting, out-of-plane). While the near-IR is responsible for molecular vibrations, the far-IR corresponds to more rotation dominant behavior.
S
t
y
l
e
/
@
T
e
x
t
/
@
T
e
x
t
S
e
n
t
e
n
c
e
s
[
W
i
k
i
p
e
d
i
a
D
a
t
a
[
"
I
n
f
r
a
r
e
d
s
p
e
c
t
r
o
s
c
o
p
y
"
]
]
[
[
;
;
3
]
]
/
/
C
o
l
u
m
n
I
n
[
]
:
=
I
n
f
r
a
r
e
d
s
p
e
c
t
r
o
s
c
o
p
y
(
I
R
s
p
e
c
t
r
o
s
c
o
p
y
o
r
v
i
b
r
a
t
i
o
n
a
l
s
p
e
c
t
r
o
s
c
o
p
y
)
i
n
v
o
l
v
e
s
t
h
e
i
n
t
e
r
a
c
t
i
o
n
o
f
i
n
f
r
a
r
e
d
r
a
d
i
a
t
i
o
n
w
i
t
h
m
a
t
t
e
r
.
I
t
c
o
v
e
r
s
a
r
a
n
g
e
o
f
t
e
c
h
n
i
q
u
e
s
,
m
o
s
t
l
y
b
a
s
e
d
o
n
a
b
s
o
r
p
t
i
o
n
s
p
e
c
t
r
o
s
c
o
p
y
.
A
s
w
i
t
h
a
l
l
s
p
e
c
t
r
o
s
c
o
p
i
c
t
e
c
h
n
i
q
u
e
s
,
i
t
c
a
n
b
e
u
s
e
d
t
o
i
d
e
n
t
i
f
y
a
n
d
s
t
u
d
y
c
h
e
m
i
c
a
l
s
u
b
s
t
a
n
c
e
s
.
O
u
t
[
]
=
Now, the question is, given enough descriptor parameters for a molecule, or as it is commonly known as a "molecular fingerprint" (discussed further below), can we predict the vibrational modes that it would exhibit? Vibrations are the result of chemical bonds executing harmonic oscillations, and are highly specific to the type of atoms involved in the bond, and the overall structural and energy landscape. However, they are also not trivial, and are commonly calculated using quantum chemistry software that rely on Density Functional Theory (DFT) principles such as the Born - Oppenheimer (BO) approximations, among others. Hence, this is a perfect example of hard-to-compute data with a rich set of identifiable patterns. This project deals with the generation of such data, and the construction of neural network models to best predict vibrational modes, given a unique molecule.
This is what an IR spectra plot for an organic molecule “aniline” looks like. The frequency, commonly measured in wavenumber (1/cm), has a corresponding value of transmittance.
W
e
b
I
m
a
g
e
S
e
a
r
c
h
[
"
I
n
f
r
a
r
e
d
s
p
e
c
t
r
o
s
c
o
p
y
"
]
;
I
n
[
]
:
=
%
[
[
3
]
]
/
/
M
a
g
n
i
f
y
[
#
,
0
.
7
5
]
&
I
n
[
]
:
=
O
u
t
[
]
=
The lower the transmittance, the greater the absorption and hence the vibrational mode. Thus, we can characterize the
“most significant”
frequencies as the ones with the
lowest
transmittance values.
Highlighted in yellow are the bonds that exhibit the various stretching modes as shown above for aniline, in decreasing order of frequency:
F
u
n
c
t
i
o
n
[
b
o
n
d
,
{
I
m
a
g
e
@
M
o
l
e
c
u
l
e
P
l
o
t
3
D
[
M
o
l
e
c
u
l
e
@
"
A
n
i
l
i
n
e
"
,
{
B
o
n
d
[
F
i
r
s
t
@
b
o
n
d
,
"
S
i
n
g
l
e
"
|
"
A
r
o
m
a
t
i
c
"
]
}
]
,
S
t
y
l
e
[
T
e
x
t
@
b
o
n
d
/
.
{
{
a
_
,
b
_
}
,
c
_
}
a
<
>
"
-
"
<
>
b
<
>
"
"
<
>
c
,
F
o
n
t
S
i
z
e
2
0
]
}
]
/
@
{
{
{
"
N
"
,
"
H
"
}
,
"
~
3
4
0
0
-
1
c
m
"
}
,
{
{
"
C
"
,
"
H
"
}
,
"
~
3
0
0
0
-
1
c
m
"
}
,
{
{
"
C
"
,
"
C
"
}
,
"
~
1
5
0
0
-
1
c
m
"
}
,
{
{
"
C
"
,
"
N
"
}
,
"
~
1
3
0
0
-
1
c
m
"
}
}
/
/
T
r
a
n
s
p
o
s
e
/
/
G
r
i
d
[
#
,
F
r
a
m
e
A
l
l
]
&
I
n
[
]
:
=
N
-
H
~
3
4
0
0
-
1
c
m
C
-
H
~
3
0
0
0
-
1
c
m
C
-
C
~
1
5
0
0
-
1
c
m
C
-
N
~
1
3
0
0
-
1
c
m
O
u
t
[
]
=
Even molecules have fingerprints~
We can identify each molecule with a unique fingerprint in the form of the “radial distribution function”(RDF), a model that describes the variation of atom density as a function of distance from a reference atom:
W
e
b
I
m
a
g
e
S
e
a
r
c
h
[
S
e
a
r
c
h
Q
u
e
r
y
S
t
r
i
n
g
@
"
R
a
d
i
a
l
d
i
s
t
r
i
b
u
t
i
o
n
f
u
n
c
t
i
o
n
d
r
"
]
[
[
{
4
,
7
}
]
]
/
/
I
m
a
g
e
C
o
l
l
a
g
e
[
#
,
A
u
t
o
m
a
t
i
c
,
4
0
0
]
&
I
n
[
]
:
=
O
u
t
[
]
=
RDF values in the WL are a set of discrete values in a {210} numeric array, which is based a list of molecular descriptors calculated by
Dragon
[1, 2]
M
o
l
e
c
u
l
e
[
#
]
[
"
R
D
F
"
]
&
/
@
{
"
e
t
h
a
n
o
l
"
,
"
a
c
e
t
i
c
a
c
i
d
"
,
"
a
c
e
t
o
n
e
"
}
I
n
[
]
:
=
N
u
m
e
r
i
c
A
r
r
a
y
T
y
p
e
:
R
e
a
l
6
4
D
i
m
e
n
s
i
o
n
s
:
{
2
1
0
}
,
N
u
m
e
r
i
c
A
r
r
a
y
T
y
p
e
:
R
e
a
l
6
4
D
i
m
e
n
s
i
o
n
s
:
{
2
1
0
}
,
N
u
m
e
r
i
c
A
r
r
a
y
T
y
p
e
:
R
e
a
l
6
4
D
i
m
e
n
s
i
o
n
s
:
{
2
1
0
}
O
u
t
[
]
=
Although the differences seem subtle, they are based on highly sensitive parameters.
M
a
t
r
i
x
P
l
o
t
[
P
a
r
t
i
t
i
o
n
[
N
o
r
m
a
l
@
#
,
9
]
,
F
r
a
m
e
F
a
l
s
e
]
&
/
@
%
/
/
G
r
i
d
[
{
#
,
{
"
e
t
h
a
n
o
l
"
,
"
a
c
e
t
i
c
a
c
i
d
"
,
"
a
c
e
t
o
n
e
"
}
}
,
F
r
a
m
e
A
l
l
]
&
I
n
[
]
:
=
e
t
h
a
n
o
l
a
c
e
t
i
c
a
c
i
d
a
c
e
t
o
n
e
O
u
t
[
]
=
The RDF in the WL has 7 crucial parts (1 unweighted and 6 weighted) that make it ideal for fingerprinting and “deploy-worthy” in NNs:
Pure radial distribution parameter
1
.
Molecular mass
2
.
Van der Waals volume
3
.
Sanderson electronegativity
4
.
Polarizability
5
.
Ionization potential
6
.
I-state (intrinsic-state from local vertex variants)
7
.
Data sets- Let’s automate things!
Now, we need to find a suitable software that will generate our dataset for us. Among a wide range of software that can do this, I found
Orca
[3] to be the most suitable. It has a very simple installation procedure, and is windows-compatible. An easy to follow documentation is a huge plus.
S
t
y
l
e
/
@
T
e
x
t
/
@
T
e
x
t
S
e
n
t
e
n
c
e
s
[
W
i
k
i
p
e
d
i
a
D
a
t
a
[
"
O
R
C
A
(
q
u
a
n
t
u
m
c
h
e
m
i
s
t
r
y
p
r
o
g
r
a
m
)
"
]
]
[
[
;
;
2
]
]
/
/
C
o
l
u
m
n
I
n
[
]
:
=
"
O
R
C
A
i
s
a
n
a
b
i
n
i
t
i
o
q
u
a
n
t
u
m
c
h
e
m
i
s
t
r
y
p
r
o
g
r
a
m
p
a
c
k
a
g
e
t
h
a
t
c
o
n
t
a
i
n
s
m
o
d
e
r
n
e
l
e
c
t
r
o
n
i
c
s
t
r
u
c
t
u
r
e
m
e
t
h
o
d
s
i
n
c
l
u
d
i
n
g
d
e
n
s
i
t
y
f
u
n
c
t
i
o
n
a
l
t
h
e
o
r
y
,
m
a
n
y
-
b
o
d
y
p
e
r
t
u
r
b
a
t
i
o
n
,
c
o
u
p
l
e
d
c
l
u
s
t
e
r
,
m
u
l
t
i
r
e
f
e
r
e
n
c
e
m
e
t
h
o
d
s
,
a
n
d
s
e
m
i
-
e
m
p
i
r
i
c
a
l
q
u
a
n
t
u
m
c
h
e
m
i
s
t
r
y
m
e
t
h
o
d
s
.
"
"
I
t
s
m
a
i
n
f
i
e
l
d
o
f
a
p
p
l
i
c
a
t
i
o
n
i
s
l
a
r
g
e
r
m
o
l
e
c
u
l
e
s
,
t
r
a
n
s
i
t
i
o
n
m
e
t
a
l
c
o
m
p
l
e
x
e
s
,
a
n
d
t
h
e
i
r
s
p
e
c
t
r
o
s
c
o
p
i
c
p
r
o
p
e
r
t
i
e
s
.
"
O
u
t
[
]
=
An exciting bonus here is that not only can
Orca
generate data sets to do IR spectra (vibrational; absorption) analysis, but it can also be extended to study and predict patterns in Raman (vibrational; scattering), UV-vis (electro-optical), and NMR (magnetic) spectra.
General notes:
Computation time: ~15 min for 20 atom system [geometry optimization + spectra calculations]
1
.
Command line run software
2
.
From molecule name to spectra, all on our dear Mathematica
We will perform execution and acquisition of data from
Orca
all through the WL. This will help internalize our entire process of data set generation.
Orca
requires 2 types of files to calculate the Hessian values for various organic molecules. They are the input and the data files. The data file is fairly straightforward; using the built-in function Molecule[], we can export .xyz data. The input file is very short and describes the method of energy minimization etc. which can be written as a string expression in the WL. First we download the
software
.
From here, code can be run only if Orca and its executable files are installed at "C:\\ Orca" from the link above.
(
*
C
r
e
a
t
e
D
i
r
e
c
t
o
r
y
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
,
"
o
r
c
a
d
a
t
a
"
}
]
]
*
)
We can acquire ordered molecule name lists from the “GDB-9” database. We will leave it uncompressed until necessary, to save time.
d
a
t
a
G
D
B
=
C
l
o
u
d
G
e
t
@
C
l
o
u
d
O
b
j
e
c
t
[
"
h
t
t
p
s
:
/
/
w
w
w
.
w
o
l
f
r
a
m
c
l
o
u
d
.
c
o
m
/
o
b
j
/
j
a
s
o
n
b
/
C
o
m
p
r
e
s
s
e
d
_
G
D
B
-
9
_
m
o
l
e
c
u
l
e
s
"
]
;
I
n
[
]
:
=
Next, we shall consider smaller molecules (<=20 atom systems) so that the computation time does not overflow, while at the same time ensuring that we still have a diverse representation of functional groups (and hence a diverse representation of vibrational frequencies). Additionally, some uncommon molecules do not have IUPAC name in the PubChem service, and hence those are removed.
n
e
w
M
o
l
d
a
t
a
[
n
_
,
m
_
]
:
=
{
#
,
M
o
l
e
c
u
l
e
@
#
}
&
/
@
L
a
s
t
/
@
F
i
r
s
t
/
@
(
S
e
l
e
c
t
[
(
S
e
r
v
i
c
e
E
x
e
c
u
t
e
[
"
P
u
b
C
h
e
m
"
,
"
C
o
m
p
o
u
n
d
P
r
o
p
e
r
t
i
e
s
"
,
{
"
M
o
l
e
c
u
l
e
"
#
,
"
P
r
o
p
e
r
t
y
"
{
"
I
U
P
A
C
N
a
m
e
"
}
}
]
&
/
@
S
e
l
e
c
t
[
U
n
c
o
m
p
r
e
s
s
/
@
d
a
t
a
G
D
B
[
[
n
;
;
m
]
]
,
B
o
n
d
C
o
u
n
t
@
#
≤
2
0
&
]
)
,
!
S
t
r
i
n
g
C
o
n
t
a
i
n
s
Q
[
T
o
S
t
r
i
n
g
@
F
u
l
l
F
o
r
m
@
#
[
[
1
]
]
,
"
M
i
s
s
i
n
g
"
]
&
]
)
Each time we wish to add data for new molecules, we take a part of the GDB database in serial order (serial 81 through 120 in this example, meaning the first 80 have been completed and saved in the “orca data” folder in the NotebookDirectory):
n
e
w
=
n
e
w
M
o
l
d
a
t
a
[
1
2
1
,
1
4
0
]
;
I
n
[
]
:
=
As the quantum chemistry simulations are time-consuming, we should perform a sanity check to see if some molecules have already been computed, then define a list of fresh molecules to compute:
c
u
r
r
e
n
t
M
o
l
d
a
t
a
=
{
#
,
M
o
l
e
c
u
l
e
@
#
}
&
/
@
S
t
r
i
n
g
R
e
p
l
a
c
e
[
(
T
o
L
o
w
e
r
C
a
s
e
/
@
S
t
r
i
n
g
C
a
s
e
s
[
#
,
"
d
a
t
a
"
~
~
_
~
~
x
_
_
~
~
"
_
o
u
t
_
"
-
>
x
]
&
/
@
F
i
l
e
N
a
m
e
s
[
_
_
_
~
~
"
_
o
u
t
_
I
R
.
t
x
t
"
,
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
/
/
F
l
a
t
t
e
n
)
,
S
t
a
r
t
O
f
S
t
r
i
n
g
~
~
"
n
-
"
"
N
-
"
]
;
I
n
[
]
:
=
A visual check if we have diversity in the types of organic bonds:
I
m
a
g
e
/
@
(
M
o
l
e
c
u
l
e
P
l
o
t
/
@
R
a
n
d
o
m
S
a
m
p
l
e
[
%
[
[
A
l
l
,
2
]
]
,
4
0
]
)
/
/
I
m
a
g
e
C
o
l
l
a
g
e
[
#
,
A
u
t
o
m
a
t
i
c
,
{
1
0
0
0
,
1
0
0
0
}
,
B
a
c
k
g
r
o
u
n
d
W
h
i
t
e
]
&
/
/
M
a
g
n
i
f
y
[
#
,
0
.
2
5
]
&
I
n
[
]
:
=
O
u
t
[
]
=
List of molecules to compute (example run):
t
o
D
o
=
S
e
l
e
c
t
[
n
e
w
,
!
I
n
t
e
r
s
e
c
t
i
n
g
Q
[
L
i
s
t
[
#
[
[
2
]
]
]
,
c
u
r
r
e
n
t
M
o
l
d
a
t
a
[
[
A
l
l
,
2
]
]
]
&
]
[
[
A
l
l
,
1
]
]
;
I
n
[
]
:
=
%
/
/
P
a
r
t
i
t
i
o
n
[
#
,
5
]
&
/
/
G
r
i
d
I
n
[
]
:
=
p
e
n
t
-
1
-
y
n
e
b
u
t
a
n
e
n
i
t
r
i
l
e
2
-
(
m
e
t
h
y
l
a
m
i
n
o
)
a
c
e
t
o
n
i
t
r
i
l
e
3
-
m
e
t
h
o
x
y
p
r
o
p
-
1
-
y
n
e
2
-
m
e
t
h
o
x
y
a
c
e
t
o
n
i
t
r
i
l
e
b
u
t
-
3
-
y
n
-
1
-
o
l
3
-
h
y
d
r
o
x
y
p
r
o
p
a
n
e
n
i
t
r
i
l
e
b
u
t
a
n
a
l
N
-
e
t
h
y
l
f
o
r
m
a
m
i
d
e
e
t
h
y
l
f
o
r
m
a
t
e
2
-
m
e
t
h
o
x
y
a
c
e
t
a
l
d
e
h
y
d
e
3
-
h
y
d
r
o
x
y
p
r
o
p
a
n
a
l
p
e
n
t
a
n
e
b
u
t
a
n
-
1
-
o
l
1
-
m
e
t
h
o
x
y
p
r
o
p
a
n
e
O
u
t
[
]
=
And here we automate the process:
m
o
l
T
o
I
R
[
m
o
l
_
S
t
r
i
n
g
]
:
=
M
o
d
u
l
e
[
{
n
a
m
e
=
S
t
r
i
n
g
R
e
p
l
a
c
e
[
m
o
l
,
"
"
"
"
]
}
,
(
*
W
r
i
t
i
n
g
g
e
o
m
e
t
r
y
o
p
t
i
m
i
z
a
t
i
o
n
i
n
p
u
t
f
i
l
e
a
n
d
x
y
z
d
a
t
a
f
i
l
e
*
)
M
a
p
T
h
r
e
a
d
[
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
,
n
a
m
e
<
>
#
1
}
]
,
#
2
]
&
,
{
{
"
_
i
n
_
g
e
o
.
t
x
t
"
,
"
_
g
u
e
s
s
.
t
x
t
"
}
,
{
"
!
B
3
L
Y
P
D
E
F
2
-
S
V
P
O
P
T
D
4
"
<
>
"
\
n
"
<
>
"
*
x
y
z
f
i
l
e
0
1
"
<
>
n
a
m
e
<
>
"
_
g
u
e
s
s
.
t
x
t
"
<
>
"
\
n
"
,
E
x
p
o
r
t
S
t
r
i
n
g
[
M
o
l
e
c
u
l
e
[
m
o
l
]
,
"
X
Y
Z
"
]
}
}
]
;
(
*
R
u
n
n
i
n
g
g
e
o
m
e
t
r
y
o
p
t
i
m
i
z
a
t
i
o
n
;
f
o
r
C
a
f
f
e
i
n
e
-
T
O
T
A
L
R
U
N
T
I
M
E
:
0
d
a
y
s
0
h
o
u
r
s
1
9
m
i
n
u
t
e
s
4
4
s
e
c
o
n
d
s
3
3
3
m
s
e
c
*
)
R
u
n
P
r
o
c
e
s
s
[
{
"
C
:
\
\
O
r
c
a
\
\
o
r
c
a
.
e
x
e
"
,
n
a
m
e
<
>
"
_
i
n
_
g
e
o
.
t
x
t
"
}
,
"
S
t
a
n
d
a
r
d
O
u
t
p
u
t
"
,
P
r
o
c
e
s
s
D
i
r
e
c
t
o
r
y
-
>
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
;
(
*
G
e
o
m
e
t
r
y
o
p
t
i
m
i
z
e
d
f
i
l
e
i
s
n
o
w
t
h
e
n
e
w
d
a
t
a
f
i
l
e
*
)
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
,
n
a
m
e
<
>
"
_
i
n
_
I
R
.
t
x
t
"
}
]
,
"
!
B
P
8
6
D
E
F
2
-
S
V
P
F
R
E
Q
"
<
>
"
\
n
"
<
>
"
*
x
y
z
f
i
l
e
0
1
"
<
>
n
a
m
e
<
>
"
_
i
n
_
g
e
o
.
x
y
z
"
<
>
"
\
n
"
]
;
(
*
R
u
n
n
i
n
g
a
n
d
e
x
p
o
r
t
i
n
g
I
R
s
p
e
c
t
r
a
p
r
e
d
i
c
t
i
o
n
;
f
o
r
C
a
f
f
e
i
n
e
-
T
O
T
A
L
R
U
N
T
I
M
E
:
0
d
a
y
s
0
h
o
u
r
s
1
4
m
i
n
u
t
e
s
7
s
e
c
o
n
d
s
2
0
3
m
s
e
c
*
)
M
a
p
T
h
r
e
a
d
[
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
,
n
a
m
e
<
>
#
1
}
]
,
#
2
]
&
,
{
{
"
_
o
u
t
_
f
u
l
l
.
t
x
t
"
,
"
_
o
u
t
_
I
R
.
t
x
t
"
}
,
F
u
n
c
t
i
o
n
[
s
,
{
s
,
S
t
r
i
n
g
T
a
k
e
[
s
,
F
i
r
s
t
/
@
(
F
i
r
s
t
/
@
S
t
r
i
n
g
P
o
s
i
t
i
o
n
[
s
,
#
]
&
/
@
{
"
I
R
S
P
E
C
T
R
U
M
"
,
"
T
h
e
f
i
r
s
t
"
}
)
-
1
]
}
]
@
R
u
n
P
r
o
c
e
s
s
[
{
"
C
:
\
\
O
r
c
a
\
\
o
r
c
a
.
e
x
e
"
,
n
a
m
e
<
>
"
_
i
n
_
I
R
.
t
x
t
"
}
,
"
S
t
a
n
d
a
r
d
O
u
t
p
u
t
"
,
P
r
o
c
e
s
s
D
i
r
e
c
t
o
r
y
-
>
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
}
]
]
I
n
[
]
:
=
m
o
l
T
o
I
R
[
"
c
a
f
f
e
i
n
e
"
]
I
n
[
]
:
=
{
C
:
\
U
s
e
r
s
\
S
a
t
h
v
i
k
I
y
e
n
g
a
r
\
D
e
s
k
t
o
p
\
W
S
S
\
S
a
t
h
v
i
k
\
o
r
c
a
d
a
t
a
\
c
a
f
f
e
i
n
e
_
o
u
t
_
f
u
l
l
.
t
x
t
,
C
:
\
U
s
e
r
s
\
S
a
t
h
v
i
k
I
y
e
n
g
a
r
\
D
e
s
k
t
o
p
\
W
S
S
\
S
a
t
h
v
i
k
\
o
r
c
a
d
a
t
a
\
c
a
f
f
e
i
n
e
_
o
u
t
_
I
R
.
t
x
t
}
O
u
t
[
]
=
Our Data are on Cloud (nine)
We will need to access this data often. It will be saved in the form:
{
{
{
m
o
l
e
c
u
l
e
n
a
m
e
1
,
R
D
F
1
}
,
{
{
f
r
e
q
u
e
n
c
y
(
1
,
1
)
,
r
e
l
a
t
i
v
e
t
r
a
n
s
m
i
t
t
a
n
c
e
(
1
,
1
)
}
,
{
}
,
.
.
.
}
,
{
{
m
o
l
e
c
u
l
e
n
a
m
e
2
,
R
D
F
2
}
,
.
.
.
}
,
.
.
.
}
i
r
D
a
t
a
=
F
u
n
c
t
i
o
n
[
n
a
m
e
,
(
*
m
o
l
e
c
u
l
e
n
a
m
e
*
)
{
{
#
,
M
o
l
e
c
u
l
e
[
F
i
r
s
t
@
#
]
[
"
R
D
F
"
]
}
&
@
S
t
r
i
n
g
R
e
p
l
a
c
e
[
T
o
L
o
w
e
r
C
a
s
e
@
S
t
r
i
n
g
C
a
s
e
s
[
n
a
m
e
,
"
d
a
t
a
"
~
~
_
~
~
x
_
_
~
~
"
_
o
u
t
_
"
-
>
x
]
,
S
t
a
r
t
O
f
S
t
r
i
n
g
~
~
"
n
-
"
"
N
-
"
]
,
(
*
{
F
r
e
q
u
e
n
c
y
,
R
e
l
a
t
i
v
e
t
r
a
n
s
m
i
t
t
a
n
c
e
}
*
)
T
o
E
x
p
r
e
s
s
i
o
n
/
@
S
t
r
i
n
g
S
p
l
i
t
[
#
,
W
h
i
t
e
s
p
a
c
e
C
h
a
r
a
c
t
e
r
.
.
]
&
/
@
S
t
r
i
n
g
C
a
s
e
s
[
#
,
_
~
~
_
~
~
"
:
"
~
~
W
h
i
t
e
s
p
a
c
e
C
h
a
r
a
c
t
e
r
.
.
~
~
(
x
:
(
N
u
m
b
e
r
S
t
r
i
n
g
~
~
W
h
i
t
e
s
p
a
c
e
C
h
a
r
a
c
t
e
r
.
.
~
~
N
u
m
b
e
r
S
t
r
i
n
g
)
)
x
]
&
@
I
m
p
o
r
t
[
n
a
m
e
]
}
]
/
@
F
i
l
e
N
a
m
e
s
[
_
_
_
~
~
"
_
o
u
t
_
I
R
.
t
x
t
"
,
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
/
/
Q
u
i
e
t
;
I
n
[
]
:
=
However, this data is not ready yet. We need to do some cleaning, and make it variable by choice by introducing a parameter:
Some peaks are very insignificant and can be discarded (very high relative transmittance values)
1
.
Some values fall outside the usual IR frequency range of 400-4000 cm-1, we can specify a new range to restrict our search
2
.
Finally, some peaks are extremely close, separated by <10 cm-1. A parameter is used to set that frequency width and return the most prominent peak within it
3
.
The molecules are also in name order, better to RandomSample to be safe.
4
.
i
r
D
a
t
a
F
i
l
t
e
r
[
t
r
a
n
s
S
p
e
c
_
,
f
r
e
q
S
p
e
c
_
:
3
0
,
f
r
e
q
R
a
n
g
e
_
L
i
s
t
:
{
4
0
0
,
4
0
0
0
}
]
:
=
M
o
d
u
l
e
[
{
n
e
w
D
a
t
a
=
i
r
D
a
t
a
}
,
(
*
s
e
t
t
i
n
g
a
t
r
a
n
s
m
i
t
t
a
n
c
e
v
a
l
u
e
t
h
r
e
s
h
o
l
d
*
)
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
=
F
u
n
c
t
i
o
n
[
d
a
t
a
,
S
e
l
e
c
t
[
d
a
t
a
[
[
2
]
]
,
#
[
[
2
]
]
<
t
r
a
n
s
S
p
e
c
&
]
]
/
@
i
r
D
a
t
a
;
(
*
t
e
x
t
t
o
p
r
e
v
e
n
t
s
e
t
t
i
n
g
p
a
r
a
m
e
t
e
r
s
t
o
o
e
x
t
r
e
m
e
,
a
t
l
e
a
s
t
1
f
r
e
q
u
e
n
c
y
m
u
s
t
r
e
m
a
i
n
*
)
I
f
[
M
e
m
b
e
r
Q
[
L
e
n
g
t
h
/
@
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
,
0
]
,
R
e
t
u
r
n
@
P
r
i
n
t
[
"
T
r
a
n
s
m
i
t
t
a
n
c
e
t
h
r
e
h
o
l
d
t
o
o
l
o
w
a
s
s
o
m
e
m
o
l
e
c
u
l
e
s
h
a
v
e
n
o
p
e
a
k
s
a
t
t
h
i
s
v
a
l
u
e
"
]
,
C
o
n
t
i
n
u
e
]
;
(
*
s
e
t
t
i
n
g
a
f
r
e
q
u
e
n
c
y
r
a
n
g
e
*
)
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
=
F
u
n
c
t
i
o
n
[
d
a
t
a
,
S
e
l
e
c
t
[
d
a
t
a
[
[
2
]
]
,
B
e
t
w
e
e
n
[
#
[
[
1
]
]
,
f
r
e
q
R
a
n
g
e
]
&
]
]
/
@
n
e
w
D
a
t
a
;
(
*
s
e
t
t
i
n
g
a
p
a
r
a
m
e
t
e
r
t
o
e
x
c
l
u
d
e
o
v
e
r
l
a
p
s
w
i
t
h
i
n
t
h
e
f
r
e
q
u
e
n
c
y
w
i
d
t
h
,
e
g
:
f
r
e
q
S
p
e
c
=
3
0
w
i
l
l
g
i
v
e
t
h
e
p
a
i
r
w
i
t
h
l
o
w
e
r
t
r
a
n
s
m
i
t
t
a
n
c
e
b
e
t
w
e
e
n
1
3
1
0
c
m
-
1
a
n
d
1
3
2
0
c
m
-
1
a
s
t
h
e
y
a
r
e
1
0
c
m
-
1
a
p
a
r
t
*
)
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
=
T
a
b
l
e
[
F
i
r
s
t
/
@
R
e
s
o
u
r
c
e
F
u
n
c
t
i
o
n
[
"
M
a
x
i
m
a
l
S
u
b
s
e
t
s
"
]
[
S
o
r
t
B
y
[
#
,
L
a
s
t
]
&
/
@
(
F
u
n
c
t
i
o
n
[
l
i
s
t
,
N
e
a
r
e
s
t
[
l
i
s
t
,
#
,
{
I
n
f
i
n
i
t
y
,
f
r
e
q
S
p
e
c
}
]
&
/
@
l
i
s
t
]
@
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
[
[
n
]
]
)
]
,
{
n
,
L
e
n
g
t
h
@
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
}
]
;
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
=
F
u
n
c
t
i
o
n
[
i
,
S
o
r
t
B
y
[
i
[
[
2
]
]
,
#
[
[
1
]
]
&
]
]
/
@
n
e
w
D
a
t
a
;
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
=
(
#
+
R
a
n
d
o
m
R
e
a
l
[
{
0
.
0
0
0
1
,
0
.
0
0
1
}
,
D
i
m
e
n
s
i
o
n
s
@
#
]
)
&
/
@
i
r
D
a
t
a
[
[
A
l
l
,
2
]
]
;
(
*
t
e
x
t
t
o
p
r
e
v
e
n
t
s
e
t
t
i
n
g
p
a
r
a
m
e
t
e
r
s
t
o
o
e
x
t
r
e
m
e
,
a
t
l
e
a
s
t
1
f
r
e
q
u
e
n
c
y
m
u
s
t
r
e
m
a
i
n
*
)
I
f
[
M
e
m
b
e
r
Q
[
L
e
n
g
t
h
/
@
n
e
w
D
a
t
a
[
[
A
l
l
,
2
]
]
,
0
]
,
R
e
t
u
r
n
@
P
r
i
n
t
[
"
E
i
t
h
e
r
t
h
e
f
r
e
q
u
e
n
c
y
e
x
c
l
u
s
i
o
n
w
i
d
t
h
t
o
o
h
i
g
h
o
r
t
h
e
f
r
e
q
u
e
n
c
y
r
a
n
g
e
i
s
t
o
o
n
a
r
r
o
w
a
s
s
o
m
e
m
o
l
e
c
u
l
e
s
h
a
v
e
n
o
p
e
a
k
s
a
t
t
h
i
s
v
a
l
u
e
"
]
,
C
o
n
t
i
n
u
e
]
;
R
a
n
d
o
m
S
a
m
p
l
e
@
n
e
w
D
a
t
a
]
I
n
[
]
:
=
f
i
l
t
e
r
e
d
D
a
t
a
=
i
r
D
a
t
a
F
i
l
t
e
r
[
3
0
]
;
I
n
[
]
:
=
C
l
o
u
d
P
u
t
[
f
i
l
t
e
r
e
d
D
a
t
a
,
"
I
R
_
d
a
t
a
_
f
u
n
c
t
i
o
n
"
,
P
e
r
m
i
s
s
i
o
n
s
"
P
u
b
l
i
c
"
]
I
n
[
]
:
=
C
l
o
u
d
O
b
j
e
c
t
h
t
t
p
s
:
/
/
w
w
w
.
w
o
l
f
r
a
m
c
l
o
u
d
.
c
o
m
/
o
b
j
/
s
a
t
h
v
i
k
/
I
R
_
d
a
t
a
_
f
u
n
c
t
i
o
n
O
u
t
[
]
=
Alternatively, it can be exported as a “.mx”:
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
i
r
D
a
t
a
S
e
t
.
m
x
"
}
]
,
i
r
D
a
t
a
]
I
n
[
]
:
=
C
:
\
U
s
e
r
s
\
S
a
t
h
v
i
k
I
y
e
n
g
a
r
\
D
e
s
k
t
o
p
\
W
S
S
\
S
a
t
h
v
i
k
\
i
r
D
a
t
a
S
e
t
.
m
x
O
u
t
[
]
=
So you have the numbers, but can you animate it?
Numbers are hard to understand, let us try to animate these molecular vibrations.
i
r
D
a
t
a
=
C
l
o
u
d
G
e
t
[
"
h
t
t
p
s
:
/
/
w
w
w
.
w
o
l
f
r
a
m
c
l
o
u
d
.
c
o
m
/
o
b
j
/
s
a
t
h
v
i
k
/
I
R
_
d
a
t
a
_
f
u
n
c
t
i
o
n
"
]
;
I
n
[
]
:
=
And here it goes :
m
o
l
T
o
G
I
F
[
m
o
l
_
S
t
r
i
n
g
/
;
M
e
m
b
e
r
Q
[
c
u
r
r
e
n
t
M
o
l
d
a
t
a
[
[
A
l
l
,
1
]
]
,
m
o
l
]
T
r
u
e
]
:
=
M
o
d
u
l
e
[
{
n
a
m
e
=
S
t
r
i
n
g
R
e
p
l
a
c
e
[
m
o
l
,
"
"
"
"
]
,
a
t
o
m
N
u
m
,
f
r
e
q
N
,
f
r
e
q
,
f
i
l
e
s
,
v
i
b
D
a
t
a
}
,
(
*
I
n
p
u
t
f
i
l
e
t
o
g
r
a
b
H
e
s
s
i
a
n
s
*
)
R
u
n
P
r
o
c
e
s
s
[
{
"
C
:
\
\
O
r
c
a
\
\
o
r
c
a
_
p
l
t
v
i
b
.
e
x
e
"
,
n
a
m
e
<
>
"
_
i
n
_
I
R
.
h
e
s
s
"
,
"
a
l
l
"
}
,
"
S
t
a
n
d
a
r
d
O
u
t
p
u
t
"
,
P
r
o
c
e
s
s
D
i
r
e
c
t
o
r
y
-
>
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
;
(
*
c
l
e
a
n
i
n
g
u
p
t
h
e
d
a
t
a
*
)
f
r
e
q
N
=
L
e
n
g
t
h
@
i
r
D
a
t
a
[
[
F
i
r
s
t
@
@
P
o
s
i
t
i
o
n
[
i
r
D
a
t
a
,
n
a
m
e
]
,
2
]
]
;
f
r
e
q
=
i
r
D
a
t
a
[
[
F
i
r
s
t
@
@
P
o
s
i
t
i
o
n
[
i
r
D
a
t
a
,
n
a
m
e
]
,
2
,
A
l
l
,
1
]
]
;
f
i
l
e
s
=
I
m
p
o
r
t
[
#
,
"
T
e
x
t
"
]
&
/
@
F
i
l
e
N
a
m
e
s
[
n
a
m
e
~
~
"
_
i
n
_
I
R
.
h
e
s
s
.
v
"
~
~
_
_
_
,
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
o
r
c
a
d
a
t
a
"
}
]
]
[
[
-
f
r
e
q
N
;
;
]
]
;
a
t
o
m
N
u
m
=
F
i
r
s
t
@
S
t
r
i
n
g
C
a
s
e
s
[
F
i
r
s
t
@
f
i
l
e
s
,
S
t
a
r
t
O
f
S
t
r
i
n
g
~
~
x
:
N
u
m
b
e
r
S
t
r
i
n
g
x
]
;
(
*
T
a
b
l
e
t
o
m
a
k
e
G
I
F
s
f
o
r
A
L
L
v
i
b
r
a
t
i
o
n
a
l
m
o
d
e
s
*
)
T
a
b
l
e
[
v
i
b
D
a
t
a
=
(
S
t
r
i
n
g
R
i
f
f
l
e
[
#
,
"
\
n
"
]
&
/
@
(
P
r
e
p
e
n
d
[
#
,
a
t
o
m
N
u
m
]
&
/
@
P
a
r
t
i
t
i
o
n
[
R
e
s
t
@
S
t
r
i
n
g
C
a
s
e
s
[
f
i
l
e
s
[
[
i
]
]
,
L
e
t
t
e
r
C
h
a
r
a
c
t
e
r
~
~
R
e
p
e
a
t
e
d
[
W
h
i
t
e
s
p
a
c
e
~
~
N
u
m
b
e
r
S
t
r
i
n
g
,
3
]
]
,
T
o
E
x
p
r
e
s
s
i
o
n
@
a
t
o
m
N
u
m
]
)
)
;
(
*
T
a
b
l
e
t
o
p
u
t
t
o
g
e
t
h
e
r
v
a
r
i
o
u
s
x
y
z
t
r
a
j
e
c
t
o
r
i
e
s
f
o
r
a
s
i
n
g
l
e
v
i
b
r
a
t
i
o
n
a
n
i
m
a
t
i
o
n
*
)
T
a
b
l
e
[
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
a
n
i
m
a
t
i
o
n
s
"
,
n
a
m
e
<
>
"
v
i
b
_
"
<
>
T
o
S
t
r
i
n
g
@
f
r
e
q
[
[
i
]
]
<
>
"
c
m
-
1
_
"
<
>
T
o
S
t
r
i
n
g
@
n
<
>
"
.
x
y
z
"
}
]
,
v
i
b
D
a
t
a
[
[
n
]
]
,
"
T
e
x
t
"
]
,
{
n
,
L
e
n
g
t
h
@
v
i
b
D
a
t
a
}
]
;
(
*
P
l
o
t
u
s
i
n
g
M
o
l
e
c
u
l
e
P
l
o
t
3
D
a
n
d
e
x
p
o
r
t
i
t
a
s
a
l
i
s
t
o
f
i
m
a
g
e
s
*
)
E
x
p
o
r
t
[
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
a
n
i
m
a
t
i
o
n
s
"
,
n
a
m
e
<
>
"
v
i
b
_
"
<
>
T
o
S
t
r
i
n
g
@
f
r
e
q
[
[
i
]
]
<
>
"
c
m
-
1
_
"
<
>
"
.
g
i
f
"
}
]
,
I
m
a
g
e
/
@
(
M
o
l
e
c
u
l
e
P
l
o
t
3
D
[
#
]
&
/
@
(
I
m
p
o
r
t
[
F
i
r
s
t
@
#
,
"
X
Y
Z
"
]
&
/
@
(
F
i
l
e
N
a
m
e
s
[
n
a
m
e
~
~
"
v
i
b
_
"
~
~
T
o
S
t
r
i
n
g
@
f
r
e
q
[
[
i
]
]
~
~
"
c
m
-
1
_
"
~
~
T
o
S
t
r
i
n
g
@
#
~
~
"
.
x
y
z
"
,
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
a
n
i
m
a
t
i
o
n
s
"
}
]
]
&
/
@
R
a
n
g
e
[
2
0
]
)
)
)
,
"
A
n
i
m
a
t
i
o
n
R
e
p
e
t
i
t
i
o
n
s
"
∞
]
,
{
i
,
L
e
n
g
t
h
@
f
i
l
e
s
}
]
;
]
I
n
[
]
:
=
Let's try this for a random molecule :
m
o
l
T
o
G
I
F
[
r
a
n
d
o
m
=
F
i
r
s
t
@
R
a
n
d
o
m
C
h
o
i
c
e
@
c
u
r
r
e
n
t
M
o
l
d
a
t
a
]
Now we have the data and coordinate points! We need to specify the frame rates and adjust the animation so it can run smoothly [5]
m
a
k
e
A
n
i
m
a
t
i
o
n
[
l
i
s
t
_
,
d
e
l
a
y
L
i
s
t
_
:
{
.
0
3
}
]
:
=
D
y
n
a
m
i
c
M
o
d
u
l
e
[
{
l
=
L
e
n
g
t
h
[
l
i
s
t
]
,
d
e
l
a
y
s
=
A
b
s
@
F
l
a
t
t
e
n
[
{
d
e
l
a
y
L
i
s
t
}
]
,
t
i
m
e
s
,
t
o
t
a
l
T
i
m
e
,
d
e
l
t
a
=
.
0
3
,
f
r
a
m
e
s
}
,
t
i
m
e
s
=
R
o
u
n
d
[
.
5
+
P
a
d
R
i
g
h
t
[
d
e
l
a
y
s
,
l
,
d
e
l
a
y
s
]
/
d
e
l
t
a
]
;
f
r
a
m
e
s
=
F
l
a
t
t
e
n
@
T
a
b
l
e
[
T
a
b
l
e
[
l
i
s
t
[
[
i
]
]
,
{
t
i
m
e
s
[
[
i
]
]
}
]
,
{
i
,
L
e
n
g
t
h
[
t
i
m
e
s
]
}
]
;
t
o
t
a
l
T
i
m
e
=
L
e
n
g
t
h
[
f
r
a
m
e
s
]
;
E
v
e
n
t
H
a
n
d
l
e
r
[
D
y
n
a
m
i
c
[
f
r
a
m
e
s
[
[
C
l
o
c
k
[
{
1
,
t
o
t
a
l
T
i
m
e
,
1
}
,
t
o
t
a
l
T
i
m
e
d
e
l
t
a
]
]
]
,
T
r
a
c
k
e
d
S
y
m
b
o
l
s
{
}
]
,
{
"
M
o
u
s
e
U
p
"
,
2
}
N
u
l
l
]
]
I
n
[
]
:
=
There are quite a few vibrational modes for this molecule ~20, so we will just animate 3 random ones. These vibrations are highly accurate as they retain all trajectory data that were calculated :
C
o
l
u
m
n
[
m
a
k
e
A
n
i
m
a
t
i
o
n
[
I
m
p
o
r
t
[
#
]
]
&
/
@
R
a
n
d
o
m
S
a
m
p
l
e
[
F
i
l
e
N
a
m
e
s
[
r
a
n
d
o
m
~
~
_
_
~
~
"
.
g
i
f
"
,
F
i
l
e
N
a
m
e
J
o
i
n
[
{
N
o
t
e
b
o
o
k
D
i
r
e
c
t
o
r
y
[
]
,
"
a
n
i
m
a
t
i
o
n
s
"
}
]
]
,
3
]
]
I
n
[
]
:
=
O
u
t
[
]
=
Predicting frequencies using self normalizing neural nets (SNNs)
NNs are nit - picky towards data format
First, we try to work on this problem statement- Given a trained molecule (and hence its RDF), and an arbitrary transmission intensity, can we assign the intensity value to a suitable vibrational frequency?
i
r
D
a
t
a
=
C
l
o
u
d
G
e
t
[
"
h
t
t
p
s
:
/
/
w
w
w
.
w
o
l
f
r
a
m
c
l
o
u
d
.
c
o
m
/
o
b
j
/
s
a
t
h
v
i
k
/
I
R
_
d
a
t
a
_
f
u
n
c
t
i
o
n
"
]
;
I
n
[
]
:
=
We include copies of random variate data for each molecule to account for variability in intensity
i
r
D
a
t
a
R
a
n
d
o
m
=
T
a
b
l
e
[
T
a
b
l
e
[
F
u
n
c
t
i
o
n
[
s
d
,
(
F
l
a
t
t
e
n
@
(
A
p
p
e
n
d
[
Q
u
i
e
t
@
N
o
r
m
a
l
[
i
r
D
a
t
a
[
[
A
l
l
,
1
,
2
]
]
[
[
k
]
]
]
,
R
a
n
d
o
m
V
a
r
i
a
t
e
[
N
o
r
m
a
l
D
i
s
t
r
i
b
u
t
i
o
n
[
#
,
s
d
]
]
]
&
@
i
r
D
a
t
a
[
[
A
l
l
,
2
,
A
l
l
,
2
]
]
[
[
k
,
j
]
]
)
i
r
D
a
t
a
[
[
A
l
l
,
2
,
A
l
l
,
1
]
]
[
[
k
,
j
]
]
)
]
/
@
R
a
n
g
e
[
0
.
0
1
(
*
v
a
r
i
a
t
e
s
t
a
r
t
r
a
n
g
e
*
)
,
1
0
(
*
s
t
o
p
*
)
,
0
.
2
(
*
s
t
e
p
s
i
z
e
*
)
]
,
{
j
,
L
e
n
g
t
h
@
i
r
D
a
t
a
[
[
A
l
l
,
2
,
A
l
l
,
2
]
]
[
[
k
]
]
}
]
,
{
k
,
L
e
n
g
t
h
@
i
r
D
a
t
a
}
]
;
I
n
[
]
:
=
We split the dataset twice, such that we have “forTrain” which accounts for a “traininset” and a “validationset”, and we have “forTest” which is exclusively kept for testing new molecules.
{
f
o
r
T
r
a
i
n
,
f
o
r
T
e
s
t
}
=
R
e
s
o
u
r
c
e
F
u
n
c
t
i
o
n
[
"
T
r
a
i
n
T
e
s
t
S
p
l
i
t
"
]
[
i
r
D
a
t
a
R
a
n
d
o
m
]
;
I
n
[
]
:
=
L
e
n
g
t
h
@
f
o
r
T
r
a
i
n
I
n
[
]
:
=
1
6
1
O
u
t
[
]
=
L
e
n
g
t
h
@
f
o
r
T
e
s
t
I
n
[
]
:
=
4
0
O
u
t
[
]
=
{
t
r
a
i
n
i
n
g
s
e
t
,
v
a
l
i
d
a
t
i
o
n
s
e
t
}
=
R
e
s
o
u
r
c
e
F
u
n
c
t
i
o
n
[
"
T
r
a
i
n
T
e
s
t
S
p
l
i
t
"
]
[
f
o
r
T
r
a
i
n
]
;
{
t
r
a
i
n
i
n
g
s
e
t
,
v
a
l
i
d
a
t
i
o
n
s
e
t
}
=
{
R
a
n
d
o
m
S
a
m
p
l
e
@
F
l
a
t
t
e
n
@
t
r
a
i
n
i
n
g
s
e
t
,
R
a
n
d
o
m
S
a
m
p
l
e
@
F
l
a
t
t
e
n
@
v
a
l
i
d
a
t
i
o
n
s
e
t
}
;
I
n
[
]
:
=
SNN time!
We will use a self-normalizing NN:
S
N
N
n
e
t
[
d
r
o
p
o
u
t
r
a
t
e
_
,
n
h
i
d
d
e
n
_
,
n
l
a
y
e
r
s
_
]
:
=
N
e
t
C
h
a
i
n
[
J
o
i
n
[
T
a
b
l
e
[
N
e
t
C
h
a
i
n
[
{
L
i
n
e
a
r
L
a
y
e
r
[
n
h
i
d
d
e
n
]
,
E
l
e
m
e
n
t
w
i
s
e
L
a
y
e
r
[
"
S
E
L
U
"
]
,
D
r
o
p
o
u
t
L
a
y
e
r
[
d
r
o
p
o
u
t
r
a
t
e
,
"
M
e
t
h
o
d
"
"
A
l
p
h
a
D
r
o
p
o
u
t
"
]
}
]
,
n
l
a
y
e
r
s
-
1
]
,
{
L
i
n
e
a
r
L
a
y
e
r
[
1
]
}
]
]
I
n
[
]
:
=
t
e
s
t
=
S
N
N
n
e
t
[
0
.
0
0
1
,
2
0
0
,
2
]
I
n
[
]
:
=
N
e
t
C
h
a
i
n
u
n
i
n
i
t
i
a
l
i
z
e
d
I
n
p
u
t
p
o
r
t
:
a
r
r
a
y
O
u
t
p
u
t
p
o
r
t
:
v
e
c
t
o
r
(
s
i
z
e
:
1
)
O
u
t
[
]
=
Standardize our data:
e
x
t
r
a
c
t
o
r
=
F
e
a
t
u
r
e
E
x
t
r
a
c
t
i
o
n
[
N
@
K
e
y
s
[
t
r
a
i
n
i
n
g
s
e
t
]
,
"
S
t
a
n
d
a
r
d
i
z
e
d
V
e
c
t
o
r
"
]
;
t
r
a
i
n
S
t
a
n
d
a
r
d
i
z
e
d
=
e
x
t
r
a
c
t
o
r
[
K
e
y
s
[
t
r
a
i
n
i
n
g
s
e
t
]
]
V
a
l
u
e
s
[
t
r
a
i
n
i
n
g
s
e
t
]
;
t
e
s
t
S
t
a
n
d
a
r
d
i
z
e
d
=
e
x
t
r
a
c
t
o
r
[
K
e
y
s
[
v
a
l
i
d
a
t
i
o
n
s
e
t
]
]
V
a
l
u
e
s
[
v
a
l
i
d
a
t
i
o
n
s
e
t
]
;
I
n
[
]
:
=
e
x
p
e
r
i
m
e
n
t
=
N
e
t
T
r
a
i
n
[
t
e
s
t
,
t
r
a
i
n
i
n
g
s
e
t
,
A
l
l
,
T
a
r
g
e
t
D
e
v
i
c
e
"
C
P
U
"
,
T
i
m
e
G
o
a
l
6
0
,
V
a
l
i
d
a
t
i
o
n
S
e
t
v
a
l
i
d
a
t
i
o
n
s
e
t
]
;
e
x
p
e
r
i
m
e
n
t
[
"
E
v
o
l
u
t
i
o
n
P
l
o
t
s
"
]
L
o
s
s
r
o
u
n
d
s
l
o
s
s
v
a
l
i
d
a
t
i
o
n
t
r
a
i
n
i
n
g
O
u
t
[
]
=
Clearly, the minimization of loss is not very efficient, let us test this-
t
r
a
i
n
e
d
N
e
t
=
e
x
p
e
r
i
m
e
n
t
[
"
T
r
a
i
n
e
d
N
e
t
"
]
I
n
[
]
:
=
N
e
t
C
h
a
i
n
I
n
p
u
t
p
o
r
t
:
v
e
c
t
o
r
(
s
i
z
e
:
2
1
1