Message Boards Message Boards

3 Replies
14 Total Likes
View groups...
Share this post:

Extracting data from images of spectral plots generated by SDBS

I sometimes obtain plots from Spectral Database for Organic Compounds (SDBS). They are just images or PDF files – like the one in this post. Notice the unusual reversed non-uniform time axis. How could I extract data from it – is it possible at all? Even a general recommendation would be helpful. I am also looking for re-plotting this with a normal uniform axis. Thank you!

POSTED BY: Darya Aleinikava
3 Replies
There are a number of questions about this on Mathematica.SE.  Here are two you may be interested in:
Take a look at @halirutan's answer in particular.
POSTED BY: Szabolcs Horvát
This is great, - exactly what I was hoping for. Nice trick with ImageData and Position ;-) My colleagues and I applied this to a few other plots – works perfectly, thank you!
POSTED BY: Darya Aleinikava
Yes, but it's not totally automatic.

Here is how I would do it.

First, import the image (above):

image = Import["C:\\Users\\arnoudb.WRI\\Downloads\\WqZ6swa.png"]

Next, with the image editing tools extract just the content area:

image2 = ImageTake[image, {18, 683}, {74, 1498}]

Then extract the "black" points:
pos = Position[ImageData[image2], {0., 0., 0.}];

Then get the data ranges (in the "image coordinate system"):
 In[33]:= Max[pos[[All, 1]]]
 Out[33]= 635
 In[34]:= Min[pos[[All, 1]]]
 Out[34]= 32
 In[35]:= Max[pos[[All, 2]]]
 Out[35]= 1425
In[36]:= Min[pos[[All, 2]]]
Out[36]= 1

Then rescale the data points and do a list plot (Edit: It looks like the original plot has some piecewise linear scaling or logarithmic scaling, which will make the correct transformation a bit harder):

ListPlot[{Rescale[#[[2]], {1, 1425}, {4000, 400}], Rescale[#[[1]], {635, 32}, {0, 100}]} & /@ pos]

Note how the data is "reversed" from the original image, since the original image has a very strange x-axis (it starts at 4000 and
then goes down to 400).
POSTED BY: Arnoud Buzing
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract