Message Boards Message Boards

Find clusters in an energy spectrum?

Posted 6 years ago

Hi ! I have recently been faced with a problem on spectral analysis which may find a solution with data clustering techniques. I have an energy spectrum with around 200 energy levels: many of them separated by less or comparable to the error on energy, and so I would like to cluster them to consider the levels very close in energy (with respect to their errors) as only one. Data are organised in a table of four columns where the first one is the energy, the second and third are level intensity and its errors (irrelevant for the clustering),and the fourth is the error on energy.

level52={{6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.112, 13}, {6485, 2.03, \
0.0848, 15}, {6531, 0.78, 0.0579, 18}, {6584, 0.56, 0.0488, 21}, \
{6659, 0.83, 0.0483, 25}} 

etc...

I therefore tried:

FindClusters[Drop[level52, None, {2, 3, 4}] -> level52, 40]

but this does not cluster levels properly. Does anyone know what it if the appropriate distance method to use in the case ? Is there a way to define a distance function using the error on energy on the fourth column of the table ? For example a distance which is the difference in energy between the levels divided by the sum of their errors. Many thanks for the help.

POSTED BY: Andrea G.
5 Replies

I devised a new distance-function df1 which is zero when the energies are equal, 1 when uncertainety-intervals overlap and the absolute value of the energy-difference otherwise.

df1[xx_, yy_] := Module[{},
  x = xx; y = yy;
  j1 = Interval[{x[[1]] - x[[4]], x[[1]] + x[[4]]}];
  j2 = Interval[{y[[1]] - y[[4]], y[[1]] + y[[4]]}];
  ds = IntervalIntersection[j1, j2];
  Which[
   x[[1]] == y[[1]], 0,
   ds === Interval[], Abs[x[[1]] - y[[1]]],
   True, 1]
  ]

This distancefunction applied to the 2-subsets of level52 shows that two energies are "similar", i.e. having distance = 1, and the others not:

In[17]:= df1 @@@ (Subsets[level52, {2}])

Out[17]= {1, 55, 101, 154, 229, 33, 79, 132, 207, 46, 99, 174, 53, 128, 75}

So level52 should be clustered, what surprisingly is not the case:

 In[15]:= FindClusters[level52, DistanceFunction -> (df1[#1, #2] &)]
% // Length

Out[15]= {{{6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.112, 13}, {6485, 
   2.03, 0.0848, 15}, {6531, 0.78, 0.0579, 18}, {6584, 0.56, 0.0488, 
   21}, {6659, 0.83, 0.0483, 25}}}

Out[16]= 1

So I decided to do it on my own, and - as expected - find in level52 two clusters. One of two "similar energy-levels, and the rest of all others

In[29]:= len = Length[level52];
pairs = Flatten[Table[{level52[[i]], level52[[j]]}, {i, len - 1}, {j, i + 1, len}],1];
Select[pairs, df1[#[[1]], #[[2]]] == 1 &]

Out[31]= {{{6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.112, 13}}}
POSTED BY: Hans Dolhaine

I define a distance-function according to your proposal (although I don't see why you divide by the sum of errors: great errors make great differences in energy small)

For example a distance which is the difference in energy between the levels divided by the sum of their errors

df[x_, y_] := Abs[(x[[1]] - y[[1]])/(x[[4]] + y[[4]])]

Unfortunately this does not give a clustering of level52

In[68]:= FindClusters[level52, DistanceFunction -> (df[#1, #2] &)]
% // Length

Out[68]= {{{6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.112, 13}, {6485, 
   2.03, 0.0848, 15}, {6531, 0.78, 0.0579, 18}, {6584, 0.56, 0.0488, 
   21}, {6659, 0.83, 0.0483, 25}}}

Out[69]= 1

Interesting enough applying this distance-function to each pair of elements (produced by the Subsets-command) in level52 gives a set of numbers (distances), which are clustered in two parts:

In[83]:= Apply[df, Subsets[level52, {2}], {1}] // N
FindClusters[%]

Out[83]= {0.88, 2.03704, 3.36667, 4.66667, 6.18919, 1.17857, 2.54839, \
3.88235, 5.44737, 1.39394, 2.75, 4.35, 1.35897, 2.97674, 1.63043}

Out[84]= {{0.88, 2.03704, 1.17857, 2.54839, 1.39394, 2.75, 1.35897, 
  2.97674, 1.63043}, {3.36667, 4.66667, 6.18919, 3.88235, 5.44737, 
  4.35}}

But does df work at all in the Clustering-command?

If I produce a new set of data named level52a where there are added five datasets with an energy enhanced by 450 units the distance-function works: level52a is clustered in two sets consisting of 6 (the original data) and 5 (the new data) elements.

In[114]:= 
level52a = Join[level52, # + {450, 0, 0, 0} & /@ Take[level52, 5]];
FindClusters[level52a, DistanceFunction -> (df[#1, #2] &)]
Length /@ %

Out[115]= {{{6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.112, 13}, {6485, 
   2.03, 0.0848, 15}, {6531, 0.78, 0.0579, 18}, {6584, 0.56, 0.0488, 
   21}, {6659, 0.83, 0.0483, 25}}, {{6880, 0.93, 0.0808, 12}, {6902, 
   0.56, 0.112, 13}, {6935, 2.03, 0.0848, 15}, {6981, 0.78, 0.0579, 
   18}, {7034, 0.56, 0.0488, 21}}}

Out[116]= {6, 5}

Obviously df is not appropriate to cluster level 52a.

POSTED BY: Hans Dolhaine

Sorry, but it is by no means clear what you actually want to achieve.

What should your

Drop[level52, None, {2, 3, 4}] -> level52

do? In my system this just makes a Rule. Look at

FullForm [ Drop[level52, None, {2, 3, 4}] -> level52 ]

And then? What should be done with this rule?

Or do you want to eliminate some entries of level52? And if yes, which?

Is level52 one of your 200 or so energy-levels. Or do you want to cluster within level52?

It might be helpful if you gave more details, or a written example (in a notebook) what your intentions are. E.g. ten or 15 energy levels in a list, and the result you want to have.

POSTED BY: Hans Dolhaine

What do you expect? Perhaps this?

FindClusters[#[[1]] & /@ level52]
POSTED BY: Hans Dolhaine
Posted 6 years ago

Well, I would expect something like that:

FindClusters[{ {6430, 0.93, 0.0808, 12}, {6452, 0.56, 0.1120, 13}, {6485, 2.03, 0.0848, 15}},\\ first cluster, energies compatible within error 
   {{6531, 0.78, 0.0579, 18}, {6584, 0.56, 0.0488, 21}},\\ second cluster, energies compatible within error {{6659, 0.83, 0.0483, 25}} \\third cluster: energy not compatible within errors with others energies 
   }
POSTED BY: Andrea G.
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract