Message Boards Message Boards

0
|
11458 Views
|
10 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Finding Peaks with Signal Data

Posted 8 years ago

Hi All,

I have a question regarding FindingPeaks with Signal Data in ListLinePlot. I have two columns with data and am trying to find the Peaks of the second column (the first column are the x-data in seconds). The outcome of the number of peaks are = 32, but only 1 peak showed on the plot at 1 second. What is wrong? Please can you help?enter image description here

See file attached. Thanks.

data=test[[All,2]]; peaks=FindPeaks[data];First/@peaks;Length[peaks]; ListLinePLot[test,Epilog->{Red,PointSize[0.03],Point[peaks]},PlotStyle->Directive[Green,Thin]}]
Attachments:
POSTED BY: Jos Klaps
10 Replies

Hi Jos,

the way you calculate the peaks you get a result in the form {{index (!), value}, ...}. One possibility is to define your data as a time series and do it like so (compare with documentation on FindPeaks - under "Details and Options"):

ts = Transpose[test];
peaks = Normal@FindPeaks[TimeSeries[ts[[2]], {ts[[1]]}]]; 

Regards -- Henrik

POSTED BY: Henrik Schachner

Hi Henrik,

I'm very pleased with your support and tips. This is what I'm looking for. Thank You.

Regards,......Jos

POSTED BY: Jos Klaps

I have a 1-D list of data with about 3 million elements. The data looks like noise to the naked eye, but it is not noise. If I apply FindPeaks to a 60k-element sub-list Taken from somewhere in the midst of the original data, FindPeaks[sublist,315,0] does a pretty decent job of finding the peaks at the scale I need, finding 22 peaks. If I do FindPeaks[originallist,315,0], I get 25 peaks and none are in the subset. The found peaks are clustered at the front end of the original list. Any clues what is going on? I wish I could share the notebook, but I'm not sure how to do that with 3M points.

POSTED BY: Richard Klopp

If you decimate the data (literally), and maybe use the first third of that, does the issue persist? If so, it would be with 100K elements, which might now be small enough to put into an attached notebook.

POSTED BY: Daniel Lichtblau

See attached notebook with 1/12 the data. I am sure there is an explanation, but I don't understand the behavior difference between FindPeaks[ , 400], 500, and 600. Perhaps I am simply asking too much of FindPeaks. FindPeaks[ ,500] gets closest to the result I am seeking, in terms of scale.

Attachments:
POSTED BY: Richard Klopp

Dear Richard,

I suppose that the main problem is that there are so many repeated values in your time series. You might want to take that into account somehow. If we define a maximum as a value that is larger than the left and right neighbour then we might run into trouble if there are too many repeated values. So we might first want to identify the repeated values:

testData = Flatten[Import["~/Desktop/temptest.txt", "Data"]];
summarydata = {#[[1]], Length[#]} & /@ Split[testData]

The Split function splits the data into groups of identical values. The first bit takes the value and the length of a sublist, so we generate a value and the number of repetitions. The output looks like this:

enter image description here

Now we can ignore the repetitions and just look at consecutively different values, i.e. just use values of consecutive plateaus:

summarypeaks = FindPeaks[summarydata[[All, 1]], 1]

Again this does to perform rather well:

enter image description here

We could now decide to define a "peak" as the first value of a plateau, if the plateau before and after have lower values:

finalpeaks = {Total[summarydata[[1 ;; #[[1]] - 1, 2]]] + 1, #[[2]]} & /@ summarypeaks

enter image description here

Show[ListPlot[testData], ListPlot[finalpeaks, PlotStyle -> Red]]

enter image description here

There is, however, something which I find quite weird about how FindPeak reacts to the following proposition. If the problem is caused by multiple repeated values, adding Noise should fix some that. It would create too many peaks, but should give quite uniformly distributed peaks over the entire dataset. But if I use:

Show[ListPlot[testData], ListPlot[FindPeaks[testData + RandomVariate[NormalDistribution[0, 0.1], 259839]], PlotStyle -> Red]]

I get:

enter image description here

which is slightly unexpected. Perhaps @Daniel Lichtblau can help me out?

Cheers,

Marco

PS: Alternatively, I can of course use MovingAverage:

smoothed = N[MovingAverage[testData, 4000]];
Show[ListPlot[smoothed], ListPlot[peaks, PlotStyle -> Red]]

enter image description here

POSTED BY: Marco Thiel

Another thing is that the time series appears to be hopelessly oversampled. Everything becomes much faster if you first resample:

Show[ListLinePlot[MovingAverage[ArrayResample[testData, 2500], 50]], 
 ListPlot[FindPeaks[MovingAverage[ArrayResample[testData, 2500], 50], 25], PlotStyle -> Red]]

enter image description here

One can put a bit more effort (calculate window sizes in the resampled data) into this, but basically one can then also determine where the peaks are for the original sampling:

peaksresample = N /@ FindPeaks[MovingAverage[ArrayResample[testData, 2500], 50], 25];
{Length[testData]/2500.*#[[1]], #[[2]]} & /@ peaksresample

Cheers,

Marco

POSTED BY: Marco Thiel

Marco Thank you, this is extremely helpful. Funny you mention over-sampled. You should see the original data, which is sampled 12 times even more frequently. This is telemetry data from a large machine, and high-frequency sampling is required to resolve certain mechanical responses, but certainly not thermal responses. The data example I supplied here is an example of a thermal response. Kindest regards,

Rich

POSTED BY: Richard Klopp

Dear Richard,

that is indeed, interesting. And the mechanical responses might be on a much faster time scale than the thermal ones. A long time ago I looked at data from milling and had rather high resolution measurements. I believe that Mathematica should be able to deal with this kind of data. (You might use alternatives to the Import function though...)

Do you need the results in (near) real-time?

Best wishes,

Marco

POSTED BY: Marco Thiel

No need for real-time output. All of this analysis is after-the-fact regarding a plant upset.

POSTED BY: Richard Klopp
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract