Message Boards Message Boards

0
|
5293 Views
|
2 Replies
|
1 Total Likes
View groups...
Share
Share this post:

Forecast timeseries with Zeros / Gaps in between?

Posted 3 years ago

Good afternoon,

I wonder how to forecast or predict values for a "Time Series", when the existing data from the past has gaps, or Zeros. - Should I use "Event Series"? - Or Predict? - Or change the Prediction Method of the Time Series?

If we have data consisting of pairs of date and an amount (e.g. Sales, or kilometers run on that date), how can we create a "Forecast", if that data is not continuous? (Example Data below.)

I tried to create a "Event Series", but it seems as if TimeSeriesModelFit cannot be used on EventSeries (?). But when creating a TimeSeries instead, the values for each day/date that is not given is interpolated (= not zero).

To avoid that, I filled the dates where "nothing happened" with Zeros. But not only does this make the plot look horrible, it also then uses the Moving Average as default calculation for a TimeSeriesModel. Resulting in a Forecast that is a horizontal line.

Should I just override that? Or what else can I do to build a forecast with data where, at some days, the value is Zero? (Actually, where most dates don't even exist in the list.)

(Sorry, all this TimeSeries stuff is still new to me.)

Thanks a lot for your help already! Oliver

Example Data, from attached Notebook:

{{DateObject[{2010, 1, 4}, "Day", "Gregorian", 2.], 453.}, {DateObject[{2010, 1, 5}, "Day", "Gregorian", 2.], 511.}, 
 {DateObject[{2010, 1, 6}, "Day", "Gregorian", 2.], 493.}, {DateObject[{2010, 1, 7}, "Day", "Gregorian", 2.], 530.}, 
 {DateObject[{2010, 1, 8}, "Day", "Gregorian", 2.], 449.}, {DateObject[{2010, 1, 11}, "Day", "Gregorian", 2.], 484.}, 
 {DateObject[{2010, 1, 12}, "Day", "Gregorian", 2.], 518.}, {DateObject[{2010, 1, 14}, "Day", "Gregorian", 2.], 533.}, 
 {DateObject[{2010, 1, 15}, "Day", "Gregorian", 2.], 465.}, {DateObject[{2010, 1, 18}, "Day", "Gregorian", 2.], 456.}, 
 {DateObject[{2010, 1, 20}, "Day", "Gregorian", 2.], 455.}, {DateObject[{2010, 1, 21}, "Day", "Gregorian", 2.], 473.}, 
 {DateObject[{2010, 1, 22}, "Day", "Gregorian", 2.], 501.}, {DateObject[{2010, 1, 27}, "Day", "Gregorian", 2.], 454.}, 
 {DateObject[{2010, 1, 29}, "Day", "Gregorian", 2.], 497.}, {DateObject[{2010, 2, 1}, "Day", "Gregorian", 2.], 508.}, 
 {DateObject[{2010, 2, 2}, "Day", "Gregorian", 2.], 497.}, {DateObject[{2010, 2, 3}, "Day", "Gregorian", 2.], 520.}, 
 {DateObject[{2010, 2, 4}, "Day", "Gregorian", 2.], 460.}, {DateObject[{2010, 2, 5}, "Day", "Gregorian", 2.], 464.}, 
 {DateObject[{2010, 2, 12}, "Day", "Gregorian", 2.], 536.}, {DateObject[{2010, 2, 15}, "Day", "Gregorian", 2.], 563.}, 
 {DateObject[{2010, 2, 17}, "Day", "Gregorian", 2.], 495.}, {DateObject[{2010, 2, 22}, "Day", "Gregorian", 2.], 539.}, 
 {DateObject[{2010, 2, 23}, "Day", "Gregorian", 2.], 560.}, {DateObject[{2010, 2, 26}, "Day", "Gregorian", 2.], 547.}, 
 {DateObject[{2010, 3, 2}, "Day", "Gregorian", 2.], 482.}, {DateObject[{2010, 3, 9}, "Day", "Gregorian", 2.], 537.}, 
 {DateObject[{2010, 3, 10}, "Day", "Gregorian", 2.], 501.}, {DateObject[{2010, 3, 12}, "Day", "Gregorian", 2.], 538.}, 
 {DateObject[{2010, 3, 18}, "Day", "Gregorian", 2.], 582.}, {DateObject[{2010, 3, 22}, "Day", "Gregorian", 2.], 578.}, 
 {DateObject[{2010, 3, 24}, "Day", "Gregorian", 2.], 590.}, {DateObject[{2010, 3, 25}, "Day", "Gregorian", 2.], 516.}, 
 {DateObject[{2010, 3, 29}, "Day", "Gregorian", 2.], 506.}, {DateObject[{2010, 3, 31}, "Day", "Gregorian", 2.], 554.}, 
 {DateObject[{2010, 4, 1}, "Day", "Gregorian", 2.], 533.}, {DateObject[{2010, 4, 5}, "Day", "Gregorian", 2.], 505.}, 
 {DateObject[{2010, 4, 6}, "Day", "Gregorian", 2.], 586.}, {DateObject[{2010, 4, 8}, "Day", "Gregorian", 2.], 505.}, 
 {DateObject[{2010, 4, 9}, "Day", "Gregorian", 2.], 527.}}
Attachments:
POSTED BY: Oliver Ruessing
2 Replies
Posted 3 years ago

Hi Oliver,

If there is no data available for a date then that date should not be present in the input. Using a value of zero is not a good idea. Let TimeSeriesModelFit resample the data or resample it yourself and interpolate between the missing values.

tsm = TimeSeriesModelFit[notZero, {"ARMA", {25, 5}}];
DateListPlot[{tsm["TemporalData"], TimeSeriesForecast[tsm, {15}]}]

There are quite a few gaps so interpolating would help. Depending on the nature of the data you can pick suitable values for the model family and parameters.

POSTED BY: Rohit Namjoshi

Hello Rohit,

thanks a lot for your reply, much appreciated.

Actually, now I see that my approach was not perfect: What's really needed is a Time Series Model of the Moving Average. (Or some other Mean.) Not of the spaced data itself.

The idea is to predict values for the future, e.g. Sales in units, or Kilometers run on a day. If we interpolate data that has a lot of gaps (e.g. because I go for a run only two times a week), the resulting interpolated prediction will have a much too high Total sum of kilometers (or sold units) for that week. (Because, due to the interpolation, it will appear as if I'm running every day.)

So instead, some kind of Mean or Average from the spaced data has to be created, which is naturally much lower than the interpolated data. And then, for this (Moving) Average, a TimeSeriesModel or Predictor Function has to be created.

That's a rough description, I hope it conveyed the meaning. I'll try to add the result later once I finished the code & a few graphs.

POSTED BY: Oliver Ruessing
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract