Message Boards Message Boards

0
|
8176 Views
|
10 Replies
|
9 Total Likes
View groups...
Share
Share this post:

Filtering Missing from Data

Posted 9 years ago

What am I doing wrong? I am still getting Missing data in my result.

Select[FinancialData["GE", {{2012, 5, 1}, {2012, 9, 30}}][[All]], FreeQ[#, Missing] &]
POSTED BY: Clinton Osborn
10 Replies

Hi,

I cannot reproduce that. If I download the data you say, I do not get any missing data at all. If however, I download this:

WeatherData["EEKA",  "WindSpeed", {{2004, 7, 12}, {2004, 7, 30}}] // Normal

I do get missing data. In that case your function:

Select[WeatherData["EEKA", "WindSpeed", {{2004, 5, 1}, {2004, 9, 30}}] // Normal, FreeQ[#, Missing] &]

appears to work as expected. If you want you can export the time series that you get and attach it to a post in this thread. I'll try with that then.

Cheers,

M.

POSTED BY: Marco Thiel

Clinton why do you have [[All]] - is it a typo?

Anyway generally these functions can be useful for you, in addition to Marco's answer:

?*Missing*

enter image description here

POSTED BY: Sam Carrettie

In my stumbling around to see what I was doing wrong, I changed the code from [[All,2]] to [[All]].

Thank you.

POSTED BY: Clinton Osborn

You are good at this. I was trying to use some weather data examples to do finance.

Thank you.

POSTED BY: Clinton Osborn

Well, as I said, I could not reproduce your problem using Financial data, because the function behaved differently from what you described. When I downloaded the data, there was not a single missing data point so I could not test your function to delete the missing points.

I then tried to download the financial data for different time periods, but no luck. I always got all data with not a single point missing.

As the question appeared to be mainly about deleting missing data points I thought trying on weather data might work. There it is easy to find Missing data points. So I downloaded one of those and found that your function to clean the data does appear to work. The structure of the data sets is quite similar and there should not be any difference in result of the application of the function you use to delete the missing data.

That function appeared to work just fine, so I believe that it should work for other data sets of the same form, too. It looked as if your question was not about financial data but rather about deleting entries with the head Missing[].

Cheers,

M.

POSTED BY: Marco Thiel

I get a lot complaints trying to do something like this

dkf = FinancialData["GE", {{2012, 5, 1}, {2012, 9, 30}}];
data = TemporalData[
   dkf
   ]["Part", All, {{2012, 5, 1}, {2012, 9, 30}}]
epoc = EstimatedProcess[data, ARIMAProcess[{40, 2, 0}]]
forecasr = 
 TimeSeriesForecast[epoc, 
  FinancialData["GE", {{2012, 5, 1}, {2012, 9, 30}}], {20}]
DateListPlot[{data["Path"], forecasr["Path"]}, Joined -> True, 
 Filling -> Axis, PlotLabel -> "GE", ImageSize -> 350]
Attachments:
POSTED BY: Clinton Osborn

Hi,

I am not sure whether that question is still related to the MissingData issue. There are some problems in that code, for example that there is a change between scalar data and data with dates - as needed for the DateListPlot.

The following might work:

(*Download the data*)

dkf = FinancialData["GE", {{2012, 5, 1}, {2012, 9, 30}}];

(*Make data scalar and fit time series*)

data = dkf[[All, 2]];
epoc = EstimatedProcess[data, ARIMAProcess[{a, b, c}, d, {e, f}, g]];

(*Forecast based on the model*)

forecasr = TimeSeriesForecast[epoc, data, {20}];

(*Join data and forecast*)

datapred = Join[data, Flatten[forecasr["Paths"], 1][[All, 2]]];

(*Generate dates for the DateListPlot; the BusinessDay option requires MMA10.0.2*)

dates = DateRange[dkf[[1, 1]], DayPlus[dkf[[-1, 1]], 31], "BusinessDay"];

(*Then Plot*)

DateListPlot[{Transpose[{dates, datapred}][[1 ;; Length[dkf]]], Transpose[{dates, datapred}][[Length[dkf] ;;]]}, Joined -> True, Filling -> Axis, PlotLabel -> "GE", ImageSize -> 350]

The result looks like this:

enter image description here

There is no problem with MissingData, and I am not sure whether this relates to the original post.

Cheers,

M.

POSTED BY: Marco Thiel

OK, we did vary from the original subject. But I did learn something. Getting "curated" data is not easy for a beginner. You provided some code that I have not seen in the demonstrations and documentation.

Thank You

POSTED BY: Clinton Osborn

I see your point. On the other hand the problem that you encountered is mostly because the time series is to equidistantly sampled, because there are no data points on the weekends. So getting the data is rather straight forward, but the sampling needs to be taken into consideration. I think that that's more of a data analysis problem, rather than a curated data problem.

To fix the sampling issue I separated the "data from the dates" , then did the forecast bit, and then added the dates in again. You could use several alternative strategies, e.g. use the Function TimeSeriesResample you can also fix the problem; and I only have to work with TimeSeries data. The idea is like this:

dkfresamp = TimeSeriesResample[dkf];

I then fit a model (which in this case is not a good one!):

model = EstimatedProcess[dkfresamp, ARIMAProcess[{a, b, c}, d, {e, f}, g]]

I then forecast

forecast2 = TimeSeriesForecast[model, dkfresamp, {20}, Method -> "Kalman"]

and plot:

DateListPlot[{dkfresamp, forecast2}]

enter image description here

Of course the fit is quite unsatisfactory in this case, but the program runs through without any error messages and I do not need to split the dates from the values.

Best wishes,

M.

PS: If you want the time series in a standard date, value format you can use:

dkfresamp2 = {DateList[#[[1]]], #[[2]]} & /@ dkfresamp;
POSTED BY: Marco Thiel

The information that you have provided is excellent and something that I have had trouble getting out of the tutorials. Even the last post even though flawed will be educational to me once I figure out what it means.

You have been very kind and more help than I deserve. I am just an old man enjoying the latest technical developments. I consider Mathematica to be revolutionary. I am trying to convince everyone else.

Thank you.

POSTED BY: Clinton Osborn
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract