Message Boards Message Boards

0
|
4313 Views
|
4 Replies
|
4 Total Likes
View groups...
Share
Share this post:

Function to turn Dataset columns into time series by year?

Posted 2 years ago

I'm dabbling with a workbook to watch snow pack in Utah where I live. I'm able to get the data and can get the visualization that I want, but expanding it out to include past year would get tedious quickly.

I have a function that allows for passing the year as a variable them using it to select both the column and set the date range. The dates start Oct. 10 each year as that is when the water year starts. So water year 2022, started Oct. 10, 2021.

Here is what I have so far:

url = "https://www.nrcs.usda.gov/Internet/WCIS/AWS_PLOTS/basinCharts/\
POR/WTEQ/assocHUCut3/state_of_utah.csv";

dataset = Import[url, "Dataset", HeaderLines -> 1]

ts0 = TimeSeries[Normal@Values@dataset[All, {ToString[ToExpression[#] - 1]}], {"Oct 10 " <> #}] & /@ {"2020", "2021", "2022"}

DateListPlot[ts0, PlotRange -> All]

I tried a few different things, but can't figure out a good way to capture the output in discrete variables. The dates line up right, but I would like to capture each as a separate variable but do it programmatically so I can change the date range without have to set all the variables. Like this without having to set the output list. I would like to be able to process as an event service at time so I can overlap them and see how one year tracks compared to the next. That part of still work in progress. Any insight are welcome.

{ts0,ts1,ts2} = TimeSeries[Normal@Values@dataset[All, {ToString[ToExpression[#] - 1]}], {"Oct 10 " <> #}] & /@ {"2020", "2021", "2022"}

DateListPlot[{ts0, ts1, ts2}, PlotRange -> All]
Attachments:
POSTED BY: Michael Madsen
4 Replies
Posted 2 years ago

Another alternative that creates an Association, but could easily extend to Dataset

yKeys = Keys[dataset[1]] // Normal // 
   StringCases[StringExpression @@ Table[DigitCharacter, 4]] // Flatten

assn1 = With[{y = #, yTS = ToString[ToExpression@# - 1]},
      y -> dataset[TimeSeries[#, {"Oct 1 " <> yTS}] &, y]
      ] & /@ yKeys // Association;

DateListPlot[Lookup[assn1, {"1981", "2020", "2021", "2022"}], 
 ImageSize -> Large]

Custom Years

DateListPlot[Lookup[assn1, yKeys], ImageSize -> Large]

All Years

POSTED BY: David G
Posted 2 years ago

Thank you both of those responses have been helpful. Yes, the date should start on October 1st.

POSTED BY: Michael Madsen
Posted 2 years ago

As an alternative to Eric's answer, you could generate an Association to store the timeseries for each year. e.g.

years = dataset // First // Keys // Normal // Take[#, {2, -10}] &
timeSeries = 
 AssociationMap[
  TimeSeries[
    Normal@Values@dataset[All, {#}], {"Oct 10 " <> ToString[ToExpression[#] - 1]}] &, years]

Then pick the years you want to plot

KeyTake[timeSeries, {"2010", "2020"}] // Values // DateListPlot

Or as individual plots

KeyTake[timeSeries, {"2010", "2015", "2020"}] // 
 KeyValueMap[DateListPlot[#2, PlotLabel -> #1] &]

Also instead of "Oct 10", shouldn't it be "Oct 1" since the date column starts at 10-01?

POSTED BY: Rohit Namjoshi
Posted 2 years ago

I'm not quite understanding your desire. You said, "I would like to capture each as a separate variable but do it programmatically so I can change the date range without have to set all the variables". So, I'm confused about whether you need individual variables or not.

Using the code you've already built, you can just do:

DateListPlot[
 TimeSeries[
    Normal@
     Values@
      dataset[
       All, {ToString[
         ToExpression[#] - 1]}], {"Oct 10 " <> #}] & /@ {"2019", 
   "2020", "2021", "2022"}]

If you don't want to keep adding years one-by-one, you could build a helper function for that. Something like this:

YearRange[start_, stop_] := ToString /@ Range[start, stop]

If you need the TimeSeries data stored in a variable, just assign that expression to a variable. The variable will contain a List, which can then be accessed by position (if you need the individual year data time series data). Or you just use the variable as a way to simplify the DateListPlot expression. Something like this:

plotData = 
  TimeSeries[
     Normal@
      Values@
       dataset[
        All, {ToString[ToExpression[#] - 1]}], {"Oct 10 " <> #}] & /@ 
   YearRange[2019, 2022];

DateListPlot[plotData]
POSTED BY: Eric Rimbey
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract