Group Abstract

Message Boards

WOLFRAM COMMUNITY

12.7K Views

7 Replies

12 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Data Science Operations Research Graphics and Visualization Import and Export Wolfram Language Modeling Statistics and Probability Events & Media

Investigation - power outages in the UK

Henry Jaspars

Posted 6 years ago

Introduction On Friday 9th August 2019, Britain suffered one of the worst power outages in recent years. Over 1 million people in England and Wales were cut off from power for nearly 9 hours, preventing numerous rail services from running during the rush hour. Was this a statistical misfortune, or a systemic failure which caused the recent power outages? Getting the data To obtain the information required to do analysis of the events leading up to the catastrophic power outage, we will we will use a fantastic website, GridWatch Templar, which has realtime data as to the different power supplies, resources and frequencies. From the download page, I downloaded all the data from 27th May 2011 to the current time. Be patient; it will take a considerable amount of time to download the entire dataset (and as it is too large, I have not attached this csv file to this page). Also, as we will be carrying out analysis on such a large dataset, some cells will take a considerable amount of time to run. Analysis of the frequencies First, we import the dataset: In[1]: data = Import["D:\\Programming\\GridWatch\\gridwatch.csv"]; The data which it contains: In[2]: data[[1]] Out[2]: {"id", "timestamp", "demand", "frequency", "coal", "nuclear", "ccgt", \ "wind", "pumped", "hydro", "biomass", "oil", "solar", "ocgt", \ "french_ict", "dutch_ict", "irish_ict", "ew_ict", "nemo", "other", \ "north_south", "scotland_england"} To look at the frequencies: In[3]: frequencies = If[#[[4]] == 0, Nothing, {DateObject[#[[2]]], #[[4]]}] & /@ data[[2 ;;]]; To show the minimum value: In[4]: MinimalBy[frequencies, Last] Out[4]: {{DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.], 48.889}} In[5]: DateListPlot[frequencies, PlotRange -> All, AxesLabel -> {"Time", "Hz"}] This is not particularly enlightening; although it does show how dramatic the recent drop of frequency is. In fact, it is known that if the frequency drops below 49.5 Hz, a blackout will occur - and this is the first time it has happened (at least since 2011). Let's dive deeper - we'll use a normal distribution in approximation to the distribution: In[6]: distribution = NormalDistribution[Mean[Last /@ frequencies], StandardDeviation[Last /@ frequencies]] Out[6]: NormalDistribution[50.0012, 0.0584802] Now plotting it gives: In[7]: Show[Histogram[Last /@ newfreqs, Automatic, "ProbabilityDensity"], Plot[PDF[distribution, x], {x, 49.7, 50.3}, PlotStyle -> Thick]] This returns: Observe how quickly this tapers off. Assuming the frequencies follow that normal distribution, the probability of this occurring is: In[8]: Probability[freq <= 49.5, freq \[Distributed] distribution] Out[8]: 5.119110^-18 So we can be very confident that this was not just a statistical mishap - there was a genuine, systemic cause for this event to occur (see the comment below for explanation of this). Now let's take a closer look at what happened that day. Focusing in further A National Grid spokesperson said: The root cause of yesterdays issue was not with our system but was a rare and unusual event, the almost simultaneous loss of two large generators, one gas and one offshore wind, at 4.54pm. We are still working with the generators to understand what caused the generation to be lost. (courtesy of the the Guardian for this report). We will verify this report below. First, we will zoom in on the interval between one day before and one day after the event occurred. In[9]: timeinterval = Select[data[[2;;]], DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`] - Quantity[1, "Days"] <= DateObject[#[[2]]] <= DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`] + Quantity[1, "Days"] &] (This specific time interval data is attached to this post). To visualise it the frequencies: In[11]: timeintervalfreqs = {DateObject[#[[2]]], #[[4]]} & /@ timeinterval; In[12]: DateListPlot[Out[51], PlotRange -> All, Epilog -> {Red, Text[Style["\[SixPointedStar]", 20], #] & /@ {{DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`], 48.889`}}}] which shows: Now let's look at the specific types of energy generation: In[13]: DateListPlot[ Table[{DateObject[#[[2]]], #[[k]]} & /@ timeinterval, {k, 5, 21}], PlotLegends -> data[[1]][[5 ;;]], PlotRange -> All] which returns Now let's reduce down to the two energy sources which were mentioned in the report: namely, gas (or CCGT, OCGT), and offshore wind. In[14]: indices = {3, 7, 8, 14}; DateListPlot[ Table[{DateObject[#[[2]]], #[[k]]} & /@ timeinterval, {k, indices}], PlotLegends -> data[[1]][[indices]], PlotRange -> All, Epilog -> {Directive[{Thick, Red, Dashed}], Line[{{DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`], 0}, {DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`], 40000}}]}] This clearly shows a drop in CCGT and wind power (and also the demand) subsequent to the incident. Finding peaks for the CCGT data to find when the drop occurs: In[15]: ccgtdata = {AbsoluteTime[#[[2]]], -#[[8]]} & /@ timeinterval; In[16]: inter = Interpolation[ccgtdata[[All, 1]]]; In[17]: peaks = FindPeaks[ccgtdata[[All, 2]]]; In[18]: peak = Select[{inter[#1], #2} & @@@ peaks, #[[1]] >= AbsoluteTime[ DateObject[{2019, 8, 9, 15, 55, 37}, "Instant", "Gregorian", 1.`]] &][[1, 1]] (This calculates the first trough since the event took place) Out[18]: 3774356135 This means that, 20 minutes after the frequency collapses, the CCGT (and the Wind power) supply decrease dramatically. Conclusion At 3:55 PM GMT +1 (so really 4:55 PM, confirming the report - this is on account of BST) on Friday the 9th of August, for the first time since 2011 (and probably many years before), frequency reached levels below 49.5 Hz, at which point blackouts would take place, affecting nearly 1 million people and disrupting multiple rail services across England and Wales. According to the National Grid, this had almost happened 3 times before this one took place; our data can confirm this (see the graph below input 5). We can be certain that this was not a statistical oddity; in fact, around 20 minutes later, we begin to see that the CCGT and offshore wind supplies decrease, which confirms the report stating that both had cut out simultaneously. Hopefully this will never happen again. With this dataset at hand, it seems likely that these outages could be more foreseen in the future. Attachments:* data.csv

POSTED BY: Henry Jaspars

7 Replies

Sort By:

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 6 years ago

- Congratulations! This post is now featured in our Staff Pick column as distinguished by a badge on your profile of a Featured Contributor! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

POSTED BY: EDITORIAL BOARD

Henry Jaspars

Posted 6 years ago

Good question - this is from a statistical hypothesis test (here, a one-tailed z-test, with p-value). Assume the frequencies, `F`, are similar to the distribution `NormalDistribution[50.0012, 0.0584802]`. Then we have two hypotheses: H0: This was caused by a statistical misfortune (the null hypothesis) H1: This was caused by a process problem (the alternative hypothesis) We will test this at the 99.99% significance level (so for us to be certain H1 was the case, the probability this would occur if the process was in statistical control must be below `? = 0.0001`). The probability this was an accident, assuming the process was in control, was `p = Prob[F < 49.5] = 5.1191*10^-18 < 0.0001 = ?`. So we have sufficient evidence, at the 99.99% significance level, to reject H_0. Therefore, we can be almost (but never completely; such is the way of statistics) certain that this could not be a statistical misfortune. So the process was not in control at the time of the blackouts.

POSTED BY: Henry Jaspars

Ahmed Elbanna

Ahmed Elbanna, Wolfram Research

Posted 6 years ago

Thank you for your detailed answer. Some parts still not sufficiently explained, for example you are using a variable called `newfreqs` when you are comparing the distribution of the frequencies with Normal distribution instead of the usual variable `frequencies` but you didn't show what this `newfreqs` is. I am asking about that because the frequencies are not normally distributed h = DistributionFitTest[Last /@ frequencies, Automatic, "HypothesisTestData"] h["TestDataTable", All] Long story short, I think it is too early to judge that the blackout is not by chance specially that it happened only twice in 8 years, so using the word systematic is harsh. You still can be write but it needs more investigation I think.

POSTED BY: Ahmed Elbanna

Henry Jaspars

Posted 6 years ago

With regards to the variable names, I have just amended that. Thank you for pointing that out! Although I agree that a normal distribution is not perfect here, observe how far the 49.5 Hz reading lies from the mean: In[1]: (Mean[Last/@frequencies] - 49.5)/StandardDeviation[Last/@frequencies] Out[1]: 8.57123 Regardless of choice of distribution, 8.57 standard deviations from the mean is a remarkable outlier. Which distribution would you suggest as opposed to the normal (I considered Student's T, but this was not successful)?

POSTED BY: Henry Jaspars

Kapio Letto

Kapio Letto, Consalting

Posted 6 years ago

Have you tried using FindDistribution? http://reference.wolfram.com/language/ref/FindDistribution.html

POSTED BY: Kapio Letto

Henry Jaspars

Posted 6 years ago

I had already tried this, but was not pleased with the results. In[1]: dist = FindDistrubution[Last/@frequencies] Out[1]: StudentTDistribution[50.0054, 0.0739269, 8.64764] In[2]: Show[Histogram[Last /@ frequencies, Automatic, "ProbabilityDensity"], Plot[PDF[dist, x], {x, 49.7, 50.3}, PlotStyle -> Thick]] which gave: which doesn't seem like a particularly good fit. Others from the best solutions do not seem particularly accurate either.

I had already tried this, but was not pleased with the results.

In[1]: dist = FindDistrubution[Last/@frequencies]

Out[1]: StudentTDistribution[50.0054, 0.0739269, 8.64764]

In[2]: Show[Histogram[Last /@ frequencies, Automatic, "ProbabilityDensity"], 
 Plot[PDF[dist, x], {x, 49.7, 50.3}, 
  PlotStyle -> Thick]]

which gave:

graph

which doesn't seem like a particularly good fit. Others from the best solutions do not seem particularly accurate either.

POSTED BY: Henry Jaspars

Ahmed Elbanna

Ahmed Elbanna, Wolfram Research

Posted 6 years ago

Nice post. I still don't see from where did you get the conclusion that the blackout is not just a statistical mishap. Can you explain it more to me please?

POSTED BY: Ahmed Elbanna

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback

Investigation - power outages in the UK

Introduction

Getting the data

Analysis of the frequencies

Focusing in further

Conclusion