# [WSG23] Daily Study Group: Introduction to Probability

Posted 1 month ago
3624 Views
|
180 Replies
|
60 Total Likes
|
 A Wolfram U Daily Study Group on Introduction to Probability begins on February 27th 2023. Join me and a group of fellow learners to learn about the world of probability and statistics using the Wolfram Language. Our topics for the study group include the characterisation of randomness, random variable design and analysis, important random distributions and their applications, probability-based data science and advanced probability distributions.The idea behind this study group is to rapidly develop an intuitive understanding of probability for a college student, professional or interested hobbyist. A basic working knowledge of the Wolfram Language is recommended but not necessary. We are happy to help beginners get up to speed with Wolfram Language using resources already available on Wolfram U.Please feel free to use this thread to collaborate and share ideas, materials and links to other resources with fellow learners.REGISTER HERE Attachments:
180 Replies
Sort By:
Posted 3 days ago
 In slide 8 for Lesson 16 it says that "From Newtonian physics, the horizontal distance function is v^2Sin[alpha]Cos[alpha]/g" . There should be a factor of 2 in the numerator. The rest of the slide correctly includes the factor of 2 in the calculations.
Posted 2 hours ago
 Hello Joseph,Thank you for noticing, this will be corrected shortly.
Posted 3 days ago
 Is there any significant reason Lesson 25 Exercise 1 is computed differently than for the Buffalo snowfall example given in the lesson? The exercise estimates a distribution, while the lesson example takes the mean and standard deviation of the data. This results in slightly different standard deviations: {Mean@data, StandardDeviation@data} // N EstimatedDistribution[data, NormalDistribution[\[Mu], \[Sigma]]] {33442.1, 4893.54}NormalDistribution[33442.1, 4844.36]Is there any meaningful difference?
Posted 1 hour ago
 Hello, Well, there is definitely a difference and its technical and significant.Basically, in the lesson, the emphasis is that the normal distribution is shown for the student to see the division of the standard deviation by the root of number of samples. You are asked for the probability of a mean, thus you "created" the sample distribution and your probability is only applied to means, but you initial data was not means.In the exercise you mention, your data is already a set of means. Therefore, you can just estimate the data and you have a sample distribution. In that case you don't use the factor of square root of the number of samples and you get the right answer.This all has to do with what your initial data is. Is it a mean, or raw data from a simple element, individual or such?I hope I made this clear.
Posted 3 days ago
 Howdy Marc,One of the questions on the version of the final exam appears to be missing the referenced "sample" code,StyleBox[Cell["What is the estimated variance of the savings ratio in the \:201cSample Data: Life Cycle Savings\:201d dataset? The dataset is normally distributed. Use the following code to obtain the data.", ExpressionUUID -> "7d7e813b-1bac-405a-b85e-d2849831a823"], "ProblemCaption", StripOnInput -> False] Thanks, John
Posted 3 days ago
 Hi John,This is now fixed, thank you for noticing.
Posted 3 days ago
 I found what I think is a simpler solution to Lesson 24 Exercise 4: Nest[TransformedDistribution[ x + y, {x \[Distributed] #, y \[Distributed] #}] &, BinomialDistribution[2, p], 9] 
Posted 3 days ago
 Also for Exercise 5 in the same lesson: Nest[TransformedDistribution[ x + y, {x \[Distributed] #, y \[Distributed] #}] &, NormalDistribution[2, Sqrt@31/2], 2] 
Posted 3 days ago
 Hi Parker,You seem to be missing the point here. This is also a valid solution, but a much more obscure one. How would you explain the standard deviation have to be divided by 2 for the addition of nested distributions? This result is from the variance, but explaining it is difficult.The solution given restricts all standard deviations to be integer numbers, which is wildly unnecessary, but it facilitates the explanation and calculus.
Posted 3 days ago
 Hi,Indeed, that solution is much more efficient. However, the average student may not be familiar with recursion. The RecursionTable was used to show the recursive steps the function is going through throughout the process, to allow solving with complete understanding. But yes, purely for computation, the Nest function is better here.
Posted 5 days ago
 StyleBox[Cell[Cell[RowBox[{RowBox[{"Select", "[", RowBox[{RowBox[{"ExampleData", "[", RowBox[{"{", RowBox[{"\"Statistics\"", ",", " ", "\"USEarthquakes\""}], "}"}], "]"}], ",", " ", RowBox[{RowBox[{RowBox[{"#", "[", RowBox[{"[", "1", "]"}], "]"}], " ", "<", " ", "1900"}], " ", "&"}]}], "]"}], "[", RowBox[{"[", RowBox[{"All", ",", " ", "7"}], "]"}], "]"}], "InlineCode", ExpressionUUID -> "a4b6a146-104b-452e-829d-1f5fde573d7b"], ExpressionUUID -> "304430c9-d846-45ec-9492-5c906e03147f"], "ProblemCaption", StripOnInput -> False] Copy and paste doesn't seem to work in the final exam. I tried to to copy a line that starts out "Select[ExampleData" etc and got the above.
Posted 6 days ago
 On my exam, the following appeared: ​ LESSON11 Which distribution best describes the following data? Hint: use FindDistribution and EstimatedDistribution.But there was no data.
Posted 5 days ago
 Hello,We will correct this shortly, I'll post again to confirm it has been corrected.
Posted 4 days ago
 The issue has been addressed and corrected.
Posted 8 days ago
 In question 35 of the Practice exam, we consider the permutations of a group of letters to determine which are words. It seems that Mathematica finds an extra word if the letters are capitalized…. In[1]:= Tally[DictionaryWordQ/@StringJoin/@Permutations[{"r","e","s","e","t"}]] Out[1]= {{True,6},{False,54}} In[2]:= Tally[DictionaryWordQ/@StringJoin/@Permutations[{"R","E","S","E","T"}]] Out[2]= {{True,7},{False,53}} The word “STERE” is missing from the uncapitalized words...
Posted 7 days ago
 Hello Byron,Indeed, the answer would then be 7/60. This will be corrected. Thank you for noticing!
Posted 9 days ago
 On Exercise 1 of Exercises-23.nb, I believe the solution shows the upper bound for the probability that x<=13 or x>=37. Then 1 minus that probability gives a lower bound for P(13<=x<=37). The same thing happens in Problem 2 of Quiz 6.Also, for Problem 5 in Quiz 6, one answer shows the probability of x>260 using a Normal to approximate Discrete Data, but no answer shows for x>260+.5.On Exercises 1 and 2 from Exercises24.nb, the solutions use the variance instead of the standard deviation as the second parameter for NormalDistribution[].On exercise 2 of Excercises25, the solution uses different attributes.
Posted 7 days ago
 Hello Juan, This is new and will be corrected, indeed it should be a lower bound, 1-the probability. Why would this be +0.5? "At least" implies inclusion, so -0.5 is more appropriate. Right on! We will correct this! Personally this is one of my most reoccurring mistakes, I find it very counterintuitive to use standard deviation as a parameter instead of variance. Thank you for catching that. Definitely a small but significant mistake. We will correct this. Thank you a lot of all those corrections!
Posted 7 days ago
 Hello Juan, This is new and will be corrected, indeed it should be a lower bound, 1-the probability. Why would this be +0.5? "At least" implies inclusion, so -0.5 is more appropriate. Right on! We will correct this! Personally this is one of my most reoccurring mistakes, I find it very counterintuitive to use standard deviation as a parameter instead of variance. Thank you for catching that. Definitely a small but significant mistake. We will correct this. Thank you a lot of all those corrections!
Posted 9 days ago
 For exercises 10 problem 5, I don't see how the condition x>1.5 in the calculation of variance represents the problem statement of "greater that 15 given that it is at least 10. Attachments:
Posted 8 days ago
 Hello Joseph,There seems be two problems mixing here, so this will be corrected to: The distribution of values of the retirement package offered by a company to new employees is modeled by the probability density function 1/5 e^(-(1/5)(x-5)) for x>5. Calculate the variance of the retirement package value, given that the value is at least 10.The code would be: Expectation[x^2 \[Conditioned] x > 10, x \[Distributed] retirementDist] - Expectation[x \[Conditioned] x > 10, x \[Distributed] retirementDist]^2 This results in the same variance however, since this is condition independent.Thank you for noticing.
Posted 9 days ago
 Hi; Is the following set logic correct - the notes were not real clear? P(A \[Union] B)=P(A) + P(B) - P(A \[Intersection] B) P(A \[Intersection] B) = P(A') + P(B') - P(A' \[Intersection] B')  When creating a distribution using data from the repository, can the more recent data be weighted? I used the following to extract data; however, the FindDistribution function had a problem with the extracted data. Can you tell me what I am doing wrong. data = QuantityMagnitude@ Normal@ResourceData["Sample Data: Fisher's Irises"][ All, {"SepalLength", "SepalWidth", "PetalLength", "PetalWidth"}]; FindDistribution[data]  I finished the quizzes last Friday and the final yesterday. Some of the questions were quite challenging.All in all, I really enjoyed your presentation and learned a lot about using Mathematica in calculating probability. However, I still have a few remaining questions.Thanks again,Mitch Sandlin
Posted 8 days ago
 Hello Mitchell, The first is true, the second should be P(A' \[Union] B') = P(A') + P(B') - P(A' \[Intersection] B')  or  1- P(A \[Intersection] B) = P(A') + P(B') - P(A' \[Intersection] B') Could you tell me where you found this so that I may correct it? Yes, you can assign any functionality to a weighted dataset with WeightedData, for example: data = RandomReal[{-5, 5}, 10^4]; weightedData = WeightedData[data, PDF[NormalDistribution[], #] &] EstimatedDistribution[weightedData, NormalDistribution[\[Mu], \[Sigma]]]  Some functionalities, like FindDistribution, may not work with WeightedData, but most functionalities do. As mentionned when introducting FindDistribution, that function is only for univariate data. You may want to FindDistribution for each dimension (independence assumption) or use for example a general multinormal distribution with EstimatedDistribution (normal assumption, but dependence is allowed). I'm glad you enjoyed the course, I wish you the best!
Posted 7 days ago
 Hi Marc; Thanks so much for the reply. To answer your question, I am not sure where I copied the code from since I was using my notes.Now I am still not sure as to how the LHS of the statement is switched from a Union to an Intersection, in the 2nd Equation. For example, when you switch (A union B) to their complements (A',B'), does that switch the LHS to (A' intersection B')? I am confused as to how we are getting from a union to an intersection - "or" from an or relationship to an "and". P(A \[Union] B)=P(A) + P(B) - P(A \[Intersection] B) P(A' \[Union] B') = P(A') + P(B') - P(A' \[Intersection] B') 1- P(A \[Intersection] B) = P(A') + P(B') - P(A' \[Intersection] B') Thanks again,Mitch Sandlin
Posted 6 days ago
 Hello Mitch,In two words: De Morgan's! P(A' \[Union] B') = P((A'' \[Complement] B'')') (De Morgan's Law of Set Theory) P(A' \[Union] B') = P((A \[Complement] B)') (Double Negation of Set Theory) P(A' \[Union] B') = 1 - P(A \[Complement] B) (Complement Law of Probability Theory) Those laws are always useful, once in a while. I hope that helps.
Posted 10 days ago
 Dear All, Exam , Q11 may have missing data. G
Posted 8 days ago
 Hello,If you are referring to the mock exam, I can't see anything missing. If you are referring to the actual Final Exam, the numbering is random, so I will need more information to be able to pinpoint the question.
Posted 10 days ago
 Dear Marc,Will the course materials page contain all corrections to the Study Group notebooks e.g. Exercises? I returned from Spring Break. In this Community Page there are too many discussions to keep track of, so I prefer to download the latest notebooks to work from.Cheers,Dave
Posted 10 days ago
 The course materials have been updated with the corrections :)
Posted 3 days ago
 Thanks for taking care of this! You, guys, are the best! I also started my own course and with your help I overcame many problems.
Posted 8 days ago
 Hello,The lastest notebooks are on the course framework, or soon to be on the framework. If you find an error, just ctrl-find it on this page or suggest a change if you can't find it. But most mistakes were corrected in the framework by now. Hopefully that helps.
Posted 10 days ago
 How do we calculate the Kurtosis of a distribution coming from data, as problem 6 from quiz 3 asks?
Posted 10 days ago
 Hello Juan,As repeated throughout the course, it all comes down to your capacity to recognize where to apply each distribution. If you can recognize the situation is appropriate for a specific distribution, use EstimatedDistribution on that abstract distribution together with the data. Then extract your measures from that distribution. Now, if you only have data and no information otherwise, use FindDistribution and actually use the found distribution to take your measures.
Posted 10 days ago
 Marc:That is what I did on the problem and got PoissonDistribution[3.71429] which has a Kurtosis of 3.26923, which does not appear as a valid answer on Problem 6 from Quiz 3.
Posted 8 days ago
 Hello Juan,I'm not sure how you are getting that result. The output for: EstimatedDistribution[{1,3,6,0,3,4,5,7,4,2,11,1},PoissonDistribution[\[Lambda]]] is PoissonDistribution[3.91667] 
Posted 6 days ago
 That's interesting. The output for: FindDistribution[{1, 3, 6, 0, 3, 4, 5, 7, 4, 2, 11, 1}, TargetFunctions -> "Discrete"] is: PoissonDistribution[3.71429] 
Posted 5 days ago
 Hello Michael,Indeed! This may see weird at first but this comes down to the approach. If you know what shape you're facing for the PDF, then it becomes a problem of classical probability theory to find the parameters corresponding to that PDF. However, FindDistribution is much more complex and based on heuristics, which make it much more uncertain. It tries to imagine the data that is not there, using heuristics. For example, with minimal data, it will prefer the Uniform and Normal distribution as they are common, even when the shape is different. This is why the course tends to emphasize EstimatedDistribution, whenever you have just a little bit of info more than raw data.
Posted 10 days ago
 How do we calculate the variance for a Normal Distribution when it is not indicated? Problems 1 and 2 from Excercises-18.nb do it differently.
Posted 10 days ago
 Hello Juan,The normal distribution is the distribution of approximations, it is used to approximate two major distributions: the binomial and the Poisson distribution. In both approximation, just use the mean and variance of the exact distribution you're trying to approximate. If it's the binomial, take the binomial mean and variance for the normal. If it's Poisson, take the Poisson mean and variance.
Posted 10 days ago
 Thank you.
Posted 11 days ago
 For exercises08 exercise 2 , given as "What is its expectation of the function Binomial[10,i]/2^10 for 0<=x<=10?" What is the "its" referred to in the problem statement ? The solution seems to be calculating the expectation of x for the distribution Binomial[10,i]/2^10 .
Posted 11 days ago
 Hello Joseph,Indeed, it should be a "the", not an "its". Simple mistake, it will be corrected.Thank you.
Posted 13 days ago
 I also get a different output from Lesson 14 Exercise 4 with the same input as the exercise gives: EstimatedDistribution[{3, 3, 10, 6, 6, 4, Sequence[ 5, 9, 3, 4, 7, 4, 7, 10, 8, 5, 6, 7, 11, 10, 5, 9, 7, 8, 6, 5, 6, 7, 6, 8, 12, 9, 6, 3, 9, 5, 7, 5, 2, 9, 3, 5, 9, 9, 3, 5, 3, 8, 5, 6, 5, 4, 7, 10, 6, 7, 8, 8, 11, 9, 8, 8, 9, 3, 11, 8, 7, 10, 5, 4, 5, 10, 4, 8, 7, 7, 4, 3, 5, 10, 5, 4, 11, 5, 6, 10, 5, 7, 10, 11, 7, 5, 4, 7, 9, 5, 4, 5, 7, 5, 10, 11, 10, 5, 5, 7, 4, 7, 5, 4, 3, 4, 7, 10, 4, 8, 2, 7, 4, 4, 8, 4, 8, 8, 3, 9, 7, 7, 7, 7, 10, 5, 9, 8, 11, 6, 8, 7, 7, 8, 3, 6, 7, 6, 7, 8, 8, 7, 2, 3, 4, 9, 7, 7, 6, 4, 10, 6, 4, 8, 10, 7, 3, 10, 6, 6, 6, 5, 9, 7, 11, 6, 7, 1, 4, 8, 8, 5, 5, 2, 8, 6, 7, 7, 5, 5, 6, 5, 6, 2, 12, 7, 6, 5, 7, 5, 9, 6, 4, 8, 3, 8, 3, 7, 6, 3, 10, 6, 3, 6, 7, 8, 7, 3, 7, 4, 5, 4, 10, 8, 7, 10, 10, 7, 5, 9, 5, 4, 6, 4, 6, 11, 7, 9, 9, 6, 7, 4, 6, 7, 5, 5, 5, 5, 6, 4, 8, 4, 8, 7, 6, 4, 4, 5, 7, 8, 4, 2, 1, 5, 9, 2, 6, 11, 5, 4, 5, 12, 7, 7, 7, 0, 3, 7, 4, 6, 11, 5, 3, 5, 8, 4, 5, 2, 3, 8, 8, 6, 6, 1, 9, 4, 3, 8, 5, 4, 4, 5, 4, 5, 6, 6, 5, 7, 6, 1, 7, 3, 9, 8, 4, 8, 2, 9, 7, 13, 5, 5, 2, 8, 12, 8, 5, 5, 2, 3, 4, 9, 11, 5, 6, 10, 5, 5, 5, 6, 5, 4, 3, 8, 8, 12, 7, 7, 8, 11, 2, 9, 10, 5, 4, 2, 8, 9, 6, 8, 7, 6, 1, 8, 7, 9, 10, 5, 10]}, PoissonDistribution@\[Mu]] Variance@% PoissonDistribution[6.31507] 6.31507 
Posted 12 days ago
 Hello Parker,Indeed, I'm not sure what gave such a weird output. It will be corrected. Thank you for noticing.
Posted 13 days ago
 The values used in Lesson 14 Exercise 2 do not match the values given in the question. For that question I get the following input and output: Probability[x >= 25, x \[Distributed] PoissonDistribution[0.06 500]] // N Probability[x >= 25, x \[Distributed] BinomialDistribution[500, 0.06]] 0.842758 0.850619 Is this correct?
Posted 13 days ago
 On exercise 5 of Exercises7 the solution gives P(17=17).On exercise 4 of Exercises 8 the solution gives P(A)^3 which I believe is the probability of being absent two days for three months in a row. I calculated the answer as P(x+y+z>=2) where each is distributed by 1/(ex!).On exercises 13, I believe that more than one solution excludes equality when it should include it.Am I wrong in these?
Posted 12 days ago
 Hello Juan,All correct! Thank you for noticing. Indeed, your answer makes more sense given the formulation. It will be corrected. Indeed, the formulation is too ambiguous, it will be corrected to make it more clear what is needed. Yes, for exercise 2 and 4, there are some equalities missing to the final solution. Thank you!
Posted 12 days ago
 Problem 5 is stated as: "Consider a zoo visitor who arrives less than three hours before closing. How likely is that person to be able to stay for more than an hour?" What we have calculated by Probability[17 < x < 19, x \[Distributed] zooDist] is the probability that a visitor, among ll the visitors of the day, will arrive between 17 and 19. Based on the way the problem is stated, wouldn't it be more correct to calculate the probability that a visitor arriving between 17 and 20 (less that three hours before closing) actually arrives between 17 and 19. Based on this reasoning, shouldn't the correct approach be to calculate Probability[17 < x < 19, x \[Distributed] zooDist]/Probability[17 < x < 20, x \[Distributed] zooDist] ?
Posted 11 days ago
 Hello Joseph,This is indeed also correct, and is an equivalent formulation of: Probability[x < 19 \[Conditioned] x >= 17, x \[Distributed] dist] As mentioned previously in this post.
Posted 13 days ago
 I agree with you on both Robb.
Posted 12 days ago
 Hello Parker,Indeed, it seems the question and solution numbers don't correspond. This will be corrected.Thank you for noticing.
Posted 4 days ago
 Same question here. I obtained the same solution as shown in this post.
Posted 13 days ago
 Marc: I don't quite understand your statement "as soon as I have three balls the game is over". I did assume that once "you" draws a red ball the game is over. I reproduced the terms in your sum but if you carry the tree to the end you get some cases where there are 3 balls left and "you" have not yet drawn a red ball. I work this out in the attached notebook
Posted 13 days ago
 Hello Joseph,It seems the solution for this problem is wrong and too complicated. To avoid this, let's use the power of the Wolfram language.Here is the new solution.Let's compute all arrangements of balls with your friend, where red balls are negative and black balls are positive. We are only interested in the balls that are received by the player, so let's take the odd columns (odd rounds). possibilities = Permutations[{-1, -2, 1, 2, 3, 4}][[All, {1, 3, 5}]]; Now that we have the balls received in all probabilities, just measure the number of possibilities where we have at least one negative number against the total number of possibilities. Length@Select[possibilities, AnyTrue[#, Negative] &]/ Length@possibilities And we get 4/5. As mentionned by William Weller.
Posted 13 days ago
 Thanks for your response. I could not help thinking that there must be an easier way to solve this problem than by drawing a complex tree!
Posted 13 days ago
 WOW! I worked through your solution. It truly was an elegant tour de force!
Posted 13 days ago
 Hi Marc,I am reviewing "Mock Exam" notebook and have found that the Question 31 solution is not correct. Given U, the transformed distribution with U-0.3 must be stated as: Plot3D[{PDF[DirichletDistribution[{5, 3, 2}], {x, y}], PDF[TransformedDistribution[{a - 0.3, b - 0.3}, {a, b} \[Distributed] DirichletDistribution[{5, 3, 2}]], {x, y}]}, {x, 0, 1}, {y, 0, 1}, Filling -> Bottom, PlotRange -> All, PlotLegends -> {"U", "U-0.3"}] 
Posted 13 days ago
 Hello Hee-Young,Yes, it seems obvious that I forgot to adjust the question phrasing to U+0.3. Thank you for noticing, it will be corrected.
Posted 14 days ago
 Hi Marc, It seems there are problems in both Quiz 6 #1 and #6, both of which are calculating the probability by normal approximation to binomial distribution. Please check these two questions in the "framework". Attachments:
Posted 14 days ago
 Hello Hee-Young,For the first question, you seem to be thinking Probability and NProbability are entirely equivalent, but they are not. NProbability is and will forever be an approximation, and sometimes a bad approximation. For this course, I suggest sticking to Probability, or even N@Probability or Probability[...] //N, as this is more likely to be a better approximation. Question 1 is one of those examples, with the binomial distribution.As for question 6, I'm not sure what the issue is. You got the right answer with your normal approximation.
Posted 14 days ago
 Thanks for this explanation, Marc. I have tried again using N@Probability. The 1st question still indicates that my response is wrong. As for the 6th question, it seems that there is no correct answer in the multiple choice. I suspect that there is either a technical glitch or wrong syntax in the "Framework". Please do the double check. Best wishes,
Posted 12 days ago
 I found the issue! Thank you for noticing, it will be corrected.
Posted 15 days ago
 Hi Marc, Can you quickly check whether there is any technical error in your courseware ("framework")? I am sorry. I figured out my mistake. The correct syntax is: NProbability[x > 40, x \[Distributed] BinomialDistribution[150, 0.2633]]  Attachments:
Posted 17 days ago
 The Daily Study Group "Introduction to Probability" Quizes and Level 1 Certifications have deadlines within the next two weeks. Next week will be Spring Break; personally I have been and will be very time constrained for the next few weeks.Alternatively, I suppose we can also complete the course on the Wolfram-U page and request the Level 1 exam upon completion?
Posted 17 days ago
 Thanks for making the point about spring break, @Dave Middleton. We will extend both deadlines by a week, so quizzes should be done by March 24, and the exam by March 31. Yes, you can earn both completion and Level 1 independently in the interactive course, but course completion will require that you watch the video lessons within the framework. We run custom data pulls in order to verify completion with the Study Group.
Posted 16 days ago
 Thank you Jamie. I plan to use the Study Group materials if time permits.
Posted 17 days ago
 Hi;When you obtain the probability, it is easy to understand its exact meaning. However, the Expectation and RandomVariate are not so easy to understand. When would you use these two and what exactly are they telling me?
Posted 16 days ago
 Hello Mitchell,Let's go back to basics. A probability distribution is a set of values associated to a set of probabilities. For a die, the values are {1,2,3,4,5,6} and the probabilities are {1/6,1/6,1/6,1/6,1/6,1/6}. What is your probability? The likelihood of getting a certain value. Now, the RandomVariate is essentially a simulation. It randomly gives you back any value based on its probability in the context of the distribution. You use it to simulate the distribution, get an intuition of what that distribution is all about. To create artificial data, if needed.Expectation is the following formula: the sum or integral of all probabilities times a mathematical expression of the value. It turns out this formula is really useful, as demonstrated by the lesson on expectation. It basically expresses the mean or average of a given mathematical expression, where the express has as a variable the values of the distribution. The expectation of the value itself is the mean, or average of the distribution. But as you can see in the lesson, the Expectation formula can be used for many other measures. The interpretation depends on the expression of the value used. You use it for many measures of a distribution, to find what is the expected, predicable value that will come, or at least close to it.
Posted 17 days ago
 I have been puzzled about conditional probability, because is the situation usually as clear as that? The probability of an event occurring, given that another event has already occurred. It seems to me that in real world situations there can be all kinds of complex correlations between the events and the external world, not captured by the basic conditional probability formula.
Posted 17 days ago
 Hello Anders,Indeed, a lot can go into conditional probability. While the formula is correct, it's most often difficult to pinpoint exactly the conditions of an event. Besides in quantum physics, most events can be predicted based on a variety of factors. In data science and statistics, we often talk about measured and unmeasured attributes to explain randomness. Overall, it boils down to this. Measure everything that you think may have an impact. Based on this data, test the predictability of your next point. If it is predictable enough, you measure enough. If not, you might be missing too much information. Back to the drawing board.
Posted 17 days ago
 Interesting, thanks.
Posted 17 days ago
 In yesterday's session about JointDistributions, the example of the Dirichlet distribution (about cutting a rope @ at about 4:30 in the framework video, https://www.wolframcloud.com/obj/online-courses/introduction-to-probability/joint-distributions.html) is completely different from the example in the notebook ("Lesson 22 - Join Distributions.nb", slide 8, as well as in the framework Lesson notebook) which is about mileage of a car. Isn't the rope example more typical of the Dirichlet distribution than car mileages?
Posted 17 days ago
 Hello Hakan,Indeed, the version on the framework seems to be the wrong one! Thank you for telling us, this will be corrected shortly.
Posted 17 days ago
 I would like to suggest another clarification to section 3.
Posted 17 days ago
 Hello Joseph,First, since we aim to fit every phrase in a single line, we tend to shorten definitions like this one in interest of simplicity, while maintaining correctness. Second, I think I actually disagree with your definition. Set theory is more basic than probability theory, that much should be clear. Thus, set operation definitions should not be dependent on the definitions of probability theory of event and sample space. So the vocabulary is, dare I say, anachronic. In set theory, your confusion is actually correct. Due to the trivial existence of the theoretical universal set, the complement of a given set can be used despite not referring to any exterior set. It's just another variable, that may or may not contain every possible element or not. The generality of the statement doesn't have to be lost. So in that sense, the definition we gave is valid, and yours is too restrictive. However, as we discussed in the study group, computers are rarely so theoretical. There is no purely mathematical set in the Wolfram language, only lists. In the same way, there is no purely theoretical complement implemented, only a computationally approximation of that concept.
Posted 14 days ago
Posted 17 days ago
 Suggestion for Clarifying Section 3 Slide 6
Posted 17 days ago
 Hello Joseph,Your suggestion is actually what initially intended, but here is the issue: for this demonstration, no interactivity is possible due to the clickable interactions and the framework of Wolfram U. This initially lead to some confusion in the early stages. Our solution was to give a more print friendly graph, keep the interactivity in the video and give the link to the demonstration for those that wanted to experiment with it (as you did?).Seeing this also led you to more confusion, I'll consider rebuilding that demonstation myself with the Wolfram U framework in mind.
Posted 18 days ago
 For Section 2 Exercise 3. how do all the paths to (4,2) add up to 7 segments? It looks like 6 to me. Attachments:
Posted 18 days ago
 Hello Joseph,Indeed, this is an error, as noted by the post by Juan Ortiz Navarro, here's my answer: As for problem 3, this is also an error, but the anwser should be Binomial[6,2] or Multinomial[4,2], yes. Thank you for informing us! For the solution to make sense, just change the question to (0,0) to (4,3).
Posted 19 days ago
 Hi!. In the combinatorics, at the start of the course we saw how to count [Not-Replacement +Order Not Relevant] --> Binomial. What would be the approach to count [Replacement + Order Not Relevant], maybe we saw that, but I can't find it. Thanks!
Posted 18 days ago
 Hello J,For combinatorics, it's easier to think of as a tree of decisions rather than a matrix. Some cases may not have any real problem. The binomial is [Order not relevant + 1 group], and we usually assume without replacement. Why? Because a set with multiple examples of the same element is still the same set! Consider this:A set is a collection of non-repeated elements without order.A multiset is a collection of elements without order.Therefore, the approach to count (Replacement + Order Not Relevant) requires the definiton of a multiset. This multiset can be counted using Binomial(n+k-1,n). See this for more explanations.
Posted 18 days ago
 Thanks Marc.. I do believe there may be some impossible (in the sense of not meaningful) cases. I still try to figure out what makes sense to ask and what doesn't .But ok, here I go with a clarification of my question. I still believe this is applicable in real life, but I may be wrong. I tried to depict the situation with an "entry level skills" notebook, so my question gets easier to be assessed..
Posted 18 days ago
 Hello J,Here's one way to reformulate your situation. Out of a group of 5, you want to choose 1 to 3 elements. Thus, Binomial[5, 3] + Binomial[5, 2] + Binomial[5, 1] or equivalently: Sum[Binomial[5, i], {i, 1, 3}] This gives you the 25 you got. Choosing from a range of groups may be required in some situations indeed.
Posted 18 days ago
 Thanks Marc, thanks a lot!
Posted 19 days ago
 The solution for exercise 1 from section 2 seems disconnected. Q: Twenty-five runners compete in the 200m event. How many top 10 arrangements are possible? A: The arrangements are ordered, but you were only asked to consider 10 elements out of 25. Thus, this corresponds to a permutation: y1 = Integrate[(-s + 0), {s, 0, x}]What does this statement have to do with the problem? Shouldn't the answer be 25!/15! ?
Posted 18 days ago
 Hello Joseph,Indeed, on problems 1 and 3, there seems to be text from another part of the course. This is an error, as mentionned by Juan Ortiz Navarro, posted 2 days ago Regarding excercises-02.nb The solution to problem 1 (...)So yes the answer is 25!/15!.
Posted 18 days ago
 Thanks!
Posted 19 days ago
 On Excercise 4 of excercises-06.nb, "A smoker is twice as likely to have an ectopic pregnancy as a non-smoking pregnant woman." is denoted as P(E|S)=2P(E). I thought it to be denoted as P(E|S)=2P(E|S'), and P(S|E)=P(E|S)P(S)/(P(E|S)P(S)+P(E|S')P(S'))=0.461538How can P(E|S)=2P(E|S') be said in words?
Posted 19 days ago
 Hello Juan,I agree. This is a mistake, it should say: A smoker is twice as likely to have an ectopic pregnancy as any pregnant woman. That way the statement would make sense. Your calculation makes sense considering the formulation. Thank you for noticing, it will be corrected.
Posted 14 days ago
 Thanks @Juan Ortiz Navarro, I had just calculated the same solution using Bayes theorem. As the solution of "Exercise 4" in the "exercises-06" notebook is different from mine, I consulted this community page. P( S | E) =2 / (1+1/0.3) = 0.46Personally, I find the original wording and this solution more interesting..
Posted 19 days ago
 Hello all,For a bit of context for the exponential family of distributions, here are some ressources:A well made short video series introducing the subject, by Mutual Information.An introduction article to get familiar with this, from Berkeley EECS.A short textbook of Statistical Theory focused on the exponential family, from the University of Oxford.A paper on the link with machine learning, from Princeton University.The exponential family is usually covered in any course of Statistical Theory. Go satisfy your curiosity!
Posted 19 days ago
 Mind blowing!...(for me.. for sure there are more advance users for whom this may be already natural.)..but now I can appreciate the flexibility (and why not beauty) of exponential when "connected" to other "devices" to cast a spectrum of other functions. Never expected that to be possible.
Posted 20 days ago
 Lecture 1: How many different teams are possible given that it must include 3 Swiss (original group is 4) and 2 Ethiopians (original group is 5). You give the answer Binomial[4,2]*Binomial[5,2] is this the correct answer? How did you derive it? Is my answer Binomial[4,3]*Binomial[5,2] not correct?
Posted 20 days ago
 Hello Alex,As noted in the post by John Burke, there is a mistake and the answer is 40, with Binomial[4,3]*Binomial[5,2].
Posted 20 days ago
Posted 20 days ago
 Hi Zbigniew,The first is also our mistake, now that it is edited. Refer to the original post by John Burke.The second is a mistake, and will be corrected. Thank you for telling us!
Posted 20 days ago
 It's funny that commenting on typos I made a typo myself by inexplicably replacing a multiplication sign with a plus sign. This notwithstanding, the first is also a mistake. In your documentation, the calculation reads:Binomial[4, 2] * Binomial[5, 3], which is equal to 60, whereas it should readBinomial[4, 3] * Binomial[5, 2], which is equal to 40NOTE: I edited my original post and fixed my typo and its consequence.
Posted 20 days ago
 Hello... While reading the second excersise that states "While surfing the web, you encounter ad A 7 times and ads B and C 3 times each. How many arrangements are possible? " I interpreted as having next sequence: AAAAAAABBBCCC or (any other with seven As, three Bs and three Cs, as the precise sequence is not established). So I understood it like n=3 (A,B,C) al possible value reutilization=Yes k (trials)=13 , as the sequence provided, no matter the order is 13 positions long.So my solution before reading the answer was n^k -> 3^13 possible arrangements. But when reading I found "multinomial". In the question what is the part that suggest that Multinomial approach is the right approach? Thanks
Posted 20 days ago
 Hello J,That's a pretty fun question actually. Let's go through it.You encounter A 7 times, B 3 times and C 3 times, 13 in total. How count this have happened? You have your limited ressource of 7 As, 3 Bs and 3 Cs, so all the orders are going to be 13!, not 3^13, because this would imply you may not have encountered the given number of ads.But, can you distinguish the difference between A and A? No, the elements are not distinguishable. So you need to divide by 7! for all the orders of A, 3! for B, and 3! for C. You get 13!/(7!3!3!) which is the Multinomial[7,3,3].
Posted 20 days ago
Posted 20 days ago
 Hello J,To be clear, I think your situation could be also a valid situation. When you read the situation, you are given a specific instance of what happened. 7, 3, 3. Given that specific instance, I ask about the unknown information: the ordering. But I may have asked given 13 ads and 3 ad types, or given 3 ad types (even more general). You should need to be careful what is the specific situation and not generalize too fast.
Posted 20 days ago
 Thanks, Marc. I'll keep that advice in mind!
Posted 20 days ago
 Regarding excercises-04.nbOn problem 4, I think the solution is not complete. I believe, the answer displayed is for P(R' intersection B'), which one needs to add 1-P(R) and 1-P(B).Or P(R' union B')=1-P(R intersection B)=1-.1=.9.What mistake am I doing?
Posted 20 days ago
 Hello Juan,This was addressed earlier on this post, but the 0.1 is a mistake. Here is the solution:P(A'[Intersection]B')=P((A[Union]B)')=1-P(A[Union]B)=1-0.4=0.6Basically using De Morgan's Law of sets.Thus, the probability that neither A nor B occurs is 0.6.1-P(R)=P(R') and 1-P(B)=P(B'). The probability neither happen mean both are false at the same time, thus P(A'[Intersection]B'), a conjunction. Hopefully that answered your question, although I'm not entirely sure. Let me know if you're still confused.
Posted 20 days ago
 Thanks Marc. I was confused on "neither red nor blue". I understand now that it is a conjunction.
Posted 21 days ago
 Regarding excercises-02.nb The solution to problem 1 seems to took a turn I do not understand. Should it be factorial(25)/factorial(10)?And on problem 3, going from (0,0) to (4,2), are paths of length 7 or 6?So Binomial[6,2] or Multinomial (4,2)?
Posted 20 days ago
 Hello Juan,exercises-02, Problem 1. Honestly, this issue is a great surprise to me. It definitely took a turn I was not expecting either. Thank you for noticing. This will be corrected, but here is the answer: 25!/15! or equivalently: FactorialPower[25, 10] As for problem 3, this is also an error, but the anwser should be Binomial[6,2] or Multinomial[4,2], yes. Thank you for informing us!
Posted 20 days ago
 Thanks. I meant to write 25!/15! indeed.
Posted 21 days ago
 While trying to confirm concepts in the course with 3rd party reference, I found the attached. It states that the "Sample Space" for the times to failure for a certain machine is a sequence (T1...Tn). In my opinion, sample space must contain all possible times to failure, as opposed to certain specific sample. So I'd say the "space" would be all the real numbers, or perhaps all the real numbers below the maximum allowable age for the equipment being analyzed. Am I right on that the reference is confusing a "Sample space" with a specific sample?I'd appreciate your comments. Jorgea
Posted 20 days ago
 Hi J,This is a bit complicated, so let's discuss this one thing at a time. This textbook expresses a sample space where each data point is in itself a sequence of multiple numbers. In that sense, this is a multivariate random variable. Within a single sample, or data point, there are multiple times, which are the periods of time between breakdowns. Let's say there are n such periods in each sample. Thus, the domain of any single outcome is R(>0)^n, that is, the positive reals in n dimensions (periods of time are always positive).In this context, the sample space is the set of all possible sequences of periods between breakdowns, possibly R(>0)^n itself. So overall, I believe you are right with your sample space, but also that you are wrongly interpreting their explaination of the sample space. Hopefully that explanation helped.
Posted 20 days ago
 Oh!!!! Aweeeeeesomee!!! Thanks...thanks a lot... I see my mistake!
Posted 21 days ago
 Hi Marc, In Lesson 7, Slide 10 and 11, you showed two different examples of how to load and aggregate. I have difficulty in understanding how to aggregate :height[h_?(60 <= # <= 76 &)] := aggdata[[2, h - 59]]/Length[data] and roll[n_Integer?(2 <= # <= 12 &)] := dice[[n - 1, 2]]/36Could you elaborate a bit, or use simpler and understandable codes?Thanks, Lewis
Posted 21 days ago
 Hi Laising,Lesson 7 is infamously trying to use data functionality that we don't have access to at that point in the course. That is, you could accomplish the same thing using EmpiricalDistribution, SmoothKernelDistribution or HistogramDistribution. But here's an explanation of what I'm doing to avoid those functions. Aggregate data by value and frequency using the Tally function. Normalize the frequency by dividing the list of frequencies by to total number of occurences. This now becomes your list of probabilities. Map the probabilities to the correct values. You now have your PDF. I also bound the values to the PDF to make it clearly where is the domain of my values. Again, this is not necessary for you, as this will be better addressed in other lessons. This was merely the first jab at the subject of data-driven distribution.
Posted 22 days ago
 Marc,You asked for mistakes in the documents. So, please have a look at “Lesson 2, Slide 9”: Your equations are wrong, which means LHS is not equal to RHS. Right?
Posted 22 days ago
 Indeed, the LHS exponent should be 5, not 3. This will be corrected. Thank you for noticing and informing us.
Posted 22 days ago
 When is the new study group for Introduction to Probability starting? I missed last week and would rather start from the beginning without worry and rushing.
Posted 22 days ago
 Does anybody have the link to the course materials that they could post on the community thread? I want to get started on the exercises but I missed the last meeting and the recording does not show the chat pane with the links. Thanks
Posted 22 days ago
Posted 22 days ago
 thanks!
Posted 23 days ago
 Marc, I am enjoying the course very much. Thank you for all the effort you are putting into it. When you get time, would you please check Exercise 3 from exercises-04.nb. I did the problem in two different ways, and I still get an answer of 0.6 which does not agree with 0.1 which is your answer. Thanks
Posted 22 days ago
 Indeed! You caught my second mistake!P(A'[Intersection]B')=P((A[Union]B)')=1-P(A[Union]B)=1-0.4=0.6Thus, the probability that neither A nor B occurs is 0.6.Thank you for noticing, it will be corrected.
Posted 22 days ago
 Thanks Marc for your prompt consideration. I have another for you! In Exercise 1 of exercises-05.nb, the event you describe out of the 36 possible equally likely outcomes is E={{1, 1}, {2, 2}, {3, 3}, {4, 4}, {5, 5}, {6, 6}, {1, 6}, {6, 1}, {2, 6}, {6, 2}, {3, 6}, {6, 3}, {4, 6}, {6, 4}, {5, 6}, {6, 5}}. Hence, P(E) =16/36=4/9. Thus, the odds are 4 to 5. Would you please look into this also. Thanks.
Posted 22 days ago
 Indeed, that is correct.There are 6*6 combinations, 6 with doubles, 6 starting with the number 6 and 6 ending with the number 6. The three events have the occurrence (6,6) in common, so one occurrence is counted three times, thus: (6 + 6 + 6 - 2)/(6*6) is indeed 4/9, thus odds of 4 to 5.Thank you for informing us, it will be corrected.
Posted 21 days ago
 Marc, I have another one for you. In Exercise 3 of exercises-05.nb, there should be an additional branch to your tree; namely b,b,b,b,r,r. This will give you an additional 1/15 for an answer of 12/15. I found that a simpler way for me to attack the problem was to consider the probability of failure to draw at least one red ball. This means you only have a sub-tree with 3 branches to keep track of. You will then get 3/15 as probability of failure, and hence 12/15 is the probability of success. Thanks for staying on top this.
Posted 20 days ago
 William:I agree with you but I see it as which parts of the tree will give at least one red, taking into account that as soon as I have three balls the game is over. So in the first round I have a 2/6 probability of getting a red one. If I do not succeed on that, then on the second round I have 4/62/5 probability of getting a red one. And on the third round, I have 4/63/51/2 of getting a red one and the game is over since the other player took the other three balls:2/6 + 4/62/5 + 4/63/52/4=4/5or the complement of not getting red balls, which will be all black balls: 1-4/63/51/2.Does that make sense?EDIT: I seem to have answered that half asleep, I can't even read my answer. Please disregard it.
Posted 18 days ago
 The question never states that the game is over as soon as a red ball is taken. Without that information, I interpreted the question as there being three outcomes at the end after all balls are taken, one set for "you" and another for the "friend": {{r,r,b},{b,b,b}} {{r,b,b},{r,b,b}} {{b,b,b},{r,r,b}} This gives a ⅔ probability of having a red ball in "your" set at the end.Is that line of thinking correct given my assumptions?
Posted 13 days ago
 Hello Parker,This is incorrect due to the fact the probabilities change as this situation is ordered. But naming all the possibilities may not be a bad idea, this will become my new solution. Thank you for the idea.
Posted 13 days ago
 Marc: I don't quite understand your statement "as soon as I have three balls the game is over". I did assume that once "you" draws a red ball the game is over. I reproduced the terms in your sum but if you carry the tree to the end you get some cases where there are 3 balls left and "you" have not yet drawn a red ball. I work this out in the attached notebook, Attachments:
Posted 13 days ago
 The solution to this problem is to complicated for its own good. Refer to your later post on this, were I give a better solution.
Posted 13 days ago
 Marc, Would you check Your answer to Question 8 of the Mock Exam? I get 20*Binomial[19,7] =1,007,760. Thanks.
Posted 12 days ago
 Hello William,Indeed, the written answer is right but the calculation is wrong, it will be corrected. Thank you, it will be corrected.
Posted 13 days ago
 Hello Willian,It seems the solution for this problem is wrong and too complicated. To avoid this, let's use the power of the Wolfram language.Here is the new solution.Let's compute all arrangements of balls with your friend, where red balls are negative and black balls are positive numbers. We are only interested in the balls that are received by the player, so let's take the odd columns (odd rounds). possibilities = Permutations[{-1, -2, 1, 2, 3, 4}][[All, {1, 3, 5}]]; Now that we have the balls received in all probabilities, just measure the number of possibilities where we have at least one negative number against the total number of possibilities. Length@Select[possibilities, AnyTrue[#, Negative] &]/ Length@possibilities And we get 4/5. As mentionned by you with the three branches. Surely this way we're less likely to get lost.
Posted 12 days ago
 Hi Marc, Very Nice approach to this problem!! Well done. I like it because, as you point out, it does demonstrate the elegance and power of the Wolfram Language. There is no need to give up on your "Tree" approach, though. What your Wolfram solution shows is that in a problem of this type, counterintuitively, every COMPLETE branch is equally likely. Thus, you don't have to laboriously track ever changing probabilities down each branch! Hence, our solution is: The number of favorable branches devided by the number of total branches which is Binomial[6,4]. Of Course, the numbers 6 and 4 could be replaced by even integers, n and k, with n>k>0.
Posted 24 days ago
 Hello all, We had a more difficult question today, given as: Given a vector with m element (real numbers) I want to compute the PDF for the sum of the m elements considering that each element will have positive or negative sign with equal probability.First, consider that each element must have its own probability distribution. Since it is real, let me for example assume any element is uniformly distribution between -10 and 10. UniformDistribution[{-10, 10}] If all elements are distributed in the same way, the PDF of the sum will be a result of m elements added. So we get: PDF[TransformedDistribution[ m*x, x \[Distributed] UniformDistribution[{-10, 10}]], x] However, if you assume a normal distribution centered as 0 for your elements, then you could write: PDF[TransformedDistribution[ m*x, x \[Distributed] NormalDistribution[0, 1]], x] Finally, your elements may be distributed in different ways, which could look like something like this, assuming let's say m is 4: PDF[TransformedDistribution[ a + b + c + d, {a \[Distributed] NormalDistribution[0, 1], b \[Distributed] NormalDistribution[0, 4], c \[Distributed] NormalDistribution[0, 19], d \[Distributed] UniformDistribution[{-10, 10}]}], x] Another way to intepret your question is to each element is given a binary choice between -1 or 1, but the values are constant. In that case, you may want to apply many distributions to a single sum, which results in this, assume for example this vector: v = {1.77, 5.65, 10.14, 195.14} PDF[TransformedDistribution[ v[[1]]*(-1)^a + v[[2]]*(-1)^b + v[[3]]*(-1)^c + v[[4]]*(-1)^d, {a, b, c, d} \[Distributed] Table[BernoulliDistribution[0.5], Length[v]]], x] Hopefully that answers the question. Transformed distributions will be seen in Lesson 20.
Posted 24 days ago
 Hi;A couple of questions regarding the Decision Tree to decide how to count specific outcomes:Under the Count, Without Order, One Group, (n over i ) - what does (n over i) mean or what operation are we performing here?Under Count, With Order, Remplacing - what is Remplacing?Thanks,Mitch Sandlin
Posted 24 days ago
 Hi Mitch,The (n i) (vertical) is the mathematical notation for Binomial[], this is a binomial coefficient. Same goes for the Multinomial coefficient.For replacing part, I'm sorry for the confusion, but this is meant to say replacing, not remplacing, this is just a syntax error. As for what does replacing mean, it mean that when you "use" an element, you replace it by another element of the same value. For example, a password of 7 digits. You may "use" the digit 5 as your first characters, but it is still available for your next choice. That means you replaced the used digit 5 by another digit 5, for the set to maintain the same size. This vocabulary comes from the historical example of taking balls out of an urn. You may put another equivalent ball after you've taken one out (with replacing) or you may not do that, therefore the size of the group is less after each step (without replacing).
Posted 24 days ago
 Will this course be repeated? I am having struggles with it right now, and would like to work on Computational X-plorations or something else. Thanks.
Posted 24 days ago
 Hi,It may be best for you to prioritize other engagements, as this course is made to be taken at any time.The course will soon be fully released with all materials and lessons freely accessible. Moreover, the daily study group sessions are all recorded and also freely accessible. There is no planned second study group for this course for now, but it may happen if there is interest for it. Feel free to also continue asking questions in this post even after the study group has ended.
Posted 24 days ago
 Would you please provide details of the proof for the inequality on Slide 12 of Lesson 10? Thanks very much.
Posted 24 days ago
 Hi Bob,I found this proof, from the University of Arizona, which stays within the bounds of the course and you don't need too much external knowledge of other branches of mathematics besides Multivariable Calculus. This is not too much of a difficult proof, but it remains fairly theoretical. The actual proof is short and at the end, but the rest of the paper gives great context.
Posted 24 days ago
 Thanks Marc, this is exactly what I was looking for. Sometimes a simple equation has an easy proof, and I am relieved to see that this is not such a case! Can you provide a reference to the book mentioned in the article? That seems to be very compatible with this course.
Posted 24 days ago
 Hi Bob,The textbook of the proof is Probability - An Introduction by Geoffrey Grimmett and Dominic Welsh. It is concise but a bit short on examples. A great book for examples would be Introduction to Probability by Charles Grinstead and Laurie Snell. Less mathematical, more examples, more discussion. It's also freely available. Hopefully that helps.
Posted 25 days ago
 In lesson 8, "Discrete Random Variables" we use the statements children[n_] := 1/((n + 1)^3*Zeta[3]); distChildren = ProbabilityDistribution[children[n], {n, 0, Infinity, 1}] Does the "1" in the ProbabilityDistribution function indicate that n is a discrete variable?
Posted 25 days ago
 Hello Joseph, Basically yes. It indicates you're making jumps of 1, and the fact you're jumping over values indicates you are taking discrete steps; thus, you are using a discrete variable.Notice you could also theoretically also jump of 2, or 0.5, and this would still be a discrete distribution. The point is that you are jumping values instead of using a continuous range.
Posted 25 days ago
 Thank you for the wonderful and edifying course. However, some of the questions you ask seem to be ambivalent. For example, let's take look at one of the questions posed today:The second option was announced to be correct. But one could easily argue that the last one is also correct. Say, if you want to see how close the head frequency in tossing a coin 100 times would get to the expected value, you would use the RandomVariate function in MODELING the problem. Having potentially more than one correct answer and only one deemed correct by an automated exam is not a problem for quizzes, which you can take more than once with the same questions. However, it would be a problem for a certification exam, which you can take only once with the same questions.In summary, the fuzziness/ambivalence of some of the questions presented in the course worries me in the context of the future certification exam.
Posted 25 days ago
 Hello Zbigniew,First, let's be clear, I am taking a few liberties with the poll questions, such as having general comprehension questions and multiple true answers. The same fuzziness will not be found in quizzes or certifications exams, and only one answer will always be true for those.Second, the 4th choice remains a bad choice. You can use RandomVariate to model, but should you? A mathematical model is an abstract description of a concrete system using mathematical concepts and language. (Wikipedia) RandomVariate will give you an approximation of the base model, but never exactly the base model. You are losing information when you model only based on the RandomVariate data. "Say, if you want to see how close the head frequency in tossing a coin 100 times would get to the expected value, you would use the RandomVariate function in MODELING the problem."In this case, you want to use the expectation of the difference between theoretical expectation and the random variable. Check for the reduction of sample mean error, or even the CentralMoment. The problem with using the RandomVariate is that you might be able to see how close the head frequency should be, or you might not. The RandomVariate is inexact. You may get your measurement, but it is also possible your sample may be heavily biased or not enough biased. Moreover, seeing how close a head frequency is does not constitute a model. I fail to see how your problem fits the definition of modelling.
Posted 25 days ago
 Thanks, Marc, for clarifying. I'll get used to your precise definitions, including that of modeling. Again, thanks 10^6 for a so-far wonderful course.
Posted 24 days ago
 Adding to this, I believe there's always a catch when trying to rely on a single approach to assess all possible analytical situation. Of course, having an equation beforehand as a model it is an ideal situation, but that's not always doable. As an example, forecasting the production profile of an Industrial Plant where hundreds or thousands of assets interact (where you have random climate, random times to failure, sometimes with clear patterns and sometimes not), forcing one to have an equation as a model may turn the analysis into not-viable. In Modeling, there's regularly an "elucidation" process where one plays and may end having an equation possible to fit, or not. I believe in those cases, if one wants to approach them cost-effectivelly, I'd say it is perfectly valid play with the "RandonNumber" generation as part of the Modeling process because we're just trying to figure out what's going on. That's why approaches like Monte Carlo (despite being loved or hated) provides speed to the analysis process.
Posted 25 days ago
 Hi everybody!! I'm having some issues while trying to Plot a PDF. I see some unexpected changes at the staring of the PDF plot. Am I doing something wrong in the PDF definition or invoking the Plot WLFunction? Thanks in advance!
Posted 25 days ago
 Some weird behavior when trying to use a PDF Attachments:
Posted 25 days ago
 Oh!! Awesome! Great advice Jürgen!!..Thanks a lot!
Posted 25 days ago
 Actually, Mathematica does a good job, even though it confuses you by selecting different PlotRange options. But look carefully, and note that the value at which your distribution starts is 1/(1+3Pi/2) = 0.175058...The best way to avoid this illusion is to specify the same PlotRange option for all your plots, say, PlotRange ->{0,1}, or PlotRange -> All, or PlotRange -> Full.
Posted 25 days ago
 Hi;I noticed that a squared off D (esc cond esc) was being used to indicate a condition. Also, Mathematica use a /; to indicate a condition. Are these two interchangeable?Thanks,Mitch Sandlin
Posted 25 days ago
 Hello Mitchell,No, these are not equivalent. Those are for very different contexts.Conditioned is only used in the context of Probability to symbolize the classical P(A|B) or probability of A given B, basically replacing the word given.Condition is used in pattern recognition, always jointly with a pattern, giving an iterative test to accomplish on a list usually, as a much more core mechanic of the language.
Posted 26 days ago
 Hi, In the material of Lesson 4, when referring to the probability of the event "E" , and the need for Kolmogorov axioms, appears this.. However, k/M must be constant to be relevant, which is problematic< Every time I "test" the 1/6 of a perfect dice, the number of k ones / M trials is not constant, certainly because random events are not constant. So I don't believe I get the core idea of why k/M must be constant or why this is "problematic". From an Statistical perspective that k/M is how we ( or at least I ) estimate (guess) probability. So, I'm confused.
Posted 26 days ago
 Hello J,A valid question for sure. Let me try to be as explicit as possible.First, why must a probability be a constant measure? Well, a quick Google search gives us that a measure is "A reference standard or sample used for the quantitative comparison of properties." In other words, for something to be a measure, it needs to be reliable, a reference. An uncertain number that has some convergence is not evidently reliable, thus not a good measure. For us to define an entire branch of mathematics, we need more reliable things.Second, why is the assumption of convergence of any frequency problematic? Simply, because it is too much of a strong statement. If I start by assuming a big and complicated statement, then any possible exception to that statement may render worthless all the theorems I’ve built on that statement. It's really a question of what is considered conventionally true. Look at it this way. We want the most basic statements for axioms. You can prove the convergence of frequency through Kolmogorov's axioms. However, you cannot prove Kolmogorov's axioms from the assumption of the convergence of frequency. This implies Kolmogorov's axioms are more "basic".
Posted 26 days ago
 Awesome.. Got it! Thanks!
Posted 26 days ago
 The example in lesson 3 for the probability that the sum of two dice will be even comes up with a probability of 6/11. This seemed odd and a little bit of extra thought suggests that you should not delete duplicate outcomes as they are part of the sample space of outcomes. When you consider duplicate outcomes in both numerator and denominator, you get the more intuitively satisfying answer of a probability of 1/2. Attachments:
Posted 26 days ago
 This doesn't dare to be an "answer", but my interpretation, as I had the same "feeling" when reviewed that slide.As the explicit statement in the slide is {2,4,6,8,10,12}, and those were examples on the topic of "Sample spaces and Events" I saw that list as the "Event Definition" that one would use to calculate the mentioned probability. That list is actually the Event Definition against which the EvenQ function tests the full range of outputs. And I'd humbly agree the probability of an Even Sum is 1/2.
Posted 26 days ago
 Hello Joseph,Let's be careful with the terms here. In lesson 3, we defined the sample space, the set of all possible outcomes. For the sum of two dice, this is {2,3,4,5,6,7,8,9,10,11,12}, as a set does not need to repeat instances. The probability of each of those events is not given at that point.Now, if we discuss probability, you can obviously see that the event of sum 2 will only happen for {1,1} dice, but the event of sum 7 will happen for {1,6}, {2,5}, {3,4}, {4,3}, {5,2}, {6,1} dice, in other words, many combinations. From this you can affirm the events {2,3,4,5,6,7,8,9,10,11,12} most definitely don't have the same probability. Thus, your equally likely assumption leading to 6/11 is wrong. You need to assign to each event its correct probability.
Posted 26 days ago
 Marc Thanks for your response. I agree that the set of all possible outcomes should not include repeats. But is the probability of an even sum 1/2? . I'm enjoying these study group sessions! Joe
Posted 25 days ago
 Yes! It is 1/2. Let's see how you could do it considering all the notions seen.First, get the sample space and probabilities. {sampleSpace, probabilities} = Transpose@Tally@Flatten@Table[x + y, {x, 1, 6}, {y, 1, 6}] Then, normalize the probabilities for a sum of 1: probabilities = probabilities/Total[probabilities] Finally, sum all even sums: Sum[If[EvenQ@sampleSpace[[i]], probabilities[[i]], 0], {i, Length[sampleSpace]}] Not the simplest way, but definitely a visual way to see it: ListPlot[Transpose@{sampleSpace, probabilities}, Filling -> Axis] Which should give you:
Posted 26 days ago
 Hi...two fast questions: 1- In these two last laws (pls. see the image) .. and in general, should it be assumed that intersection always goes (is processed) first, before that Union. 2. Does it make sense to generalize that, what is applicable to + and * when in regular arithmetic {i.e (A+B)=(B+A) } Is valid when replacing + by Union and * by Intersection {i.e ( A u B ) = (B u A) or as A(B+C) is AB +A*C then A n ( B u C) --> (A n B ) u ( B n C) shoud be ok...)Thanks!
Posted 26 days ago
 Hello J, Yes! This is the priority of operations, as noted throughout the lesson. First parentesis, then negation, then intersection, then union. Like the infamous PEDMAS, but for set operations. Yes, the generalization applies. Imagine it as equivalent operations but operation on different objects. The logical Or ||, the addition +, the set union, all represent the natural first common operation.The logical And &&, the multiplication *, the set intersection, all represent the natural second common operation in a ring.This is a common theme in mathematics, which implies a lot of the basic rules you see, often in linear algebra, are appliable to many many contexts!
Posted 26 days ago
 Hi Mark:I did not understand what you are calculating in the UCILetter Example on the Baye's section.Can you point me in the right direction?Thanks.
Posted 26 days ago
 Hi Juan, To be fair, this is extra knowledge that is only there for your information. That said, let's get into it. The Naive Bayes assumption is that every event is considered independent. This implies there is n probabilities to guess for any set of n events. From there, a Naive Bayes classifier, a machine learning algorithm, will assume every attribute of your dataset is independent. Now, with a lot of data points, noticing where data points differ, you can infer those n probabilities, which allows you afterwards to predict the class based on your guessed probabilities, just multiplying the independent events.Feel free to check external resources, like this, to get a good understanding of the algorithm. But this is beyond the scope of the course of course, something you would see in an Introduction to AI.
Posted 27 days ago
 Hi Mark, I am a high school math teacher. The lesson materials are wonderful to be used in our classroom. There are some simulations in the notebook of day 1. I wonder is there any place that I can download the source code? just like the "normal convergence" you shared.
Posted 27 days ago
 Hi Tianyi,Most code for visualizations is available just by downloading the notebook and expanding the cell of interest. For lesson 1, most of the visualisations are taken from the Wolfram Demonstrations Project, feel free to explore it and use it for your own classes. Here is the code for the Normal Convergence: Manipulate[ Show[ Histogram[ Table[Mean[RandomReal[{0, 100}, n]], {200}], {35, 65, 1}], Plot[200*PDF[NormalDistribution[50, Sqrt[9999/(12 n)]], x], {x, 35, 65}], ImageSize -> {400, 200} ] , {{n, 20, "Sample size"}, 10, 200, 1, Appearance -> "Labeled"}] I switched Manipulate to Animate to make it an animation.
Posted 26 days ago
 This is really great for learning the complete possibilities of the Wolfram Language.. Thanks for doing it so thoroughly!. You could paste a picture!.. but you didn't!. Thanks a lot!
Posted 27 days ago
 Marc,I downloaded your class notebooks from the site today.Slide 7 of lesson 2 "Consider that the 4 Swiss and 5 Ethiopian athletes want to form a team of 5 to represent them in competition. How many different teams are possible given that it must include 3 Swiss and 2 Ethiopians?" Shouldn't the answer be Binomial[4,3]*Binomial[5,2]?John
Posted 27 days ago
 Same question here, lol. Should the answer be 40?
Posted 27 days ago
 Hi John, You spotted the first error! Indeed, this should be Binomial[4,3]*Binomial[5,2]! You are totally right! The answer is 40.Thank you for noticing!Sorry I can't give you a physical medal, but this will have to do. The binomial distribution thanks you for noticing binomial mistakes:
Posted 27 days ago
 Was told by one of the Q&A moderators that the session recording will be posted . How can we access it to review it ?
Posted 27 days ago
 Emails have just now been sent to everyone who is registered for the Daily Study Group. Recordings can be accessed from the webinar series landing page.
Posted 1 month ago
 I am wondering if this course will touch on any aspects of probabilistic statements in quantum mechanics such as the probability the electron is here is 50%. Another application would be the probability the radioisotope decays that expresses radioactivity that is radioactive will be based on the half life.There are some examples of using Mathematica's probability functions in quantum physics at https://resources.wolframcloud.com/PacletRepository/resources/Wolfram/QuantumFramework/tutorial/ExploringFundamentalsOfQuantumTheory.html.This course will fit well with my university courses of Applied Probability and Statistics STA 345 and Probability and Statistics I STA 445 and Probability and Statistics II 446.I think applying probability theory to Catan is interesting. For example there are two dice in Catan with six sides and the most common roll is a 7 because the expectation of a normal dice is 3.5 and 3.5+3.5=7. The 7 causes the robber to make you discard half of your resource and commodity cards rounded down if you have more than the discard limit.I would like to mention that you can calculate the combinatorial enumeration of all events in the sample space with FactorialPower. For example, with the example from the second presentation Imagine that 4 Swiss and 5 Ethiopian athletes compete for the 200m sprint. How many different top-three winner rankings are possible? FactorialPower[9,3] returns 504 which is the same as 9!/(9-3)!.
Posted 27 days ago
 Thank you, Peter, for your question!The probabilistic statements of quantum mechanics are a useful application of probability theory. This is not approached in this course not because of its difficulty, but because of necessary background knowledge of quantum physics to be able to use those statements. However, the probability of decay is mentioned Lesson and Exercises 17, as it is a classical example for exponential distributions.Yes, as said in the beginning, this covers the material for Probability courses. However, Statistics are not covered, you will need your courses for that.As for the probabilities of throwing two dice, there is far more knowledge needed to formulate this than meets the eye. As said in the introduction, the probabilities of dice are explored throughout the lessons. I believe if you seek the exact probabilistic formulation of this problem, you can look at Lesson 22.Indeed! FactorialPower can be used, you're welcome to do that. However, the goal of Lesson 2 was to get a good understanding of combinatorics, so it seemed more intuitive to just do it explicitly.
Posted 1 month ago
 Reminder that our upcoming Daily Study Group provides a preview of the new interactive course, Introduction to Probability. The Study Group meets daily over two weeks, Monday through Friday, for an hour online each day, starting Monday. Take advantage of this opportunity to prepare for probability and statistics related coursework and research in natural science, engineering, finance, medicine, data science and other fields! You can sign up here.
Posted 1 month ago
 The study group starts next week, on February 27th.If you want to take full advantage of this course's material and get a practical and deep understanding of probability, don't forget to click on the REGISTER HERE link to get registered in this course!I'm looking forward to your participation and feedback!
Posted 1 month ago
 This study group will be based on the upcoming Introduction to Probability course on Wolfram U.Marc Vicuna is the instructor for the study group as well as the Wolfram U course and is an outstanding young teacher and data scientist.I strongly recommend you to join the study group and immerse yourself in probabilistic thinking for two weeks!