Group Abstract Group Abstract

Message Boards Message Boards

[WSG23] Daily Study Group: Introduction to Probability

A Wolfram U Daily Study Group on Introduction to Probability begins on February 27th 2023.

Join me and a group of fellow learners to learn about the world of probability and statistics using the Wolfram Language. Our topics for the study group include the characterisation of randomness, random variable design and analysis, important random distributions and their applications, probability-based data science and advanced probability distributions.

The idea behind this study group is to rapidly develop an intuitive understanding of probability for a college student, professional or interested hobbyist. A basic working knowledge of the Wolfram Language is recommended but not necessary. We are happy to help beginners get up to speed with Wolfram Language using resources already available on Wolfram U.

Please feel free to use this thread to collaborate and share ideas, materials and links to other resources with fellow learners.

REGISTER HERE

enter image description here

Wolfram U Banner

POSTED BY: Marc Vicuna
201 Replies

My solution to Mock Exam Question 3 Choose a uniformly random point in the unit square with corners in (0,0) and (1,1). What is the point's expected distance from the origin? is

Norm[Expectation[{x, y} - {0, 0}, {x, y} \[Distributed] 
   UniformDistribution[2]]]

I don't understand why this doesn't match the answer of approximately 0.76. What did I do wrong?

POSTED BY: Peter Burbery
Posted 2 years ago
POSTED BY: Dave Middleton
POSTED BY: Marc Vicuna
POSTED BY: Peter Burbery
Posted 2 years ago

Peter, I think all updated course notebooks can be found in the course framework: https://www.wolframcloud.com/obj/online-courses/introduction-to-probability/what-is-probability.html

POSTED BY: Dave Middleton
POSTED BY: Marc Vicuna
Posted 2 years ago

Dear Marc, While catching up this month, I came across a number of suggested errata; see the text below. Thank you for putting together this course. It was a lot of fun as a fast track probability review. Cheers, Dave


MODERATOR NOTE: notebook Suggested Errata was moved to the attachment below and also can be viewed at https://www.wolframcloud.com/obj/5f715089-9e6c-4e90-9733-f3a69ddec8e9 Wolfram cloud notebook.

Attachments:
POSTED BY: Dave Middleton

Hello Dave,

Thank you for this exhausive review, this is currently being addressed and will be changed soon on the framework. Let me address these concerns one by one for you:

Lesson 2: correct, unique objects are used.

Exercice 7: correct, the function in the question will be modified.

Exercice 10: correct, the + are now added.

Exercise 15: correct, the coefficients will be added.

Exercise 18: 4: correct, the word will be changed. 5: incorrect, this is not a root, the only root taken is because of the transfer from variance to standard deviation.

Exercise 19: this will not be changed, since this is a question of interpretation of the dataset, a skill we want to develop in this course.

Exercise 21: slight changes were made to make it clearer. However, consider the only requirement here is to try to interpret clusters. Moreover, answers only always portray one possible way to answer. Giving steps would go against the many possible ways to answer the question.

Exercise 23: This was already corrected from another community post. Moreover, the -1 is not necessary, only that it is a negative number. There is no issue if your method is slightly different. Question 4: both are true. The questionnaire goes from 1 to 10, but the distribution is from 3 to 10. in other words, the probability of 1 to 2 is 0, as no one gave that answer. Question 5: this is found in the term "average" of distributions. A combination of distributions does not imply equal weights, but an "average" does.

Lesson 24: this seems to be a framework problem, this will be updated soon.

Exercise 24: No, there is no mathematical difference. You are adding 0.5, is 0.5 included or excluded? Not only does this make no difference in calculation, it also is impossible to argue one or the other semantically, as the approximation is independent of a finite point on an interval. So in the continuous case, it is preferrable to avoid equality symbols since it complicates the formulation without changing anything, even if we're aiming for the most absolute mathematical correctness.

Quiz 6: this was corrected.

Practice Exam: correct, this will be corrected soon.

Thank you again for this huge help in the review of this course. Good luck on your probabilistic endeavors!

POSTED BY: Marc Vicuna

Hi Marc;

Do you have an updated Mock Exam that includes all the corrections, that I can download. Also, were any changes/corrections made to the downloaded questions and other notebooks?

Thanks,

Mitch Sandlin

POSTED BY: Mitchell Sandlin

Hello Mitchell,

For all errors or bugs, it takes us about 1 or 2 days for it to be corrected and on the course framework. So to access the updated version, always refer to the notebooks found on the course framework. The practice exam and multiple exercise notebooks were corrected since the launch of this course, so please redownload those on the course framework if you were using the old original notebooks.

POSTED BY: Marc Vicuna

Lesson 24 Exercise 1 : " Use the normal approximation. Note 300 is excluded and 330 is included." seems like it should read "Use the normal approximation. Note 100 is excluded and 300 is included."

POSTED BY: Joseph Smith

Hello Joseph,

Indeed, this seems to be a mistake, it will be corrected. Thank you for noticing.

POSTED BY: Marc Vicuna

For Lesson 23 Exercise 5, what is the point of averaging data from 3 distributions? samples =

Table[Sum[
RandomVariate[
d], {d, {NormalDistribution[0, RandomReal[]], 
CauchyDistribution[0, RandomReal[]], 
StudentTDistribution[0, RandomReal[], 2]}}], 1000];

Why are we setting the standard deviations to a RandomReal[] number?

POSTED BY: Joseph Smith

Hello Joseph,

As said in the question, there are many approaches to the problem. You may set your standard deviations in whatever way you want. Maybe the only restriction is that it doesn't surpass the standard deviation of exercise 4:

N@StandardDeviation@DiscreteUniformDistribution[{3, 10}]

which is about 2.29. The key here is to experimentally explore the difference in convergence between the weak and strong law of large numbers.

POSTED BY: Marc Vicuna

For Lesson 23 Exercise 1, we are asked to calculate the upper bound to the probability a cell has a size between 13 and 37 µm? with the mean = 25. By computing P(|X - mu|)>=(sigma/k)^2 aren't we computing the upper limit on the probability that the cell size will be OUTSIDE the range of 13 - 37? Think about the answer: we are saying the upper limit on the probability that the value will be within 2 standard deviations of the mean is 1/4. Does that make sense?

POSTED BY: Joseph Smith

Hello Joseph,

Indeed, this doesn't make sense. This mistake was caught and corrected, please refer back to the framework exercise notebooks.

POSTED BY: Marc Vicuna

Hello,

Well, there is definitely a difference and its technical and significant.

Basically, in the lesson, the emphasis is that the normal distribution is shown for the student to see the division of the standard deviation by the root of number of samples. You are asked for the probability of a mean, thus you "created" the sample distribution and your probability is only applied to means, but you initial data was not means.

In the exercise you mention, your data is already a set of means. Therefore, you can just estimate the data and you have a sample distribution. In that case you don't use the factor of square root of the number of samples and you get the right answer.

This all has to do with what your initial data is. Is it a mean, or raw data from a simple element, individual or such?

I hope I made this clear.

POSTED BY: Marc Vicuna
Posted 2 years ago

That makes a lot more sense now, thanks!

POSTED BY: Parker Robb

In slide 8 for Lesson 16 it says that "From Newtonian physics, the horizontal distance function is v^2Sin[alpha]Cos[alpha]/g" . There should be a factor of 2 in the numerator. The rest of the slide correctly includes the factor of 2 in the calculations.

POSTED BY: Joseph Smith

Hello Joseph,

Thank you for noticing, this will be corrected shortly.

POSTED BY: Marc Vicuna
Posted 2 years ago

Howdy Marc,

One of the questions on the version of the final exam appears to be missing the referenced "sample" code,

StyleBox[Cell["What is the estimated variance of the savings ratio in the \:201cSample Data: Life Cycle Savings\:201d dataset? The dataset is normally distributed. Use the following code to obtain the data.", ExpressionUUID -> "7d7e813b-1bac-405a-b85e-d2849831a823"], "ProblemCaption", StripOnInput -> False] Thanks, John

POSTED BY: John Davidson

Hi John,

This is now fixed, thank you for noticing.

POSTED BY: Marc Vicuna
Posted 2 years ago

I found what I think is a simpler solution to Lesson 24 Exercise 4:

Nest[TransformedDistribution[
   x + y, {x \[Distributed] #, y \[Distributed] #}] &, 
 BinomialDistribution[2, p], 9]
POSTED BY: Updating Name
Posted 2 years ago

Also for Exercise 5 in the same lesson:

Nest[TransformedDistribution[
   x + y, {x \[Distributed] #, y \[Distributed] #}] &, 
 NormalDistribution[2, Sqrt@31/2], 2]
POSTED BY: Parker Robb

Hi Parker,

You seem to be missing the point here. This is also a valid solution, but a much more obscure one. How would you explain the standard deviation have to be divided by 2 for the addition of nested distributions? This result is from the variance, but explaining it is difficult.

The solution given restricts all standard deviations to be integer numbers, which is wildly unnecessary, but it facilitates the explanation and calculus.

POSTED BY: Marc Vicuna
Posted 2 years ago

I see what you mean.

When I did the problem I proceeded without the assumption that σ of each distribution had to be an integer. Since σ in the starting distribution can be whatever, I back-calculated a single factor for σ, instead of factoring by several different integers as the exercise solution does. I see after reading the solution that that factor comes about through the "distance formula" rule:

In[] := Nest[Sqrt[#^2 + #^2] &, Sqrt@31/2, 2]
Out[] := Sqrt[31]
POSTED BY: Parker Robb

Hi,

Indeed, that solution is much more efficient. However, the average student may not be familiar with recursion. The RecursionTable was used to show the recursive steps the function is going through throughout the process, to allow solving with complete understanding. But yes, purely for computation, the Nest function is better here.

POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: Updating Name
Posted 2 years ago

On my exam, the following appeared:

​ LESSON11

Which distribution best describes the following data? Hint: use FindDistribution and EstimatedDistribution.

But there was no data.

POSTED BY: Updating Name

Hello,

We will correct this shortly, I'll post again to confirm it has been corrected.

POSTED BY: Marc Vicuna

The issue has been addressed and corrected.

POSTED BY: Marc Vicuna
Posted 2 years ago

In question 35 of the Practice exam, we consider the permutations of a group of letters to determine which are words. It seems that Mathematica finds an extra word if the letters are capitalized….

In[1]:= Tally[DictionaryWordQ/@StringJoin/@Permutations[{"r","e","s","e","t"}]]

Out[1]= {{True,6},{False,54}}

In[2]:= Tally[DictionaryWordQ/@StringJoin/@Permutations[{"R","E","S","E","T"}]]

Out[2]= {{True,7},{False,53}}

The word “STERE” is missing from the uncapitalized words...

POSTED BY: Byron Zollars
POSTED BY: Marc Vicuna

Hello Juan,

  1. This is new and will be corrected, indeed it should be a lower bound, 1-the probability.
  2. Why would this be +0.5? "At least" implies inclusion, so -0.5 is more appropriate.
  3. Right on! We will correct this! Personally this is one of my most reoccurring mistakes, I find it very counterintuitive to use standard deviation as a parameter instead of variance. Thank you for catching that.
  4. Definitely a small but significant mistake. We will correct this.

Thank you a lot of all those corrections!

POSTED BY: Marc Vicuna

Hello Juan,

  1. This is new and will be corrected, indeed it should be a lower bound, 1-the probability.
  2. Why would this be +0.5? "At least" implies inclusion, so -0.5 is more appropriate.
  3. Right on! We will correct this! Personally this is one of my most reoccurring mistakes, I find it very counterintuitive to use standard deviation as a parameter instead of variance. Thank you for catching that.
  4. Definitely a small but significant mistake. We will correct this.

Thank you a lot of all those corrections!

POSTED BY: Marc Vicuna

For exercises 10 problem 5,

I don't see how the condition x>1.5 in the calculation of variance represents the problem statement of "greater that 15 given that it is at least 10.


*MODERATOR NOTE: notebook Exercises10 Problem 5 was moved to the attachment below and also can be viewed at https://www.wolframcloud.com/obj/befe889f-2b79-4fd3-b80a-d22f6da3f9c8 Wolfram cloud notebook.*

Attachments:
POSTED BY: Joseph Smith
Posted 2 years ago
POSTED BY: Updating Name

Hi;

  1. Is the following set logic correct - the notes were not real clear?

    P(A \[Union] B)=P(A) + P(B) - P(A \[Intersection] B)
    P(A \[Intersection] B) = P(A') + P(B') - P(A' \[Intersection] B')
    
  2. When creating a distribution using data from the repository, can the more recent data be weighted?

  3. I used the following to extract data; however, the FindDistribution function had a problem with the extracted data. Can you tell me what I am doing wrong.

    data = 
      QuantityMagnitude@
       Normal@ResourceData["Sample Data: Fisher's Irises"][
         All, {"SepalLength", "SepalWidth", "PetalLength", "PetalWidth"}];
    
    FindDistribution[data]
    

I finished the quizzes last Friday and the final yesterday. Some of the questions were quite challenging.

All in all, I really enjoyed your presentation and learned a lot about using Mathematica in calculating probability. However, I still have a few remaining questions.

Thanks again,

Mitch Sandlin

POSTED BY: Mitchell Sandlin
POSTED BY: Marc Vicuna
POSTED BY: Mitchell Sandlin

Hello Mitch,

In two words: De Morgan's!

P(A' \[Union] B') = P((A'' \[Complement] B'')') (De Morgan's Law of Set Theory)
P(A' \[Union] B') = P((A \[Complement] B)') (Double Negation of Set Theory)
P(A' \[Union] B') = 1 - P(A \[Complement] B) (Complement Law of Probability Theory)

Those laws are always useful, once in a while. I hope that helps.

POSTED BY: Marc Vicuna
Posted 2 years ago

Dear All, Exam , Q11 may have missing data. G

POSTED BY: Gabor Tarkanyi

Hello,

If you are referring to the mock exam, I can't see anything missing. If you are referring to the actual Final Exam, the numbering is random, so I will need more information to be able to pinpoint the question.

POSTED BY: Marc Vicuna
Posted 2 years ago

Dear Marc,

Will the course materials page contain all corrections to the Study Group notebooks e.g. Exercises?

I returned from Spring Break. In this Community Page there are too many discussions to keep track of, so I prefer to download the latest notebooks to work from.

Cheers,

Dave

POSTED BY: Updating Name
Posted 2 years ago

The course materials have been updated with the corrections :)

POSTED BY: Dave Middleton

Thanks for taking care of this! You, guys, are the best! I also started my own course and with your help I overcame many problems. I for example know this source https://phdessay.com/free-essays-on/cultural-identity/ that helps me a lot with my college writing, on any topic, like Cultural Identity; but I would like to know if there is a source that could help me with wolfram; this is until I'm doing pretty bad and if someone could help me I would be very glad, or maybe some AI who does this quite well, or maybe even a real person who could do this obviously for a fee. Thanks in advance.

Hello,

The lastest notebooks are on the course framework, or soon to be on the framework. If you find an error, just ctrl-find it on this page or suggest a change if you can't find it. But most mistakes were corrected in the framework by now. Hopefully that helps.

POSTED BY: Marc Vicuna

How do we calculate the Kurtosis of a distribution coming from data, as problem 6 from quiz 3 asks?

Hello Juan,

As repeated throughout the course, it all comes down to your capacity to recognize where to apply each distribution. If you can recognize the situation is appropriate for a specific distribution, use EstimatedDistribution on that abstract distribution together with the data. Then extract your measures from that distribution. Now, if you only have data and no information otherwise, use FindDistribution and actually use the found distribution to take your measures.

POSTED BY: Marc Vicuna

Marc:

That is what I did on the problem and got PoissonDistribution[3.71429] which has a Kurtosis of 3.26923, which does not appear as a valid answer on Problem 6 from Quiz 3.

Hello Juan,

I'm not sure how you are getting that result. The output for:

EstimatedDistribution[{1,3,6,0,3,4,5,7,4,2,11,1},PoissonDistribution[\[Lambda]]]

is

PoissonDistribution[3.91667]
POSTED BY: Marc Vicuna
POSTED BY: Michael Gierhake
POSTED BY: Marc Vicuna

If you can recognize the situation is appropriate for a specific distribution, use EstimatedDistribution on that abstract distribution together with the data.

I would like to add that knowing the number of the parameters appropriate, makes big difference. Take for example StudentTDistribution, it has either one or three parameters. You can easily get a bad fit using EstimatedDistribution on your data with StudentTDistribution[[Nu]] while getting a much better one using StudentTDistribution[[Mu], [Sigma], [Nu]]

POSTED BY: Ahmed Elbanna

Hello Ahmed,

Indeed, you also need to be careful about that. When a parameter is not used in the function signature, it's usually because that parameter is assumed to be some standard value, like 0 or 1. If you cannot make such an assumption in your situation, it is in your interest to use the most general form of the distribution.

POSTED BY: Marc Vicuna

How do we calculate the variance for a Normal Distribution when it is not indicated? Problems 1 and 2 from Excercises-18.nb do it differently.

Posted 2 years ago

Hello Juan,

The normal distribution is the distribution of approximations, it is used to approximate two major distributions: the binomial and the Poisson distribution. In both approximation, just use the mean and variance of the exact distribution you're trying to approximate. If it's the binomial, take the binomial mean and variance for the normal. If it's Poisson, take the Poisson mean and variance.

POSTED BY: Updating Name

Thank you.

For exercises08 exercise 2 , given as "What is its expectation of the function Binomial[10,i]/2^10 for 0<=x<=10?" What is the "its" referred to in the problem statement ? The solution seems to be calculating the expectation of x for the distribution Binomial[10,i]/2^10 .

POSTED BY: Joseph Smith

Hello Joseph,

Indeed, it should be a "the", not an "its". Simple mistake, it will be corrected.

Thank you.

POSTED BY: Marc Vicuna
Posted 2 years ago

I also get a different output from Lesson 14 Exercise 4 with the same input as the exercise gives:

EstimatedDistribution[{3, 3, 10, 6, 6, 4, Sequence[
  5, 9, 3, 4, 7, 4, 7, 10, 8, 5, 6, 7, 11, 10, 5, 9, 7, 8, 6, 5, 6, 7,
    6, 8, 12, 9, 6, 3, 9, 5, 7, 5, 2, 9, 3, 5, 9, 9, 3, 5, 3, 8, 5, 6,
    5, 4, 7, 10, 6, 7, 8, 8, 11, 9, 8, 8, 9, 3, 11, 8, 7, 10, 5, 4, 5,
    10, 4, 8, 7, 7, 4, 3, 5, 10, 5, 4, 11, 5, 6, 10, 5, 7, 10, 11, 7, 
   5, 4, 7, 9, 5, 4, 5, 7, 5, 10, 11, 10, 5, 5, 7, 4, 7, 5, 4, 3, 4, 
   7, 10, 4, 8, 2, 7, 4, 4, 8, 4, 8, 8, 3, 9, 7, 7, 7, 7, 10, 5, 9, 8,
    11, 6, 8, 7, 7, 8, 3, 6, 7, 6, 7, 8, 8, 7, 2, 3, 4, 9, 7, 7, 6, 4,
    10, 6, 4, 8, 10, 7, 3, 10, 6, 6, 6, 5, 9, 7, 11, 6, 7, 1, 4, 8, 8,
    5, 5, 2, 8, 6, 7, 7, 5, 5, 6, 5, 6, 2, 12, 7, 6, 5, 7, 5, 9, 6, 4,
    8, 3, 8, 3, 7, 6, 3, 10, 6, 3, 6, 7, 8, 7, 3, 7, 4, 5, 4, 10, 8, 
   7, 10, 10, 7, 5, 9, 5, 4, 6, 4, 6, 11, 7, 9, 9, 6, 7, 4, 6, 7, 5, 
   5, 5, 5, 6, 4, 8, 4, 8, 7, 6, 4, 4, 5, 7, 8, 4, 2, 1, 5, 9, 2, 6, 
   11, 5, 4, 5, 12, 7, 7, 7, 0, 3, 7, 4, 6, 11, 5, 3, 5, 8, 4, 5, 2, 
   3, 8, 8, 6, 6, 1, 9, 4, 3, 8, 5, 4, 4, 5, 4, 5, 6, 6, 5, 7, 6, 1, 
   7, 3, 9, 8, 4, 8, 2, 9, 7, 13, 5, 5, 2, 8, 12, 8, 5, 5, 2, 3, 4, 9,
    11, 5, 6, 10, 5, 5, 5, 6, 5, 4, 3, 8, 8, 12, 7, 7, 8, 11, 2, 9, 
   10, 5, 4, 2, 8, 9, 6, 8, 7, 6, 1, 8, 7, 9, 10, 5, 10]}, 
 PoissonDistribution@\[Mu]]
Variance@%

PoissonDistribution[6.31507]
6.31507
POSTED BY: Parker Robb

Hello Parker,

Indeed, I'm not sure what gave such a weird output. It will be corrected.

Thank you for noticing.

POSTED BY: Marc Vicuna
Posted 2 years ago

The values used in Lesson 14 Exercise 2 do not match the values given in the question. For that question I get the following input and output:

Probability[x >= 25, 
  x \[Distributed] PoissonDistribution[0.06 500]] // N
Probability[x >= 25, x \[Distributed] BinomialDistribution[500, 0.06]]

0.842758
0.850619

Is this correct?

POSTED BY: Parker Robb

On exercise 5 of Exercises7 the solution gives P(17<x<19), which I believe is the probability of coming in between the 17 and 19 hour. I calculated the answer as P(x<19|x>=17).

On exercise 4 of Exercises 8 the solution gives P(A)^3 which I believe is the probability of being absent two days for three months in a row. I calculated the answer as P(x+y+z>=2) where each is distributed by 1/(ex!).

On exercises 13, I believe that more than one solution excludes equality when it should include it.

Am I wrong in these?

Posted 2 years ago

Hello Juan,

All correct! Thank you for noticing.

  1. Indeed, your answer makes more sense given the formulation. It will be corrected.
  2. Indeed, the formulation is too ambiguous, it will be corrected to make it more clear what is needed.
  3. Yes, for exercise 2 and 4, there are some equalities missing to the final solution.

Thank you!

POSTED BY: Updating Name

Problem 5 is stated as: "Consider a zoo visitor who arrives less than three hours before closing. How likely is that person to be able to stay for more than an hour?" What we have calculated by

Probability[17 < x < 19, x \[Distributed] zooDist] 

is the probability that a visitor, among ll the visitors of the day, will arrive between 17 and 19. Based on the way the problem is stated, wouldn't it be more correct to calculate the probability that a visitor arriving between 17 and 20 (less that three hours before closing) actually arrives between 17 and 19. Based on this reasoning, shouldn't the correct approach be to calculate

Probability[17 < x < 19, x \[Distributed] zooDist]/Probability[17 < x < 20, x \[Distributed] zooDist]

?

POSTED BY: Joseph Smith

Hello Joseph,

This is indeed also correct, and is an equivalent formulation of:

Probability[x < 19 \[Conditioned] x >= 17, x \[Distributed] dist]

As mentioned previously in this post.

POSTED BY: Marc Vicuna

I agree with you on both Robb.

Hello Parker,

Indeed, it seems the question and solution numbers don't correspond. This will be corrected.

Thank you for noticing.

POSTED BY: Marc Vicuna

Same question here. I obtained the same solution as shown in this post.

POSTED BY: Joseph Smith

Marc: I don't quite understand your statement "as soon as I have three balls the game is over". I did assume that once "you" draws a red ball the game is over. I reproduced the terms in your sum but if you carry the tree to the end you get some cases where there are 3 balls left and "you" have not yet drawn a red ball. I work this out in the attached notebook

POSTED BY: Joseph Smith

Hello Joseph,

It seems the solution for this problem is wrong and too complicated. To avoid this, let's use the power of the Wolfram language.

Here is the new solution.

Let's compute all arrangements of balls with your friend, where red balls are negative and black balls are positive. We are only interested in the balls that are received by the player, so let's take the odd columns (odd rounds).

possibilities = Permutations[{-1, -2, 1, 2, 3, 4}][[All, {1, 3, 5}]];

Now that we have the balls received in all probabilities, just measure the number of possibilities where we have at least one negative number against the total number of possibilities.

Length@Select[possibilities, AnyTrue[#, Negative] &]/
     Length@possibilities

And we get 4/5. As mentionned by William Weller.

POSTED BY: Marc Vicuna

Thanks for your response. I could not help thinking that there must be an easier way to solve this problem than by drawing a complex tree!

POSTED BY: Joseph Smith

WOW! I worked through your solution. It truly was an elegant tour de force!

POSTED BY: Joseph Smith

Hi Marc,

I am reviewing "Mock Exam" notebook and have found that the Question 31 solution is not correct. Given U, the transformed distribution with U-0.3 must be stated as:

Plot3D[{PDF[DirichletDistribution[{5, 3, 2}], {x, y}], 
  PDF[TransformedDistribution[{a - 0.3, 
     b - 0.3}, {a, b} \[Distributed] 
     DirichletDistribution[{5, 3, 2}]], {x, y}]}, {x, 0, 1}, {y, 0, 
  1}, Filling -> Bottom, PlotRange -> All, 
 PlotLegends -> {"U", "U-0.3"}]
POSTED BY: Hee-Young Shin

Hello Hee-Young,

Yes, it seems obvious that I forgot to adjust the question phrasing to U+0.3. Thank you for noticing, it will be corrected.

POSTED BY: Marc Vicuna

Hi Marc, It seems there are problems in both Quiz 6 #1 and #6, both of which are calculating the probability by normal approximation to binomial distribution. Please check these two questions in the "framework".

POSTED BY: Hee-Young Shin

Hello Hee-Young,

For the first question, you seem to be thinking Probability and NProbability are entirely equivalent, but they are not. NProbability is and will forever be an approximation, and sometimes a bad approximation. For this course, I suggest sticking to Probability, or even N@Probability or Probability[...] //N, as this is more likely to be a better approximation. Question 1 is one of those examples, with the binomial distribution.

As for question 6, I'm not sure what the issue is. You got the right answer with your normal approximation.

POSTED BY: Marc Vicuna

Thanks for this explanation, Marc. I have tried again using N@Probability. The 1st question still indicates that my response is wrong. As for the 6th question, it seems that there is no correct answer in the multiple choice. I suspect that there is either a technical glitch or wrong syntax in the "Framework". Please do the double check. Best wishes,

POSTED BY: Hee-Young Shin

I found the issue! Thank you for noticing, it will be corrected.

POSTED BY: Marc Vicuna

Can you clarify why you chose x>=440-0.5 for the argument for the Probability statement in your analysis of Exercise 6. I certainly see why 440 is chosen but I don't get the -0.5.

Response

The question mentions to use if appropriate the normal approximation. Read or listen again to Lesson 24 to know how to apply that approximation. Here 440 is included, so we include it in the probability.

I will look this over.

POSTED BY: Joseph Smith

Hello Joseph,

The question mentions to use if appropriate the normal approximation. Read or listen again to Lesson 24 to know how to apply that approximation. Here 440 is included, so we include it in the probability.

POSTED BY: Marc Vicuna

Hi Marc, Can you quickly check whether there is any technical error in your courseware ("framework")?

I am sorry. I figured out my mistake. The correct syntax is:

NProbability[x > 40, 
 x \[Distributed] BinomialDistribution[150, 0.2633]]
Attachments:
POSTED BY: Hee-Young Shin
Posted 2 years ago

The Daily Study Group "Introduction to Probability" Quizes and Level 1 Certifications have deadlines within the next two weeks. Next week will be Spring Break; personally I have been and will be very time constrained for the next few weeks.

Alternatively, I suppose we can also complete the course on the Wolfram-U page and request the Level 1 exam upon completion?

POSTED BY: Dave Middleton

Thanks for making the point about spring break, @Dave Middleton. We will extend both deadlines by a week, so quizzes should be done by March 24, and the exam by March 31. Yes, you can earn both completion and Level 1 independently in the interactive course, but course completion will require that you watch the video lessons within the framework. We run custom data pulls in order to verify completion with the Study Group.

POSTED BY: Jamie Peterson
Posted 2 years ago

Thank you Jamie. I plan to use the Study Group materials if time permits.

POSTED BY: Dave Middleton

Hi;

When you obtain the probability, it is easy to understand its exact meaning. However, the Expectation and RandomVariate are not so easy to understand. When would you use these two and what exactly are they telling me?

POSTED BY: Mitchell Sandlin

Hello Mitchell,

Let's go back to basics.

A probability distribution is a set of values associated to a set of probabilities. For a die, the values are {1,2,3,4,5,6} and the probabilities are {1/6,1/6,1/6,1/6,1/6,1/6}.

What is your probability? The likelihood of getting a certain value. Now, the RandomVariate is essentially a simulation. It randomly gives you back any value based on its probability in the context of the distribution. You use it to simulate the distribution, get an intuition of what that distribution is all about. To create artificial data, if needed.

Expectation is the following formula: the sum or integral of all probabilities times a mathematical expression of the value. It turns out this formula is really useful, as demonstrated by the lesson on expectation. It basically expresses the mean or average of a given mathematical expression, where the express has as a variable the values of the distribution. The expectation of the value itself is the mean, or average of the distribution. But as you can see in the lesson, the Expectation formula can be used for many other measures. The interpretation depends on the expression of the value used. You use it for many measures of a distribution, to find what is the expected, predicable value that will come, or at least close to it.

POSTED BY: Marc Vicuna
Posted 2 years ago

I have been puzzled about conditional probability, because is the situation usually as clear as that? The probability of an event occurring, given that another event has already occurred. It seems to me that in real world situations there can be all kinds of complex correlations between the events and the external world, not captured by the basic conditional probability formula.

POSTED BY: Anders Lindman
POSTED BY: Marc Vicuna
Posted 2 years ago

Interesting, thanks.

POSTED BY: Anders Lindman

In yesterday's session about JointDistributions, the example of the Dirichlet distribution (about cutting a rope @ at about 4:30 in the framework video, https://www.wolframcloud.com/obj/online-courses/introduction-to-probability/joint-distributions.html) is completely different from the example in the notebook ("Lesson 22 - Join Distributions.nb", slide 8, as well as in the framework Lesson notebook) which is about mileage of a car.

Isn't the rope example more typical of the Dirichlet distribution than car mileages?

Hello Hakan,

Indeed, the version on the framework seems to be the wrong one! Thank you for telling us, this will be corrected shortly.

POSTED BY: Marc Vicuna

I would like to suggest another clarification to section 3.

POSTED BY: Joseph Smith
POSTED BY: Marc Vicuna

Thanks for your response!

POSTED BY: Joseph Smith
POSTED BY: Joseph Smith
POSTED BY: Marc Vicuna

For Section 2 Exercise 3. how do all the paths to (4,2) add up to 7 segments? It looks like 6 to me.

Attachments:
POSTED BY: Joseph Smith

Hello Joseph,

Indeed, this is an error, as noted by the post by Juan Ortiz Navarro, here's my answer:

As for problem 3, this is also an error, but the anwser should be Binomial[6,2] or Multinomial[4,2], yes. Thank you for informing us!

For the solution to make sense, just change the question to (0,0) to (4,3).

POSTED BY: Marc Vicuna
Posted 2 years ago

Hi!. In the combinatorics, at the start of the course we saw how to count [Not-Replacement +Order Not Relevant] --> Binomial. What would be the approach to count [Replacement + Order Not Relevant], maybe we saw that, but I can't find it. Thanks!

POSTED BY: J.Edi Gran

Hello J,

For combinatorics, it's easier to think of as a tree of decisions rather than a matrix. Some cases may not have any real problem. The binomial is [Order not relevant + 1 group], and we usually assume without replacement. Why? Because a set with multiple examples of the same element is still the same set! Consider this:

A set is a collection of non-repeated elements without order.

A multiset is a collection of elements without order.

Therefore, the approach to count (Replacement + Order Not Relevant) requires the definiton of a multiset. This multiset can be counted using Binomial(n+k-1,n). See this for more explanations.

POSTED BY: Marc Vicuna
Posted 2 years ago

Thanks Marc.. I do believe there may be some impossible (in the sense of not meaningful) cases. I still try to figure out what makes sense to ask and what doesn't .But ok, here I go with a clarification of my question. I still believe this is applicable in real life, but I may be wrong. I tried to depict the situation with an "entry level skills" notebook, so my question gets easier to be assessed..

POSTED BY: J.Edi Gran
POSTED BY: Marc Vicuna
Posted 2 years ago

Thanks Marc, thanks a lot!

POSTED BY: J.Edi Gran
POSTED BY: Joseph Smith

Hello Joseph,

Indeed, on problems 1 and 3, there seems to be text from another part of the course. This is an error, as mentionned by Juan Ortiz Navarro, posted 2 days ago Regarding excercises-02.nb The solution to problem 1 (...)

So yes the answer is 25!/15!.

POSTED BY: Marc Vicuna

Thanks!

POSTED BY: Joseph Smith
POSTED BY: Marc Vicuna
Posted 2 years ago

Thanks @Juan Ortiz Navarro, I had just calculated the same solution using Bayes theorem. As the solution of "Exercise 4" in the "exercises-06" notebook is different from mine, I consulted this community page.

enter image description here

P( S | E) =2 / (1+1/0.3) = 0.46

Personally, I find the original wording and this solution more interesting..

POSTED BY: Dave Middleton

Hello all,

For a bit of context for the exponential family of distributions, here are some ressources:

A well made short video series introducing the subject, by Mutual Information.

An introduction article to get familiar with this, from Berkeley EECS.

A short textbook of Statistical Theory focused on the exponential family, from the University of Oxford.

A paper on the link with machine learning, from Princeton University.

The exponential family is usually covered in any course of Statistical Theory. Go satisfy your curiosity!

POSTED BY: Marc Vicuna
Posted 2 years ago

Mind blowing!...(for me.. for sure there are more advance users for whom this may be already natural.)..but now I can appreciate the flexibility (and why not beauty) of exponential when "connected" to other "devices" to cast a spectrum of other functions. Never expected that to be possible.

POSTED BY: J.Edi Gran
Posted 2 years ago

Lecture 1: How many different teams are possible given that it must include 3 Swiss (original group is 4) and 2 Ethiopians (original group is 5). You give the answer Binomial[4,2]*Binomial[5,2] is this the correct answer? How did you derive it? Is my answer Binomial[4,3]*Binomial[5,2] not correct?

POSTED BY: Alex Kuznetsov
POSTED BY: Marc Vicuna

POSTED BY: Zbigniew Kabala
POSTED BY: Marc Vicuna

It's funny that commenting on typos I made a typo myself by inexplicably replacing a multiplication sign with a plus sign. This notwithstanding, the first is also a mistake. In your documentation, the calculation reads:

Binomial[4, 2] * Binomial[5, 3], which is equal to 60,

whereas it should read

Binomial[4, 3] * Binomial[5, 2], which is equal to 40

NOTE: I edited my original post and fixed my typo and its consequence.

POSTED BY: Zbigniew Kabala
Posted 2 years ago

Hello... While reading the second excersise that states "While surfing the web, you encounter ad A 7 times and ads B and C 3 times each. How many arrangements are possible? " I interpreted as having next sequence: AAAAAAABBBCCC or (any other with seven As, three Bs and three Cs, as the precise sequence is not established). So I understood it like n=3 (A,B,C) al possible value reutilization=Yes k (trials)=13 , as the sequence provided, no matter the order is 13 positions long.

So my solution before reading the answer was n^k -> 3^13 possible arrangements. But when reading I found "multinomial". In the question what is the part that suggest that Multinomial approach is the right approach? Thanks

POSTED BY: J.Edi Gran

Hello J,

That's a pretty fun question actually. Let's go through it.

You encounter A 7 times, B 3 times and C 3 times, 13 in total. How count this have happened? You have your limited ressource of 7 As, 3 Bs and 3 Cs, so all the orders are going to be 13!, not 3^13, because this would imply you may not have encountered the given number of ads.

But, can you distinguish the difference between A and A? No, the elements are not distinguishable. So you need to divide by 7! for all the orders of A, 3! for B, and 3! for C. You get 13!/(7!3!3!) which is the Multinomial[7,3,3].

POSTED BY: Marc Vicuna
Posted 2 years ago

Oh that's interesting, as the assumption for the multinomial is that all possible Ads are "exhausted" at the last observation. My intuition was that, while surfing the internet, I used a finite amount of time, and during that undetermined time Window I saw only 13 Ads, like if I keep browsing I may perfectly see more ads. Let say if I would double the browing time, I may end seeing 26 ads or 30 ads in total, .. So I guess I get your point if all possible ad instances are contained in the limited length sample (of 13 total ads). Otherwise, if not limiting length is stated, I guess it may be fair to assume that As Bs or Cs are never exhausted, and browsing more...means watching more ads..(like the "YouTube Premium thing...that seems to never end.. ;) ....) Thanks Marc for the clarification

POSTED BY: J.Edi Gran

Hello J,

To be clear, I think your situation could be also a valid situation.

When you read the situation, you are given a specific instance of what happened. 7, 3, 3. Given that specific instance, I ask about the unknown information: the ordering. But I may have asked given 13 ads and 3 ad types, or given 3 ad types (even more general). You should need to be careful what is the specific situation and not generalize too fast.

POSTED BY: Marc Vicuna
Posted 2 years ago

Thanks, Marc. I'll keep that advice in mind!

POSTED BY: J.Edi Gran
POSTED BY: Marc Vicuna

Thanks Marc. I was confused on "neither red nor blue". I understand now that it is a conjunction.

Regarding excercises-02.nb The solution to problem 1 seems to took a turn I do not understand. Should it be factorial(25)/factorial(10)?

And on problem 3, going from (0,0) to (4,2), are paths of length 7 or 6?

So Binomial[6,2] or Multinomial (4,2)?

POSTED BY: Marc Vicuna

Thanks. I meant to write 25!/15! indeed.

Posted 2 years ago

While trying to confirm concepts in the course with 3rd party reference, I found the attached. It states that the "Sample Space" for the times to failure for a certain machine is a sequence (T1...Tn). In my opinion, sample space must contain all possible times to failure, as opposed to certain specific sample. So I'd say the "space" would be all the real numbers, or perhaps all the real numbers below the maximum allowable age for the equipment being analyzed. Am I right on that the reference is confusing a "Sample space" with a specific sample?

I'd appreciate your comments. Jorge

a enter image description here

POSTED BY: J.Edi Gran

Hi J,

This is a bit complicated, so let's discuss this one thing at a time. This textbook expresses a sample space where each data point is in itself a sequence of multiple numbers. In that sense, this is a multivariate random variable. Within a single sample, or data point, there are multiple times, which are the periods of time between breakdowns. Let's say there are n such periods in each sample. Thus, the domain of any single outcome is R(>0)^n, that is, the positive reals in n dimensions (periods of time are always positive).

In this context, the sample space is the set of all possible sequences of periods between breakdowns, possibly R(>0)^n itself.

So overall, I believe you are right with your sample space, but also that you are wrongly interpreting their explaination of the sample space. Hopefully that explanation helped.

POSTED BY: Marc Vicuna
Posted 2 years ago

Oh!!!! Aweeeeeesomee!!! Thanks...thanks a lot... I see my mistake!

POSTED BY: J.Edi Gran
Posted 2 years ago
POSTED BY: Laising Yen
POSTED BY: Marc Vicuna

Marc,

You asked for mistakes in the documents. So, please have a look at “Lesson 2, Slide 9”: enter image description here Your equations are wrong, which means LHS is not equal to RHS. Right?

POSTED BY: Jürgen Kanz

Indeed, the LHS exponent should be 5, not 3. This will be corrected.

Thank you for noticing and informing us.

POSTED BY: Marc Vicuna
Posted 2 years ago

When is the new study group for Introduction to Probability starting? I missed last week and would rather start from the beginning without worry and rushing.

POSTED BY: Updating Name

Does anybody have the link to the course materials that they could post on the community thread?

I want to get started on the exercises but I missed the last meeting and the recording does not show the chat pane with the links.

Thanks

POSTED BY: Joseph Smith
Posted 2 years ago

thanks!

POSTED BY: Joseph Smith
Posted 2 years ago
POSTED BY: William Weller
POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: William Weller

Indeed, that is correct.

There are 6*6 combinations, 6 with doubles, 6 starting with the number 6 and 6 ending with the number 6. The three events have the occurrence (6,6) in common, so one occurrence is counted three times, thus:

(6 + 6 + 6 - 2)/(6*6)

is indeed 4/9, thus odds of 4 to 5.

Thank you for informing us, it will be corrected.

POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: William Weller

William:
I agree with you but I see it as which parts of the tree will give at least one red, taking into account that as soon as I have three balls the game is over. So in the first round I have a 2/6 probability of getting a red one. If I do not succeed on that, then on the second round I have 4/62/5 probability of getting a red one. And on the third round, I have 4/63/51/2 of getting a red one and the game is over since the other player took the other three balls:
2/6 + 4/6
2/5 + 4/63/52/4=4/5
or the complement of not getting red balls, which will be all black balls: 1-4/63/51/2.
Does that make sense?

EDIT: I seem to have answered that half asleep, I can't even read my answer. Please disregard it.

POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: Parker Robb

Hello Parker,

This is incorrect due to the fact the probabilities change as this situation is ordered. But naming all the possibilities may not be a bad idea, this will become my new solution. Thank you for the idea.

POSTED BY: Marc Vicuna
Attachments:
POSTED BY: Joseph Smith
POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: William Weller

Hello William,

Indeed, the written answer is right but the calculation is wrong, it will be corrected.

Thank you, it will be corrected.

POSTED BY: Marc Vicuna
POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: Updating Name
POSTED BY: Marc Vicuna

Hi;

A couple of questions regarding the Decision Tree to decide how to count specific outcomes:

Under the Count, Without Order, One Group, (n over i ) - what does (n over i) mean or what operation are we performing here?

Under Count, With Order, Remplacing - what is Remplacing?

Thanks,

Mitch Sandlin

POSTED BY: Mitchell Sandlin
POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: Updating Name
POSTED BY: Marc Vicuna
Posted 2 years ago

Would you please provide details of the proof for the inequality on Slide 12 of Lesson 10? Thanks very much.

POSTED BY: Bob Renninger
POSTED BY: Marc Vicuna
Posted 2 years ago

Thanks Marc, this is exactly what I was looking for. Sometimes a simple equation has an easy proof, and I am relieved to see that this is not such a case! Can you provide a reference to the book mentioned in the article? That seems to be very compatible with this course.

POSTED BY: Bob Renninger
POSTED BY: Marc Vicuna

In lesson 8, "Discrete Random Variables" we use the statements

children[n_] := 1/((n + 1)^3*Zeta[3]);
distChildren = 
 ProbabilityDistribution[children[n], {n, 0, Infinity, 1}]

Does the "1" in the ProbabilityDistribution function indicate that n is a discrete variable?

POSTED BY: Joseph Smith
POSTED BY: Marc Vicuna
POSTED BY: Zbigniew Kabala
POSTED BY: Marc Vicuna
POSTED BY: Zbigniew Kabala
Posted 2 years ago
POSTED BY: J.Edi Gran
Posted 2 years ago
Attachments:
POSTED BY: J.Edi Gran

Some weird behavior when trying to use a PDF

Attachments:
POSTED BY: Jürgen Kanz
Posted 2 years ago

Oh!! Awesome! Great advice Jürgen!!..Thanks a lot!

POSTED BY: J.Edi Gran
POSTED BY: Zbigniew Kabala
POSTED BY: Mitchell Sandlin

Hello Mitchell,

No, these are not equivalent. Those are for very different contexts.

Conditioned is only used in the context of Probability to symbolize the classical P(A|B) or probability of A given B, basically replacing the word given.

Condition is used in pattern recognition, always jointly with a pattern, giving an iterative test to accomplish on a list usually, as a much more core mechanic of the language.

POSTED BY: Marc Vicuna
Posted 2 years ago

Hi,

In the material of Lesson 4, when referring to the probability of the event "E" , and the need for Kolmogorov axioms, appears this..

However, k/M must be constant to be relevant, which is problematic<

Every time I "test" the 1/6 of a perfect dice, the number of k ones / M trials is not constant, certainly because random events are not constant. So I don't believe I get the core idea of why k/M must be constant or why this is "problematic". From an Statistical perspective that k/M is how we ( or at least I ) estimate (guess) probability. So, I'm confused.

POSTED BY: J.Edi Gran
POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: J.Edi Gran
Attachments:
POSTED BY: Joseph Smith
Posted 2 years ago

This doesn't dare to be an "answer", but my interpretation, as I had the same "feeling" when reviewed that slide.

As the explicit statement in the slide is {2,4,6,8,10,12}, and those were examples on the topic of "Sample spaces and Events" I saw that list as the "Event Definition" that one would use to calculate the mentioned probability. That list is actually the Event Definition against which the EvenQ function tests the full range of outputs. And I'd humbly agree the probability of an Even Sum is 1/2

.enter image description here

POSTED BY: J.Edi Gran
POSTED BY: Marc Vicuna

Marc

Thanks for your response. I agree that the set of all possible outcomes should not include repeats. But is the probability of an even sum 1/2? .

I'm enjoying these study group sessions!

Joe

POSTED BY: Joseph Smith

Yes! It is 1/2. Let's see how you could do it considering all the notions seen.

First, get the sample space and probabilities.

{sampleSpace, probabilities} = 
 Transpose@Tally@Flatten@Table[x + y, {x, 1, 6}, {y, 1, 6}]

Then, normalize the probabilities for a sum of 1:

probabilities = probabilities/Total[probabilities]

Finally, sum all even sums:

Sum[If[EvenQ@sampleSpace[[i]], probabilities[[i]], 0], {i, 
  Length[sampleSpace]}]

Not the simplest way, but definitely a visual way to see it:

ListPlot[Transpose@{sampleSpace, probabilities}, Filling -> Axis]

Which should give you: enter image description here

POSTED BY: Marc Vicuna
Posted 2 years ago
POSTED BY: J.Edi Gran
POSTED BY: Marc Vicuna

Hi Mark:

I did not understand what you are calculating in the UCILetter Example on the Baye's section.

Can you point me in the right direction?

Thanks.

POSTED BY: Marc Vicuna
Posted 2 years ago

Hi Mark,

I am a high school math teacher. The lesson materials are wonderful to be used in our classroom. There are some simulations in the notebook of day 1. I wonder is there any place that I can download the source code? just like the "normal convergence" you shared.

POSTED BY: Tianyi Hu
POSTED BY: Marc Vicuna
Posted 2 years ago

This is really great for learning the complete possibilities of the Wolfram Language.. Thanks for doing it so thoroughly!. You could paste a picture!.. but you didn't!. Thanks a lot!

POSTED BY: J.Edi Gran
Posted 2 years ago

Marc,

I downloaded your class notebooks from the site today.

Slide 7 of lesson 2 "Consider that the 4 Swiss and 5 Ethiopian athletes want to form a team of 5 to represent them in competition. How many different teams are possible given that it must include 3 Swiss and 2 Ethiopians?" Shouldn't the answer be Binomial[4,3]*Binomial[5,2]?

John

POSTED BY: John Burke
Posted 2 years ago

Same question here, lol. Should the answer be 40?

POSTED BY: Mr. Khushu
POSTED BY: Marc Vicuna

Was told by one of the Q&A moderators that the session recording will be posted . How can we access it to review it ?

POSTED BY: Amin Cheikhi
POSTED BY: Jamie Peterson
POSTED BY: Peter Burbery
POSTED BY: Marc Vicuna

Reminder that our upcoming Daily Study Group provides a preview of the new interactive course, Introduction to Probability. The Study Group meets daily over two weeks, Monday through Friday, for an hour online each day, starting Monday. Take advantage of this opportunity to prepare for probability and statistics related coursework and research in natural science, engineering, finance, medicine, data science and other fields! You can sign up here.

POSTED BY: Jamie Peterson
POSTED BY: Marc Vicuna

This study group will be based on the upcoming Introduction to Probability course on Wolfram U.

Marc Vicuna is the instructor for the study group as well as the Wolfram U course and is an outstanding young teacher and data scientist.

I strongly recommend you to join the study group and immerse yourself in probabilistic thinking for two weeks!

POSTED BY: Devendra Kapadia
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard