Group Abstract Group Abstract

Message Boards Message Boards

[WSG23] Daily Study Group: Introduction to Statistics

A Wolfram U Daily Study Group on "Introduction to Statistics" begins on April 3, 2023.

Join a cohort of fellow statistics enthusiasts to learn about collecting, describing, analyzing and interpreting data and trends in science, industry and society. Learn about techniques for data visualization and descriptive statistics, methods for calculating confidence intervals and tools for hypothesis testing from video lessons created by veteran instructor and developer David Withoff. Participate in live Q&A and review your understanding through interactive in-session polls. Complete quizzes at the end of the study group to get your certificate of program completion.

April 3-14, 2023, 11am-12pm CT (4-5pm GMT)

REGISTER HERE

enter image description here

Please feel free to use this thread to collaborate and share ideas, additional resources and questions with the instructors as well as with other members of the study group.

47 Replies
POSTED BY: Mitchell Sandlin
Posted 2 years ago

The 0.975 number is the probability needed for looking up z-star in tables, or for using the InverseCDF function to get z-star.

The end points of a 95% confidence interval are the points at the ends of an interval that includes the central 95% of the probability in the sampling distribution of the underlying test statistic, but instead of giving those points directly, tables of the sampling distribution give points such that some fraction of the probability will be to the left of any point in the distribution, so to use those tables (or the InverseCDF function) it is necessary to work out those probabilities. For a 95% confidence interval, the end points of the central 95% leave 2.5% the left of the left end point and 2.5% to the right of the right end point, which leaves 97.5% of the probability to the left of the right end point. So to get the end points of a 95% confidence interval, the points that are needed from the table (or from InverseCDF) are the point that gives 2.5%, or a fraction 0.025, of the probability to the left of that point, and the point that gives 97.5%, or a fraction 0.975, of the probability to the left of that point.

So the need for the 0.975 number is simply a consequence of how probability tables and the InverseCDF function work. Tables (and functions like InverseCDF) are based on probability to the left of any point, rather than points that include some probability around the center of the distribution.

The origin of the 0.975 number and is also described, with illustrations, in lesson 16 "Computing Confidence Intervals".

POSTED BY: Dave Withoff

David -

RE: Resource for Conditional Probability Problems

Your remark during the Statistics DSG that the internet is a great source of problem solving help was right on target. The following link goes to some discussion and a collection of solved problems in conditional probability. Working through these problems helps develop the ability to understand what is being asked in conditional probability problems and what techniques can be used to solve them.

https://www.studocu.com/ph/document/central-luzon-state-university/differential-equation-l/prob3160ch4-math/33698873

POSTED BY: Joseph Smith

The Wolfram U course Introduction to Statistics is now available with updated content (based on the feedback we received from DSG attendees) and also features a Final Exam. We would like to encourage all our DSG attendees to take the exam and claim their Level 1 certification. Good luck!

POSTED BY: Joseph Smith

Hi everyone! A message to the Statistics Study Group was sent in error at 11:00 CT today. The message reminds you to take the course quizzes for your certificate of completion by Friday, April 28, which is accurate. It also says the exam is available in the framework, which is not accurate. We continue our testing, and the exam will be deployed in the next day or two. We have to ask for your continued patience regarding the exam. We will post here, on Community, when the exam is available.

Keep in mind there is no deadline for taking the exam. Once that is part of the framework, the Level 1 certificate will be available.

POSTED BY: Jamie Peterson
Posted 3 years ago

Hi Jamie,

I notice that the final exam is now active in the course framework. Since an official announcement has not been made, I am thinking that the exam is not quite ready. In any case, I would like to privately report an error that I found, without disclosing anything about the final to the group. Please let me know where I can send an email about that.

Regards,

Bob

POSTED BY: Bob Renninger

Hi @Bob Renninger. Yes, the exam has been added to the framework! Please send bug reports to wolfram-u@wolfram.com. Thank you!

POSTED BY: Jamie Peterson
Posted 3 years ago

Thanks, I have reported the issues.

POSTED BY: Bob Renninger

Dear Jose and Arben,

I suspect that there are two errors in the Quiz 6.

Problem 8: There is no correct answer. Multiple choice D) can be an answer but only when it is modified as "Approximately normal and with a standard error of 80g."

Problem 10: There is no answer. Even if we change the order of the two samples in MeanDifferenceCI, still there is no correct multiple choice.

Please see the attached

Attachments:
POSTED BY: Hee-Young Shin

Hi Hee-Young—thanks for your comments and notebook.

Re: Question 10, I've looked at your provided notebook, and the answer you calculate is listed among the choice I see on the deployed quiz (granted, rounded to the nearest .01, but we'll call that good enough!). Do you see something different from the following for your answer choices?

enter image description here

As for Question 8, I believe you are correct and will get that updated. Thanks again for pointing it out!

POSTED BY: Arben Kalziqi
Posted 3 years ago

Thank you Arben for your time.

POSTED BY: José Dordá
POSTED BY: Arben Kalziqi
Posted 3 years ago

About quizes:

While going through the quizes, it seems that there is no correct option (answer) for Quiz 6, Question 10. I tried with Mathematica and also an specialized statistics software and none of the option correspond to the correct answer for the data supplied.

POSTED BY: José Dordá

Hi José,

I've just checked with the quiz author, and we are able to get the correct answer as listed in the choices. While we're not sure of what answer you calculated, they suggest that it's possible to get the reverse of the correct answer if you swap the two provided samples for one another in the argument of the relevant function.

Please let us know whether this helps, and feel free to provide more information if it doesn't.

Arben

POSTED BY: Arben Kalziqi
Posted 3 years ago

Quiz 15, Problem 4, What is the expected value of the mean from a sampling distribution of sample size n, drawn from a normally distributed population with standard deviation σ? Goin g through the possible answers provided, the answer that is shown as correct is dependent on the sample problem in the video for nb 44. The central limit theorem governs this situation in this problem and it says the expected value of the mean from a sampling distribution equals the mean (µ) of the population. what you offer is the numerical value of the µ from the specific situation in the sampling problem.

POSTED BY: Tom Ogilvy

Hi Tom,

Thanks for bringing this to our attention. We're rolling out a fix; you may have to clear your cookies/cache to see it, but the answer will be updated.

POSTED BY: Arben Kalziqi

Hi;

When I try to request my final exam from the course framework, nothing happens. I get a wait indicator that never produces a final exam.

POSTED BY: Mitchell Sandlin

Hi @Mitchell Sandlin, the exam has not yet been added to the course framework. We will send an email notification to the Study Group participants when this is available next week.

POSTED BY: Jamie Peterson

Hi;

I am attempting to weight temperature data to use in creating a Normal Distribution using the EstimateDistribution function - see attached notebook. However, it seems that all I am getting is a bunch of messages with the processing failing to create a Normal Distribution. Please tell me what I am doing incorrectly.

Thanks,

Mitch Sandlin

Attachments:
POSTED BY: Mitchell Sandlin
Posted 3 years ago

Take a look at

PDF[NormalDistribution[], #] & /@ wdAllFlat

The temperature values are far from 0 which is the mean used by NormalDistribution[], so the probability is tiny.

POSTED BY: Rohit Namjoshi
Attachments:
POSTED BY: Mitchell Sandlin
Posted 3 years ago
POSTED BY: Dave Withoff

I have been experimenting with TTest and I am observing puzzling behavior.

As the attached workbook shows, the SignificanceLevel option does not seem to change the p value for the sample dataset used in Lesson 27.

The attached worksheet also shows experiments where the p value for a TTest with normally distributed data does not seem to decrease as the standard deviation of normally distributed data decreases.

Perhaps I am making a mistake in how I am using TTest command.

Attachments:
POSTED BY: Joseph Smith
Posted 3 years ago
POSTED BY: Dave Withoff

David,

Thanks for your response. I will need to digest this carefully but I really appreciate your help.

Joe Smith

POSTED BY: Joseph Smith

Thanks again. Some additional experiments with data where the mean moves away from 1000 illustrate the interaction between PValue, the significance level, and the conclusion of the TTest with respect to rejection of the null hypothesis.

Attachments:
POSTED BY: Joseph Smith

In the lecture today on the Multiple Testing Problem, a website was mentioned of someone who made it his hobby to collect false positives, or "spurious correlations", e.g., a time series of people drowning in swimming pools over successive years that appears to be very similar to the time series of movies coming out starring Nicholas Cage. The website shows many more funny coincidences.

This is the website: https://www.tylervigen.com/spurious-correlations

There's a book as well by the same author: https://www.amazon.com/Spurious-Correlations-Tyler-Vigen/dp/0316339431

Please post the link to the course framework on this community site. The chat pane does not appear in the recording of the course so if I miss that lecture I won't see the link. Thanks!

POSTED BY: Joseph Smith

Unfortunately the course framework is not ready for release yet. We are only sharing a beta version with our study group attendees. We can include the link in our reminder emails, so you can get to it, even if you miss the live session.

Thanks. Understood.

POSTED BY: Joseph Smith
Attachments:
POSTED BY: Mitchell Sandlin
Posted 3 years ago
POSTED BY: lara wag
Posted 3 years ago

In the notebook "11.NumericalSummariesOfData.nb" there are images of this kind: picture taken from notebook You animated these diagrams in the accompanying video and I wondered how you did it. Unfortunately, there are only pictures in the notebook (which makes sense, since you have explanations in the notebook that are "on the soundtrack" in the video). Even if it's a little off-topic, but: would you be so kind and share with us the code of this stunning animation?

POSTED BY: lara wag
Posted 3 years ago
POSTED BY: Dave Withoff
Posted 3 years ago

Dear Mr. Withoff,

Thank you very much for your answer.

You are absolutely right, the code does distract a lot from the actual topic. There is also a separate course on the topic of "animations" here on WolframU.

Still, thanks for sharing, I found your method of moving the five points or the single point via the join function (pts = ...) very enlightening. Also that you were able to assemble the final image with only "Epilog" was not something I would have expected.

POSTED BY: lara wag

Hi;

I am interested in finding the value of "x" in a probability when I know the probability in which I am interested. Intuitively one would assume that the Solve function should produce the answer by simply solving for x - see below. However, Solve, SolveValue or Reduce does not return the desired results. In some situations, I know the probability in which I am interested but I do not know the value of x that gives that probability without a lot of trial and error, so I hope someone could point me in the right direction.

Thanks, Mitch Sandlin

Solve[0.6 == Probability[x, x [Distributed] NormalDistribution[998, 202]], x]

POSTED BY: Mitchell Sandlin

Dear Mr. Sandlin:

I am NO expert on Wolfram or Probability but the following may help you.

Attachments:

Hi Juan;

Thanks so much. It was actually the InverseCDF function that performed the correct calculation.

Mitch Sandlin

POSTED BY: Mitchell Sandlin

How to take the Quizzes and the Online Course Exam?

POSTED BY: Md Mohsin

We will share links to quizzes and final exam during the DSG.

Show does not work properly in Mathematica 13.2.x. The overlay shown in the downloaded notebook 8.UsingHistogramData.nb fails to show the overlay!

POSTED BY: Marvin Schaefer
Posted 3 years ago
POSTED BY: Rohit Namjoshi

There was an update to how temperature units are handled in 13.2. This should provide an explanation: https://reference.wolfram.com/language/tutorial/TemperatureUnits.html

Thank you, Abrita. Seeing the documentation is what I badly needed !

-Marv

POSTED BY: Marvin Schaefer

Thank you very much for diagnosing the problem, Rohit! I got a response from support about the units on the x-axis not corresponding between the ListLinePlot and the graphic, and that was also not making sense to me at the time. Your response made it all very clear and, as you indicate, I do not think that this modification in implicit conversion was not documented in the prerelease documents I received for 3.2 or 3.3. It certainly is not compatible with the documentation and is possibly an easily corrected bug.

I truly appreciate your persisting in analysis of this conundrum! I’ll re-report it to Support.

-Marv Schaefer

POSTED BY: Marvin Schaefer

Reminder that the statistics group starts Monday! Author @David Withoff will join us as we kickoff our latest Daily Study Group. A pre-release version of the interactive course framework will be shared with participants. Sign up here.

POSTED BY: Jamie Peterson
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard