Group Abstract Group Abstract

Message Boards Message Boards

[WSG21] Daily study group: multiparadigm data science

111 Replies

Hello, Sorry for adding to a fairly long thread, but I am having trouble with the Exercises in Section 4.3: Cluster Analysis. I cannot get the FindClusters function to get the same results as expected in Exercise 4.3.2. I have even used an answer on this thread, but no matter what the clusters do not match. Could something have changed in the Wolfram Language implementation to change the expected output? Here is what I tried:

FindClusters[{"colloquialism", "clubroom", "narcissism", "ecclesiastically", "autumnal", "cornice", "mislead", "desperation", "pare", "gainer", "decorum", "embroil", "incoming", "postmark", "bolus", "strobe", "tectonic", "passive", "amused", "inured", "blowup", "meaty", "extensible", "hike", "psychosomatic", "coatroom", "adventitious", "protector", "punster", "putrescence", "under", "nipper", "slate", "antebellum", "sympathizer", "piping", "condensation", "sloping", "fundamentally", "stakeholder", "weekend", "circumferential", "slow", "invalidating", "formulated", "reheat", "authenticated", "pungency", "orderly", "cationic"}, DistanceFunction -> (Abs[StringLength@#1 - StringLength@#2] &)]

For which I get clustering that is actually a bit strange, with one cluster of words with length, 16,9,15 which would seem to be quite far apart:

{{"ecclesiastically", "protector", "circumferential"}, {"colloquialism", "psychosomatic", "fundamentally", "authenticated"}, {"clubroom", "autumnal", "incoming", "postmark", "tectonic", "coatroom", "pungency", "cationic"}, {"narcissism", "extensible", "antebellum", "formulated"}, {"cornice", "mislead", "decorum", "embroil", "passive", "punster", "sloping", "weekend", "orderly"}, {"desperation", "putrescence", "sympathizer", "stakeholder"}, {"pare", "hike", "slow"}, {"gainer", "strobe", "amused", "inured", "blowup", "nipper", "piping", "reheat"}, {"bolus", "meaty", "under", "slate"}, {"adventitious", "condensation", "invalidating"}}

@Abrita Chakravarty Can you have a quick look and let me know if I'm doing something wrong? Thank you very much. This is one of the last questions left for me to complete the exercises for the course.

Posted 2 years ago

Hi,

I hope you could tell me what is wrong with my answer. This has bothered me so much that I joined this community.

Exercise 4.1.3

c = Classify[trainingSamplesMNIST, Method -> "NearestNeighbors"];
ClassifierMeasurements[c, testSamplesMNIST, "Accuracy"]

Output= 0.74 ---> this is also the expected output.

Exercise 3.3.2

Why is this not accepted as the correct answer:

BioSequence["DNA", 
   EntityValue[
    Entity["Gene", {"BRCA1", {"Species" -> "HomoSapiens"}}], 
    "ReferenceSequence"]] // LetterCounts[#, 3] & // KeySort
POSTED BY: M St
Posted 4 years ago

I apologize if someone has already discussed this, but I suspect that the option ColorFunctionBinning for GeoRegionValuePlot has changed its default behavior (or else my installation is screwy). Almost all the examples in the documentation show discrete color bins in the legend (as does the Expected Output in the exercises), but the documentation for my installation says the default is None. When I rerun the examples in the documentation with no explicit settings for ColorFunctionBinning or ColorFunction, I always get a continuous gradient of colors in the legend, not the output shown originally with discrete bins. In other words, it looks like the examples were not rerun when the function was updated in 2020 (or, again, my installation is screwy).

As for the corresponding exercises, I can reproduce the graphics only by adding explicit settings for ColorFunctionBinning, which the autograder seems to deprecate (of course it does).

I'm running version "12.2.0 for Microsoft Windows (64-bit) (December 12, 2020)" on Windows 10.

Thanks, Ron Goetz

POSTED BY: Updating Name
Posted 4 years ago

Thanks Abrita,

That clarified the intent. Autograder confirms.

Roger

POSTED BY: Roger Shifrin

Hi Roger, If you use FindClusters to cluster the group of 100 words given in the exercise, it will create clusters based on the default distance metric for strings - to show the similarity/dissimilarity between the strings themselves. Instead this exercise wants you to use a numeric feature for each word (StringLength) to group them into clusters.

FindClusters will create the clusters automatically by grouping the numbers (representing the length of the words) into appropriate clusters. We need not have just one cluster for each number. We are trusting FindClusters to come up with a reasonable number of clusters for this data. Four, five or six lettered words can end up in the same cluster.

Hope this helps.

Posted 4 years ago

Hello group,

Could someone please clarify the instructions to exercise 4.3.2. The instructions seem straightforward and I interpret them to mean that you would like all words within the given 100 word list that have equal length to be clustered or grouped into a separate sub-collection for each word length. In other words, all strings containing 13 characters are collected into one sublist, all strings containing 12 characters are collected into a different sublist, and so on. The expected output is not represented this way. Perhaps I am not understanding the instructions correctly. Please advise.

Thanks,

Roger

POSTED BY: Roger Shifrin

I have submitted my exercises solution and send to the course team my submission email just minutes ago. my intention is level I certificate, thank you very much.

POSTED BY: vincent feng
Posted 4 years ago

The built-in classifier "ProgrammingLanguage" of ex 4.1.2

runs on the cloud but not in my version 12.2 .0 desktop in Windows 10. It is not in the Wolfram Documentation for my desktop version.

I am assuming its experimental, if it is , can you just check that if it is used in a question that it is in the latest distributions of Mathematica ...... I spent an age trying to get this to run on my desktop, before in desperation running it in the cloud .

POSTED BY: Doug Beveridge
Posted 4 years ago
POSTED BY: lara wag

Hi @doug Beveridge, The "Programminglanguage" classifier is definitely available in the latest version of the Wolfram Language. It has been available since its introduction in 2018 with version 11.3. If you are unable to access it or use it in your version of the Wolfram Language, could you please contact the technical support team at https://www.wolfram.com/support/contact/email/?topic=technical. I am worried if it might be indicative of some other issue with your installation.

Thanks

Posted 4 years ago

Hi Abrita

I think it has something do with the way the string "/^.?$|^(..+?)\1+$/" is being parsed.

enter image description here

I could not get my code to work in the desktop and I cut and pasted it straight into the cloud and it worked. After my post on here I cut and pasted the same code from the cloud into my desktop version and it first contacted the Wolfram server and did a download and worked. All rather strange.

(it did not help when you do a search for "FacialAge" it comes up as a (Built-in Classifier) and if you do a search on ProgrammingLanguage it comes up as an Entity )

POSTED BY: Doug Beveridge

Which documentation page are you taken to when you paste the following in the search box at the top of the documentation window? (or do you find the page does not exist on your system?)

ref/classifier/ProgrammingLanguage

enter image description here

Also are you able to evaluate and see the putput of the following?

Information[Classify["ProgrammingLanguage"], "Classes"]
Posted 4 years ago
POSTED BY: Doug Beveridge
Posted 4 years ago

If you do a search on the "search for all pages containing "ProgrammingLanguage" then you come to see its overloaded .

enter image description here

POSTED BY: Doug Beveridge

Vincent Feng, your brilliant. I never knew these were included with the distribution of Mathematica ! NotebookOpen[ FileFind[ "ExampleData/StocksTemplate.nb" ] ] worked !! Have a great Day! Thank-you, John

POSTED BY: John Burgers
POSTED BY: Jürgen Kanz

Hello, Would it be possible to obtain the file File["ExampleData/StocksTemplate.nb"] referred to in session 14 Automated Report generation under the heading Automated report generation? I'd like to see it to help my understanding of the driver linkages to the template. Thank-you, John Burgers

POSTED BY: John Burgers
POSTED BY: vincent feng

Hi Everyone,

Thanks for making the MPDS study group a success with your active participation. We appreciate all the great questions during the sessions, the posts on this community thread, and the reports emailed to wolfram-u@wolfram.com about the issues related to the exercises. Just wanted to share some links as we wrap up the study group series:

Certificate of Completion

Requirements: Attendance at study group sessions (or watching recordings) and completing the quizzes by May 7

Level I in Multiparadigm Data Science

Requirements: Attendance at study group sessions (or watching recordings), completing the quizzes and exercises by May 14.

We understand some you are running into issues with the autograder not accepting your solution that produces the same output as shown in "Expected Output". In those cases, we are requesting you mail us at wolfram-u@wolfram.com (a description of the issue and your exercises notebook - available for download both from the course page and the download link shared during study group).

Level II in Multiparadigm Data Science

Detailed instructions here: https://www.wolfram.com/wolfram-u/certification/level2/multiparadigm-data-science/

Finally, if you are interested, more details about the Wolfram Data Science Boot Camp 2021 can be found here: https://www.wolfram.com/wolfram-u/special-event/data-science-boot-camp/

Best, Abrita

Hi Abrita,

One month before the Study group, I passed the Level 1 certification using the online course “Multiparadigm Data Science”. In order to do the mentioned Level II later this year using the coupon, will I have to repeat the Level I of the Study Group, or is the existing certificate sufficient?

BR, Andreas

POSTED BY: Andreas Rudolph

Hi, Abrita, today I have received an email from Jan Fisher on the course completion certificate, howevere, my aim is on Level I certificate, I sent my submission email before May 7th, with 12mb exercises notebooks in a zip file, posted to my own authenticated website for username/password access, any single word on those submission?

Thanks,

POSTED BY: vincent feng

Hi Vincent,

Thank you for checking in with us. We are currently in the process of manually grading the exercises submissions we have received so far (both via the online submissions on the course and via email). Thank you for your patience, as we finish up the grading. We will reach out to folks over email if we need to clarify any issues with their submissions.

best, Abrita

POSTED BY: vincent feng
Posted 5 years ago
POSTED BY: Ohoe Kim
Posted 5 years ago

I just completed Quiz 3 and click "GET RESULT". But nothing happens. I tried clicking a few more times, it still does not yield anything.

POSTED BY: Ohoe Kim

Hi Ohoe, Could you make sure you are still logged in to the cloud (at the top right corner of the page) and your session in the cloud somehow did not get disconnected?

How long will we be able to re-watch the web presentations?

POSTED BY: George Wolfe

Hi George! These recordings will be available in perpetuity :)

POSTED BY: Arben Kalziqi

That is really great! I expected them to self destruct, like that did in Mission Impossible. That's what happened after the Neural Network Bootcamp, which was unfortunate. There was much more information, which also seemed more complicated.

POSTED BY: George Wolfe
Posted 5 years ago
POSTED BY: Marc Widdowson

Similar opinions as above, I learn a lot from this course, now I have complete all the quiz and Exercises, it is something I can build on in the future. thanks for this tons of material course.

POSTED BY: vincent feng
Posted 4 years ago

I'll just add "Hear! hear!" to that sentiment. I really enjoyed the course -- my only regret is that the course occurred at a time of year when I really would rather be outdoors, but that's on me.

Ron Goetz

POSTED BY: Updating Name

Exercise 4.4.4

You are asking for 26 new words! The expected output only shows 26 Characters. What has to be delivered?

POSTED BY: Jürgen Kanz

Hi, Jurgen, That is the result I got, roughly 26 new words, space also counts as one word: enter image description here

POSTED BY: vincent feng

We will rephrase the question text. Your solution should provide "26 characters" (individually most likely next 26 elements).

Hi Abrita, Arben,

I think something is wrong with your expected output for Exercise 4.5.5. When I take the output and decode it, the input sequence is as follows "GGTCTCCCAG". The original input sequence "GGCTCTTTAG" creates a different unit vector list compared to your expected output.

Am I right?

POSTED BY: Jürgen Kanz

Thanks for the report @Jürgen Kanz . The expected output from a "Characters" encoder for the alphabet "ACTG" working on the input string "GGCTCTTTAG" should be:

{{0, 0, 0, 1},
 {0, 0, 0, 1},
 {0, 1, 0, 0}, 
 {0, 0, 1, 0},
 {0, 1, 0,  0},
 {0, 0, 1, 0}, 
 {0, 0, 1, 0}, 
 {0, 0, 1, 0}, 
 {1, 0, 0, 0}, 
 {0, 0,  0, 1}}

Hi Abrita, Arben,

Exercise 5.1.1 requires recasting the Dataset Association formats ...

Could either of you explain to me why this pattern matching statement doesn't operate on each element as it maps through all the elements of the association ?

easternEurope /.  KeyValuePattern[a_ -> b_] -> {QuantityMagnitude[b] -> a}

yields ...

{{6.30805*10^10, 9452409} -> Entity["Country", "Belarus"]}

Thanks, John Burgers.

POSTED BY: John Burgers

Hi John,

That is a little confusing to me... in the meantime, [EDIT: this doesn't work, nevermind. Investigating...] I'll note that switching out KeyValuePattern[a_->b_] for Rule[a_,b_] works. I'm trying to think of why KeyValuePattern wouldn't, however. I'll update if I figure something out!

POSTED BY: Arben Kalziqi

Hi again John,

It seems like this is a known issue—coincidentally, somebody has commented on it internally on this very day! In the meantime, here is an alternative construction that I think should generate the output that you want:

Association@KeyValueMap[QuantityMagnitude[#2] -> #1 &, easternEurope]
POSTED BY: Arben Kalziqi

Thank you Arben, for the research, and suggesting a solution. John

POSTED BY: John Burgers
POSTED BY: Arben Kalziqi
Posted 5 years ago

Hi John,

Another way to do it

easternEurope // QuantityMagnitude // AssociationThread[Values@#, Keys@#] &
POSTED BY: Rohit Namjoshi

Thank-you Rohit, that's an easily readable way of doing it. John

POSTED BY: John Burgers
POSTED BY: vincent feng

You are right. There is an issue where ListPlot is unable to handle the multiple datasets in the EntityAssociation format. However the following should work:

ListPlot[{Values@easternEurope, Values@westernEurope}]

Now what remains, is to set up the labels for each data point. For example, for the first dataset the labels are available as Keys@easternEurope.

You will see on the documentation for ListPlot that one way to provide labels for data points is

ListPlot[{data1, data2, ...}]

where datai can have the form:

{y1,y2, ...}->{"lbl1","lbl2",...}

as well as

 {{x1,y1},{x2,y2}, ...}->{"lbl1","lbl2",...}

Daily Study Group Session Cancelled (Apr 27, 2021)

Our sincere apologies. It seems like BigMarker, our webinar platform provider, is facing some issues today. We are therefore unable to host the study group session.

We were planning to talk about Neural Networks in today's session. Instead we are asking you to look at the video at https://www.wolfram.com/wolfram-u/multiparadigm-data-science/neural-networks.html today. We'll discuss the code from the notebook at tomorrow's session.

POSTED BY: vincent feng
Posted 5 years ago
POSTED BY: Rohit Namjoshi

You are right, after adding the (), now is working, thanks,

POSTED BY: vincent feng
POSTED BY: Arben Kalziqi
Posted 5 years ago
POSTED BY: lara wag
POSTED BY: Arben Kalziqi
Posted 5 years ago

enter image description here

I still don't get why we have to reverse the order in the third function . Is there a logical reason for that

POSTED BY: Doug Beveridge

This, I don't have a particularly good justification for. I suspect it might be mirroring RandomChoice, RandomSample, and other such functions, where weights are specified first when given in that format (List of Rules), but I'm not sure...

POSTED BY: Arben Kalziqi
Posted 5 years ago

Thank you Arben, for your quick reply (and sorry for getting back to you only now...).

POSTED BY: lara wag
Posted 5 years ago

Quiz 2 Problem 5 - You will find the answer on the documentation page for WordCloud http://reference.wolfram.com/language/ref/WordCloud.html

I did and that is why the query ...!

enter image description here

POSTED BY: Doug Beveridge

Hi Doug—I don't think that there's anything contradictory here, though it is a bit counterintuitive. The documentation can be easy to misread here, for whatever reason, so I'd highly recommend giving it another look!

POSTED BY: Arben Kalziqi
Posted 5 years ago

Hi Arben Can you give me an example were enter image description here

will actually run , my view is the function is syntactically incorrect .

POSTED BY: Doug Beveridge

Because of the context of this being a quiz, I cannot give you a specific example. I can confirm, however, that that is indeed a valid syntax and that there is no error with the question as posed.

POSTED BY: Arben Kalziqi
Posted 5 years ago

Solved , the variables W and S are swopped around (confusing ) in these examples , thanks

enter image description here

POSTED BY: Doug Beveridge

I catch a bug in the Ex3.3.5 today:enter image description here

The first column should change to the first row here, check with following:enter image description here

0.338xxx is min of the first row,

POSTED BY: vincent feng

Abrita,

Meantime I have participated in a number of Wolfram-U courses. I think it is fair to say that in each course we were facing issues with the auto-grader that is looking for certain code elements. Wouldn't it be better to compare the results with the expected output instead of the code? As you perhaps might know, I apply an unorthodox programming style which often leads to a number of auto-grader issues, but the results are okay. In the case of plots or images, this idea may not work, but a lot of discussed issues would not occur in the future.

POSTED BY: Jürgen Kanz

Hi @Jürgen Kanz, We do appreciated your participation and feedback, as we try to resolve the various issues with the autograding of exercises. The grader has been set up to accept the variety of computations possible with the Wolfram Language and attempts to check the solutions in multiple different ways, based on the type of problem. We do use the approach you have suggested, in certain cases. We feel good about what the grader is doing right now but also realize we need to resolve a few more issues. All the feedback we are receiving from our study-group participants is definitely helping us out. Thank you.

Quiz 3 - Problem 8

You ask for a shorter list, but the requested answer leads to an empty list. Is this what you really want?

POSTED BY: Jürgen Kanz
POSTED BY: John Burgers

Don't worry. I have two slightly different solutions, but the grader does not accept them as well.

POSTED BY: Jürgen Kanz

Exercise 3.3.5

We are asked to calculate some statistical figures "for the first column of the following dataset". The expected solution provides the results for the first row of the dataset. What do you really want?

POSTED BY: Jürgen Kanz

The week1 recap recording doesn't seem to be working for replay, https://msp4.bigmarker.com/links/wud3D6N60up/RE0CC9pK4/Ef7uvgDURRH/hKXtFDrYfl?redirect_to=https%3A%2F%2Fwww.bigmarker.com%2Fwolfram-u%2Fdsg-mpds-week-1-recap%3Fbmid%3Dea77011a35dd

I would like to watch it again if possible- thank you

Attachment

Attachments:
POSTED BY: Michael Lyda
Posted 5 years ago
Posted 5 years ago

Q2 Problem 5

I think there could an "error" to the answer in this question

POSTED BY: Doug Beveridge

why does this code generate an error?

Attachments:
POSTED BY: LORIS LORI

Hi, there, I am working on Ex2.3.1, I run the first command, and get: enter image description here

Now I don't understand the question, "To be useful as a discriminative feature, each column should span a wide range of values across the samples. Find the number of unique feature values in each column of the dataset."

POSTED BY: vincent feng
Posted 5 years ago
POSTED BY: lara wag
Posted 5 years ago
POSTED BY: Updating Name

Since a number of study group participants are working on the exercises, we are requesting that you do not post the solutions here on community. You are of course welcome to discuss various functions and provide examples of code that do not give away the entire solution.

If you are running into issues, where the autograder is not accepting your solution (although the output is identical to what is shown as the "Expected Output"), please feel free to email us at wolfram-u@wolfram.com.

The notebook from today's review session has been uploaded to the study group materials folder.

Posted 5 years ago

Thanks Abrita

that works perfectly and the AutoGrader accepts it .

POSTED BY: Doug Beveridge
Posted 5 years ago
POSTED BY: Doug Beveridge
Posted 5 years ago

Hi Doug, there is something wrong with the formatting of the task. After simply selecting DarkBands as ChartStyle (as suggested by the task), I got a correct solution. See here http://reference.wolfram.com/language/guide/ColorSchemes.html

The incorrect formatting looks like "internal Mathematica code" - at least that's what you can see when you open a notebook with a text editor. Maybe a bracket or a quotation mark is missing?

larawag

POSTED BY: lara wag

Hello, Let me try to address a number of questions here.

  1. Quizzes: We are addressing some issues we found with the quiz. @JuergenKanz The issue with Quiz 1 Problem 9 will be resolved in the fix we are deploying.
  2. Autograder not accepting solutions to exercise problems: We will discuss this in more details during Friday's session. @UlfSchmidt and @DougBeveridge We are also in the process of setting up a way for you to download the exercise notebooks, work on them on your desktop and submit for manual grading (especially when the autograder does not accept your solution). There are so many different ways to implement a solution in the Wolfram Language... although we have set up the autograder to accept multiple solutions, we do run into issues sometimes.
  3. "Download notebook" links: We are looking into the issue of the "Download notebook" not working. @DougBeveridge For now please download the zipped notebooks from the link shared during the webinar sessions.

Thank you for your patience as we try to resolve the issues. We appreciate your interest and participation in the study group sessions.

Quiz 1 - Problem 9

You are asking for a specific number regarding the "dimensions of the output returned by DimensionReduce". I think this is in contradiction to the Documentation Center where you can read
"DimensionReduce[examples] automatically chooses an appropriate dimension for the approximating manifold." So, the requested number for the quiz problem can be right, but must not be correct in general, right?

POSTED BY: Jürgen Kanz
Posted 5 years ago
POSTED BY: Doug Beveridge

You are right, but let us assume you put images or other data into the DimensionReduce function, the dimensions change.

POSTED BY: Jürgen Kanz
Posted 5 years ago

Hi Abrita

Firstly the notebook files would not download for me ( I am using windows 10 and a range of browsers ( Chrome , Mozilla and Edge ) . I suspect it is becasue the website wants to open a window and the Browser / Virus software will not allow it .

Now attempting the exercises and its is just not happening. The small window on the right will not show all the info, and if you enter info, it will not display correctly. ( I do get the right answer eventually) .

enter image description here

POSTED BY: Doug Beveridge
POSTED BY: Jürgen Kanz
Posted 5 years ago

Jürgen are you using Windows or Mac ?

POSTED BY: Doug Beveridge

Windows 10

POSTED BY: Jürgen Kanz
POSTED BY: Jürgen Kanz
Posted 5 years ago

Thanks , running the exercises in the cloud now

POSTED BY: Doug Beveridge

Hi,
Please check my registration, because I do not receive your emails, and the Q&A display shows my surname first, followed by my first name. Perhaps I made a mistake during the registration process?

POSTED BY: Jürgen Kanz
Posted 5 years ago

Exercise 2.3.1 / 2.3.3 / 2.3.4 I get exactly the same solution as the expected solution. The solution check answers always with "Try again". What i am doing wrong?

POSTED BY: Ulf Schmidt

This is in reply to a question posed at today's study group session about listing wars from a particular period. I have used the "MilitaryConflict" entity type to create a FilteredEntityClass that satisfies a specific condition for its EntityProperties "StartDate" and "EndDate" with the help of an EntityFunction.

EntityFunction[x, body] behaves exactly like an EntityProperty and can be used to create very specific properties that help us denote a list of very specific entities

EntityList[
 FilteredEntityClass["MilitaryConflict", 
  EntityFunction[c, 
   c["StartDate"] >= DateObject[{1975, 1, 1, 0, 0, 0}] && 
    c["EndDate"] <= DateObject[{2000, 12, 31, 0, 0, 0}]]]]
Posted 5 years ago

Abrita, I can not find the "Exercises" in The MPDS notebook. I really appreciate it if you would help.

POSTED BY: Ohoe Kim

Once you go to a new subsection of the course, the Exercise-File is copied into your private Wolfram Cloud. It is much easier to work with the copy. You can work with the files either on the cloud or locally.

POSTED BY: Jürgen Kanz
Posted 5 years ago
POSTED BY: Glen Deering

No problem. We can add the "Build a Project Workflow" notebooks to the same download location.

Posted 5 years ago

Thanks for the downloads.

I see the the 1st workflow notebooks were left out ("1. Build a Project Workflow"). It seems that the last three notebooks of that section have a non-working download link at the bottom of each notebook because I've tried three different browsers with the same result. I cannot download them, though I can probably reconstruct them via copy, paste, and execute.

POSTED BY: Glen Deering

Yes, we will share a link to download all the notebooks at once, in the study group session today. We'll also look into the issue of the "Download Notebook" links not working as expected. Thanks.

Posted 5 years ago

It seems like a hit and miss affair, sometimes it downloads , sometimes it does not, even when I tried a number of browsers . I think that is why we are getting a separate link in today's lecture so we can download all the files in one go .

POSTED BY: Doug Beveridge

Clicking the Download Notebook link did not result in a download for the sessions {Explore, Analyze, Handling different types of data, restructuring Data}.

Is there something else required to get the downloads?

Thank-you.

POSTED BY: John Burgers
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard