Message Boards Message Boards

1
|
9285 Views
|
2 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Monday morning quiz: Do you know your Wolfram L vocabulary?

The Wolfram language tries to make distinctions where John Doe wouldn't see the need/distinction. The result is a wealth of similar(?) vocab items which can be confusing to beginners.

Quiz question: What do the following vocab items have in common, and why do 2 of them not belong in that list?

The quiz question maybe meh but I hope you find that list as intriguing as i do. Did you know all the items (aka passive vocabulary)? And which ones did you ever use in your own code (aka active vocabulary)? I am new to the language, so i am still figuring out which ones when to use in my future programs.

POSTED BY: Raspi Rascal
2 Replies

Seems like nobody knows the answer to the quiz question?

Anyway, here is a practical example when you could make use of the above listing. Say you imported experimental data in form of a matrix of numeric values, and unfortunately some of the matrix elements are either missing, corrupt, or replaced by strings ("err", "N/A"). You decide not to bin the entire matrix but to replace those annoying :D elements in such a sensible way that you could still make use of the data for further automatic mathematical processing. This step is called data cleaning.

Example: The calculation of the column mean, i.e. the mean value of a column of a numeric matrix.

In[1]:=  col = {2, 4, 8, 6, 5}; (* 5 valid elements, clean data *)
         Mean[col] (* (2+4+8+6+5)/5 = 5 *)
Out[2]=  5

Now let's assume that your column has a corrupt entry, a corrupt or missing element produced by the electronic device which recorded/logged the data.

In[3]:=  col = {2, 4, "N/A", 6, "N/A"}; (* 3 of 5 elements are valid, 2 are corrupt *)

Then you must take a decision how you want to proceed with the data. If you're interested in the column mean, then replacing the "N/A" with the value zero would not be a good idea, because it would warp the data (and the column mean) too much, (2+4+0+6+0)/5 = 2.4. Instead, a common procedure would be to replace all corrupt values with the (same) mean of the remaining values which is (2+4+6)/3 = 4.

In[4]:=  col = {2, 4, 4, 6, 4};
         Mean[col]
Out[5]=  4

However, this procedure might not be desirable for any reason. Instead, you're quizzing yourself whether the "N/A" can/cannot be replaced by a symbol which Wolfram L interprets correctly as a missing numeric value and hence calculates the correct column mean directly as (2+4+6)/3 = 4.

If you're unsure which vocab item could work here, the full list from the OP gives you a systematic way for testing. As a beginner, you would think that good candidates to start with are Missing, None, Invisible, Undefined, Null, Nothing, Empty, Removed, Gone, Exclusions, Absent, Blank, Placeholder, what do you think? ;-)

POSTED BY: Raspi Rascal

Here is another real life example. Everyday our local authority publishes the updated pair of numbers (coroona infections vs deaths) on their webpage, even on weekends. Plotting the data with ListPlot[] visualizes the course of the numbers, with the x-axis representing the day number.

mytown = {
(*Mon,day1*){864,430},{921,464},{981,496},{1082,556},{1156,603},{1214,634},{1260,652}
};
ListPlot[{ mytown[[All,1]], mytown[[All,2]] }] 

Day after day you collect the data from week2 but notice that for Friday and Saturday the authority didn't publish any data. You want ListPlot[] to plot an invisible data point for day12 and day13, so that the overall imaginary curve is still intact, not falsified. You don't want to substitute {0,0} for either day, because that would be amateurish. Again, as a beginner, you would think that good candidates to start with are Missing, None, Invisible, Undefined, Null, Nothing, Empty, Removed, Gone, Exclusions, Absent, Blank, Placeholder. However, the correct vocab item to substitute here in order to get the desired effect is Indeterminate.

mytown = {
(*Mon,day1*){864,430},{921,464},{981,496},{1082,556},{1156,603},{1214,634},{1260,652}
,(*Mon,day8*){1273,659},{1296,675},{1368,707},{1427,738},(*day12*){Indeterminate,Indeterminate},(*Sat*){Indeterminate,Indeterminate},(*Sun*){1532,777}
};
ListPlot[{ mytown[[All,1]], mytown[[All,2]] }]     

I've been observing the #flattenthecurve of our town area, manually collecting the daily numbers everyday. When the publisher skipped some days, i was reminded of this thread to pick the appropriate Wolfram L vocab item to insert in my data set, instead of inserting {0,0}. I wasn't sure which one to pick and had to try various 2213 ones myself!

POSTED BY: Raspi Rascal
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract