Message Boards Message Boards

2
|
14631 Views
|
42 Replies
|
18 Total Likes
View groups...
Share
Share this post:

[WSG20] Wolfram Language Basics from EIWL (Days 6, 7, 8, 9)

During week two of the Wolfram Study Group Apr 2020 (WSG20) we are looking at the following topics from Stephen Wolfram's book Elementary Introduction to the Wolfram Language:

Day 6: Chapter 16 Real-World Data, Chapter 35: Natural Language Understanding, Chapter 44: Importing and Exporting

Day 7: Chapter 5: Operations on Lists, Chapter 13: Arrays, or Lists of Lists, Chapter 30: Rearranging Lists, Chapter 31: Parts of Lists

Day 8: Chapter 34: Associations, Chapter 45: Datasets

Day 9: Chapter 27: Applying Functions Repeatedly, Chapter 29: More about Pure Functions, Chapter 46: Writing Good Code, Chapter 47: Debugging Your Code

We are looking at videos from the Wolfram U interactive course on EIWL and also working on simple exercises and mini projects.

Feel free to post questions on the material we covered in these sessions here.

42 Replies

Daily challenge (Day 6):

List the EntityProperties for the "Country" entity that have a name beginning with the letter "w"? How many did you find?

Posted 5 years ago

Entity["Country", "WestBank"]["Properties"] Entity["Country", "WesternSahara"]["Properties"]

POSTED BY: Moses Paul
Posted 5 years ago

The following 9:

{EntityProperty["Country", "WageAndSalariedWorkers"], 
 EntityProperty["Country", "WageAndSalariedWorkersFraction"], 
 EntityProperty["Country", "WagesCostIndex"], 
 EntityProperty["Country", "WaterArea"], 
 EntityProperty["Country", "WaterArrivals"], 
 EntityProperty["Country", "WaterProductivity"], 
 EntityProperty["Country", "WaterwayLength"], 
 EntityProperty["Country", "Workforce"], 
 EntityProperty["Country", "WPI"]}
POSTED BY: Muhammad Ali

Thank you @Abrita Chakravarty. I have done suggested modifications.

I have found 7 properties beginning with small letter "w", by Common Names:

In[1]:= Select[CommonName /@ EntityProperties["Country"], StringMatchQ[#, "w" ~~ __] &]

Out[1]= {"wage and salaried workers",
          "wage and salaried workers fraction",
          "wage cost index",
          "water area",
          "water productivity",
          "waterway length",
          "wholesale price index"}

and 9, by Canonical Names:

In[2]:= Select[CanonicalName /@ EntityProperties["Country"], StringMatchQ[#, "W" ~~ __] &]

Out[2]= {"WageAndSalariedWorkers",
         "WageAndSalariedWorkersFraction",
         "WagesCostIndex",
         "WaterArea",
         "WaterArrivals",
         "WaterProductivity",
         "WaterwayLength",
         "Workforce",
         "WPI"}

2 Canonical Names have Common Names that do not start with letter "w".

It looks so that the correct result is 7, as the formulation ask for names starting with the small letter "w"..

Posted 5 years ago

There are nine

 Entity["Country"]["Properties"] // 
   Select[ToLowerCase[StringTake[CanonicalName@#, 1]] == "w" &]

(*
{EntityProperty["Country", "WageAndSalariedWorkers"], 
 EntityProperty["Country", "WageAndSalariedWorkersFraction"], 
 EntityProperty["Country", "WagesCostIndex"], 
 EntityProperty["Country", "WaterArea"], 
 EntityProperty["Country", "WaterArrivals"], 
 EntityProperty["Country", "WaterProductivity"], 
 EntityProperty["Country", "WaterwayLength"], 
 EntityProperty["Country", "Workforce"], 
 EntityProperty["Country", "WPI"]}
*)
POSTED BY: Rohit Namjoshi
Posted 5 years ago

Hi Valeriu,

The question asks for 'beginning with the letter "w"'. You need to anchor the pattern to the start of the string or it will match "w" anywhere in the name.

StartOfString ~~ "w" | "W" ~~ ___
POSTED BY: Rohit Namjoshi

@Moses Paul Thanks for your reply. We are looking for EntityProperties that start with "w", not Country Entities. Try:

EntityProperties["Country"]

@Muhammad Ali Thanks for your reply. Can you also post the code you used to get to this answer?

@Valeriu Ungureanu Thanks for your reply. You want to modify the pattern a tad bit - so only property names starting with "w" are caught.

@Rohit Namjoshi Nice one.

Posted 5 years ago

@Abrita Chakravarty

Cases[
    EntityProperties["Country"],
    x_/;Or@@StringMatchQ[
       {CanonicalName[x],CommonName[x]},
       "W*",
       IgnoreCase->True
    ]
]
POSTED BY: Muhammad Ali

Hi Muhammad,

There is a difficulty with your code. I have executed it:

In[61]:= Cases[EntityProperties["Country"], 
 x_ /; Or @@ 
   StringMatchQ[{CanonicalName[x], CommonName[x]}, "W*", 
    IgnoreCase -> True]]

Out[61]= {EntityProperty["Country", "WageAndSalariedWorkers"], 
 EntityProperty["Country", "WageAndSalariedWorkersFraction"], 
 EntityProperty["Country", "WagesCostIndex"], 
 EntityProperty["Country", "WaterArea"], 
 EntityProperty["Country", "WaterArrivals"], 
 EntityProperty["Country", "WaterProductivity"], 
 EntityProperty["Country", "WaterwayLength"], 
 EntityProperty["Country", "Workforce"], 
 EntityProperty["Country", "WPI"]}

and found that for 2 of your results their Common Names do not start with small letter "w". So, for

"WaterArrivals" the Common Name is "arrivals by sea"

and for

"Workforce" the Common Name is "employment"

It looks so that the code must be modified because, according to formulation, property names must begin with small letter "w". At least, such observation did @Abrita Chakravarty

Thank you, Rohit, for your suggestion!

Here's another solution:

StringCases[#[[2]] & /@ EntityProperties["Country"], 
  StartOfString ~~ "w" ~~ ___, IgnoreCase -> True] // Flatten

Daily Challenge (Day 7):

This one is from EIWL Chapter 31-- Make a histogram of where the letter “e” occurs in the words in WordList[ ]

Small or capital "w". But good catch - interesting that "WaterArrivals" and "Workforce" have different common names. For this challenge all canonical names beginning with "w" should be accepted.

Posted 5 years ago

@Valeriu Ungureanu In my experience when working with Entities it is advisable to look for both CanonicalName and CommonName. CanonicalName is the actual field name in the database so in many databases they are kept as short and cryptic to avoid naming conflicts. As an example, if you create your own SQL database and map it onto the EntityFramework using RelationalDatabase, you will see that database field names are mapped to CanonicalName. If you restrict your searching to CanonicalName only you will miss out on many relevant properties.

Regarding the question, it didn't specify what name it was referring to (that's why I did an exhaustive search using the Or in my code) but I agree one can become pedantic and take "w" as a literal for the lower case. But I don't think that was the intention of the question.

POSTED BY: Muhammad Ali

Agree!

My code for today (day 7) challenge:

Last /@ Position[Characters@WordList[], "e"] // Histogram
Posted 5 years ago

enter image description here

POSTED BY: Muhammad Ali

A shorter code:

Histogram@Flatten@StringPosition[WordList[], "e"]

With such code, day 7 challenge becomes a simple problem :)

Posted 5 years ago

Hi Valeriu,

The counts are doubled in this solution because StringPosition returns a pair of numbers.

POSTED BY: Rohit Namjoshi

Thank you, Rohit! You are right! At the moment, I can modify the code to obtain the correct result, but it will not be as short as the precedent:

Last /@ Flatten[StringPosition["e"]@WordList[], 1] // Histogram

Another solution

DeleteCases[StringPosition[WordList[], "e"], {}][[All, 1, 1]] // Histogram

Another modification of the code:

Histogram@Flatten@Map[Union, StringPosition[WordList[], "e"], 2]

or

Histogram@Flatten[Union @@@ StringPosition[WordList[], "e"]]

Daily Challenge (Day 8): Use WL code to restructure the following dataset, so the "ID" is used as the row-identifier instead of the name for each row.

Dataset[
    <|"Padmé Amidala" -> <|"ID" -> "19435", "Planet" -> "Naboo"|>, 
      "Bail Organa" -> <|"ID" -> "11114", "Planet" -> "Alderaan"|>, 
      "Mon Mothma" -> <|"ID" -> "10712", "Planet" -> "Chandrila"|>|>]

Expected output: enter image description here

Posted 5 years ago

Here is my clunky attempt at it. Probably there is a better, more humane way to do this.

enter image description here

POSTED BY: Muhammad Ali

My non-ideal code, that may be checked:

data = Dataset@<|
"Padmé Amidala" -> <|"ID" -> "19435", "Planet" -> "Naboo"|>, 
"Bail Organa" -> <|"ID" -> "11114", "Planet" -> "Alderaan"|>, 
"Mon Mothma" -> <|"ID" -> "10712", "Planet" -> "Chandrila"|>|>;
name = Keys@Normal[data];
id = Values@Normal[data[All, 1]];
planet = Values@Normal[data[All, 2]];
Dataset@Association@
Array[id[[#]] -> <|"Name" -> name[[#]], "Planet" -> planet[[#]]|> &,
Length@data]

Another possibility:

d = Dataset[Association[
   "Padmé Amidala" -> <|"ID" -> "19435", "Planet" -> "Naboo"|>, 
   "Bail Organa" -> <|"ID" -> "11114", "Planet" -> "Alderaan"|>, 
   "Mon Mothma" -> <|"ID" -> "10712", "Planet" -> "Chandrila"|>]]

Apply[Association, d[#, "ID"] -> <|"Name" -> #, "Planet" -> d[#, "Planet"]|> & /@ Keys[d]]

Daily Challenge (Day 9): A pangram is usually understood as a sentence in which every letter of the alphabet is used at least once. For the letters {"a", "o", "l", "r", "w", "f", "m"}, a pangram would be a word that uses all of the seven letters e.g. "Wolfram".

Use WL to find a pangram from the letters {"z", "t", "c", "a", "u", "h", "p"}.

If every letter is used exactly once, then the following code solves the problem:

In[1] :=  Intersection[StringJoin /@ Permutations[{"z", "t", "c", "a", "u", "h", "p"}],
          WordList[]]
Out[1] = {"chutzpa"}

or, thanks the Rohit's idea to use the function Complements:

In[2]:= l = {"z", "t", "c", "a", "u", "h", "p"};

If[Complement[l, Characters@#] == {}, #, Nothing] & /@ WordList[]

Out[2]= {"chutzpa", "chutzpah"}
Posted 5 years ago

For at least once

letters = {"z", "t", "c", "a", "u", "h", "p"};

WordList[] // Map[{#, Complement[letters, Characters[#]]} &] // 
  Select[Last@# == {} &] // Map[First]

(* {"chutzpa", "chutzpah"} *}
POSTED BY: Rohit Namjoshi
Posted 5 years ago

Shorter version

WordList[] // Select[Complement[letters, Characters[#]] == {} &]
(* {"chutzpa", "chutzpah"} *)
POSTED BY: Rohit Namjoshi

An identical version with Cases:

In[1]:= l = {"z", "t", "c", "a", "u", "h", "p"};
        Cases[x_ /; Complement[l, Characters@x] == {}]@WordList[]

Out[1]= {"chutzpa", "chutzpah"}
Posted 5 years ago

@Abrita Chakravarty I have a query regarding the question.

For the letters, {"a", "o", "l", "r", "w", "f", "m"}, can the pangram apart from containing at least one of each letter from the list also contain some of the remaining letters? Like for example "flamethrower" contains all of the letters from the list but also contains "e" twice, "h" and "t" once which are not part of the list. So is "flamethrower" also regarded as a pangram of this list in addition to "wolfram"?

POSTED BY: Muhammad Ali

Some other codes applying operations on sets:

l = {"z", "t", "c", "a", "u", "h", "p"};
Select[ContainsAll[l]@Characters@# &]@WordList[]

or

Select[SubsetQ[Characters@#, l] &]@WordList[]

or

Select[Intersection[Characters@#, l] == Sort@l &]@WordList[]

or

Cases[x_ /; ContainsAll[l]@Characters@x]@WordList[]

or

Cases[x_ /; SubsetQ[Characters@x, l]]@WordList[]

or

Cases[x_ /; Intersection[Characters@x, l] == Sort@l]@WordList[]

Sure, we can apply also the function If[].

Posted 5 years ago

@Abrita Chakravarty First unlike WordList, DictionaryLookup supports patterns. Moreover DictionaryLookup seems to have more words than the WordList (probably a bug).

WordList[]//Length(*39176*)
DictionaryLookup[]//Length(*92518*)
Complement[WordList[],DictionaryLookup[]](*{}*)

Here is my solution:

Clear[pangram];
pangram[characters_List]:=Module[{$characterSet="["<>StringJoin@characters<>"]*"},DictionaryLookup@RegularExpression[StringJoin["(?="<>$characterSet<>#<>")"&/@characters,$characterSet]]];
pangram[{"a","o","l","r","w","f","m"}](*{"wolfram"}*)
pangram[{"z","t","c","a","u","h","p"}](*{"chutzpa","chutzpah"}*)
pangram[{"e","t","l","a","b"}](*{"ablate","ballet","battle","beatable","bleat","eatable","table","tablet"}*)

Its orders of magnitude faster than doing surgery on WordList[] or DictionaryLookup[] alone.

POSTED BY: Muhammad Ali

No, a pangram would be a word that uses ONLY the letters from the given set - BUT could use a letter more than once. So for the letters, {"a", "o", "l", "r", "w", "f", "m"}, "flamethrower" would not be considered the pangram.

DictionaryLookup[] contains more words, really:

In[1]:= Complement[DictionaryLookup[], WordList[]] // Length

Out[1]= 53342
Posted 5 years ago

Taking the complement other way shows that WordList is a subset of DictionaryLookup so DictionaryLookup is a safe choice.

POSTED BY: Muhammad Ali

So, we can construct other codes with DictionaryLookup[], such as for example:

In[1]:= DictionaryLookup[x__ /; SubsetQ[Characters@x, l]]

Out[1]= {"chutzpa", "chutzpah"}

[WSG20] Wolfram Language Basics from EIWL (Days 6, 7, 8, 9)

Posted 5 years ago

Try your code on {"a", "o", "l", "r", "w", "f", "m"}

DictionaryLookup[x__ /; SubsetQ[Characters@x, {"a", "o", "l", "r", "w", "f", "m"}]]
{"flamethrower", "flamethrowers", "flatworm", "flatworms", "mayflower", "mayflowers", "wolfram"}

It produces incorrect result.

POSTED BY: Muhammad Ali

It looks important to define before the variable l:

In[1]:= l = {"z", "t", "c", "a", "u", "h", "p"};
DictionaryLookup[x__ /; SubsetQ[Characters@x, l]]

Out[1]= {"chutzpa", "chutzpah"}

Yet another way:

DeleteMissing[{#, PartOfSpeech[#]} & /@ StringJoin /@ Permutations[{"z", "t", "c", "a", "u", "h", "p"}], 1, Infinity]
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract