Group Abstract Group Abstract

Message Boards Message Boards

Add M49 standard country codes to a dataset?

Posted 6 years ago
POSTED BY: Tester Trying
7 Replies
POSTED BY: Hans Michel
Posted 6 years ago
POSTED BY: Tester Trying
Posted 6 years ago
POSTED BY: Rohit Namjoshi
POSTED BY: Hans Michel
Posted 6 years ago

Dear Hans Michel,

I have this dataset with all countries and multiple entries for different years per country. Ideally I would like to add two kind of rows. One with the continents and one with geographical regions like 'Caribbean', 'Central America' or 'North/South Europe' for example.

dataset: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016

regions:

enter image description here

POSTED BY: Tester Trying
Posted 6 years ago

Hi Tester,

That dataset is a CSV file. I think you mean add columns for continent and geographical region, not rows. Here is one way to do that

Download and import the dataset

suicideImport = Import["~/Downloads/master.csv", "Dataset", HeaderLines -> 1];

One country's name does not match WL data so rename it

suicide = 
  suicideImport[All, 
   If[#country == "Saint Vincent and Grenadines", <|#, "country" -> "Saint Vincent and the Grenadines"|>, #] &];

Add continent column

suicide = 
  suicide[All, <|#, "continent" -> CanonicalName@CountryData[#country]["Continent"]|> &];

Country and region data is available here, download Excel and import

unsdImport = 
  Import["~/Downloads/UNSD \[LongDash] Methodology.xlsx", {"Dataset", 1}, HeaderLines -> 1];

Generate map from country name to region

regionMap = 
  unsdImport[
     All, <|#"Country or Area" -> 
        If[#"Intermediate Region Name" == "", #"Sub-region Name", #"Intermediate Region Name"]|> &] //
     Normal // Association;

Add region to suicide dataset

suicide = suicide[All, <|#, "region" -> regionMap[#country]|> &]

There are still some issues, country "Czech Republic" has no mapping to region.

Total number of suicides by continent and year

suicide[GroupBy[{#continent, #year} &] /* Values,
 <|
   "continent" -> Query[Max, #continent &], 
   "year" -> Query[Max, #year &], 
   "total_suicides" -> Query[Total, #"suicides_no" &]
 |>]
POSTED BY: Rohit Namjoshi

Hello Tester...: This should be something to get started pending some clarification of your post. Do you want to add another column or row. And what is to be in that column.

data = Import["https://unstats.un.org/unsd/methodology/m49/", "Data"];
Rest[Part[data,2,1,2,1,2,1][[All,2]]]/. Map[CountryData[#, "UNNumber"] -> # &, CountryData[All]] 

Using the UNCode is problematic as there is an entry in the M49 data that is missing there are ways to fix this but the M49Code is all there. Nevertheless not all number match a Wolfram Language CountryData entity.

POSTED BY: Hans Michel
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard