Message Boards Message Boards

GROUPS:

Calculating the popularity of languages over a set of countries

Posted 5 years ago
13182 Views
|
15 Replies
|
6 Total Likes
|

Hi all

I'm quite new to Alpha and have a specific question, I'm not sure if Wolfram Alpha is able to calculate? I have a set of countries and I want to know what the popularity of those languages are over all the countires combined (both a percentage and number of people), if possible with a breakdown for each country. For interest this is the list of countries where Wikipedia Zero is available, which is a service to give Wikipedia with no data charges in predominantly countries in the global south. http://wikimediafoundation.org/wiki/Wikipedia_Zero.

Thanks

John

15 Replies
Posted 5 years ago

Hi Nasser

Many thanks but that's not quite right, I'm not looking for the most common language per country, I'm looking for the most common languages for the sum total population of a set of countries.

Thanks

John

Posted 5 years ago

For reference the set of countries are:

Angola, Anguilla, Antigua, Bangladesh, Botswana, British Virgin Islands, Burundi, Cameroon, Cayman Islands, Democratic Republic of Congo, East Timor, El Salvador, Ghana, Haiti, India, Indonesia, Ivory Coast, Jamaica, Jordan, Kazakhstan, Kenya, Kosovo, Kyrgyzstan, Madagascar, Malaysia, Mongolia, Montenegro, Morocco, Myanmar, Nepal, Niger, Nigeria, Pakistan, Panama, Philippines, Russia, Rwanda, Saudi Arabia, South Africa, Sri Lanka, St Kitts, Suriname, Tajikistan, Thailand, Tobago, Trinidad, Tunisia, Turks and Caicos Islands, Uganda and Ukraine

Not sure how this would be done in Wolfram Alpha Pro, but it's not so hard in Wolfram Language. If you look at

http://reference.wolfram.com/language/ref/LanguageData.html

you can probably work it out, e.g. using the "CountryLanguageFractions" property.

To find the fraction over two countries, you might use the formula

average= ((average in X)(number in X) +(average in Y)(number in Y))/(X+Y)

which generalizes to a list of countries.

Posted 5 years ago

Hi Todd

Thanks so muck for this, I'm sorry, I'm having trouble understanding this, would you be able to write an example input that I can type in for 2 countries, say Angola and Anguilla? That way I can expand the example to find the most common languages for the sum total population of a set of countries for all the data I would like to input.

Many thanks

John

Sure, though maybe CountryData will be the faster way to go.

X=QuantityMagnitude[CountryData["Angola","Population"]];
Y=QuantityMagnitude[CountryData["Anguilla","Population"]];

The output of this

CountryData["Angola", "LanguagesFractions"]

is a bunch of rules, language->fraction so you need a way to match up the languages. You could do it like this

A=CountryData["Angola", "LanguagesFractions"];
B=CountryData["Anguilla", "LanguagesFractions"];
languages=Union[A[[All,1]],B[[All,1]]];

and then something like this

# -> ((# /. Append[A, _ -> 0])*X + (# /. Append[B, _ -> 0])*Y)/(X + Y) & /@ languages
Posted 5 years ago

Hi Todd

Can I check that I have inputted this correctly? I had assumed one of the outputs was giving me a percentage of the total population who were speakers of each language but Wolfram Alpha tells me that only 60% of people in Angola speak Portuguese, Anguilla should barely effect this number of the total because it only has 15,000 inhabitants, none of which speak Portuguese.

https://www.wolframcloud.com/objects/68ba4fa3-0b9d-42bb-9916-6333910724ec

Could you explain what I'm seeing from the two outputs?

I'm sorry to keep asking questions

Thanks very much for your help

John

I see, yes that is strange. I don't know. You should send a comment to the Wolfram Alpha team, for one thing because the total percents it gives adds up to a little more than 100% even taking rounding into account. From a quick glance at the Wikipedia page, it seems there might not be a reliable number for this one.

Posted 5 years ago

Thanks very much Todd, I've emailed support

Cheers

John

If the question is about Wolfram|Alpha, the best email address is info@wolframalpha.com .

Posted 5 years ago

Hi Bruce

I think this is a Mathematica problem so I sent it to support@wolfram.com , I'm sure they'll pass it on if I've gone to the wrong place.

Cheers

John

Posted 5 years ago

So I think I've fixed that problem but made myself a new one by not knowing what I'm doing

I fixed the previous problem by removing a line of code that I'd added because I didn't understand the instructions properly

I've got this to work

X01=QuantityMagnitude[CountryData["Spain","Population"]];
X02=QuantityMagnitude[CountryData["Portugal","Population"]];

A01=CountryData["Spain", "LanguagesFractions"];
A02=CountryData["Portugal", "LanguagesFractions"];
languages=Union[A01[[All,1]],A02[[All,1]]];

# -> ((# /. Append[A01, _ -> 0])*X + (# /. Append[A02, _ -> 0])*Y)/(X01 + X02) & /@ languages

Which gives me this which seems to be correct

{Entity["Language", "Aragonese"] -> 0.00023, Entity["Language", "Asturian"] -> 0.0026, Entity["Language", "Basque"] -> 0.012, Entity["Language", "Calo"] -> 0.00093, Entity["Language", "CatalanValencianBalear"] -> 0.14, Entity["Language", "CatalonianSignLanguage"] -> 0.00038, Entity["Language", "Extremaduran"] -> 0.0042, Entity["Language", "Fala"] -> 0.00022, Entity["Language", "Galician"] -> 0.067, Entity["Language", "Gascon"] -> 0.000080, Entity["Language", "MirandaDoDouro"] -> 0.00028, Entity["Language", "Portuguese"] -> 0.18, Entity["Language", "RomaniVlax"] -> 9.2*10^-6, Entity["Language", "Spanish"] -> 0.59, Entity["Language", "SpanishSignLanguage"] -> 0.0021}

But I'm getting an error when I add the full list of countries, could someone tell me what I'm doing wrong please? I get the error

Part::partd: Part specification Missing[NotApplicable][[All,1]] is longer than depth of object. >> Union::heads: Heads Part and List at positions 24 and 1 are expected to be the same. >>

When I enter:

X01 = QuantityMagnitude[
   CountryData["Republic of Angola", "Population"]];
X02 = QuantityMagnitude[CountryData["Anguilla", "Population"]];
X03 = QuantityMagnitude[
   CountryData["Antigua and Barbuda", "Population"]];
X04 = QuantityMagnitude[CountryData["Bangladesh", "Population"]];
X05 = QuantityMagnitude[CountryData["Barbados", "Population"]];
X06 = QuantityMagnitude[CountryData["Bermuda", "Population"]];
X07 = QuantityMagnitude[CountryData["Botswana", "Population"]];
X08 = QuantityMagnitude[
   CountryData["British Virgin Islands", "Population"]];
X09 = QuantityMagnitude[CountryData["Cameroon", "Population"]];
X10 = QuantityMagnitude[CountryData["Cayman Islands", "Population"]];
X11 = QuantityMagnitude[
   CountryData["Democratic Republic of the Congo", "Population"]];
X12 = QuantityMagnitude[CountryData["East Timor", "Population"]];
X13 = QuantityMagnitude[CountryData["El Salvador", "Population"]];
X14 = QuantityMagnitude[CountryData["Ghana", "Population"]];
X15 = QuantityMagnitude[CountryData["Guyana", "Population"]];
X16 = QuantityMagnitude[CountryData["Haiti", "Population"]];
X17 = QuantityMagnitude[CountryData["India", "Population"]];
X18 = QuantityMagnitude[CountryData["Indonesia", "Population"]];
X19 = QuantityMagnitude[CountryData["Ivory Coast", "Population"]];
X20 = QuantityMagnitude[CountryData["Jamaica", "Population"]];
X21 = QuantityMagnitude[CountryData["Jordan", "Population"]];
X22 = QuantityMagnitude[CountryData["Kazakhstan", "Population"]];
X23 = QuantityMagnitude[CountryData["Kenya", "Population"]];
X24 = QuantityMagnitude[CountryData["Kosovo", "Population"]];
X25 = QuantityMagnitude[CountryData["Kyrgyzstan", "Population"]];
X26 = QuantityMagnitude[CountryData["Madagascar", "Population"]];
X27 = QuantityMagnitude[CountryData["Malaysia", "Population"]];
X28 = QuantityMagnitude[CountryData["Mongolia", "Population"]];
X29 = QuantityMagnitude[CountryData["Montenegro", "Population"]];
X30 = QuantityMagnitude[CountryData["Morocco", "Population"]];
X31 = QuantityMagnitude[CountryData["Myanmar", "Population"]];
X32 = QuantityMagnitude[CountryData["Nepal", "Population"]];
X33 = QuantityMagnitude[CountryData["Niger", "Population"]];
X34 = QuantityMagnitude[CountryData["Nigeria", "Population"]];
X35 = QuantityMagnitude[CountryData["Pakistan", "Population"]];
X36 = QuantityMagnitude[CountryData["Panama", "Population"]];
X37 = QuantityMagnitude[CountryData["Philippines", "Population"]];
X38 = QuantityMagnitude[CountryData["Rwanda", "Population"]];
X39 = QuantityMagnitude[CountryData["Saudi Arabia", "Population"]];
X40 = QuantityMagnitude[CountryData["South Africa", "Population"]];
X41 = QuantityMagnitude[CountryData["Sri Lanka", "Population"]];
X42 = QuantityMagnitude[
   CountryData["Saint Kitts and Nevis", "Population"]];
X43 = QuantityMagnitude[CountryData["Suriname", "Population"]];
X44 = QuantityMagnitude[CountryData["Tajikistan", "Population"]];
X45 = QuantityMagnitude[CountryData["Thailand", "Population"]];
X46 = QuantityMagnitude[
   CountryData["Trinidad and Tobago", "Population"]];
X47 = QuantityMagnitude[CountryData["Tunisia", "Population"]];
X48 = QuantityMagnitude[
   CountryData["Turks and Caicos Islands", "Population"]];
X49 = QuantityMagnitude[CountryData["Uganda", "Population"]];
X50 = QuantityMagnitude[CountryData["Ukraine", "Population"]];

A01 = CountryData["Republic of Angola", "LanguagesFractions"];
A02 = CountryData["Anguilla", "LanguagesFractions"];
A03 = CountryData["Antigua and Barbuda", "LanguagesFractions"];
A04 = CountryData["Bangladesh", "LanguagesFractions"];
A05 = CountryData["Barbados", "LanguagesFractions"];
A06 = CountryData["Bermuda", "LanguagesFractions"];
A07 = CountryData["Botswana", "LanguagesFractions"];
A08 = CountryData["British Virgin Islands", "LanguagesFractions"];
A09 = CountryData["Cameroon", "LanguagesFractions"];
A10 = CountryData["Cayman Islands", "LanguagesFractions"];
A11 = CountryData["Democratic Republic of the Congo", 
   "LanguagesFractions"];
A12 = CountryData["East Timor", "LanguagesFractions"];
A13 = CountryData["El Salvador", "LanguagesFractions"];
A14 = CountryData["Ghana", "LanguagesFractions"];
A15 = CountryData["Guyana", "LanguagesFractions"];
A16 = CountryData["Haiti", "LanguagesFractions"];
A17 = CountryData["India", "LanguagesFractions"];
A18 = CountryData["Indonesia", "LanguagesFractions"];
A19 = CountryData["Ivory Coast", "LanguagesFractions"];
A20 = CountryData["Jamaica", "LanguagesFractions"];
A21 = CountryData["Jordan", "LanguagesFractions"];
A22 = CountryData["Kazakhstan", "LanguagesFractions"];
A23 = CountryData["Kenya", "LanguagesFractions"];
A24 = CountryData["Kosovo", "LanguagesFractions"];
A25 = CountryData["Kyrgyzstan", "LanguagesFractions"];
A26 = CountryData["Madagascar", "LanguagesFractions"];
A27 = CountryData["Malaysia", "LanguagesFractions"];
A28 = CountryData["Mongolia", "LanguagesFractions"];
A29 = CountryData["Montenegro", "LanguagesFractions"];
A30 = CountryData["Morocco", "LanguagesFractions"];
A31 = CountryData["Myanmar", "LanguagesFractions"];
A32 = CountryData["Nepal", "LanguagesFractions"];
A33 = CountryData["Niger", "LanguagesFractions"];
A34 = CountryData["Nigeria", "LanguagesFractions"];
A35 = CountryData["Pakistan", "LanguagesFractions"];
A36 = CountryData["Panama", "LanguagesFractions"];
A37 = CountryData["Philippines", "LanguagesFractions"];
A38 = CountryData["Rwanda", "LanguagesFractions"];
A39 = CountryData["Saudi Arabia", "LanguagesFractions"];
A40 = CountryData["South Africa", "LanguagesFractions"];
A41 = CountryData["Sri Lanka", "LanguagesFractions"];
A42 = CountryData["Saint Kitts and Nevis", "LanguagesFractions"];
A43 = CountryData["Suriname", "LanguagesFractions"];
A44 = CountryData["Tajikistan", "LanguagesFractions"];
A45 = CountryData["Thailand", "LanguagesFractions"];
A46 = CountryData["Trinidad and Tobago", "LanguagesFractions"];
A47 = CountryData["Thailand", "LanguagesFractions"];
A48 = CountryData["Trinidad and Tobago", "LanguagesFractions"];
A49 = CountryData["Tunisia", "LanguagesFractions"];
A50 = CountryData["Turks and Caicos Islands", "LanguagesFractions"];
A51 = CountryData["Uganda", "LanguagesFractions"];
A50 = CountryData["Ukraine", "LanguagesFractions"];
languages = Union[A01[[All, 1]], A02[[All, 1]], A03[[All, 1]], A04[[All, 1]], 
   A05[[All, 1]], A06[[All, 1]], A07[[All, 1]], A08[[All, 1]], 
   A09[[All, 1]], A10[[All, 1]], A11[[All, 1]], A12[[All, 1]], 
   A13[[All, 1]], A14[[All, 1]], A15[[All, 1]], A16[[All, 1]], 
   A17[[All, 1]], A18[[All, 1]], A19[[All, 1]], A20[[All, 1]], 
   A21[[All, 1]], A22[[All, 1]], A23[[All, 1]], A24[[All, 1]], 
   A25[[All, 1]], A26[[All, 1]], A27[[All, 1]], A28[[All, 1]], 
   A29[[All, 1]], A30[[All, 1]], A31[[All, 1]], A32[[All, 1]], 
   A33[[All, 1]], A34[[All, 1]], A35[[All, 1]], A36[[All, 1]], 
   A37[[All, 1]], A38[[All, 1]], A39[[All, 1]], A40[[All, 1]], 
   A41[[All, 1]], A42[[All, 1]], A43[[All, 1]], A44[[All, 1]], 
   A45[[All, 1]], A46[[All, 1]], A47[[All, 1]], A48[[All, 1]], 
   A49[[All, 1]], A50[[All, 1]]];

# -> (((((((((((((((((((((((((((((((((((((((((((((((((((# /. 
                    Append[A01, _ -> 0])*X01 + (# /. Append[
                    A02, _ -> 0])*X02) + (# /. Append[A03, _ -> 0])*
                    X03) + (# /. Append[A04, _ -> 0])*X04) + (# /. 
                    Append[A05, _ -> 0])*X05) + (# /. 
                    Append[A06, _ -> 0])*X06) + (# /. 
                    Append[A07, _ -> 0])*X07) + (# /. 
                    Append[A08, _ -> 0])*X08) + (# /. 
                    Append[A09, _ -> 0])*X09) + (# /. 
                    Append[A10, _ -> 0])*X10) + (# /. 
                    Append[A11, _ -> 0])*X11) + (# /. 
                    Append[A12, _ -> 0])*X12) + (# /. 
                    Append[A13, _ -> 0])*X13) + (# /. 
                    Append[A14, _ -> 0])*X14) + (# /. 
                    Append[A15, _ -> 0])*X15) + (# /. 
                    Append[A16, _ -> 0])*X16) + (# /. 
                    Append[A17, _ -> 0])*X17) + (# /. 
                    Append[A18, _ -> 0])*X18) + (# /. 
                    Append[A19, _ -> 0])*X19) + (# /. 
                    Append[A20, _ -> 0])*X20) + (# /. 
                    Append[A21, _ -> 0])*X21) + (# /. 
                    Append[A22, _ -> 0])*X22) + (# /. 
                    Append[A23, _ -> 0])*X23) + (# /. 
                    Append[A24, _ -> 0])*X24) + (# /. 
                    Append[A25, _ -> 0])*X25) + (# /. 
                    Append[A26, _ -> 0])*X26) + (# /. 
                    Append[A27, _ -> 0])*X27) + (# /. 
                    Append[A28, _ -> 0])*X28) + (# /. 
                    Append[A29, _ -> 0])*X29) + (# /. 
                    Append[A30, _ -> 0])*X30) + (# /. 
                    Append[A31, _ -> 0])*X31) + (# /. 
                    Append[A32, _ -> 0])*X32) + (# /. 
                    Append[A33, _ -> 0])*X33) + (# /. 
                    Append[A34, _ -> 0])*X34) + (# /. 
                    Append[A35, _ -> 0])*X35) + (# /. 
                    Append[A36, _ -> 0])*X36) + (# /. 
                    Append[A37, _ -> 0])*X37) + (# /. 
                    Append[A38, _ -> 0])*X38) + (# /. 
                    Append[A39, _ -> 0])*X39) + (# /. 
                    Append[A40, _ -> 0])*X40) + (# /. 
                    Append[A41, _ -> 0])*X41) + (# /. 
                    Append[A42, _ -> 0])*X42) + (# /. 
                    Append[A43, _ -> 0])*X43) + (# /. 
                    Append[A44, _ -> 0])*X44) + (# /. 
                    Append[A45, _ -> 0])*X45) + (# /. 
                  Append[A46, _ -> 0])*X46) + (# /. 
                Append[A47, _ -> 0])*X47) + (# /. 
              Append[A48, _ -> 0])*X48) + (# /. Append[A49, _ -> 0])*
          X49) + (# /. Append[A50, _ -> 0])*X50)/(X01 + X02 + X03 + 
       X04 + X05 + X06 + X07 + X08 + X09 + X10 + X11 + X12 + X13 + 
       X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 + X23 + 
       X24 + X25 + X26 + X27 + X28 + X29 + X30 + X31 + X32 + X33 + 
       X34 + X35 + X36 + X37 + X38 + X39 + X40 + X41 + X42 + X43 + 
       X44 + X45 + X46 + X47 + X48 + X49 + X50) & /@ languages

Any help would be appreciated

Thanks very much

John

You have a syntax error with the parentheses in the last expression, too many closing parentheses in the wrong places.

The formula I wrote was a little misleading. When you have a bunch of things there are faster ways to do this, e.g.,

populations=Map[Function[country,QuantityMagnitude[CountryData[country, "Population"]]], countrylist]

will give the list of populations (see the documentation of the individual function names).

Then you could use other functions like Total, Sum, Table, and once you have a more succinct piece of code it will be easier to see if it's right.

Posted 5 years ago

Hi Todd

Thanks very much for your continued help, honestly I feel a bit lost now. I don't know how to fix the first list I've written and I don't understand enough to know what's happening with the new piece of code you've suggested. I need to spend time working through all the documentation to learn how it all works, it's amazing software but I'd like to get this working asap for a funding application. If the first bit of code is fixed will it give me the answer? Is it just large or is it going to give incorrect answers?

John

Sorry it was one extra opening parentheses. The syntax coloring should show you the first one is pink in that long sequence of opening parentheses.

Doing it the more general way will make it easier to see if things are right. Define your list of countries e.g. as above use the variable name countrylist. Get the list of populations as above. Then make the list of country language fractions in a similar manner.

languagefractions=Map[Function[country,CountryData[country, "LanguagesFractions"], countrylist]

Then get the languages

languages=Apply[Union, languagefractions[[All,All,1]]];

Then get the number of language users in each country

users[population_, fractions_, language_]:= population*(language/.Append[fractions,_->0])

You could use Map or MapThread with Total, but maybe simpler to use Sum.

Map[Function[lang,lang->Sum[users[populations[[i]], languagefractions[[i]], lang], {i, 1, Length[countrylist]}]/Total[populations]],
languages]
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract