# Trying to analyze discussions around Covid using Twitter data

Posted 7 months ago
1205 Views
|
4 Replies
|
5 Total Likes
|
 Hi! A little bit about my idea and what I could figure out with my basic Mathematica knowledge. Hopefully you guys can let me know other solutions for the objectives I want to achieve.Basically, I would like to check what kind of words or topics are mostly discussed in my country's Twittosphere. I'm really interested in this data because I enjoy doing SciComm and getting data about how the public interacts on the Internet will help me build a better proposal whenever I have to do SciComm expos or anything. My country is also interesting because we're known to not invest to much on science, for example. What reputation do scientists have in a not-so-scientific country? That would also be interesting to know.To start with this, I picked up 7 search words and hashtags I know people in my country are using from what I could extract from the daily TT.So, my first idea is to apply wordclouds on imported Twitter data. I tried to use my Mathematica v.11.2.0 on my Ubuntu computer, but sadly, I don't know why it can't process any of the Twitter code I'm writing in there. So, I decided to start working with a notebook on the Cloud. This is what I got using a script I found somewhere: twitter= ServiceConnect["Twitter"] result = twitter["TweetSearch", "Query" -> "#CoronavirusEnPeru", MaxItems -> 100]; WordCloud@Flatten[Normal[StringSplit[#["Text"]] & /@ result]] This resulted in a nice WordCloud, but it lacks certain filtering. Now, the tweets I'm trying to analyze here are all written in spanish. So, using a list of spanish stepwords, I created a list stepsp.My goal is to remove all spanish stepwords to obtain a better glimpse. Also, I added some twitter jargon that might be noisy.This is what I tried to use to remove spanish stepwords, using my list stepsp: DeleteCases[Normal[result[[All,"Text"]]], Alternatives @@ stepsp] And this is where I'm stuck right now. It seems that DeleteCases isn't doing actually anything. I tried to produce a wordcloud from that, but it seems that that computation exceeds what I'm able to do with the Cloud.These are my questions: Is DeleteCases a good way to remove stepwords? Why is that function not deleting what I want to delete? Is there a way to just obtain tweets according to a certain country? I tried using GeoLocation, but I don't know if this is the way to go. How should I proceed with the Mathematica I have installed in my computer? This is what I get whenever I want to process anything: \$CharacterEncoding: "The byte sequence {240} could not be interpreted as a character in \ the UTF-8 character encoding."  The same code I'm using in the Cloud notebook has been tried in my own computer, but it doesn't work. And it seems that I won't be able to complete these tasks with the Cloud seeing as I might occupy all the memory I have available.