Thank you very much for your reply Jonathan,
I find your article very interesting and inspiring. I am not a proficient Wolfram user, unfortunately. However, I study finance now, and I would really like to learn how you did this sentiment analysis.
With the help of Rohit, I went through the first part of code, and now I have some troubles with the WSJSentimentIndicator:
WSJSentimentIndicator[date_] :=
Module[{d = date, archive, archivewords, WSJSI},
archive =
Import[StringJoin["http://www.wsj.com/public/page/archive-",
DateString[d, {"Year", "-", "MonthShort", "-", "DayShort"}],
".html"]];
archive =
StringDrop[archive,
StringPosition[archive,
DateString[d, {"MonthName", " ", "DayShort", ", ", "Year"}]][[1,
2]]];
archive =
StringTake[
archive, -1 + StringPosition[archive, "ARCHIVE FILTER"][[1, 1]]];
archivewords = ToLowerCase[DeleteStopwords[TextWords[archive]]];
WSJSI = #Positive/(#Negative + #Positive) &@
Counts[Classify["Sentiment", archivewords]] // N;
{WSJSI, archivewords, archive}]
So, if we update the code, it should look like this:
WSJSentimentIndicator[date_ ] :=
Module[{d = date, archive, archivewords, WSJSI},
archive =
Import[StringJoin["https://www.wsj.com/news/archive/",
DateString[d, {"Year", "Month", "Day"}]]];
archive =
StringDrop[archive,
StringPosition[archive,
DateString[
d, {"MonthNameShort", " ", "DayShort", " ",
"Year"}]][[1, 2]]];
archive =
StringTake[
archive, -1 +
StringPosition[archive, "Most Popular Articles"][[1, 1]]];
archivewords = ToLowerCase[DeleteStopwords[TextWords[archive]]];
WSJSI = #Positive /(#Negative + #Positive) &@
Counts[Classify["Sentiment", archivewords]] // N;
{WSJSI, archivewords, archive}]
However, the code which returns us the histogram does not work for me:
WSJSI = Flatten[First@WSJSentimentIndicator[#]&/@datelist]
Histogram[tsWSJSI, PlotLabel -> Style["Histogram of WSJ Sentiment indicator",Bold]]
If you will have time, could you please help me solve this one? I have a feeling, that the WSJSI does not account for the 'datelist' correctly.