Group Abstract Group Abstract

Message Boards Message Boards

Get an updated value for WordFrequencyData?

Hello community,

I have two questions regarding WordFrequenceData[]:

I noticed that the maximum date for this feature is 2008 (from 12 years ago), even in the new version 12.1. I understand that the data comes from the "Google Books English n-gram public dataset".

I'm still trying to understand how this command (WordFrequenceData) works, so I may be missing something. Example:

WordFrequencyData["computer", "TimeSeries", {1900, Now}]
DateListPlot[%]

im1

Now
Today
DateValue["Year"]
WordFrequencyData["computer", "TimeSeries", {1900, Today}]
WordFrequencyData["computer", "TimeSeries", {1900, DateValue["Year"]}]

im2

  • My questions are:

1) Are there any estimates when this data will be updated?

2) Is there any workaround for this? Maybe with WebSearch[] in any way?

Thank you very much.

POSTED BY: Claudio Chaib
3 Replies

There is additional functionality that allows you to analyze the raw text data As it is divided to 100 fragments you may use the following to download it as a first step and continue from there

 Table[URLDownloadSubmit[
  "http://commondatastorage.googleapis.com/books/syntactic-ngrams/eng/\
nodes." <> IntegerString[n, 10, 2] <> "-of-99.gz", 
  "~/Downloads/" <> IntegerString[n, 10, 2] <> ".gz", 
  HandlerFunctions -> <|"TaskFinished" -> Print|>], {n, 0, 98}]

best

POSTED BY: Claudio Chaib
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard