Message Boards Message Boards


Extract and detect specific semantic information from text

Posted 5 months ago
2 Replies
2 Total Likes

I am looking for any NLP/ML/Deep learning technique to extract and detect a useful information from a large amount of text data. Specifically, I am focusing on detecting whether a person actually visited a restaurant and eat some food at a specific time.

For example, we may have three different sentences:

  • S1. John enjoyed spaghetti at Terranova located in New York City.
  • S2. John read an article about Terranova, the best spaghetti restaurant in New York City
  • S3. John wanted to eat spaghetti at Terranova when he visited New York City, but he could not.

For these three sentences, my program should be able to detect only S1 that embeds important information related to his actual visit and eating activity at a the restaurant in contrast to S2 and S3 (the S2 and S3 doesn't indicate he actually visited the restaurant.) Additionally, once S1 has been detected, how can I extract specific entities involved person, location, and time information from the sentences?

Could you please recommend any useful tools and algorithms to support this kind of a task?


2 Replies

Have you looked at FindTextualAnswer?

Here's an example with 5 solutions: ``` s1 = "John enjoyed spaghetti at Terranova located in New York City."

FindTextualAnswer[s1, "Who did what?",  5, {"Probability", "String", "Position",    "HighlightedSentence"}] // Column
The answer:
 {{0.0949745, "John enjoyed spaghetti at Terranova", {1, 35}}},
 {{0.00473055, "New York City", {48, 60}}},
 {{0.00355906, ".", {61, 61}}},
 {{0.0022374, "located", {37, 43}}},
 {{0.000319291, "in", {45, 46}}}


There are also other options such as "HighlightSentence".

Posted 5 months ago

I am just one reader of the many rich threads from this community, hoping to learn from others. I do have Wolfram (free) developer evaluation licence running on Ubuntu 20.04. And I play with Jupyter notebooks

Now I am quite sure that Wolfram has the firepower to analyse your corpus, but I wonder if you might get results easier by applying AntConc tool found here (get the latest version just released).

In your case are you not searching for colocations .. keywords separated by * wild cards ..

enjoyed ... Terranova

ate ... Terranova

I do ponder about the possibility of linking AntConc into Wolfram workflow. There is no formal API between the two but consider that AntConc can export its query results to be imported into Wolfram for further visualisation of corpora. There are test corpora embedded in AntConc.

My pennyworth.

POSTED BY: David Law
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract