Message Boards Message Boards

0
|
9123 Views
|
3 Replies
|
2 Total Likes
View groups...
Share
Share this post:
GROUPS:

How good is Mathematica with Natural Language processing?

I've used NLTK a little bit, and I normally use Python to extract linguistic data. This is only my second day using Mathematica, and I'm wondering whether the following things are actually possible:

  1. I know there are thousands of languages included in M. English, it seems, offers a wide range of tools for linguistic analysis. How about other languages? Let's say I'd want to separate the syllables of Spanish... would that be doable here?

  2. What would you suggest I read as far as language processing goes...? Do people normally use M for that? Are there packages for that?

Thanks

3 Replies

In the last eight years I have used Mathematica quite a lot for doing Natural Language Processing and text mining. Here are couple of links that describe such activities:

[1] "Statistical thesaurus from NPR podcasts" : http://mathematicaforprediction.wordpress.com/2013/10/15/statistical-thesaurus-from-npr-podcasts/

[2] "Natural language processing with functional parsers" http://mathematicaforprediction.wordpress.com/2014/02/13/natural-language-processing-with-functional-parsers/

Both blog posts have links to Mathematica packages and guides for doing NLP.

The approaches in those links are more-or-less language agnostic. I have used them to make search engines that combine (i) English, Spanish, and French, and (ii) English and Malay.

You might find this discussion interesting, "Convergence of synonym networks" : http://community.wolfram.com/groups/-/m/t/227651 .

As for your question: "Let's say I'd want to separate the syllables of Spanish... would that be doable here?" I have not used separate syllables of Spanish, only appropriate stemmers.

POSTED BY: Anton Antonov

Thanks a lot, Anton. I'll take a look.

POSTED BY: Sean Clarke
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract