Message Boards Message Boards

3 Replies
3 Total Likes
View groups...
Share this post:

Markov chain n-gram models

Posted 10 years ago
Here are the links to two blog posts of mine discussing the application of n-gram models to 
1. text generation ,

2. genome data classification .

The second post has a discussion about using a modified Receiver Operating Characteristic (ROC) to select the best n of the n-gram models for different combinations of gene pairs.
Here is an example of ROC plots:

I plan to update this post with the Mathematica code I programmed and used. That code can be also found in (MathematicForPrediction at GitHub). The full blown article describing the genome data classification algorithm and experiments can be downloaded from this link :
POSTED BY: Anton Antonov
3 Replies
Here is the application of the n-gram model to text genereation using the full text of the play "Hamlet" as training data:
text = ExampleData[{"Text", "Hamlet"}];
genTexts = {#,
    NGramMarkovChainText[text, #,
     StringSplit[text][[1020 ;; 1020 + # - 1]], 200,
      WordSeparators -> {" ", "\n"}]} & /@ Range[2, 5]

 It can be seen in the table that the 5-gram generated text makes more sense than the 2-gram one. All 4 randomly generated texts start from the same place in the play.
(A more detailed discussion is given in my "Markov chains n-gram model implementation" blog post.)
POSTED BY: Anton Antonov
Thank you, Jon, that is nice to hear!
POSTED BY: Anton Antonov
Posted 10 years ago
Thnaks for the articles you have been publishing. They have been very helpful and illuminating.
POSTED BY: Jon Rogers
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract