Message Boards Message Boards

[WSS16] Essay Grading Assistant

Many core subject areas of academia have found success in automated grading, as it often saves a good amount of time. With major advancements in natural language processing and computational linguistics, there has been considerable interest to extend this type of grading to writing. Many school districts around the country asses student writing according to rubrics they have developed, often inspired by larger entities such as the Common Core State Standards. As a result, these rubrics are often rigorous and can be confusing to implement. Additionally, consistency in grading policies is difficult in this area, largely due to the subjective nature of language, communication, and creativity. That being said, this is certainly an exciting field, and given some thought, one could imagine many ways to tackle this problem.

Mathematica provides many tools to explore and potentially guide this process. Let's take a look at the first 10 sentences from a popular document. ``

doi = Take[
  TextSentences[ExampleData[{"Text", "DeclarationOfIndependence"}]], 
  10]

To see how the writers contextually progressed through the document, we can Classify each sentence by Mathematica's built in "Facebooktopic" property:

classified = Classify["FacebookTopic", #] & /@ doi

{"QuotesAndLifePhilosophy", "QuotesAndLifePhilosophy", "Politics", \ "QuotesAndLifePhilosophy", "Politics", "Health", "Politics", \ "SchoolAndUniversity", "Politics", "Politics"}

And check the polarity of each statement:

result = Thread[{Transpose[{classified, 
       Length[classified] // Range }]}[[
    1]] -> {Classify["Sentiment", #] & /@ TextSentences[doi]}[[1]]]

{{"QuotesAndLifePhilosophy", 1} -> {"Neutral"}, {"QuotesAndLifePhilosophy", 2} -> {"Neutral"}, {"Politics", 3} -> {"Neutral"}, {"QuotesAndLifePhilosophy", 4} -> {"Neutral"}, {"Politics", 5} -> {"Neutral"}, {"Health", 6} -> {"Neutral"}, {"Politics", 7} -> {"Neutral"}, {"SchoolAndUniversity", 8} -> {"Positive"}, {"Politics", 9} -> {"Negative"}, {"Politics", 10} -> {"Neutral"}}

We can create a graph which may be easier to visualize in terms of structure:

Graph[Partition[result, 2, 1] /. {a_, b_} -> a \[DirectedEdge] b, 
 VertexLabels -> "Name", GraphLayout -> "LayeredDigraphEmbedding"]

doi topics

For writing rubrics, a sort of essay structure may be visualized. Additionally the TextStructure and WordData functions provide worlds of possibility to facilitate Text Analysis. For example, are you aware that the word "really" is 3 syllables (known as a polysyllable)?

WordData["really", "Hyphenation"]

{"re", "al", "ly"}

We can have a little fun with that.....

Riffle[WordData["really", "Hyphenation"], " "] // StringJoin // Speak

Just in case you spell it wrong, there's always

SpellingCorrectionList["reely"]

{"leery", "reel", "rely", "freely", "reels", "reedy", "reply", "reel \ y", "really", "Reilly", "relay", "Elysee", "realty", "realm"}

Using these tools, along with the rubrics developed and lots more, I have begun to develop an application to take a district's rubric and several scored writing prompts. A teacher will supply a class of student essays, and the application will send back statistics on spelling errors, grammar errors, suggestions for better prose, and basic readability scores. The teacher is then given the opportunity to provide feedback on the essays. This is not a requirement, but will produce better results. Upon the final submission, the application uses Neural Network Classifiers taking in both the original and adjusted essays to determine reasonable scores in each point of the rubric. The application also produces charts related to the student essay, including a matching analysis of the uniqueness of student word choice, by matching against the most frequently used English words.The teacher can override this if they wish, and then all the information is appended to the database, and the classifiers have scheduled retraining.

The results so far are promising, but there is much work to be done in terms of identification of phrases, repetitiveness, and analysis of Google's Ngram database, as well as work lexical graphing and computational linguistics.

POSTED BY: John Corley
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract