GOAL OF THE PROJECT:To build an emotion categorization system that receives free form text input from users and outputs a list of emotions expressed in the text.
SUMMARY OF WORK: The work is based on the hourglass model of human emotions that classifies human emotions along the four dimensions of pleasantness, sensitivity, attention and aptitude each of which have six levels called sentic levels, amounting to a total of 24 emotions. We take data from the Sentic net project and use it to build a pattern matcher that scores tweets based on the pre-modeled Sentic net data, We then use the scored tweet data to train four classifiers along the individual dimensions using the Markov method.
RESULTS AND FUTURE WORK: Using only 320,000 rows of analysed and scored tweets out of 2,360,155 from the HCCorpus to train the Classifiers due to the large amount of time it takes for a highly efficient pattern matcher to process the individual tweets and the time constraints on the project, we were able to produce a usable web form https://www.wolframcloud.com/objects/user-a543cc5b-996e-42ff-a4df-5050c6ce7842/SaySomethingApp that accepts input and produces some output which tends to be reasonable sometimes. Emotion Categorization is an open ended and difficult field and our effort here is to see which methods and models deserve the most extensive effort in future work.
In the future we hope to train the classifiers on a much larger corpus of tweets, blog posts, news articles and other forms of UGCs (User generated contents), if results are still not satisfactory we will explore other models of human emotions such as: Circumplex model, Vector model, Positive activation [Dash] negative activation (PANA) model, Plutchik's model, PAD (Pleasure, Arousal and Dominance) emotional state model and the Lövheim cube of emotion.
My project is about Emotion categorization of text. Its basically a system for extracting emotion related information from free form text input from our web interface. Current sentiment analysis classifies text along two categories, either as negative or as positive, while in some cases this might be enough it is really just touching the surface of a much more complex system of affective information embedded within text. Our model uses the hourglass of emotions, in the graphic above where were classify affective information along four dimensions of pleasantness, sensitivity, attention and aptitude each of which have six sentic levels representing the intensity of the emotion.
Sentics, speciﬁes the affective information associated with real-world entities. In sentic computing, whose term derives from the Latin 'sentire' (root of words such as sentiment and sentience) and 'sensus' (as in common-sense), the analysis of natural language is based on common-sense reasoning tools, which enable the analysis of text not only at document, page or paragraph-level, but also at sentence, clause, and concept level.
Some of the most popular techniques for opinion mining simply focus on word co-occurrence frequencies and statistical polarity associated with words. Such approaches can correctly infer the polarity of unambiguous text with simple phrase structure and in a speciﬁc domain (i.e.,the one the statistical classiﬁer has been trained with).One of the main characteristics of natural language,however, is ambiguity. A word like big does not really hold any polarity on its own as it can either be negative, e.g., in the case of bigproblem, or positive, e.g., in bigmeal, but most statistical methods assign a positive polarity to it,as this often appears in a positive context.
Sentic computing proposes the ensemble application of AI and Semantic Web techniques, for knowledge representation and inference; mathematics, for carrying out tasks such as graph mining and multi-dimensionality reduction; linguistics, for discourse analysis and pragmatics; psychology, for cognitive and affective modeling; sociology, for understanding social network dynamics and social inﬂuence; ﬁnally ethics, for understanding related issues about the nature of the mind and the creation of emotional machines.
Different Emotion classes from different authors
Author #Emotions Basic emotions
Ekman 6 Anger, disgust, fear, joy, sadness, surprise
Parrot 6 Anger, fear, joy, love, sadness, surprise
Frijda 6 Desire, happiness, interest, surprise, wonder, sorrow
Plutchik 8 Acceptance, anger, anticipation, disgust, joy, fear, sadness, surprise
Tomkins 9 Desire, happiness, interest, surprise, wonder, sorrow
Matsumoto 22 Joy, anticipation, anger, disgust, sadness, surprise, fear, acceptance, shy, pride, appreciate, calmness, admire, contempt, love, happiness, exciting, regret, ease, discomfort, respect, like
This is how we did the scoring work:
First we get our database of 50,000 affect -> sentic vector mapping, so an affect concept like alittlespecific will have a pleasantness value of 0.089, sensitivity of 0.132, attention of -0.1, and aptitude of 0.119
Below is a subset of our pattern rules, for example with the affect concept alittlehungry the pattern matcher searches a tweet for every combination of the concept and all the concepts in the entire concept database. it matches as much as is necessary but not too much then does a Mean on all the matches to get a single vector for the tweet. This is the most compute intensive aspect of the entire process.
Below is the function that processes the tweets, which are arbitrary 140 character statements.
Making it possible to compute affective information from text and multi-media is a task that cannot be ignored because the future of AI and Humanity hangs on our ability to understand emotion and accurately compute it with the hope of avoiding chaotic consequences like having AI that do not understand human feelings having to interact with us on a daily basis. We will have to be able to tell AI how to feel and also give them the ability to understand how we feel from our communication with them. The field is currently very hazy with idiosyncratic definitions and methodologies from many researchers who are interested in it, therefore making it a difficult thing to standardize and compute.
The goal of this project is to start a computational-experimental approach, we take some dataset from some research and try to duplicate their conclusion, if success goes beyond some threshold then we continue digging into the paradigm, but if results are not encouraging we switch approach. If we eventually run out of established paradigms we may have to invent novel techniques to deal with the problem using every tool from Machine learning to Big data analysis.
In the immediate future we will like to train the classifier with all the tweets since we only used a small fraction due to time and compute power constraints. Furthermore we will analyse the news articles and blog posts. This is as far as we plan on going with the Hourglass model. If results are not satisfactory we will switch paradigms. Eventually the goal is to build machines capable of understanding human emotions for any form of human expression and also to build emotional machines that will be able to generate emotions that can be recognized and interacted with by human beings.
Background Info Links/References
Sentic Net http://sentic.net
[Bullet] E Cambria, A Hussain. Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Cham, Switzerland: Springer, ISBN: 978-3-319-23654-4 (2015)
Provide keywords as items