With the recent development of deep learning, research in artificial intelligence (AI) has gained new vigor and prominence. Machine learning, however, suffers from three big issues, namely:
1. Dependency issue: it requires (a lot of) training data and it is domain-dependent.
2. Consistency issue: different training and/or tweaking lead to different results.
3. Transparency issue: the reasoning process is unintelligible (black-box algorithms).
At SenticNet, we address these issues in the context of natural language processing (NLP) by coupling machine learning with linguistics and commonsense reasoning. In particular, we apply an ensemble of commonsense-driven linguistic patterns and statistical NLP: the former are triggered when prior knowledge is available, the latter is used as backup plan when both semantics and sentence structure are unknown. Machine learning, in fact, is only useful to make a "good guess" because it only encodes correlation and its decision-making process is merely probabilistic. To use Noam Chomsky's words, "you do not get discoveries in the sciences by taking huge amounts of data, throwing them into a computer and doing statistical analysis of them: that’s not the way you understand things, you have to have theoretical insights".
Our multi-disciplinary approach to natural language understanding, termed sentic computing, aims to bridge the gap between statistical NLP and many other disciplines that are necessary for understanding human language, such as linguistics, commonsense reasoning, affective computing, and more. Sentic computing, whose term derives from the Latin 'sensus' (as in commonsense) and 'sentire' (root of words such as sentiment and sentience), enables the analysis of text not only at document, page or paragraph level, but also at sentence, clause, and concept level. In particular, sentic computing's novelty gravitates around three key shifts:
1. Shift from mono- to multi-disciplinarity – evidenced by the concomitant use of AI and Semantic Web techniques, for knowledge representation and inference; mathematics, for carrying out tasks such as graph mining and multi-dimensionality reduction; linguistics, for discourse analysis and pragmatics; psychology, for cognitive and affective modeling; sociology, for understanding social network dynamics and social influence; finally ethics, for understanding related issues about the nature of mind and the creation of emotional machines.
2. Shift from syntax to semantics – enabled by the adoption of the bag-of-concepts model in stead of simply counting word co-occurrence frequencies in text. Working at concept-level entails preserving the meaning carried by multi-word expressions such as cloud_computing, which represent ‘semantic atoms’ that should never be broken down into single words. In the bag-of-words model, for example, the concept cloud_computing would be split into computing and cloud, which may wrongly activate concepts related to the weather and, hence, compromise categorization accuracy.
3. Shift from statistics to linguistics – implemented by allowing sentiments to flow from concept to concept based on the dependency relation between clauses. The sentence “iPhone7 is expensive but nice”, for example, is equal to “iPhone7 is nice but expensive” from a bag-of-words perspective. However, the two sentences bear opposite polarity: the former is positive as the user seems to be willing to make the effort to buy the product despite its high price, the latter is negative as the user complains about the price of iPhone7 although he/she likes it.
Sentic computing takes a holistic approach to natural language understanding by handling the many sub-problems involved in extracting meaning and polarity from text. While most works approach it as a simple categorization problem, in fact, sentiment analysis is actually a 'suitcase' research problem that requires tackling many NLP tasks. As Marvin Minsky would say, the expression 'sentiment analysis' itself is a big suitcase (like many others related to affective computing, e.g., emotion recognition or opinion mining) that all of us use to encapsulate our jumbled idea about how our minds convey emotions and opinions through natural language. Sentic computing addresses the composite nature of the problem via a three-layer structure that concomitantly handles tasks such as concept extraction, to deconstruct text into words and multiword expressions, subjectivity detection, to filter out neutral content, temporal tagging, for time expression analysis and recognition, named-entity recognition, to locate and classify named entities into pre-defined categories, personality recognition, for distinguishing between different personality types of the users, sarcasm detection, to detect and handle sarcasm in opinions, aspect extraction, for enabling aspect-based sentiment analysis, and more.
The core element of sentic computing is SenticNet, a knowledge base of 50,000 commonsense concepts. Unlike many other sentiment analysis resources, SenticNet is not built by manually labeling pieces of knowledge coming from general NLP resources such as WordNet or DBPedia. Instead, it is automatically constructed by applying graph-mining and multi-dimensional scaling techniques on the affective commonsense knowledge collected from three different sources, namely: WordNet-Affect, Open Mind Common Sense and GECKA. This knowledge is represented redundantly at three levels (following Minsky's panalogy principle): semantic network, matrix, and vector space. Subsequently, semantics and sentics are calculated though the ensemble application of spreading activation, neural networks and an emotion categorization model. More details about this process are provided in the latest sentic computing book (chapter 2).
SenticNet can be used for different sentiment analysis tasks including polarity detection, which is performed by means of sentic patterns. Such patterns are applied to the dependency syntactic tree of a sentence, as shown in Fig(a) below. The only two words that have intrinsic polarity are shown in yellow color; the words that modify the meaning of other words in the manner similar to contextual valence shifters are shown in blue. A baseline that completely ignores sentence structure, as well as words that have no intrinsic polarity, is shown in Fig(b): the only two words left are negative and, hence, the total polarity is negative. However, the syntactic tree can be re-interpreted in the form of a ‘circuit’ where the ‘signal’ flows from one element (or subtree) to another, as shown in Fig(c). After removing the words not used for polarity calculation (in white), a circuit with elements resembling electronic amplifiers, logical complements, and resistors is obtained, as shown in Fig(d). More details are provided in the latest sentic computing book (chapter 3).