How sentiment analysis works
What is sentiment analysis, and why is it important?
Sentiment analysis is a means by which the attitude – whether they lean to the positive, negative or neutral – of the author of (primarily) written text can be determined. This is something that most humans do instinctively, but what if the data sample is in the millions, encompassing multiple languages and formats?
Automated sentiment analysis can be used to process vast quantities of data, and report on who feels what about whom. With sophisticated sentiment analysis systems, nuances such as personal bias can be identified, and strength of feeling quantified. Any text data can be scanned, from social media posts to news articles, online reviews and survey results, and a numerical score given.
In order to do this, sentiment analysis must be able to comprehend the meaning of language, identify and link subjects, and correctly interpret irregular ways of communicating – such as slang, jargon or sarcasm.
A potted history of sentiment analysis
Sentiment analysis programmes emerged from the need to automate the time-consuming work of scanning press clippings, reviews, market research, customer service stats, survey responses and customer feedback to establish how the authors felt about an organisation, business, product, brand, service or individual and as part of social media monitoring to track public opinion.
In its earliest iteration, sentiment analysis followed a simple rules-based approach. Text documents were broken down into constituent parts, those that expressed an opinion or position on the subject were corralled, and a sentiment score given to each. A blunt tool, provided a negative score for bad commentary, and positive for anything represented in a good light.
In order to step beyond bald positive or negative ratings, sentiment libraries were established – extensive compilations of adjectives given manual scores by human coders which allowed the programme to judge the relative difference between them: the difference between “poor” and “appalling”, for example. Using these libraries combined with proximity rules, plus a hit count, the programme could more accurately score the sentiment being expressed about a specific subject.
An extra layer of accuracy was added by factoring in negators and intensifiers. Combinations of words can express greater strength of feeling – such as adding “very” or “extremely” to an adjective – and the system needed to take this into account. Likewise, the use of a negator – “not” or “wasn’t” – reverses the sentiment, so needs to be factored.
How sentiment analysis works today
The system of text analytics described above provides a broad sense of the sentiment prevalent in a document, but lacks nuance and requires ongoing human intervention to recode language rules and update sentiment libraries. A rules-based system works for predictable, repetitive text-based articles, but stalls when faced with the variety and illogic of human communication, combined with the ever-evolving structure of the written word. New rules constantly have to be written. To be effective, sentiment analysis needs to be more sophisticated than these early beginnings.
Multi-layered sentiment analysis assigns sentiment not just to documents or phrases, but also to individual entities, themes, categories, and topics within them. This allows for a restaurant review, for example, to be judged neutral overall, while specific subjects within it (the service, the food, the ambiance) are scored positively or negatively, thus allowing the restaurateur to know the areas where they can improve.
Such a system requires machine learning to function – the ability of a computer system to build an internal library of meanings and related phrases through repetition and experience. Many sentiment analysis solutions combine machine learning with a rules-based programme to account for the potential deficiencies in both. The former handles natural language processing, and the latter gives it a man-made foundation on which to build its conclusions. They work together by identifying patterns in the text which represent sentiment-bearing phrases, then applying sentiment analysis algorithms, such as factoring in intensifiers, and then employ machine learning to factor in syntax. Thanks to HTML, document structure can also be considered, with headlines and first sentences given greater importance. This approach identifies weighted sentiment phrases and gives a sentiment score for each entity.
The cutting edge
The evolution of data science and machine learning techniques has engendered different approaches to sentiment analysis. Advances in deep learning have enabled greater precision in measurement. A machine learning model can be trained to identify unknown nouns, for example, by processing large volumes of data with tagged examples, and employing neural networks to establish what nouns ‘look’ like. That same model can be used to identify other parts of speech.
With sufficient examples, the model can even learn to differentiate between double meanings for the same words or phrase, based on context. Sentiment analysis systems are able ‘remember’ where they have seen a word before. Recurrent neural networks can loop data to form this ‘memory’. In language analytics, Long Short Term Memory (LSTM) is a recurrent network that retains a concept of prior relationships in data. This level of sophistication in sentiment analysis is producing similar results to those delivered by human test groups scoring the same data.
The future of sentiment analysis
The future will see the honing of sentiment analysis tools to give insight into how sentiment changes, and why is does so. This will include the ability to differentiate between the opinions and feelings of individual stakeholder groups towards a specific subject, and to outline multi-stakeholder perspectives. Rather than combined sentiment towards a company, it will be able to identify the feelings aroused by specific news stories or topics in connection to that company: so negative scoring in social media posts isn’t cancelled out out by rising stock prices on the back of unpopular, but financially beneficial, behaviour for example.
A new level of nuance will be achieved using artificial intelligence methods. Weighting applied to individual authors who carry greater influence on particular topics; analysis that recognises the emotional state of the author as well as their literal meaning; sentiment analysis of videos and images, social media memes and stories. To do this, the scoring system will likely have to evolve from the current positive-to-negative binary option to a multi-dimensional scale capable to of distilling the multitude of sentiments that can be expressed through the written word.
Why sentiment analysis is important
Sentiment analysis is a central element of reputation intelligence, the process by which all mentions of an organisation or brand across every media channel can be captured, processed and analysed. It reveals who is saying what, and where they are saying it. This gives vital insight into that organisation’s reputational standing among its many stakeholders.
As part of this, sentiment analysis provides a more detailed understanding what a diverse base of stakeholders feel about a particular organisation, brand, product, service or event. It is employed by market and business analysts to understand investor enthusiasm or reticence; marketing and customer support to track how consumers are responding to advertising and product quality; human resources to identify workforce satisfaction – or otherwise; and public relations professionals to gain oversight on how their clients are being reported in the media.
Be part of the Connected Intelligence community