Sentiment analysis methods, now and future
Sentiment measures positive and negative tone, also called polarity or affect, primarily in written text. Applications started with brand and product marketing, and are expanding across customer service, finance, health care, law-even politics. Recent advances in deep learning enable new levels of precision in measuring sentiment methods. However, business use of sentiment is at an early and expanding stage.
How to measure sentiment analysis?
Measuring what people think of companies, brands, or other entities is complex because people and language are complex. Quantifying sentiment requires solving three big challenges: identifying what language means, interpreting exceptional ways of communicating, and correctly identifying and linking subjects and topics.
Most studies and practical experience show that a group reaches roughly 65-80% agreement on sentiment, even with carefully controlled methodologies. Models for understanding natural language cannot be trained against a definitive truth.
Second, content includes sarcasm, hyperbole, slang, industry jargon, and sentences with complex modifiers to negatives, e.g. “I wouldn’t quite say BigCo is great”. For corporate reputation, sectors such as defense, life science, and technology bring complexities.
A third challenge lies in correctly identifying entities and then associating them with relevant topics and sentiment. For example, is F. Zakaria the same as Fareed Zakaria? Where and when does “the Times” refer to Financial Times or New York Times? Is SEAT an automotive company or furniture? A long article or research report analyzing a sector may refer to many companies. alva built and continues to evolve a complex series of rules paired with over a million database entries to correctly identify authors and sources.
Which sentiment analysis methods you can use?
Solutions for these challenges have progressed from early knowledge and rule-based approaches to statistical approaches and now to deep learning. Sentiment systems started with lexicons or dictionaries that essentially count good and bad words, along with modifiers. For example, the sentence “Despite early struggles, Beyond Meat projects very promising earnings growth” could be scored -2 for “struggles”, +1 for “promising” multiplied by 3 due to the modifier “very” and another +1 for “growth” for a total of +2. This approach does not link the entity, Beyond Meat, to the topic of earnings growth. Many sentiment offerings and alternative data feed remain at this basic level.
The advent of HTML for web pages encouraged using the structure of documents. Terms and topics in headlines, initial paragraphs, and the first sentence of paragraphs are often weighted more heavily than text later in documents. This approach not only improves accuracy and efficiency but also reflects studies showing many audiences do not read entire articles
Next, practitioners adopted statistical techniques often starting with “bag of words”. This approach creates a grid of how often each word occurs within a document. It does not account for document structure or semantics. The grid provides a foundation for other analytics such as term frequency, or how often a word appears in a document. For example, many words are used frequently so an algorithm can look for infrequently used words more likely to describe a document. In practice, advanced natural language processing systems have progressed to creating “word embedding” maps that link related terms. These maps are then used as inputs to deep learning models.
Sentiment analysis with deep learning and machine learning
Statistical approaches evolved into machine learning models which can be continually trained and improved. While much focus has moved to deep learning using neural networks, machine learning approaches remain valid particularly for smaller training data sets or where computational efficiency is key.
Machine learning algorithms
Machine learning algorithms for sentiment generally center on support vector machines (SVM) and random forest. SVMs group relationships between variables. Consider a two-dimensional grid with points showing the locations of cats and dogs. The model will find a line (vector) that best groups and separates the locations of either animal. In practice, SVMs find groupings and relationships across many variables in what’s termed an n-dimensional space that is challenging to visualize. For example, a model could be trained using words and phrases from corporate earnings releases and then used to categorize new earnings releases into sentiment scores.
The random forest technique creates decision trees that step through aspects or variables about a problem. If a decision tree is categorizing animals into cats and dogs, a tree could walk through a series of questions such as “Does the animal have a tail?”, “Does the animal weigh more than 50 lbs.?”, “Does it bark?”. The algorithm will recognize that some trees contain questions that have no value (presence of tail), some have partial value (weight), and some have strong value (barking). The random forest algorithm will generate many trees and then improve by combining and evolving the most accurate trees.
Deep learning represents the current frontier for sentiment. Deep learning uses neural networks intended to partially replicate how nervous systems function. Spreadsheets offer an analogy. An input, such as an article about a company, is first converted into standardized terms, or tokens, that normalize for plurals, verb tenses, or conjoined words in languages like German that tend to agglutinate or combine words. The tokens are fed into the top of the spreadsheet as a set of values. Each layer in the spreadsheet contains a weighting value that is multiplied against inputs from the prior layer. At the bottom, a set of values comes out which is used to categorize the input into sentiment scores.
In practice, neural networks take many forms in how cells pass information to each other. Research first focused on convolutional neural network used in applications like image recognition where data is presented as a single whole. Convolutional networks break out the data into many parts and then build them up to recognize features. For example, a picture may be broken into pixels, then edges are identified, eyes and noses are recognized, and then a face categorized into feline or canine. In effect, convolutional networks live in the moment with no awareness of prior data or lessons.
Long short term memory
However, words tend to hold meaning based on their context. Sentiment applications improved with recurrent neural networks that consume data as a sequence or series. Recurrent networks loop data repeatedly through the same cells to create a form of memory. A form of recurrent network called Long Short Term Memory, or LSTM, applies particularly well to language analytics. LSTMs include cells that retain a memory of a prior relationship in data that, similarly to neurons, require a certain threshold value to activate. Alva applies LSTM neural nets in sentiment calculations.
At this point, deep learning reproduces a reasonable proportion of sentiment scored by people. In practice, smaller sentences or samples may contain less predictable results but larger data sets tend to show sentiment calculations well-aligned with training and test data sets. While we can expect continued improvements in sentiment measurement, future progress will focus on practically applying sentiment and achieving greater nuance understanding how different groups of people view a subject.
Sentiment calculations are likely to evolve with:
- Multi-stakeholder perspectives reflecting, for example, views held by employees compared to investors. Alva currently provides multi-stakeholder views. However, relationships between stakeholder perspectives and markets are not well understood. For example, when large American automotive companies announce layoffs, press coverage and social media are typically quite negative even as the stock price increases.
- Models recognizing that topics carry the sentiment. While a company’s reputation and brand may be slower to change, the shorter-term sentiment is driven by specific events or topics. While many current solutions simply sum up total sentiment toward a company at a point in time, the reality is that sentiment changes based on events and actions-and different stakeholders or demographic groups may hold differing views on these topics.
- Greater focus on authors and sources (publications) as a source of signals. A tweet by Warren Buffet clearly affects sentiment toward a company more than one by this author. In practice, we can identify clusters of authors that tend to most impact sentiment. For example, a leading crypto research boutique identified that software developers employed by crypto exchanges may have few followers but are strong predictors of attitudes toward crypto coins and tokens.
- Identifying specific emotional states rather than simply good or bad effect. While IBM Watson and other systems claim to achieve this now, the industry has generally not yet begun to tailor communication-based on classifications such as frustrated, sad, or impolite.
- Image and video sentiment measuring affect and impact of pictures, social media memes, or services like Instagram Stories videos.
Overall, sentiment applications currently quantify polarity with sufficient accuracy to plan communications actions and investments. We can expect to see further improvement in deep learning further improve the accuracy of automated models compared to human-scored data. More importantly, analytics will evolve to improve understanding why and how sentiment changes.
Be part of the
Stakeholder Intelligence community