Features

AI-Powered Sentiment Analysis Transforming Investment Research

Feb 14, 2025 | 15 min read | by Alex Hoffmann

Traditionally, financial analysts have operated in a world dominated by numbers, meticulously examining revenue figures, profit margins, cash flow statements, and balance sheets to uncover investment opportunities. Spreadsheets ruled supreme, with tools evolving from Lotus 1-2-3 to sophisticated Excel models. This quantitative focus served the industry well, but a parallel revolution has been quietly developing in how we analyze the qualitative aspects of companies.

The industries initial approach to analyzing text was surprisingly mechanical. We build sentiment models by counting the negative words in earnings calls, tabulating the frequency of terms like "challenging" or "confident" in annual reports, and creating rudimentary scoring systems based on word frequency. Eventually, basic natural language processing entered the scene, enabling analysts to scrape Twitter feeds and apply simple statistical sentiment models. Today, we stand at the frontier of a new era where large language models (LLMs) offer unprecedented capabilities to truly understand the nuance, context, and sentiment in financial communications, potentially unlocking significant value for investment professionals willing to embrace this technology.

Check out the sentiment feature of Marvin Labs, or the results from Microsoft (MSFT), NVIDIA (NVDA) and many more companies in the app to see how we apply LLMs to sentiment analysis. Read on to learn how we built our sentiment model, how it works, and how it can help you discover investment opportunities that would have slipped through otherwise.

Key Points

Traditional investment analysis focused primarily on numerical data, but modern approaches now incorporate sophisticated sentiment analysis of all company communications
Early sentiment analysis methods relied on simple word counting and basic categorization, missing context and nuance
Modern LLM-powered sentiment analysis understands semantic meaning, context, and subtle implications in financial communications
Research indicates sentiment analysis can effectively identify investment opportunities that numerical analysis alone might miss
Building a sentiment model requires careful consideration of data sources, model selection, and integration with existing workflows. We show you how we built our sentiment model.
A test portfolio based solely on sentiment signals demonstrated significant returns over a five-year period
While sentiment analysis provides valuable signals, it works best as a complementary tool within a comprehensive research process

The Evolution of Sentiment Analysis in Finance

From Numbers to Words

Financial analysis has historically centered on quantitative metrics. Analysts would parse through income statements, balance sheets, and cash flow statements, building intricate models to project future performance. According to a 2021 CFA Institute survey, over 90% of investment professionals consider financial statement analysis their primary analytical tool.

However, numbers tell only part of the story. Research by Loughran and McDonald demonstrates that the language in corporate disclosures provides valuable information beyond the financial figures. This recognition led to the first generation of sentiment analysis in finance.

First-Generation Approaches

Early sentiment analysis methods in finance were remarkably straightforward. Analysts created dictionaries of positive and negative words, then calculated sentiment scores based on word frequency. The Harvard Psychosociological Dictionary and the Loughran-McDonald financial sentiment word lists became standard tools. While groundbreaking at the time, these approaches failed to capture context, sarcasm, or the complex ways in which language conveys meaning.

By the 2010s, more sophisticated methods emerged. Natural Language Processing (NLP) techniques began analyzing syntactic patterns, and machine learning algorithms started identifying subtle relationships between linguistic features and market reactions. A Stanford University study found that these improved models could predict market reactions to earnings announcements with greater accuracy than traditional methods.

The LLM Revolution

The introduction of transformer-based language models in 2017 and their widespread adoption and quality improvements since 2022 marked a turning point. These models understand language in context, recognizing subtle cues that earlier systems missed.

Today's LLMs don't just count words or apply rigid rules. They understand semantic relationships, detect subtle shifts in tone, and recognize the implications of statements within broader economic and business contexts.

How LLM-Powered Sentiment Analysis Works

Beyond Word Counting

Traditional sentiment analysis essentially counted words from predefined lists. If a CEO used five positive words and three negative words, the statement received a positive score. This approach fails spectacularly with statements like: "We are not seeing the positive results we anticipated, and our outlook lacks the optimism we projected last quarter."

Modern LLMs understand these contextual relationships. They process text through multiple attention layers that weigh the relationships between words and phrases. According to research from MIT Technology Review , this enables them to accurately interpret statements like: "While facing headwinds in our core markets, we've managed to stabilize cash flow and expect gradual improvement throughout next quarter."

Capturing Semantic Meaning

LLMs excel at understanding implicit meaning in financial communications. When a CFO states, "We remain committed to our previous guidance," an LLM can assess whether surrounding context suggests confidence or hedging. Research published in the Journal of Financial Economics demonstrates that these models can identify linguistic patterns associated with future earnings surprises, corporate fraud, or management turnover.

The models also recognize industry-specific language and terminology. For example, "challenging environment" carries different implications in banking versus retail, distinctions that LLMs learn through their training on vast financial text corpora.

Learnings from Building a Sentiment Model: Best Practices and Considerations

When we set out to build the Marvin Labs sentiment analysis feature, we had a few guiding principles in mind. We wanted to ensure that our model was not only accurate but actually measured what we set out to measure. Here are some key takeaways from our experience:

Data Considerations: What to Include

The quality of input data significantly impacts sentiment analysis results. We only include primary sources of information to ensure that we are capturing the views of management and the company itself.

Further, we only include sources where management has regulatory requirements to offer a fair and balanced assessment - though there are some notable executives that stretch this.

High-quality sources that we include are:

Annual and interim reports such as 10-Ks and 10-Qs or their equivalents in other jurisdictions
Prepared remarks and Q&A sessions from earnings calls
Press releases and regulatory filings rendered pursuant to exchange rules or regulations
Select high quality conferences such as JPM Healthcare Conference, Morgan Stanley Technology Conference, and Goldman Sachs Technology and Internet Conference, and several others
Investor day presentations and other investor targeted company events
High profile public sales and marketing presentations such as Apples WWDC, Microsoft Build, NVIDIA GTC, and others

The big primary source that we exclude is marketing materials such as (non-regulatory) press releases, product launches, company blogs, or promotional content. We believe there is - understandably - not enough incentive for the company to provide a fully fair and balanced assessment in those document. We only include them when companies file them with the SEC or their equivalent regulator or stock exchange, as we believe that those regulatory bodies ensure a minimum standard of fair and balanced coverage.

Data Considerations: What to Exclude

We deliberately excluded third-party sources of information from our model. This includes:

Financial news sources such as Bloomberg, Reuters, CNBC, Financial Times, and Wall Street Journal
Social media such as Twitter, Reddit, and StockTwits

We are aware that other sentiment models from time to time include these sources. One of the most prominent example, the VADER sentiment model , is even deliberately designed to work with social media content.

However, we believe that this dilutes the quality of the sentiment signal from two separate directions.

First, our sentiment model is meant to capture the sentiment of management and the company itself, rather than the sentiment of third parties about a company.

Second, the distribution of third-party coverage of various companies is very top-heavy. Outside a particular subset of companies - typically consumer facing brands and meme stock - most companies simply do not generate enough content on social media or in traditional media to draw any conclusions. Have you ever looked at the Twitter / X account of ADP (currently ~ the 100th largest company by market capitalization globally)? How often are they mentioned in the financial press? This lack of coverage is even more pronounced for smaller companies, which may not have a significant social media presence or media coverage.

Company vs Document Sentiment: Picking the Right Basis of Observation

Some sentiment models provide a sentiment score on a per-document basis: how good or bad is the annual report, the quarterly report, the earnings call?

We believe that this is the wrong approach. The sentiment of a document is not the same as the sentiment of a company which is made up of a plethora of documents and other forms of communication. Each of those has a different weight and relevance to the overall sentiment of the company: an annual report is more important than a random press release. Further, the an annual report, a quarterly report, or an earnings call does not exists in isolation. They are part of a larger narrative that the company is telling to its investors and other stakeholders over time and over a variety of documents.

That is why we provide only a company sentiment score calculated daily which weighs each document appropriately based on its importance, relevance, and relation to each other.

We update our company sentiment score shortly after the publication of each document (typically within 10-30 minutes), so you can always see the most recent sentiment score of a company.

Result Calibration

Some companies will have a better base sentiment than others, irrespectively how well they are doing and how well the sentiment model is calibrated. A toy manufacturer that can describe its products as bringing joy to their customers will always have a higher sentiment score than a boring bank.

That's why it is important to calibrate the sentiment model to the company and its peers. We do this by comparing the sentiment score of a company with its peers in the same industry, and normalizing sentiment across industries and across market phases to a score between 0 and 100.

Performance Analysis: Sentiment as an Investment Signal

Methodology

To evaluate sentiment analysis as an investment signal, we constructed a weekly rebalancing portfolio using sentiment scores derived from our LLM-based system. The methodology included:

Analyzing a universe of stocks using our sentiment model
Taking long positions in the ten companies with the highest sentiment scores
Taking short positions in the ten companies with the lowest sentiment scores
Equal-weighting positions within each group
Weekly rebalancing based on updated sentiment scores

The test covered a five-year period from 2018 to present, including the market turbulence of 2020 and subsequent recovery.

Results

As shown in the chart below, a portfolio with a long-notional value of $100 would have yielded approximately $60 over the five-year period without experiencing significant drawdowns¹.

Return on a Sentiment Long-Short Portfolio. Source: Marvin Labs, Bloomberg

The performance is particularly noteworthy for its consistency. During the March 2020 market crash, the strategy demonstrated relative stability, suggesting that sentiment signals may capture information not fully reflected in price movements during periods of market stress.

According to analysis from Goldman Sachs Global Investment Research , sentiment-based signals often exhibit low correlation with traditional factors like value, momentum, and quality, potentially adding diversification benefits to multi-factor investment approaches.

Risk-Adjusted Performance

The strategy demonstrated favorable risk-adjusted metrics:

Sharpe Ratio: 1.3
Maximum Drawdown: 8.2%
Annualized Volatility: 9.6%

These figures compare favorably to both broad market indices and other alternative data strategies during the same period¹.

Practical Applications in Investment Research

Opportunity Discovery and Risk Management

Despite the encouraging performance, sentiment analysis works best as a complementary tool rather than a standalone investment approach. A Morgan Stanley research report found that combining sentiment signals with traditional fundamental factors improved risk-adjusted returns by 15-20% compared to using either approach in isolation.

Sentiment analysis excels at identifying potential opportunities and risks, but detailed fundamental analysis - quantitative and qualitative - remains essential for investment decision-making. The most effective approach integrates sentiment signals into existing research workflows, for example as a screening tool to identify companies for further analysis, or as a risk management tool to flag potential red flags. We encourage investors to use sentiment analysis as a starting point for deeper investigation rather than a replacement for traditional research methods.

The Future of LLM-powered Sentiment Analysis in Investment Research

The integration of LLM-powered sentiment analysis into investment research continues to evolve. Several trends show particular promise:

Multimodal analysis: Combining text, audio, and video analysis to detect inconsistencies between a speaker's words and their tone or body language during presentations
Improved Understanding of Models: As LLMs become more sophisticated, they will better understand the nuances of financial language, including sarcasm, humor, and cultural references that may impact sentiment interpretation.
Cross-entity sentiment networks: Mapping sentiment relationships between companies, suppliers, customers, and competitors to uncover broader industry trends

FAQ

1. What is sentiment analysis in the context of financial research?
Sentiment analysis in finance is the process of analyzing the tone, emotion, and attitude expressed in company communications (such as earnings calls, annual reports, and press releases) to gauge management's outlook and confidence. Modern sentiment analysis uses large language models to understand the nuance and context of these communications, going beyond simple word counting to capture semantic meaning.

2. How does LLM-powered sentiment analysis differ from traditional approaches?
Traditional sentiment analysis simply counted positive and negative words from predefined lists, missing context and nuance. LLM-powered approaches understand semantic relationships, recognize industry-specific terminology, detect subtle shifts in tone, and interpret statements within broader business contexts. This enables them to accurately assess sentiment even when positive words are used in negative contexts or vice versa.

3. What data sources should be included in a financial sentiment model?
The most reliable sentiment models focus on primary sources where management has regulatory requirements to provide fair and balanced assessments. These include annual and quarterly reports (10-Ks and 10-Qs), earnings call transcripts, regulatory filings, investor day presentations, and select high-profile conferences. Marketing materials and third-party sources (news media, social media) are generally less reliable for assessing management sentiment.

4. Can sentiment analysis be used as a standalone investment strategy?
While our test portfolio based solely on sentiment signals showed promising returns, sentiment analysis works best as a complementary tool rather than a standalone approach. Research shows that combining sentiment signals with traditional fundamental factors improves risk-adjusted returns significantly compared to using either approach in isolation. Sentiment analysis excels as a screening tool to identify companies for further fundamental research.

5. How are sentiment scores calibrated across different companies and industries?
Some companies naturally have a higher "base sentiment" than others due to their industry or business model (e.g., a toy manufacturer versus a bank). Effective sentiment models calibrate scores by comparing a company's sentiment with peers in the same industry and normalizing across industries and market phases. Marvin Labs normalizes sentiment to a score between 0 and 100 for easy comparison.

6. How quickly does sentiment analysis reflect new information?
Modern sentiment analysis tools like Marvin Labs update company sentiment scores shortly after the publication of each new document, typically within 10-30 minutes. This allows investors to quickly assess changes in management tone or outlook following earnings releases, investor presentations, or other important communications.

7. What risk-adjusted performance metrics can be expected from sentiment-based strategies?
In our five-year backtest, a sentiment-based long-short portfolio demonstrated a Sharpe Ratio of 1.3, maximum drawdown of 8.2%, and annualized volatility of 9.6%. These figures compare favorably to both broad market indices and other alternative data strategies. The strategy showed particular stability during market stress periods like the March 2020 crash.

8. Why focus on company-level sentiment rather than document-level sentiment?
Document-level sentiment analysis (scoring individual reports or calls) misses the broader narrative a company builds across multiple communications over time. Company-level sentiment weighs each document appropriately based on its importance and relevance, creating a more holistic view of management's outlook. This approach recognizes that documents don't exist in isolation but form part of an ongoing narrative.

9. What future developments are expected in LLM-powered sentiment analysis?
Promising trends include multimodal analysis (combining text, audio, and video analysis to detect inconsistencies in communication), increasingly sophisticated understanding of financial language nuances, and cross-entity sentiment networks that map relationships between companies, suppliers, customers, and competitors to uncover broader industry trends.

10. How does sentiment analysis correlate with traditional investment factors?
According to research from Goldman Sachs Global Investment Research, sentiment-based signals often exhibit low correlation with traditional factors like value, momentum, and quality. This low correlation means sentiment analysis can provide diversification benefits when added to multi-factor investment approaches, potentially identifying opportunities that traditional metrics might miss.

Footnotes

We wanted to clarify that this is not investment advice. Any such analysis from any software vendor - including Marvin Labs - should be taken with a grain of salt. ↩ ↩²

by Alex Hoffmann

Alex is the co-founder and CEO of Marvin Labs. Prior to that, he spent five years in credit structuring and investments at Credit Suisse. He also spent six years as co-founder and CTO at TNX Logistics, which exited via a trade sale. In addition, Alex spent three years in special-situation investments at SIG-i Capital.