BLOG — Dec 20, 2023

Why is using data and data science both interesting and important within financial markets?

Using data as part of the investment management process is far from new - but why is using data and data science both interesting and important within financial markets? And why is using alternative data so valuable?

We explored these topics at our recent S&P Global Quantitative Investment Management Forum in London.

Firstly, how do we define alternative data? Is it defined by asset class, or geography - anything outside of Equities, or the US? We observe for example the use of our proprietary capital flow data and systematic credit strategies rising. Is alternative data defined by public vs private markets, or by time-frequency - such as tick data? Does reliability mark the boundary between traditional and alternative data sets? For example, if a truly unique data set disappears there is nothing to replace it, whereas traditional data could be defined as having multiple sources. Is alternative data defined as data used within a financial market context but created for or originating from a different purpose? Or is alternative data defined by human and emotional bias - data so differentiated it is deemed to incur more risk, to generate alpha?

Ultimately, data is a spectrum between traditional and alternative, it is not binary - and importantly, and as straightforward as it may seem, data conveys information. Here are the thoughts of our panel on why and how best to leverage data and to optimize data-driven strategies:

  • Data, and alternative data, provides a better connection with the real economy - a more effective ability to calculate and price risk, and to reconcile the price-formation process with the changing state of the economy.

  • Data science is now required at every level of the value chain - from data collection, alpha signal extraction and portfolio construction, to trade/strategy execution and risk management. Data science at scale is now a requirement to gain a competitive advantage, especially when moving into new/alternative asset classes.

  • As such - collect as much data as possible, from as close to the source as possible, understand its specificities and remove the data bias for each desired usage. Noise removal is imperative to understand the true signals offered by the data.

  • However, data must not be blindly injected into systems. Data must be used with intentionality and purpose to fulfil a specific information gap. This approach will prove more additive over time than purely focusing on returns.

  • Combining data sets is more powerful than using data in isolation - but contextualization is important. Combining market data + traditional data + alternative data and overlaying this with domain expertise, will enable you to build a complete picture of the assets you are interested in.

  • Mapping is key - alternative data needs to be mapped to become meaningful, and mapping between datasets and between equity and fixed income assets remains a significant challenge. Efficient and accurate mapping of data relationships is essential to extract more signal from the 'noise', rather than building a picture manually. Building more complex data relationships will provide an information advantage. Our S&P Global Cross Reference suite allows seamless linking of reference data by company, security, and industry to better manage data integration and minimize manual processes: Cross Reference Services.

  • Cross-Asset signals: To further overcome the challenge of mapping between asset classes as part of a data-driven cross-asset strategy, S&P Global's Alpha Signals factor library provides a suite of credit-derived equity factors, our Bond-Linked Equity Signals, built on top of our proprietary CDS Pricing Dataset and Bond Pricing Data, using a robust mapping algorithm. The quantitative feed provides a daily view of how credit markets impact equities, and these indicators have proven to provide unique information that has low commonality with both fundamental and alternative datasets. Our innovation has enabled us to create alpha-generative signals originated from our mapping capabilities, helping our customers make the connection between credit and equity markets. Furthermore, our quantitative analysts have extended this research to combine our Bond-Linked Equity Factors with Equity Short Interest for an enhanced security selection process in large-cap equities in the US and Developed Europe.

  • Returning to the thematic of traditional vs alternative data, there are still significant insights to be gained from traditional data, which remains the foundation of investment management, as demonstrated by our pioneering S&P Capital IQ Financials Dataset, point in time Compustat® Financials Dataset and comprehensive S&P Capital IQ Estimates Dataset.

  • Moreover, our product innovation enables us to provide new alternative insights for our customers by leveraging our traditional data sets, for example with our forthcoming S&P Global Company Connections: Detailed Estimates product based on sell-side analyst estimates. Sell-side analyst coverage data provides a new and rich source of establishing connections between firms, as analysts (given their industry expertise) are likely to cover fundamentally related firms. Company to company connections are derived from the number of shared sell-side analysts between a pair of companies. As our panel highlights, enhanced data signals can be generated from the linking of data through company relationships.

  • The extraction of textual and tonal data by machine learning is an important step forward, and with increased focus by the industry on LLMs, the importance of textual data is significantly growing. Text is not only used for model-building, but is also increasingly used for better timing advantage in execution algorithms and for improved risk management, as well as to enhance alpha generation.

  • Text is also a key area of growth for our own S&P Global products: most unstructured data is in textual form from sources such as emails, transcripts, articles and documents. These text files are usually difficult, time-consuming and expensive to analyze and utilize. S&P Global identifies primary sources of textual information that can be parsed and structured for ease of use, bypassing the entire process of sourcing, cleansing and maintaining the data, while enabling metadata tagging and linking to other datasets such as financials and estimates. Our Textual Data Suite includes machine-readable transcripts, filings and broker research, as well as our Textual Data Analytics: Sentiment Scores & Behavioral Metrics Dataset.

  • Combining our cross-asset and textual themes, new research from our Quantamental Research team combines our Textual Data Analytics & Credit Default Swap Pricing datasets to examine the effect of earnings call sentiment derived from our earnings call transcripts on CDS spreads: Watch Your Language: Executives' Remarks on Earnings Calls Impact CDS Spreads

In conclusion, data is core to the investment management process - increasingly so with LLMs becoming more mainstream - and alternative data is becoming more necessary. However, more data and bigger models are not necessarily better - the key is to understand what you are asking for, and to keep the intention and application of the data in focus.

And as for what defines traditional vs alternative data? In a world of AI, nothing will replace human understanding of data, and this is what ultimately unites all data.

For more information on how to access these data sets, please contact the sales team at: h-ihsm-global-equitysalesspecialists@spglobal.com or visit www.marketplace.spglobal.com.

Please feel free to download a PDF version of this blog.


S&P Global provides industry-leading data, software and technology platforms and managed services to tackle some of the most difficult challenges in financial markets. We help our customers better understand complicated markets, reduce risk, operate more efficiently and comply with financial regulation.


This article was published by S&P Global Market Intelligence and not by S&P Global Ratings, which is a separately managed division of S&P Global.