Blog — 3 Sep, 2021

Probability of Default and Scoring Models: Similarities and Differences

The past years have seen a proliferation of statistical models that combine financial ratios and socioeconomic/macroeconomic factors with advanced mathematical techniques to estimate the creditworthiness of publicly listed or privately held companies in a simplified, quick, automated and scalable way. Fundamentals-based credit risk models usually come in two flavors, depending on the asset class they aim to cover:

  1. Probability of Default (PD) models, useful for small- and medium-sized enterprises (SMEs), which are trained and calibrated on default flags.
  2. Scoring models that usually utilize the rankings of an established rating agency to generate a credit score for low-default asset classes, such as high-revenue corporations.

The two approaches tend to overlap on medium and large companies and, in most instances, produce the same or very comparable assessments. At times, however, they can provide divergent credit risk assessments for the same companies, with a difference of several credit score notches. This is not surprising since the models belong to different families of analytics, have been trained on different data sets and are characterized by different “DNAs” (medium-term assessments for PD models and long-term assessments for scoring models).

At S&P Global Market Intelligence, we offer both types of statistical models that are part of the Credit Analytics suite:

  • A PD model: PD Model Fundamentals (PDFN) covers publicly listed and privately owned corporations and banks, with no revenue and asset size limitation.
  • A scoring model: CreditModel™ (CM) covers medium and large corporations (with total revenue above $25m U.S.), and banks and insurance companies (with total assets above $100m).1

We also offer a powerful analytical tool, the absolute contribution, to identify the drivers of the model output differences. Employing the absolute contribution, we can see the main drivers of a weak credit risk assessment for each model. As shown in Table 1 below for non-financial corporates domiciled in North America, the three major common drivers of a score that is worse than ‘b-’ are as follows:

Table 1: Main Drivers of weak credit assessment for North American non-financial companies

Source: S&P Global Market Intelligence, as of July 30, 2021. For illustrative purposes only.

For the same company, both models will generate the same weak output (i.e., worse than ‘b-’) within one notch most of the time, whenever the financial statement contains items that are weak across the board. In limited cases, however, there can be significant differences for the same company: one model may assign a worse than ‘b-’ score, while the other may assign a much higher score by six or more notches. Since each model looks at a limited sub-set of financial ratios, this could be based on a company filing a financial statement with a mixed profile, one that is partly “good” and partly “bad”2.

One such case is Just Energy Group Inc. (JE), a Toronto-based multi-utilities company that was founded in 19973 and provides electricity and natural gas commodities in the U.S. and Canada. Looking at Figure 1, CM 3.0 assigns a ‘bb+’ score to JE (top panel), and PDFN assigns a ‘ccc+’ score (bottom panel), which is six notches worse.

Figure 1: CM 3.0 (Top Panel) and PDFN 2.0 (Bottom Panel) output for JE, using FY2020 financials

* Currency in millions of CDN dollars.
Source: S&P Global Market Intelligence, as of July 30, 2021. For illustrative purposes only.

Unlike in the CM 3.0 calculation, the company shows negative Total Equity and negative Retained Earnings/Total Assets in PDFN 2.0, which have a high absolute contribution to the ‘ccc+’ score. So, why did we not choose the same set of financial ratios when training these two models? There are two main reasons:

  1. Data availability: Companies may not consistently report all items in the financial statement. For example, private companies do not usually report cash flow items.
  2. Model’s DNA: Each model is trained on different datasets (external ratings or default flags) and optimized accordingly by choosing the best set of variables that help maximize the model performance.

However, since each model looks at a specific set of financials, we advocate using both models when looking at the same company to get a more holistic view of credit risk.

  • When both models assign a weak credit risk profile to the same company, the company may have overall weak financials, making it a risky business partner.
  • When model outputs diverge, it is useful to remember that CM was trained on credit ratings and, as such, scores retain similar dynamics (i.e., being relatively static and providing a long-term view of credit risk). Conversely, PDFN was trained on default flags, so its PD values and mapped scores are more dynamic, providing a more responsive view to changing company financials.4
  • In the case of marked divergences, the analysis can be complemented with additional information available on the S&P Capital IQ platform. This could include analyzing financials, reviewing news and key developments, considering the debt structure and the maturity schedule of all liabilities and adding a peer comparison.

Learn more about S&P Global Market Intelligence’s Credit Analytics models.



1 S&P Global Ratings does not contribute to or participate in the creation of credit scores generated by S&P Global Market Intelligence. Lowercase nomenclature is used to differentiate S&P Global Market Intelligence PD credit model scores from the credit ratings issued by S&P Global Ratings.

2 More details are discussed in the research paper Understanding Drivers of Credit Risk (2021) published by S&P Global Market Intelligence.

3 Source: S&P Capital IQ Platform, as of July 30, 2016.

4 This is also reflected in the choice of the financials. For example, in PDFN Private short-term Liabilities are included in one of the inputs.

Learn more about Market Intelligence