Credit Risk Stress Testing in R

This project includes stress testing, credit risk modeling, and capital impact analysis using macroeconomic scenarios on a loan portfolio.

Overview

This project implements a stress testing framework combining macroeconomic scenario simulation, credit risk modeling, and capital impact analysis, using real-world French economic data. The analysis demonstrates how adverse economic conditions affect a loan portfolio’s expected losses and regulatory capital adequacy.

Data and Sources

GDP, Unemployment, Inflation: Quarterly data from Eurostat, covering France.

Data cleaning includes calculating GDP growth rates, quarterly unemployment rates, and inflation.

Historical macroeconomic data is used to fit a Vector Autoregression (VAR) model for scenario simulation.

Install and import packages

# if not installed: 
# install.packages(c("tidyverse", "vars", "eurostat", "zoo", "dplyr"))

library(tidyverse)
library(vars)
library(eurostat)
library(zoo)
library(dplyr)

Set up parameters

gdp_shock_size <- 2
unemp_shock_size <- 1
infl_shock_size <- 1.5
n_quarters <- 8

n_credits <- 1000

CET1_initial <- 1000
RWA <- 10000

Download data

# GDP 
# search_eurostat("gdp france quarterly")
gdp_raw <- get_eurostat("namq_10_gdp", filters = list(
  geo = "FR", 
  na_item = "B1GQ", 
  unit = "CLV10_MEUR"
))

# Unemployment 
unemp_raw <- get_eurostat("une_rt_q", filters = list(
  geo = "FR", 
  sex = "T", 
  age = "Y20-64",
  unit = "PC_ACT"
))
# Inflation
inflation_raw <- get_eurostat("prc_hicp_midx", filters = list(
  geo = "FR",
  coicop = "CP00",
  unit = "I15"
))

Clean data

# GDP : growth rate
gdp_q <- gdp_raw %>%
  arrange(time) %>%
  mutate(
    time = as.yearqtr(time),
    gdp_growth = 100 * (values / lag(values) - 1)) %>%
  filter(!is.na(gdp_growth)) %>%
  dplyr::select(time, gdp_growth)

gdp_q_unique <- gdp_q %>%
  group_by(time) %>%
  summarise(gdp_growth = mean(gdp_growth, na.rm = TRUE))


# Unemployment rate
unemp_q <- unemp_raw %>%
  dplyr::select(time, unemployment_rate = values) %>%
  arrange(time) %>%
  filter(!is.na(unemployment_rate)) %>%
  mutate(time = as.yearqtr(time))

unemp_q_unique <- unemp_q %>%
  group_by(time) %>%
  summarise(unemployment = mean(unemployment_rate, na.rm = TRUE))


# Inflation
# Convert monthly index to quarterly average index
inflation_q <- inflation_raw %>%
  mutate(quarter = as.yearqtr(time)) %>%
  group_by(quarter) %>%
  summarise(avg_index = mean(values, na.rm = TRUE)) %>%
  arrange(quarter) %>%
  # Compute quarter-on-quarter inflation rate in %
  mutate(inflation = 100 * (avg_index / lag(avg_index) - 1)) %>%
  filter(!is.na(inflation))

inflation_q <- inflation_q %>%
  rename(time = quarter) %>%
  dplyr::select(time, inflation)


# Merge
macro_data <- gdp_q_unique %>%
  inner_join(unemp_q_unique, by = "time") %>%
  inner_join(inflation_q, by = "time") %>%
  arrange(time)


# For later: Assuming that initial GDP level is the last observed GDP level from the original data
last_gdp_level <- as.numeric(tail(gdp_raw$values, 1))

macro_data <- macro_data %>%
  mutate(GDP_level = last_gdp_level * cumprod(1 + gdp_growth / 100))

Vector Autoregression VAR(1) model

A VAR model on GDP growth, unemployment, and inflation to capture dynamic relationships and forecast future quarters.

# Time series matix
macro_ts <- macro_data %>%
  dplyr::select(gdp_growth, unemployment, inflation) %>%
  as.matrix()

# Fit model
var_model <- VAR(macro_ts, p = 1, type = "const")
summary(var_model)

## 
## VAR Estimation Results:
## ========================= 
## Endogenous variables: gdp_growth, unemployment, inflation 
## Deterministic variables: const 
## Sample size: 88 
## Log Likelihood: -199.379 
## Roots of the characteristic polynomial:
## 0.9622 0.2606 0.215
## Call:
## VAR(y = macro_ts, p = 1, type = "const")
## 
## 
## Estimation results for equation gdp_growth: 
## =========================================== 
## gdp_growth = gdp_growth.l1 + unemployment.l1 + inflation.l1 + const 
## 
##                 Estimate Std. Error t value Pr(>|t|)  
## gdp_growth.l1   -0.26162    0.10647  -2.457   0.0161 *
## unemployment.l1  0.05025    0.13628   0.369   0.7132  
## inflation.l1     0.23012    0.22693   1.014   0.3135  
## const           -0.31933    1.18925  -0.269   0.7890  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 1.12 on 84 degrees of freedom
## Multiple R-Squared: 0.0733,  Adjusted R-squared: 0.0402 
## F-statistic: 2.215 on 3 and 84 DF,  p-value: 0.09238 
## 
## 
## Estimation results for equation unemployment: 
## ============================================= 
## unemployment = gdp_growth.l1 + unemployment.l1 + inflation.l1 + const 
## 
##                 Estimate Std. Error t value Pr(>|t|)    
## gdp_growth.l1   -0.09941    0.02713  -3.665 0.000433 ***
## unemployment.l1  0.94586    0.03472  27.242  < 2e-16 ***
## inflation.l1    -0.10446    0.05782  -1.807 0.074374 .  
## const            0.50552    0.30300   1.668 0.098964 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.2854 on 84 degrees of freedom
## Multiple R-Squared: 0.919,   Adjusted R-squared: 0.9161 
## F-statistic: 317.8 on 3 and 84 DF,  p-value: < 2.2e-16 
## 
## 
## Estimation results for equation inflation: 
## ========================================== 
## inflation = gdp_growth.l1 + unemployment.l1 + inflation.l1 + const 
## 
##                 Estimate Std. Error t value Pr(>|t|)  
## gdp_growth.l1    0.01651    0.05250   0.314   0.7540  
## unemployment.l1 -0.12143    0.06720  -1.807   0.0744 .
## inflation.l1     0.23233    0.11190   2.076   0.0409 *
## const            1.36310    0.58644   2.324   0.0225 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Residual standard error: 0.5524 on 84 degrees of freedom
## Multiple R-Squared: 0.1345,  Adjusted R-squared: 0.1035 
## F-statistic:  4.35 on 3 and 84 DF,  p-value: 0.006728 
## 
## 
## 
## Covariance matrix of residuals:
##              gdp_growth unemployment inflation
## gdp_growth      1.25469      0.13320   0.03814
## unemployment    0.13320      0.08145  -0.05403
## inflation       0.03814     -0.05403   0.30510
## 
## Correlation matrix of residuals:
##              gdp_growth unemployment inflation
## gdp_growth      1.00000       0.4167   0.06165
## unemployment    0.41667       1.0000  -0.34276
## inflation       0.06165      -0.3428   1.00000

Unemployment is highly persistent (strong AR coefficient, R²=0.92) and strongly influenced by past GDP growth (negative, significant).
GDP Growth shows weak predictability (low R²), with only its own lag being significant.
Inflation is somewhat persistent and negatively linked to unemployment, but overall model fit is low.
Residuals: Moderate correlation between GDP and unemployment, negative between unemployment and inflation.
Stress Testing Implication: Shocks to unemployment are likely to last, and GDP downturns will increase unemployment. Inflation dynamics are less reliable in this model.

Simulate macro paths

Generate baseline and stressed macroeconomic paths. The stress scenario applies potential shocks to GDP (-2%), unemployment (+1%), and inflation (+1.5%), decreasing exponentially over time.

# Baseline:
set.seed(123)
sim_base <- predict(var_model, n.ahead = n_quarters) 

gdp_sim <- sim_base$fcst$gdp_growth[,1]
unemp_sim <- sim_base$fcst$unemployment[,1]
infl_sim <- sim_base$fcst$inflation[,1]

sim_data <- tibble(
  quarter = 1:n_quarters,
  scenario = "baseline",
  gdp_growth = gdp_sim,
  unemployment = unemp_sim,
  inflation = infl_sim
)

sim_data <- sim_data %>%
  group_by(scenario) %>%
  mutate(GDP_level = last_gdp_level * cumprod(1 + gdp_growth / 100)) %>%
  ungroup()

# Stressed:
gdp_stress <- numeric(n_quarters)
unemp_stress <- numeric(n_quarters)
infl_stress <- numeric(n_quarters)
# Apply shocks at quarter 1
for (i in 1:n_quarters) {
  gdp_stress[i] <- gdp_sim[i] - gdp_shock_size * exp(-0.3 * (i - 1)) 
  unemp_stress[i] <- unemp_sim[i] + unemp_shock_size * exp(-0.3 * (i - 1))
  infl_stress[i] <- infl_sim[i] + infl_shock_size * exp(-0.3 * (i - 1))
}

sim_data_stress <- tibble(
  quarter = 1:n_quarters,
  scenario = "stress",
  gdp_growth = gdp_stress,
  unemployment = unemp_stress,
  inflation = infl_stress
)

sim_data_stress <- sim_data_stress %>%
  group_by(scenario) %>%
  mutate(GDP_level = last_gdp_level * cumprod(1 + gdp_growth / 100)) %>%
  ungroup()


# Merge
sim_all <- bind_rows(sim_data, sim_data_stress)

Visualize macro paths

sim_all %>%
  pivot_longer(cols = c(gdp_growth, unemployment, inflation), names_to = "variable", values_to = "value") %>%
  ggplot(aes(x = quarter, y = value, color = scenario)) +
  geom_line(size = 1) +
  facet_wrap(~variable, scales = "free_y") +
  theme_minimal() + 
  scale_color_brewer(palette = "Set1") +
  labs(title = "VAR Scenario Simulation: Baseline vs Stressed", y = "", x = "Quarter")

GDP Growth: Under the stressed scenario, GDP growth drops sharply negative in the first quarter and remains well below baseline throughout, only partially recovering by quarter 8. This indicates a significant and persistent economic downturn under stress.
Inflation: Inflation falls quickly under stress, dropping from 2% to near 0.5% over 8 quarters, while the baseline remains stable. This suggests that the stress scenario leads to strong disinflationary pressures, likely due to weak demand.
Unemployment: Unemployment rises in the baseline but much more sharply under stress, peaking early and then gradually declining but staying above the baseline. This reflects the persistence of labor market weakness following a shock.

The stressed scenario results in a deep, prolonged GDP contraction, rapid disinflation, and a persistent rise in unemployment compared to baseline. This highlights significant macroeconomic risks and potential credit deterioration in adverse conditions.

Simulate a loan portfolio

Simulation of a synthetic loan portfolio with Exposure at Default (EAD), Loss Given Default (LGD), and sector assignments.

set.seed(123) # ensure the random numbers are reproducible

portfolio <- tibble( 
  id = 1:n_credits,
  sector = sample(c("Industry", "Services", "Construction"), n_credits, replace = TRUE), #randomly assign a sector to each loan
  EAD = runif(n_credits, 100000, 1000000), #exposure at default
  LGD = runif(n_credits, 0.2, 0.5) #loss given default
)

Simulate macroeconomic data for PD model

Note: The Probability of Default model is based on synthetic training data for illustrative purposes. Real-world calibration would require actual default histories.

set.seed(456)
gdp_growth_train <- rnorm(40, mean = 0.5, sd = 1)

macro_data_pd <- tibble(
  quarter = 1:40,
  GDP = gdp_growth_train,
  unemployment = 8 + cumsum(rnorm(40, 0, 0.2)),
  inflation = 1.5 + cumsum(rnorm(40, 0, 0.1)),
  default_rate = 0.02 + 0.005 * (0.5 - GDP) + 0.01 * (unemployment - 8) + rnorm(40, 0, 0.002)
)

Probability of Default model

A Probability of Default (PD) model linking macroeconomic variables to default rates using logistic regression.

# Probability of default (PD) as a function of macro variables
macro_data_pd <- macro_data_pd %>%
  mutate(default_flag = ifelse(default_rate > median(default_rate), 1, 0))
# 1 if default rate above median, 0 otherwise

model_pd <- glm(default_flag ~ GDP + unemployment + inflation, data = macro_data_pd, family = binomial)
# fits a generalized linear model

Predict PD for each macro path

sim_all <- sim_all %>%
  mutate(GDP = gdp_growth)

sim_all <- sim_all %>%
  mutate(
    PD_sim = predict(
      model_pd,
      newdata = tibble(GDP = GDP, unemployment = unemployment, inflation = inflation),
      type = "response"
    ),
  )

Expand portfolio and compute losses

Calculation of expected losses by sector and quarter, adjusting PD by sector risk premiums.

portfolio_expanded <- sim_all %>%
  dplyr::select(scenario, quarter, PD_sim) %>%
  crossing(portfolio) %>%
  mutate(
    sector_adj = case_when(
      sector == "Industry" ~ 0.01,
      sector == "Services" ~ 0.005,
      sector == "Construction" ~ 0.02,
      TRUE ~ 0
    ),
    PD_sim = pmin(PD_sim + sector_adj, 0.5), # makes sure PD does not exceed 50%
    Expected_Loss = PD_sim * LGD * EAD
  )

# Losses by quarter
losses_by_sector_quarter <- portfolio_expanded %>%
  group_by(scenario, quarter, sector) %>%
  summarise(Total_Loss = sum(Expected_Loss)/1e6, .groups = "drop")

ggplot(losses_by_sector_quarter, aes(x = quarter, y = Total_Loss, color = sector)) +
  geom_line(size = 1) +
  facet_wrap(~scenario) +
  theme_minimal() + 
  scale_color_brewer(palette = "Set1") +
  labs(title = "Expected Losses Over Time by Sector",
       x = "Quarter", y = "Expected Loss (Million €)")

Baseline scenario: Expected losses are low and stable (around €2 million) for all sectors across all quarters, indicating normal credit risk in a stable economic environment.
Stress scenario: At the start of stress (quarter 1), expected losses spike sharply to over €30 million for all sectors. This reflects a severe but plausible economic shock causing a sudden deterioration in credit quality.
Recovery: After the initial spike, losses decline rapidly over the next few quarters, stabilizing at a higher level than the baseline but much lower than the peak. This pattern suggests that the economic shock is short-lived but has a lasting impact.
Sector comparison: All three sectors react similarly, with only minor differences in the magnitude of the loss spike and recovery pace. This indicates that the macroeconomic stress affects all sectors broadly and simultaneously.

This highlights the importance of capital buffers to absorb losses during severe downturns and the systemic nature of macro shocks.

Solvency impact

Assessment of the impact of stressed losses on CET1 capital and capital adequacy ratios, assuming initial capital and risk-weighted assets.

final_losses_stress <- losses_by_sector_quarter %>%
  filter(scenario == "stress") %>%
  summarise(total = sum(Total_Loss)) %>%
  pull(total)

CET1_final <- CET1_initial - final_losses_stress
CET1_ratio <- CET1_final / RWA
CET1_ratio_pct <- CET1_ratio * 100

print(paste0("Total expected stress loss: ", round(final_losses_stress,1), "M€"))

## [1] "Total expected stress loss: 241.3M€"

if (final_losses_stress > CET1_initial) {
  warning("⚠️ Losses exceed CET1 — check calibration.")
}

print(paste0("Final CET1: ", round(CET1_final, 1), "M€"))

## [1] "Final CET1: 758.7M€"

print(paste0("CET1 ratio under stress: ", round(CET1_ratio_pct, 2), " %"))

## [1] "CET1 ratio under stress: 7.59 %"

capital_buffer_pct <- (CET1_final / CET1_initial) * 100
print(paste0("Capital buffer remaining under stress: ", round(capital_buffer_pct, 2), "%"))

## [1] "Capital buffer remaining under stress: 75.87%"

print(paste0("Capital buffer consumed under stress: ", round(100 - capital_buffer_pct, 2), "%"))

## [1] "Capital buffer consumed under stress: 24.13%"

CET1 ratio under stress (7.59%) remains above the minimum regulatory requirement (4.5% or 7% including buffer)
Capital buffer remaining of 75.87%), indicating resilience.

Although losses are significant, the CET1 ratio stays above regulatory minimums, and most of the capital buffer remains intact. This suggests a solid capital position and the ability to absorb severe macroeconomic shocks without breaching regulatory thresholds.

Limitations

The PD model is synthetic and illustrative; real default data would improve accuracy.
Scenario shocks and decay rates can be refined with expert input.
Does not take into account other risk factors such as interest rates or FX rates.

View the GitHub Repository