Survival analysis: applications to credit risk default modelling
thesisposted on 2022-03-28, 17:23 authored by Mark Thackham
Credit-granting institutions lend money to customers, some of which may fail to make contractual repayments (namely principal, interest and fees) thereby defaulting on their obligation. Firms employ quantitative credit risk management techniques to estimate and appropriately control their credit risk, ensuring the firm's risk profile remains within its risk appetite, thus contributing to a safely run firm and stability of the wider economy. Quantitative credit risk management techniques are used to estimate: Probability of Default (PD); Exposure at Default (EAD); and Loss Given Default (LGD). These are inputs to calculate expected loss (EL) (for loan-loss provisions required under international accounting standards (IASB (2014), FASB (2016)), aswell unexpected loss (UL) (required by institutions granted regulatory approval under the BaselAccords (BIS, 2006) to use theAdvanced Internal Ratings Based (A-IRB) Approach for minimum credit capital). This thesis focuses on applying survival analysis to quantifying the risk of credit default used for PD. Institutions already use their own internal data and leverage analytical techniques to quantify the risk of credit default, so the refinements in this thesis could further assist firms control their credit risk profile. To be granted regulatory and audit approval, quantitative credit risk models need to have intuitive drivers and functional form. Therefore regression approaches are regularly adopted, and while logistic regression is common (Baesens et al. (2003), Lessmann et al. (2015)), survival models achieve comparable accuracy to logistic regression but provide additional benefits, such as including censored data and estimations over multiple time horizons (Bellotti and Crook (2009), Stepanova and Thomas (2002) and Tong et al. (2012)). Survival analysis describes studies where subjects are followed in anticipation they encounter an event of interest. Originating with Edmund Halley's life table of human mortality (1693) and its extension by Daniel Bernoulli (1760) demonstrating the increase in human survival if the competing risk of small pox were eliminated as a cause of death, survival analysis spans applications across multiple disciplines, such as biomedical science, industrial life testing (Kalbfleisch and Prentice, 2002) and finance (Lessmann et al., 2015). Regression techniques and method of partial likelihood were introduced by David Cox (1972, 1975), and remain prominent (Hosmer et al., 2008). This model has since been extended, particularly by Crowley and Hu (1977) to cater for time-varying covariates, and by (for example) Sy and Taylor (2000) to cater for mixture-cure models. This thesis explores over three chapters, via two published papers and one manuscript prepared for publication, computational enhancements to the application of survival analysis, competing risk analysis, and mixture-cure analysis, to estimating the risk of credit default. These enhancements are: (1) joint estimation of regression coefficients and baseline hazard using constrained maximum likelihood, where the constraint ensures the latter's nonnegativity; (2) calculation of an asymptotic variance-covariance matrix that allows inferences to be drawn for regression estimates; (3) improved accuracy of parameters in certain settings as demonstrated via simulation. Applied to credit risk modelling, the methods in this thesis provide comparably accurate regression parameters to those obtained using partial likelihood but with the added benefit of also returning an estimate of a baseline hazard estimate with relatively low variability along with asymptotic variance estimates for the baseline hazard and all regression parameters. This further information allows clearer resolution of the shape and statistical significance of the underlying baseline hazard for the risk of credit default. For survival analysis and competing risk analysis approaches in this thesis, time-varying covariates are included providing additional flexibility of including into the models covariates whose values change over time.