Macquarie University
Browse

The visualization, unbiased estimation and interpretation of distributional regression models

Download (4.38 MB)
thesis
posted on 2022-11-04, 02:57 authored by Stanislaus Stadlmann

Distributional regression represents a modern approach to regression modeling that yields the ability to simultaneously connect multiple parameters beyond the mean of any parametric response distribution to structured additive predictors that can take parametric and non-parametric forms. This thesis proposes contributions to this field in three unique ways: in 1) a framework for the visualization of distributional regression models is developed, which focuses on predicted conditional moments and the shape of the whole distribution, instead of solely relying on distributional parameters as is commonly done. It is implemented as an extensive interactive R package named distreg.vis, focused on usability. The second contribution 2) recognizes a bias in the estimation of distributional regression model coefficients of all parameters if the model equation of one parameter is incorrectly specified. A solution for two-parameter distributions based on a numerically solved system of ordinary differential equations (ODE) created with the parameters' maximum likelihood estimate (MLE) covariance matrix is outlined, implemented and tested in a simulation study. Contribution 3) fills a gap in the interpretation of fitted distributional regression models. Existing metrics for ranking the importance of variables in linear regression models are discussed, with “relative weights” and “hierarchical partitioning” standing out as the most suitable due to their robustness to the scale of covariates, the consideration of variable cross-correlation, order independency and suitability for effects with more than one degree of freedom. These metrics are subsequently extended to generalized linear models (GLM) and generalized additive models for location, scale and shape (GAMLSS) with linear predictors taking into account the possibly multi-parametric response structure and likelihood-based nature of the fitted regression models. These extensions are implemented in an R package called vibe, providing methods compatible with several other packages. The above contributions are showcased using several datasets about wages in the Mid-Atlantic region of the USA, gym visitor numbers in Göttingen, extreme rainfall in Tasmania of Australia, patient satisfaction with a health care provider in North Macedonia and malnutrition scores in India.

History

Table of Contents

1 Introduction -- 2 Distributional regression models -- 3 Interactively visualizing distributional regression models -- 4 Parameter orthogonality transformations in distributional regression models -- 5 Variable importance in likelihood-based regression models -- 6 Conclusion and discussion -- A Appendix -- References

Notes

Cotutelle thesis in conjunction with the Georg-August Universität Göttingen. A thesis jointly submitted to Macquarie University, Sydney Department of Mathematics and Statistics Faculty of Science and Engineering and Georg-August Universität Göttingen Chairs of Statistics and Econometrics Faculty of Business and Economics for the degrees of Doctor of Philosophy (Sydney) Dr. Rer. Pol. (Göttingen)

Awarding Institution

Macquarie University

Degree Type

Thesis PhD

Degree

Thesis (PhD), Department of Mathematics and Statistics, Faculty of Science and Engineering, Macquarie University

Department, Centre or School

Department of Mathematics and Statistics

Year of Award

2021

Principal Supervisor

Gillian Heller

Additional Supervisor 1

Maurizio Manuguerra

Rights

Copyright: The Author Copyright disclaimer: https://www.mq.edu.au/copyright-disclaimer

Language

English

Extent

169 pages

Usage metrics

    Macquarie University Theses

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC