The estimation of semiparametric generalized linear models
thesisposted on 29.03.2022, 01:59 by Busayasachee Puang-Ngern
In this thesis, a novel method for fitting the semiparametric generalized linear model (SPGLM) is developed and tested. We demonstrate that this provides an effective model fitting algorithm to the SP-GLM, particularly, when dealing with very large data sets. We also propose another special SP-GLM and discuss how to fit this special model. This special SP-GLM assumes the canonical link function, which simplifies the algorithm to fit this model. GLMs are widely used for data analysis. However, in some applications, GLMs do not perform well in model fitting when the selected distribution for the response data is inaccurate.The SP-GLM with a nonparametric reference density extends the conventional GLMs. The SP-GLM offers flexibility in regression modelling by relaxing the requirement of a known response distribution in GLMs to only require that the response variable has a distribution from some exponential family. However, a limitation has been observed in the application of the existing SP-GLM method (Huang, 2014) on large data sets, presumably due to the significant increase in the number of constraints for the SP-GLM for large sample sizes. The proposed new SP-GLM methods in this thesis will enable to fit SP-GLM to very large data sets. In this research, the focus is on the regression coefficients estimations and inferences. An iterative algorithm is developed for estimation of the regression coefficients and the reference density simultaneously. The asymptotic properties of the estimators subject to active constraints are also provided. Performance of the proposed methods are tested through simulation studies and real data applications. The simulation results have indicated effectiveness for the methods proposed in this research, with accurate estimation of the regression coefficients, as well as inference. The conclusion reached in this research is that the proposed model fitting methods enhance the capacity of the SP-GLM to handle very large data sets with fast convergence.