posted on 2025-07-15, 05:08authored byStuart Muwanga Sebiranda
<p dir="ltr"><i>Background</i>: Although machine learning has played a significant role in combating COVID-19, there is limited research leveraging these techniques to predict and identify key variables contributing to the rise of COVID-19 cases in developing countries. As the pandemic continues to exacerbate disparities between developing and developed nations, addressing this inequality is crucial and aligns closely with a key United Nations Sustainable Development Goal.</p><p dir="ltr"><i>Aim</i>: This thesis employs machine learning to model the spread of COVID-19 in a developing country, Uganda. Machine learning models will be used to identify factors leading to increased COVID-19 cases and identify the best-performing machine learning models for predicting COVID-19 cases.</p><p dir="ltr"><i>Methods</i>: The daily and cumulative number of COVID-19 cases are modelled using four machine learning models: the random forest regressor, the CatBoost regressor model, the support vector regressor, and the ordinary least squares regression model. The best-performing model is identified using four different regression metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), the coefficient of determination (R-squared) and the Adjusted R-squared.</p><p dir="ltr"><i>Conclusion</i>: The CatBoost model emerged as the best-performing model with favourable results. SHAP values were used to identify the significant variables within the CatBoost model. Subsequent analysis revealed that meteorological, economic and healthcare factors impacted COVID-19 case numbers in Uganda.</p>