Doctoral Degrees (Statistics)

Permanent URI for this collectionhttps://hdl.handle.net/10413/7126

Browse

Now showing 1 - 2 of 2

Bayesian spatial models with application to HIV, TB and STI modeling in Kenya.
(2014) Owino, Ngesa Oscar.; Mwambi, Henry Godwell.; Achia, Thomas Noel Ochieng.
This dissertation is concerned with developing and extending statistical models in the area of spatial modeling with particular interest towards application to HIV, TB and HSV-2 data. Hierarchical spatial modeling is a common and useful approach for modeling complex spatially correlated data in many settings in epidemiological, public health and ecological studies. Chapter 1 of this thesis gives a chronological development of disease mapping models, from non-spatial to spatial and from single disease models to multiple disease models. In Chapter 2, a new model that relaxes the over-restrictive normal distribution assumption on the spatially unstructured random effect by using the generalised Gaussian distribution is introduced and investigated. The third chapter provides a framework for including sampling weights into the Bayesian hierarchical disease mapping model. In this model, design effect is used to re-scale the sample sizes. A new model for over dispersed spatially correlated binary data is developed in chapter 4 of this thesis; in this model, the over dispersion parameter is modeled by a beta random effect which is allowed to vary spatially also. In chapter 5, the common multiple spatial disease mapping models are reviewed and adopted for the binary data at hand since the original models were developed based on Poisson count data. The methodologies developed in this dissertation widen the toolbox for spatial analysis and disease mapping in applications in epidemiology and public health studies.
A frequentist and a Bayesian approach to estimating HIV prevalence accounting for non-response using population-based survey data.
(2016) Chinomona, Amos.; Mwambi, Henry Godwell.
Enhanced and novel frequentist and Bayesian approaches to estimating disease measures such as HIV prevalence utilizing the recent advances in statistical computing software are explored and applied making use of population-based complex survey data. In particular design-consistent estimates and logistic regression models for HIV prevalence are respectively computed and fitted using each of the approaches. Practical survey data are rarely obtained using simple random sampling schemes, instead complex sampling designs, that are designed to refect complex underlying population structures, are employed. These designs usually involve stratification, multistage sampling and unequal selection probability of sampling units giving rise to data that are hierarchical (multilevel), clustered, and hence correlated. This is particularly true for large-scale population-based surveys. Consequently this often gives rise to units that are correlated within clusters as well as multiple sources of variability rendering standard statistical methods based on the assumption of independence of units inappropriate. Survey logistic regression models built from a generalized linear modelling framework were used to explain the variation in HIV prevalence accounting for the nonindependence of the units. In addition, a hierarchical logistic regression model built from a generalized linear mixed modelling framework was used to capture the variability and correlation of the units within clusters and further determine how different layers interact and impact on a response variable. In particular, the logistic regression models for HIV prevalence on demographic, behavioural and socio-economic variables were developed from a frequentist and a Bayesian perspective. Statistical methods that incorporate prior known information about unknown parameters are vital in most scientific and biological research especially in studies where replicative experimental investigations are not possible. The Bayesian statistical paradigm offers a framework upon which a prior distribution of a parameter can be combined with the likelihood of the observed data to obtain a posterior distribution for explaining the stochastic variation in a response variable. Computer-intensive simulation-based algorithms such as the Markov chain Monte Carlo (MCMC) methods were used to draw samples from the posterior distribution for inference purposes. A Bayesian logistic regression model for HIV prevalence on demographic and socio-economic variables was fitted from a generalized linear modelling framework using the MCMC algorithms. Furthermore, practical complex survey data are often characterized by missing observations due to non-response, a phenomenon that is true to the data used for the current research. Often, the analyses of such data take a complete case approach, that is taking a list-wise deletion of all cases with missing observations, assuming that missing values are missing completely at random (MCAR). In the current research, we systematically simulate or generate multiple values for the missing observations under a multiple imputation method accounting for the structure of the data. A rectangular complete data set is produced and the variability or uncertainty induced by the very process of imputing the values for the missing observations is accounted for. The study utilizes complex (multi-layered and clustered data with missing values) survey data obtained from the 2010-11 Zimbabwe Demographic and Health Surveys (2010-11ZDHS). The results show that HIV prevalence varies considerably across subgroups of the population. All the analyses are done using R statistical software packages.

Browse

Browsing Doctoral Degrees (Statistics) by Subject "Bayesian statistical decision theory."