Statistics
Permanent URI for this communityhttps://hdl.handle.net/10413/6771
Browse
Browsing Statistics by Author "Achia, Thomas Noel Ochieng."
Now showing 1 - 9 of 9
- Results Per Page
- Sort Options
Item Application of survival analysis methods to study under-five child mortality in Uganda.(2013) Nasejje, Justine.; Achia, Thomas Noel Ochieng.; Mwambi, Henry G.Infant and child mortality rates are one of the health indicators in a given community or country. It is the fourth millennium development goal that by 2015, all the united nations member countries are expected to have reduced their infant and child mortality rates by two-thirds. Uganda is one of those countries in sub-Saharan Africa with high infant and child mortality rates and therefore has the need to find out the factors strongly associated to these high rates in order to provide alternative or maintain the existing interventions. The Uganda Demographic Health Survey (UDHS) funded by USAID, UNFPA, UNICEF, Irish Aid and the United kingdom government provides a data set which is rich in information. This information has attracted many researchers and some of it can be used to help Uganda monitor her infant and child mortality rates to achieve the fourth millennium goal. Survival analysis techniques and frailty modelling is a well developed statistical tool in analysing time to event data. These methods were adopted in this thesis to examine factors affecting under-five child mortality in Uganda using the UDHS data for 2011 using R and STATA software. Results obtained by fitting the Cox-proportional hazard model and frailty models and drawing inference using both the Frequentists and Bayesian approach showed that, Demographic factors (sex of the household head, sex of the child and number of births in the past one year) are strongly associated with high under-five child mortality rates. Heterogeneity or unobserved covariates were found to be signifcant at household but insignifcant at community level.Item Bayesian modelling of non–gaussian time series of serve acute respiratory illness.(2019) Musyoka, Raymond Nyoka.; Mwambi, Henry.; Achia, Thomas Noel Ochieng.; Gichangi, Anthony Simon Runo.Respiratory syncytial virus (RSV), Human metapneumovirus (HMPV) and Influenza are some of the major causes of acute lower respiratory tract infections (ALRTI) in children. Children younger than 1 year are the most susceptible to these infections. RSV and influenza infections occur seasonally in temperate climate regions. We developed statistical models that were assessed and compared to predict the relationship between weather and RSV incidence in chapter 2. Human metapneumovirus (HMPV) have similar symptoms to those caused by respiratory syncytial virus (RSV). Currently, only a few models satisfactorily capture the dynamics of time series data of these two viruses. In chapter 3, we used a negative binomial model to investigate the relationship between RSV and HMPV while adjusting for climatic factors. In chapter 4, we considered multiple viruses incorporating the time varying effects of these components.The occurrence of different diseases in time contributes to multivariate time series data. In this chapter, we describe an approach to analyze multivariate time series of disease counts and model the contemporaneous relationship between pathogens namely, RSV, HMPV and Flu. The use of the models described in this study, could help public health officials predict increases in each pathogen infection incidence among children and help them prepare and respond more swiftly to increasing incidence in low-resource regions or communities. We conclude that, preventing and controlling RSV infection subsequently reduces the incidence of HMPV. Respiratory syncytial virus (RSV) is one of the major causes of acute lower respiratory tract infections (ALRTI) in children. Children younger than 1 year are the most susceptible to RSV infection. RSV infections occur seasonally in temperate climate regions. Based on RSV surveillance and climatic data, we developed statistical models that were assessed and compared to predict the relationship between weather and RSV incidence among refugee children younger than 5 years in Dadaab refugee camp in Kenya. Most time-series analyses rely on the assumption of Gaussian-distributed data. However, surveillance data often do not have a Gaussian distribution. We used a generalised linear model (GLM) with a sinusoidal component over time to account for seasonal variation and extended it to a generalised additive model (GAM) with smoothing cubic splines. Climatic factors were included as covariates in the models before and after timescale decompositions, and the results were compared. Models with decomposed covariates fit RSV incidence data better than those without. The Poisson GAM with decomposed covariates of climatic factors fit the data well and had a higher explanatory and predictive power than GLM. The best model predicted the relationship between atmospheric conditions and RSV infection incidence among children younger than 5 years. Human metapneumovirus (HMPV) have similar symptoms to those caused by respiratory syncytial virus (RSV). The modes of transmission and dynamics of these epidemics still remain poorly understood. Climatic factors have long been suspected to be implicated in impacting on the number of cases for these epidemics. Currently, only a few models satisfactorily capture the dynamics of time series data of these two viruses. In this study, we used a negative binomial model to investigate the relationship between RSV and HMPV while adjusting for climatic factors. We specifically aimed at establishing the heterogeneity in the autoregressive effect to account for the influence between these viruses. Our findings showed that RSV contributed to the severity of HMPV. This was achieved through comparison of 12 models of various structures, including those with and without interaction between climatic cofactors. Most models do not consider multiple viruses nor incorporate the time varying effects of these components. Common ARIs etiologies identified in developing countries include respiratory syncytial virus (RSV), human metapneumovirus (HMPV), influenza viruses (Flu), parainfluenza viruses (PIV) and rhinoviruses with mixed co-infections in the respiratory tracts which make the etiology of Acute Respiratory Illness (ARI) complex. The occurrence of different diseases in time contributes to multivariate time series data. In this work, the surveillance data are aggregated by month and are not available at an individual level. This may lead to over-dispersion; hence the use of the negative binomial distribution. In this paper, we describe an approach to analyze multivariate time series of disease counts. A previously used model in the literature to address dependence between two different disease pathogens is extended. We model the contemporaneous relationship between pathogens, namely; RSV, HMPV and Flu from surveillance data in a refugee camp (Dadaab) for children under 5 years to investigate for serial correlation. The models evaluate for the presence of heterogeneity in the autoregressive effect for the different pathogens and whether after adjusting for seasonality, an epidemic component could be isolated within or between the pathogens. The model helps in distinguishing between an endemic and epidemic component of the time series that would allow the separation of the regular pattern from irregularities and outbreaks. The use of the models described in this study, can help public health officials predict increases in each pathogen infection incidence among children and help them prepare and respond more swiftly to increasing incidence in low-resource regions or communities. This knowledge helps public health officials to prepare for, and respond more effectively to increasing RSV incidence in low-resource regions or communities. The study has improved our understanding of the dynamics of RSV and HMPV in relation to climatic cofactors; thereby, setting a platform to devise better intervention measures to combat the epidemics. We conclude that, preventing and controlling RSV infection subsequently reduces the incidence of HMPV.Item Bayesian spatial models with application to HIV, TB and STI modeling in Kenya.(2014) Owino, Ngesa Oscar.; Mwambi, Henry Godwell.; Achia, Thomas Noel Ochieng.This dissertation is concerned with developing and extending statistical models in the area of spatial modeling with particular interest towards application to HIV, TB and HSV-2 data. Hierarchical spatial modeling is a common and useful approach for modeling complex spatially correlated data in many settings in epidemiological, public health and ecological studies. Chapter 1 of this thesis gives a chronological development of disease mapping models, from non-spatial to spatial and from single disease models to multiple disease models. In Chapter 2, a new model that relaxes the over-restrictive normal distribution assumption on the spatially unstructured random effect by using the generalised Gaussian distribution is introduced and investigated. The third chapter provides a framework for including sampling weights into the Bayesian hierarchical disease mapping model. In this model, design effect is used to re-scale the sample sizes. A new model for over dispersed spatially correlated binary data is developed in chapter 4 of this thesis; in this model, the over dispersion parameter is modeled by a beta random effect which is allowed to vary spatially also. In chapter 5, the common multiple spatial disease mapping models are reviewed and adopted for the binary data at hand since the original models were developed based on Poisson count data. The methodologies developed in this dissertation widen the toolbox for spatial analysis and disease mapping in applications in epidemiology and public health studies.Item Garch modelling of volatility in the Johannesburg Stock Exchange index.(2013) Mzamane, Tsepang Patrick.; Achia, Thomas Noel Ochieng.; Mwambi, Henry G.Modelling and forecasting stock market volatility is a critical issue in various fields of finance and economics. Forecasting volatility in stock markets find extensive use in portfolio management, risk management and option pricing. The primary objective of this study was to describe the volatility in the Johannesburg Stock Exchange (JSE) index using univariate and multivariate GARCH models. We used daily log-returns of the JSE index over the period 6 June 1995 to 30 June 2012. In the univariate GARCH modelling, both asymmetric and symmetric GARCH models were employed. We investigated volatility in the market using the simple GARCH, GJR-GARCH, EGARCH and APARCH models assuming di erent distributional assumptions in the error terms. The study indicated that the volatility in the residuals and the leverage effect was present in the JSE index returns. Secondly, we explored the dynamics of the correlation between the JSE index, FTSE-100 and NASDAQ-100 index on the basis of weekly returns over the period 6 June 1995 to 30 June 2012. The DCC-GARCH (1,1) model was employed to study the correlation dynamics. These results suggested that the correlation between the JSE index and the other two indices varied over time.Item Multilevel modelling of HIV in Swaziland using frequentist and Bayesian approaches.(2012) Vilakati, Sifiso E.; Achia, Thomas Noel Ochieng.; Mwambi, Henry G.Multilevel models account for different levels of aggregation that may be present in the data. Researchers are sometimes faced with the task of analysing data that are collected at different levels such that attributes about individual cases are provided as well as the attributes of groupings of these individual cases. Data with multilevel structure is common in the social sciences and other fields such as epidemiology. Ignoring hierarchies in data (where they exist) can have damaging consequences to subsequent statistical inference. This study applied multilevel models from frequentist and Bayesian perspectives to the Swaziland Demographic and Health Survey (SDHS) data. The first model fitted to the data was a Bayesian generalised linear mixed model (GLMM) using two estimation techniques: the Integrated Laplace Approximation (INLA) and Monte Carlo Markov Chain (MCMC) methods. The study aimed at identifying determinants of HIV in Swaziland and as well as comparing the different statistical models. The outcome variable of interest in this study is HIV status and it is binary, in all the models fitted the logit link was used. The results of the analysis showed that the INLA estimation approach is superior to the MCMC approach in Bayesian GLMMs in terms of computational speed. The INLA approach produced the results within seconds compared to the many minutes taken by the MCMC methods. There were minimal differences observed between the Bayesian multilevel model and the frequentist multilevel model. A notable difference observed between the Bayesian GLMMs and the the multilevel models is that of differing estimates for cluster effects. In the Bayesian GLMM, the estimates for the cluster effects are larger than the ones from the multilevel models. The inclusion of cluster level variables in the multilevel models reduced the unexplained group level variation. In an attempt to identify key drivers of HIV in Swaziland, this study found that age, age at first sex, marital status and the number of sexual partners one had in the last 12 months are associated with HIV serostatus. Weak between cluster variations were found in both men and women.Item Multivariate analysis of the BRICS financial markets.(2013) Ijumba, Claire.; Achia, Thomas Noel Ochieng.; Mwambi, Henry Godwell.The co-movements and integration of financial markets has been a subject of great concern among many researchers and economists due to an interest in the impacts of stock market integration in terms of international portfolio diversification, asset allocation and asset pricing efficiency. Understanding the interdependence among financial markets is thus of immense importance especially to investors and stakeholders in making viable decisions, managing risks and monitoring portfolio performances. In this thesis, we investigated the levels of interdependence and dynamic linkages among the five emerging economies well known as the BRICS: Brazil, Russia, India, China and South Africa, using a Vector autoregressive (VAR), univariate GARCH(1,1) and multivariate GARCH models. Our data sample consisted of the BRICS weekly returns from the period of January 2000 to December 2012. We used a VAR model to examine the linear dependence among the BRICS markets. The results from the VAR model analysis provided some evidence of unidirectional linear dependencies of the Indian and Chinese markets on the Brazilian stock market. The univariate GARCH(1,1) and multivariate GARCH models were employed to explore the volatility and dynamic correlation in the BRICS stock returns respectively. The results of the univariate GARCH model suggested volatility persistence among all the BRICS stock returns where China appeared to be the most volatile followed by the Russian stock market while the South African market was found to be the least volatile. Results from the multivariate GARCH models revealed similar volatility persistence. Furthermore, we found that, the correlations among the five emerging markets varied with time. From this study, evidence of interdependence among the BRICS cannot be rejected. Moreover, it appears that there are other factors apart from the internal markets themselves that may affect the volatility and correlation among the BRICS.Item The role of immune-genetic factors in modelling longitudinally measured HIV bio-markers including the handling of missing data.(2013) Odhiambo, Nancy.; Achia, Thomas Noel Ochieng.; Mwambi, Henry G.Since the discovery of AIDS among the gay men in 1981 in the United States of America, it has become a major world pandemic with over 40 million individuals infected world wide. According to the Joint United Nations Programme against HIV/AIDS epidermic updates in 2012, 28.3 million individuals are living with HIV world wide, 23.5 million among them coming from sub-saharan Africa and 4.8 million individuals residing in Asia. The report showed that approximately 1.7 million individuals have died from AIDS related deaths, 34 million ± 50% know their HIV status, a total of 2:5 million individuals are newly infected, 14:8 million individuals are eligible for HIV treatment and only 8 million are on HIV treatment (Joint United Nations Programme on HIV/AIDS and health sector progress towards universal access: progress report, 2011). Numerous studies have been carried out to understand the pathogenesis and the dynamics of this deadly disease (AIDS) but, still its pathogenesis is poorly understood. More understanding of the disease is still needed so as to reduce the rate of its acquisition. Researchers have come up with statistical and mathematical models which help in understanding and predicting the progression of the disease better so as to find ways in which its acquisition can be prevented and controlled. Previous studies on HIV/AIDS have shown that, inter-individual variability plays an important role in susceptibility to HIV-1 infection, its transmission, progression and even response to antiviral therapy. Certain immuno-genetic factors (human leukocyte antigen (HLA), Interleukin-10 (IL-10) and single nucleotide polymorphisms (SNPs)) have been associated with the variability among individuals. In this dissertation we are going to reaffirm previous studies through statistical modelling and analysis that have shown that, immuno-genetic factors could play a role in susceptibility, transmission, progression and even response to antiviral therapy. This will be done using the Sinikithemba study data from the HIV Pathogenesis Programme (HPP) at Nelson Mandela Medical school, University of Kwazulu-Natal consisting of 451 HIV positive and treatment naive individuals to model how the HIV Bio-markers (viral load and CD4 count) are associated with the immuno-genetic factors using linear mixed models. We finalize the dissertation by dealing with drop-out which is a pervasive problem in longitudinal studies, regardless of how well they are designed and executed. We demonstrate the application and performance of multiple imputation (MI) in handling drop-out using a longitudinal count data from the Sinikithemba study with log viral load as the response. Our aim is to investigate the influence of drop-out on the evolution of HIV Bio-markers in a model including selected genetic factors as covariates, assuming the missing mechanism is missing at random (MAR). We later compare the results obtained from the MI method to those obtained from the incomplete dataset. From the results, we can clearly see that there is much difference in the findings obtained from the two analysis. Therefore, there is need to account for drop-out since it can lead to biased results if not accounted for.Item Statistical methods for analysing complex survey data : an application to HIV/AIDS in Ethiopia.(2013) Mohammed, Mohammed Omar Musa.; Zewotir, Temesgen Tenaw.; Achia, Thomas Noel Ochieng.The HIV/AIDS pandemic is currently the most challenging public health matter that faces third world countries, especially those in Sub-Saharan Africa. Ethiopia, in East Africa, with a generalised and highly heterogeneous epidemic, is no exception, with HIV/AIDS affecting most sectors of the economy. The first case of HIV in Ethiopia was reported in 1984. Since then, HIV/AIDS has become a major public health con cern, leading the Government of Ethiopia to declare a public health emergency in 2002. In 2011, the adult HIV/AIDS prevalence in Ethiopia was estimated at 1.5%. Approximately 1.2 million Ethiopians were living with HIV/AIDS in 2010. Surveys are an important and popular tool for collecting data. Analytical use of survey data especially health survey data has become very common, with a focus on the association of particular outcome variables with explanatory variables at the population level. In this study we used the data from the 2005 Ethiopian Demographic and Health Survey, (EDHS 2005), and identified key demographic, socioeconomic, sociocultural, behavioral and proximate determinants of HIV/AIDS risk factor. Usually most survey analysts ignore the complex survey design issues like clustering, stratification and unequal probability of selection (weights). This study deals with complex survey design and takes the design aspect into account, because failure to do so leads to bias parameters estimates and standard error, wide confidence intervals and statistical tests will be incorrect. In this study, three statistical approaches were used to analyse the complex survey data. The first approach was a survey logistic regression used to model the binary outcome (HIV serostatus) and set of explanatory variables (the dependence of the HIV risk factors). The difference between survey logistic regression and the ordinary logistic regression is that survey logistic regression approach takes the study design into account during analysis. The second approach was a multilevel logistic regression model, that assumed that the data structure in the population was hierarchical, and that individual within household was selected from clusters that were randomly selected from a national sampling frame. We considered a three-level model for our analysis. This second approach considered the results from Frequentist and a Bayesian multilevel models. Bayesian methods can provide accurate estimates of the parameters and the uncertainty associated with them. The third approach used was a Spatial models approach where model parameters were estimated under the Integrated Nested Laplace Approximation (INLA) paradigm.Item Statistical modelling of the relationship between intimate partner violence and HIV infection among women in Zimbabwe.(2014) Chimatira, Isobella.; Achia, Thomas Noel Ochieng.; Mwambi, Henry Godwell.Zimbabwean women between the ages of 15-49 years are among the women most affected by HIV and Intimate Partner Violence in the world. The high rates of HIV infection among women have raised an alarm and stimulated research on the problem of violence against women. Intimate Partner Violence (IPV) is a well-known violation of human rights and is a problem in public health. It usually overlaps with the HIV/AIDS epidemic and has been reported to be a determinant of women's risk for HIV. The present study explored relevant statistical methods in modelling the relationship between Intimate Partner Violence (IPV) and HIV in Zimbabwe. The data used in the current research is from a Demographic and Health Survey (DHS) conducted in Zimbabwe for year 2005 - 06. The study aimed at analysing the relationship between IPV and HIV using the following explanatory variables: age; marital status; religion; education; wealth index; region; decision making; media exposure; STI; physical and sexual violence. Principal Component Analysis was used to create indices of IPV, media exposure and decision making among women in the age group 15 - 49. Survey Logistic Regression models accounting for multi-stage survey design was also used to adjust for socio-demographic and socio-economic factors. In order to explore the relationship between IPV and HIV prevalence among women, a generalised linear mixed model was adapted, controlling for socio-demographic variables and treating DHS survey clusters as random effects. Since IPV takes up more than two categories, Multinomial Logit Modelling was used to analyse the relationship of IPV with socio demographic and socio-economic variables. The results from the survey logistic regression modelling were as follows: unadjusted odds ratios (OR) for sexual or physical IPV ranged from 0:91 - 1:09 and 95% confidence intervals (CI) were (0:72; 1:14) for sexual and (0:92; 1:28) for physical violence. The adjusted odds ratios for sexual violence 0:82 [95%CI : 0:63; 1:06] and physical violence 1:12 [95%CI : 0:97; 1:36]. Both survey logistic regression models and generalised linear mixed models found no association between HIV and IPV among women in Zimbabwe. This study provides further evidence that IPV and HIV are not associated. In addition, the analysis revealed that the covariates which were associated with HIV and IPV were age, education, marital status, STI, religion and wealth index. As a result the study recommends that more research is required to find the situations or circumstances under which IPV is associated with HIV prevalence.