A perspective on incomplete data in longitudinal multi-arm clinical trials, with emphasis on pattern-mixture-model based methodology.
Missing data are common in longitudinal clinical trials. Rubin described three different missing data mechanisms based on the level of dependence between the missing data process and the measurement process. These are missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR). Data are MCAR when the probability of dropout is independent of both observed and unobserved data. Data are MAR when the probability of data being missing does not depend on the unobserved data, conditional on the observed data. When neither MCAR nor MAR is valid, data are MNAR. The aim of this thesis is to discuss statistical methodology required for analysing missing outcome data and provide valid statistical methods for the MAR, MCAR and MNAR scenarios. This thesis does not focus on data analysis where covariate data are missing. Under MCAR complete and available case analyses are valid. When data are MAR multiple imputation, likelihood-based models, inverse probability weighting and Bayesian models are valid. When data are MNAR pattern-mixture, selection and shared-parameter models are valid. These methods are illustrated by an in depth analysis of two data sets with missing data. The first data set is the SAPiT trial an open label, randomised controlled trial in HIVtuberculosis co-infected patients. Patients were randomised to three arms; each initiating antiretroviral therapy at a different time. CD4+ count, an indication of HIV progression, was measured at baseline and every 6 months for 24 months. The primary question was whether CD4+ count trajectory over time differed for the three treatment arms. The assumption that missing data are MCAR was not supported by the observed data. We performed a range of sensitivity analyses under both MAR and MNAR assumptions. The second data set is a placebo-controlled, randomised clinical trial conducted for 8 weeks to determine the effectiveness of hypericum or sertraline in reducing depression, measured by the Hamilton depression scale. The trial randomised 340 participants, with 28% lost to follow-up before Week 8. We performed a sensitivity analysis under different assumptions about the missing data process. The missing data mechanism was not MCAR. Under MAR assumptions, some of the sensitivity analyses found no difference between either of the treatment arms and placebo, while some found a significant difference between sertraline and placebo, but not between hypericum and placebo. This re-analysis contributed to the literature around the effectiveness of St John’s Wort because it changed the conclusions of the original analysis.