Statistical modelling of availability of major food cereals in Lesotho : application of regression models and diagnostics.
Date
2012
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Oftentimes, application of regression models to analyse cereals data is limited to estimating and
predicting crop production or yield. The general approach has been to fit the model without much
consideration of the problems that accompany application of regression models to real life data, such
as collinearity, models not fitting the data correctly and violation of assumptions. These problems
may interfere with applicability and usefulness of the models, and compromise validity of results
if they are not corrected when fitting the model. We applied regression models and diagnostics
on national and household data to model availability of main cereals in Lesotho, namely, maize,
sorghum and wheat. The application includes the linear regression model, regression and collinear
diagnostics, Box-Cox transformation, ridge regression, quantile regression, logistic regression and
its extensions with multiple nominal and ordinal responses.
The Linear model with first-order autoregressive process AR(1) was used to determine factors
that affected availability of cereals at the national level. Case deletion diagnostics were used to
identify extreme observations with influence on different quantities of the fitted regression model,
such as estimated parameters, predicted values, and covariance matrix of the estimates. Collinearity
diagnostics detected the presence of more than one collinear relationship coexisting in the data
set. They also determined variables involved in each relationship, and assessed potential negative
impact of collinearity on estimated parameters. Ridge regression remedied collinearity problems
by controlling inflation and instability of estimates. The Box-Cox transformation corrected non-constant
variance, longer and heavier tails of the distribution of data. These increased applicability
and usefulness of the linear models in modeling availability of cereals.
Quantile regression, as a robust regression, was applied to the household data as an alternative
to classical regression. Classical regression estimates from ordinary least squares method are sensitive
to distributions with longer and heavier tails than the normal distribution, as well as to
outliers. Quantile regression estimates appear to be more efficient than least squares estimates for
a wide range of error term distribution. We studied availability of cereals further by categorizing
households according to availability of different cereals, and applied the logistic regression model
and its extensions. Logistic regression was applied to model availability and non-availability of
cereals. Multinomial logistic regression was applied to model availability with nominal multiple
categories. Ordinal logistic regression was applied to model availability with ordinal categories and
this made full use of available information. The three variants of logistic regression model gave
results that are in agreement, which are also in agreement with the results from the linear regression
model and quantile regression model.
Description
Thesis (Ph.D.)-University of KwaZulu-Natal, Durban, 2012.
Keywords
Mathematical statistics., Probabilities., Linear models (Statistics), Theses--Statistics and actuarial science.