A universal segment approach for the prediction of the activity coefficient.
Date
2016
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study comprised an investigation into solid-liquid equilibrium prediction, measurement
and modelling for active pharmaceutical ingredients, and solvents, employed in the
pharmaceutical industry. Available experimental data, new experimental data, and novel
measuring techniques, as well as existing predictive thermodynamic activity coefficient model
revisions, were investigated. Thereafter, and more centrally, a novel model for the prediction
of activity coefficients, at solid-liquid equilibrium, which incorporates global optimization
strategies in its training, is presented.
The model draws from the segment interaction (via segment surface area), approach in solidliquid
equilibrium modelling for molecules, and extends this concept to interactions between
functional groups. Ultimately, a group-interaction predictive method is proposed that is based
on the popular UNIFAC-type method (Fredenslund et al. 1975). The model is termed the
Universal Segment Activity Coefficient (UNISAC) model.
A detailed literature review was conducted, with respect to the application of the popular
predictive models to solid-liquid phase equilibrium (SLE) problems, involving structurally
complex solutes, using experimental data available in the literature (Moodley et al., 2016 (a)).
This was undertaken to identify any practical and theoretical limitations in the available
models. Activity coefficient predictions by the NRTL-SAC ((Chen and Song 2004), Chen and
Crafts, 2006), UNIFAC (Fredenslund et al., 1975), modified UNIFAC (Dortmund) (Weidlich
and Gmehling, 1987), COSMO-RS (OL) (Grensemann and Gmehling, 2005), and COSMOSAC
(Lin and Sandler, 2002), were carried out, based on available group constants and sigma
profiles, in order to evaluate the predictive capabilities of these models.
The quality of the models is assessed, based on the percentage deviation between experimental
data and model predictions. The NRTL-SAC model is found to provide the best replication of
solubility rank, for the cases tested. It, however, was not as widely applicable as the majority
of the other models tested, due to the lack of available model parameters in the literature. These
results correspond to a comprehensive comparison conducted by Diedrichs and Gmehling
(2011).
After identifying the limitations of the existing predictive methods, the UNISAC model is
proposed (Moodley et al, 2015 (b)). The predictive model was initially applied to solid-liquid
systems containing a set of 18 structurally diverse, complex pharmaceuticals, in a variety of solvents, and compared to popular qualitative solubility prediction methods, such as NRTLSAC
and the UNIFAC based methods. Furthermore, the Akaike Information Criterion (AIC)
(Akaike, 1974) and Focused Information Criterion (FIC) (Claeskens and Hjort, 2003) were
used to establish the relative quality of the solubility predictions. The AIC scores recommend
the UNISAC model for over 90% of the test cases, while the FIC scores recommend UNISAC
in over 75% of the test cases.
The sensitivity of the UNISAC model parameters was highlighted during the initial testing
phase, which indicated the need to employ a more rigorous method of determining parameters
of the model, by optimization to the global minimum. It was decided that the Krill Herd
algorithm optimization technique (Gandomi and Alavi, 2012), be employed to accomplish this.
To verify the suitability of this decision, the algorithm was applied to phase stability (PS) and
phase equilibrium calculations in non-reactive (PE) and reactive (rPE) systems, where global
minimization of the total Gibbs energy is necessary. The results were compared to other
methods from the literature (Moodley et al., 2015 (c)). The Krill Herd algorithm was found to
reliably determine the desired global optima in PS, PE and rPE problems. The algorithm
outperformed or matched all other methods considered for comparison, including swarm
intelligence and genetic algorithms, with an average success rate of 89.5 %, and with an average
number of function evaluations of 1406.
The UNISAC model was then reviewed, and extended, to incorporate the significantly more
detailed group fragmentation scheme of Moller et al. (2008), to improve the range of
application of the model. New UNISAC segment group area parameters that were obtained by
data fitting, using the Krill Herd Algorithm as an optimization tool, were calculated. This
Extended UNISAC model was then used to predict SLE compositions, or temperatures, of a
large volume of experimental binary and ternary system data, available in the literature, (over
4000 data points), and was compared to predictions by the UNIFAC-based and COSMO-based
models (Moodley et al., 2016 (d)).
The AIC scores suggest that the Extended UNISAC model is superior to the original UNIFAC,
modified UNIFAC (Dortmund) (2013), COSMO-RS(OL), and COSMO-SAC models, with
relative AIC scores of 1.95, 4.17, 2.17 and 2.09. In terms of percentage deviations alone
between experimental and predicted values, the modified UNIFAC (Dortmund) model, and
original UNIFAC models, proved superior at 21.03% and 29.03% respectively; however, the
Extended UNISAC model was a close third at 32.99%. As a conservative measure to ensure that inter-correlation of the training set did not occur,
previously unmeasured data was desired as a test set, to verify the ability of the Extended
UNISAC model to estimate data outside of the training set. To accomplish this, SLE
measurements were conducted for the systems diosgenin/ estriol/ prednisolone/
hydrocortisone/ betulin and estrone. These measurements were undertaken in over 10 diverse
organic solvents, and water, at atmospheric pressure, within the temperature range 293.2-328.2
K, by employing combined digital thermal analysis and thermal gravimetric analysis, to
determine compositions at saturation (Moodley et al., 2016 (e), Moodley et al., 2016 (f),
Moodley et al., 2016 (g)).
This previously unmeasured test set data was compared to predictions by the Extended
UNISAC, UNIFAC-based and COSMO-based methods. It was found that the Extended
UNISAC model can qualitatively predict the solubility in the systems measured (where
applicable), comparably to the other popular methods tested. The desirable advantage is that
the number of model parameters required to describe mixture activities is far lower than for the
group contribution and COSMO-based methods.
Future developments of the Extended UNISAC model were then considered, which included
the preliminary testing of alternate combinatorial expressions, to better account for size-shape
effects on the activity coefficient. The limitations of the Extended UNISAC model are also
discussed.
Description
Doctor of Philosophy in Chemical Engineering. University of KwaZulu-Natal, Durban 2016.