Activity of complex multifunctional organic compounds in common solvents.
The models used in the prediction of activity coefficients are important tools for designing major unit operations (distillation columns, liquid-liquid extractors etc). In the petrochemical and chemical industry, well established methods such as UNIFAC and ASOG are routinely employed for the prediction of the activity coefficient. These methods are, however, reliant on binary group interaction parameters which need to be fitted to reliable experimental data. It is for this reason that these methods are often not applicable to systems which involve complex molecules. In these systems, typically solid-liquid equilibria are of interest where the solid is some pharmaceutical product or intermediate or a molecule of similar complexity (the term complex here refers to situations where molecules contain several functional groups which are either polar, hydrogen bonding, or lead to mesomeric structures in equilibrium). In many applications, due to economic and environmental considerations, a list of no more than 20 solvents is usually considered. It is for this reason that the objective of this work is to develop a method for predicting the activity coefficient of complex multifunctional compounds in some common solvents. The segment activity coefficient approaches proposed by Hansen, MOSCED and the NRTL-SAC models show that it should be possible to “interpolate” between solvents if suitable reference solvents are available (e.g. non-polar, polar and hydrogen bonding). Therefore it is useful to classify the different solvents into suitable categories inside which analogous behaviour should be observed. To accomplish this, a significant amount of data needs to be collected for the common solvents. Data with water as a solvent was freely available and multiple sources were found with suitable data. Both infinite dilution activity coefficient (y∞) and SLE (Solid-Liquid Equilibrium) data were used for model development. The y∞ data were taken from the DDB (Dortmund Data Bank) and SLE data were taken from Beilstein, Chemspider and DDB. The limiting factor for the usage of SLE data was the availability of fusion data (heat of fusion and melting temperature) for the solute. Since y∞ in water is essentially a pure component property it was modelled as such, using the experience gained previously by this group. The overall RMD percentage (in ln y∞) for the training set was 7.3 % for 630 compounds. For the test set the RMD (in ln y∞) was 9.1 % for 25 fairly complex compounds. Typically the temperature dependence of y∞ data is ignored when considering model development such as this. Nevertheless, the temperature dependence was investigated and it was found that a very simple general correlation showed moderate accuracy when predicting the temperature dependence of compounds with low solubility. Data for solvents other than water were very scarce, with insufficient data to develop a model with reasonable accuracy. A novel method is proposed for the alkane solvents, which allows the values in any alkane solvent to be converted to a value in the solvent hexane. The method relies on a first principles application of the solution of groups concept. Quite unexpectedly throughout the course of developing the method, several shortfalls were uncovered in the combinatorial expressions used by UNIFAC and mod. UNIFAC. These shortfalls were empirically accounted for and a new expression for infinite dilution activity coefficient is proposed. This expression is however not readily applicable to mixtures and therefore requires some further attention. The method allows for the extension of the data available in hexane (chosen since it is a common solvent for complex compounds). In the same way as the y∞ data in water, the y∞ data in hexane were modelled as a pure component property. The overall RMD percentage (in ln y∞) for the training set was 21.4 % for 181 compounds. For the test set the RMD (in ln y∞) was 11.7 % for 14 fairly complex compounds. The great advantage of both these methods is that, since they are treated as pure component properties, the number of model parameters grows linearly with the number of groups, unlike with mixture models (UNIFAC, ASOG, etc.) where it grows quadratically. For both the water and the hexane method the predictions of the method developed in this work were compared to the predictions of UNIFAC, mod. UNIFAC, COSMO-RS(OL) and COSMO-SAC. Since water and hexane are not the only solvents of practical interest, a method was developed to interpolate the alcohol behaviour based on the water and hexane behaviour. The ability to predict the infinite dilution activity coefficient in various solvents allowed for the prediction of various other properties, viz. air-water partition coefficient, octanol-water partition coefficient, and water-alcohol cosolvent mixtures. In most cases the predictions of these properties were good, even for the fairly complex compounds tested.