Two recent and fully open source COSMO-SAC models are assessed for the first time on the basis of the world’s largest phase equilibrium database, the Dortmund Data Bank (DDB). The model performance of COSMO-SAC 2010 and COSMO-SAC-dsp (2013) is studied for vapor-liquid equilibrium (VLE) and infinite dilution activity coefficient predictions, and it is benchmarked with respect to the group contribution models UNIFAC and mod. UNIFAC(DO). For this purpose, binary mixture combinations of 2,295 components are investigated. This leads to 10,897 infinite dilution activity coefficient and 6,940 VLE mixtures, which corresponds to 29,173 and 139,921 data points, respectively. The comprehensive assessment is organized as a top-down approach to preserve a well-structured overview. On the first level, the error distribution of all binary data sets, distinguishing between aqueous and non-aqueous mixtures, is studied. In the second more detailed step, the model performance is analyzed in terms of chemical families. Here, the prediction accuracy for all chemical main-family combinations, e.g. Alkanes + Alcohols, Alkanes + Esters etc. are taken into account. Due to the large considered data set, a MATLAB program was created and provided for the interested reader to study the models for specific mixture types. With this tool, it is feasible to further investigate the model’s accuracy on two levels of chemical sub-families for a chosen main-family combination, e.g. Alkanes + Alcohols. This aims to establish a basis for further improvements of the COSMO-SAC models by considering their performance for given functional groups.
The comprehensive assessment shows that there is a clear improvement from COSMO-SAC 2010 to COSMO-SAC-dsp and from UNIFAC to mod. UNIFAC(DO). The mean absolute deviation of infinite dilution activity coefficient predictions is reduced from 95 % to 86 % (COSMO-SAC 2010 to COSMO-SAC-dsp) and from 73 % to 58 % (UNIFAC to mod. UNIFAC(DO)). A combined mean absolute deviation is introduced to study the temperature, pressure and vapor mole fraction errors of VLE predictions, and it is reduced from 4.77 % to 4.63 % (COSMO-SAC 2010 to COSMO-SAC-dsp) and from 4.47 % to 3.51 % (UNIFAC to mod. UNIFAC(DO)). Detailed error analyses show that the accuracy of COSMO-SAC models mainly depends on chemical family types, but not on the molecular size asymmetry or polarity. The analysis between aqueous and non-aqueous data sets shows that each method is more accurate for non-aqueous mixtures. In addition, the analysis emphasizes that the COSMO-SAC-dsp model development was meaningful and that the dispersive interactions should be taken into account, even though they are just a small part of the total intermolecular interaction energy. Mixtures for which dispersion parameters were established showed an improvement. The present results may serve as a reference for the reliability of predictions with COSMO-SAC methods and provide directions for future developments.