file property source # of molecules
solubility_data.txt Aqueous solubility http://www.vcclab.org/lab/alogps/ 1,311

notes on solubility data

This data set was used in ALOGPS 2.0 (except for several molecules that were corrected by Dr. A. Yan after publication of this article)

Tetko, I.V.; Tanchuk, V.Yu.; Kasheva, T.N.; Villa, A.E.P. Estimation of Aqueous Solubility of Chemical Compounds using E-state Indices J. Chem. Inf. Comput. Sci., 2001, 41, 1488-1493.

This set is derived from

Huuskonen J.; Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology, J. Chem. Inf. Comput. Sci. 2000 40, 773-777.

after correction of SMILES and CAS RN of compounds. It was also used in:

Yan, A.; Gasteiger, J.; Prediction of aqueous solubility of organic compounds based on a 3D structure representation. J. Chem. Inf. Comput. Sci. 2003 43, 429-434.

A new version of the program, ALOGPS 2.1, developed with this set is available on-line at http://www.vcclab.org

First 878 molecules correpond to the training set used in the article and also corresponds to the training set of Huuskonen. The ALOGPS 2.1 was developed using all molecules.