|file||property||source||# of molecules|
notes on solubility data
This data set was used in ALOGPS 2.0 (except for several molecules that were corrected by Dr. A. Yan after publication of this article)
Tetko, I.V.; Tanchuk, V.Yu.; Kasheva, T.N.; Villa, A.E.P. Estimation of Aqueous Solubility of Chemical Compounds using E-state Indices J. Chem. Inf. Comput. Sci., 2001, 41, 1488-1493.
This set is derived from
Huuskonen J.; Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology, J. Chem. Inf. Comput. Sci. 2000 40, 773-777.
after correction of SMILES and CAS RN of compounds. It was also used in:
Yan, A.; Gasteiger, J.; Prediction of aqueous solubility of organic compounds based on a 3D structure representation. J. Chem. Inf. Comput. Sci. 2003 43, 429-434.
A new version of the program, ALOGPS 2.1, developed with this set is available on-line at http://www.vcclab.org
First 878 molecules correpond to the training set used in the article and also corresponds to the training set of Huuskonen. The ALOGPS 2.1 was developed using all molecules.