v_0.30 change note:
- improved accuracy of the model (RMSE to 0.42)
This app is based on training of more than 10000 molecules with more than 60000
solubility data points at different temperatures for different solvent.
About 9000 solubility data in water was taken from AqSolDB database
https://doi.org/10.1038/s41597-019-0151-1
and another 54000 data points from BigSolDB https://doi.org/10.26434/chemrxiv-2023-qqslt
The machine learning models for making this app are based on
XGBoost and convelutional neural network
with a customrised loss function implemented using Pytorch. After training the two models,
final prediction was taken
by the weighted model outputs. Featurelization of the compositions was based on pakage
of RDkit
The model achived RMSE of about 0.44 for the tested solubility in logrithmic scale.