The ability to predict the skin sensitization potential of small organic molecules is of high importance to the development and safe application of cosmetics, drugs and pesticides. Skin Doctor is an in silico model for the prediction of the skin sensitization potential of small molecules based on the largest LLNA data set reported so far. A robust and intuitive definition of the applicability domain, paired with additional indicators of the reliability of predictions is offered to the user. Detail on methods and the performance of the models is provided in a publication on Skin Doctor.
Enter SMILES, draw a molecule, or upload a file (.smi or .sdf) containing up to 100k molecules. The .smi file has to contain exactly one SMILES per row. If additional information is given for a SMILES this should be separated by a single space character. Optionally, select a decision threshold different from the default value by moving the decision threshold slide bars to the desired value. Click the submit button to start calculations. Users will then be forwarded to a results page.
The results of the prediction will be displayed as a colored table. Additionally, users can download the results as .csv file or check the results online at a later point in time by the provided web link. Results will be deleted after 60 days or as soon as the user presses the “Delete results” button.
In the “Show/hide columns:” section you can select which columns you want to be displayed in the results table. The results table contains among others the following columns:
Table 1. Explanation of the most important output columns.
Individual integer for each molecule submitted
SMILES as submitted by the user
SMILES after molecules preprocessing
2D structure of the preprocessed molecule
Applicability domain (AD)
States if the molecule of interest is within the applicability domain of the model or not
Mean similarity to 5 nearest neighbors
States the mean similarity to the five nearest neighbors in MACCS space, which is used to define the AD of the models.
Predicted activity with decision threshold
States if the query compound is predicted to be sensitizer or non-sensitizer based on the selected decision threshold.
A prediction may be unreliable if the distance to the decision threshold or the number of consecutive nearest neighbors with the same activity as predicted is too small.
Distance to decision threshold
States the distance of the prediction to the selected decision threshold. This might be an indicator of the reliability of the prediction. For further information see [link to be added as soon as resource is available].
Number of consecutive nearest neighbors with same activity
States the number of consecutive nearest neighbors with same activity as predicted. This might be an indicator of the reliability of the prediction. For further information see [link to be added as soon as resource is available]
Code for any errors or warnings thrown during the preparation of molecular structures. See Table 2 for explanation.
Table 2. Errors and Warnings.
Error message or warning
Invalid or empty input. No output was produced. In combination with one of the other messages, the other message gives the reason for the invalidity.
The salt filter identified a multi-compound SMILES for which the core component could not be determined. A result was generated from the original input, but is probably unreliable.
The salt filter has removed at least one component of the input SMILES.
Element types other than those present in the training data were detected. A result was generated but is probably unreliable.
Molecule is broken during canonalize procedure. Comes always with ‘!1’
Molecule is broken during neutralization procedure. Comes always with ‘!1’
Anke Wilm: firstname.lastname@example.org
Johannes Kirchmair: email@example.com