About Skin Doctor

The ability to predict the skin sensitization potential of small organic molecules is of high importance to the development and safe application of cosmetics, drugs and pesticides. Skin Doctor is an in silico model for the prediction of the skin sensitization potential of small molecules based on the largest LLNA data set reported so far. A robust and intuitive definition of the applicability domain, paired with additional indicators of the reliability of predictions is offered to the user. Detail on methods and the performance of the models is provided in a publication on Skin Doctor.

How to use the Skin Doctor web service

Submitting calculations

Enter SMILES, draw a molecule, or upload a file (.smi or .sdf) containing up to 100k molecules. The .smi file has to contain exactly one SMILES per row. If additional information is given for a SMILES this should be separated by a single space character. Optionally, select a decision threshold different from the default value by moving the decision threshold slide bars to the desired value. Click the submit button to start calculations. Users will then be forwarded to a results page.


The results of the prediction will be displayed as a colored table. Additionally, users can download the results as .csv file or check the results online at a later point in time by the provided web link. Results will be deleted after 60 days or as soon as the user presses the “Delete results” button.

In the “Show/hide columns:” section you can select which columns you want to be displayed in the results table. The results table contains among others the following columns:

Table 1. Explanation of the most important output columns.

Column name



Individual integer for each molecule submitted


SMILES as submitted by the user

Filtered SMILES

SMILES after molecules preprocessing

2D structure

2D structure of the preprocessed molecule

Applicability domain (AD)

States if the molecule of interest is within the applicability domain of the model or not

Mean similarity to 5 nearest neighbors

States the mean similarity to the five nearest neighbors in MACCS space, which is used to define the AD of the models.

Predicted activity with decision threshold

States if the query compound is predicted to be sensitizer or non-sensitizer based on the selected decision threshold.

Reliability warnings

A prediction may be unreliable if the distance to the decision threshold or the number of consecutive nearest neighbors with the same activity as predicted is too small.

Distance to decision threshold

States the distance of the prediction to the selected decision threshold. This might be an indicator of the reliability of the prediction. For further information see [link to be added as soon as resource is available].

Number of consecutive nearest neighbors with same activity

States the number of consecutive nearest neighbors with same activity as predicted. This might be an indicator of the reliability of the prediction. For further information see [link to be added as soon as resource is available]


Code for any errors or warnings thrown during the preparation of molecular structures. See Table 2 for explanation.

Table 2. Errors and Warnings.


Error message or warning


Invalid or empty input. No output was produced. In combination with one of the other messages, the other message gives the reason for the invalidity.


The salt filter identified a multi-compound SMILES for which the core component could not be determined. A result was generated from the original input, but is probably unreliable.


The salt filter has removed at least one component of the input SMILES.


Element types other than those present in the training data were detected. A result was generated but is probably unreliable.


Molecule is broken during canonalize procedure. Comes always with ‘!1’


Molecule is broken during neutralization procedure. Comes always with ‘!1’

Contact, Suggestions and Bug Report

Anke Wilm: wilm@zbh.uni-hamburg.de

Johannes Kirchmair: kirchmair@zbh.uni-hamburg.de