One first needs to download the samples from SCUBIDOO. It is advised to start with the smaller sample (S). If the virtual screening does not yield promising results, one can screen bigger samples (M or L). These should contain more compounds from the database and therefore increase the chances of finding promising products. The samples are provided in isomeric canonical SMILES format (.ism), which is a common format to display molecules in 2D. Once downloaded, one needs to prepare the sample(s) for virtual screening (docking or similarity search).
Then one needs to convert the 2D information of the products (SMILES) into 3D products (conformers), usually encoded in .sdf or .mol2 format. Depending on the docking program used, there might already be softwares available to do that. If not, this can be done with Open Babel, FROG2 (online service), OMEGA or Corina. For a more exhaustive list, visit vls3d.
Once the sample was correctly prepared, it can be docked into the target of interest. There is a plethora of softwares available, each with their own FAQ. A non-exhaustive list is available here.
The docking procedure aims at identifying promising products (low hanging fruit) within the sample. These products can later on be explored in greater depth. One usually visually inspects the top 500 molecules, ranked according to their docking score.
Since every product in SCUBIDOO is the assembly of two building blocks, one looks more for a promising building block (tree) than a hit (a molecule we want to synthesize and test later on. i.e. a really good looking fruit). The later is of course possible, but less likely since we only docked a small representative portion of the database.
Once a promising product has been identified, one needs to identify the promising building block(s) (step 4) by deconstructing the promising product (step 3). To do so, simply enter the ID of the promising product into the SCUBIDOO search form. Searches will yield the synthetic information of the promising product and the building blocks required for the synthesis. One can then select the promising building block (tree) that one would like to study further (usually the fragment that makes the most compelling interactions with the receptor) and downloads all its derivatives. This can be done by selecting all the compatible reactions or only the reaction(s) one is interested in. This procedure will yield a small dataset of derivatives (step 5), made from the promising building block. Those derivative can then be docked in order to hopefully identify hits (step 6).