Advances in Virtual High-Throughput Screening

Virtual high-throughput screening is widely used in drug discovery to identify potential molecules to test from the massive drug-like chemistry space. Approaches may be structure- or ligand-based, and increasingly tools are being developed to facilitate and validate such strategies. At the ACS meeting in New Orleans we had the opportunity to meet many of the people involved in developing and applying such virtual high-throughput screening approaches.

The session was begun by Carsten Detering (BioSolveIT, United States) who described “Setting up a discovery pipeline in KNIME and PipelinePilot: High-throughput de novo design utilizing gigantic virtual chemistry spaces.” He described a ROCS shape search of 1.2 x 107 compounds in a minute and a Pfizer virtual screen of 3 x 1012 molecules in 10-15 minutes. An additional example was provided with Bayer, in which 116 of 172 molecules were plant-active compounds.

Frank Boeckler (Eberhard Karls University, Germany) discussed “New targets addressed by DEKOIS 2.0: Demanding evaluation kits for objective in-silico screening.” This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions, helping to save time and resources (

Evan Bolton (National Center for Biotechnology Information, NIH, United States) lectured on “PubChem3D: A virtual screening platform.” PubChem3D is an extension of PubChem resources to include a 3D layer, providing users with new capabilities to search, subset, visualize, analyze, and download data. With the ability to uncover latent structure-activity relationships of chemical structures while complementing 2D similarity analysis approaches, PubChem3D represents a new resource for scientists to exploit when exploring the biological annotations in PubChem.

Sean Ekins (Collaborative Drug Discovery, United States) then presentedDual-event machine learning models to accelerate drug discovery.” He described how cytotoxicity and bioactivity data were combined to produce dual-event Bayesian models (using Discovery Studio) for identifying compounds with activity against M. tuberculosis and a relative lack of cytotoxicity versus Vero cells. Over 38,000 compounds were virtually screened from different libraries and 17 of 106 predicted hits were empirically shown to be active. In one example, a GSK antimalarials library was virtually screened and 5 of 7 predicted hits were active versus M. tuberculosis, leading to one molecule being tested in vivo. (slideshare).

Vladimir Poroikov (Orekhovich Institute of Biomedical Chemistry, Russian Federation) subsequently presented “Virtual high-throughput screening of novel pharmacological agents based on PASS predictions.” The robustness of the PASS algorithm for heterogeneous datasets has been shown widely. PASS is used to estimate qualitative (yes/no) predictions of biological activity spectra for over 4000 biological activities.

Finally, Simon Krige (Cresset Biomolecular Discovery, United Kingdom) discussed “How GPUs can find your next hit: Accelerating virtual screening with OpenCL.” He described how OpenCL is about 40 times faster for a GPU versus a CPU and 25 times cheaper. For example, a single NVidia or AMD graphics card can be used and has the same screening performance as more than 40 modern CPU cores. Such a dramatic speed increase means that screening a few million compounds can be done overnight using a single desktop box with 4 GPUs, compared with using a Linux cluster. A small cluster equipped with GPU coprocessors can be used to screen virtual libraries of tens or hundreds of millions of molecules. Such databases were previously accessible only to 2D methods.

These presentations by virtual screening software developers and scientists involved in applications of these technologies suggest that virtual high-throughput screening may be increasingly utilized to accelerate drug discovery efforts.

Sean Ekins and Joel Freundlich, Symposium Organizers