2010 Skolnik Award Symposium

2010 Herman Skolnik Award Symposium

The Herman Skolnik Award recognized Anton “Tony” J. Hopfinger’s research and service to the field of cheminformatics and computational chemistry and biochemistry. Tony is perhaps best known as the father of multi-dimensional QSAR, including the 4D-QSAR and Molecular Shape Analysis methodologies. More recently Tony introduced a pseudo structure-based method, Membrane-Interaction (MI) QSAR analysis, to estimate ADME-Tox properties involving membrane transport processes.

The Herman Skolnik Award Symposium honoring Tony Hopfinger at the Boston ACS meeting provided a great overview of how Tony’s work – past and present – is shaping the view of QSAR and molecular modeling into a single concentration of ideas instead of them both evolving as separate fields. The morning session of the Symposium focused on the cheminformatics aspects of combining molecular modeling and simulation to extend and refine molecular descriptors used in the QSAR paradigm.

The first speaker was Emilio Xavier Esposito of exeResearch, LLC. Emilio’s talk focused on physico-chemical features of polyphenols with respect to gaining a better understanding of antioxidants. The dominant physical feature of antioxidants is phenols; polyphenols according to Alton Brown. The proposed antioxidant-tyrosinase mechanism, based on a series of experimentally determined mushroom tyrosinase structures, provides insight to the molecular interactions that drive the reaction. While the enzyme structures illustrate the important molecular interactions for tyrosinase inhibition, the enzyme structures do not always facilitate the understanding of what makes a good inhibitor or the mechanism of the reaction. Binary QSAR models were constructed to indicate the important antioxidant molecular features. Exploring models constructed from molecular descriptors based on fingerprints (MACCS keys), traditional molecular descriptors (2D and 2½D), VolSurf-like molecular descriptors (3D) and molecular dynamics (4D-Fingerprints), the relationship between polyphenols' biologically relevant molecular features - as determined by each set of descriptors - and their antioxidant abilities were discussed.

The next speaker was Jürgen Bajorath of the University of Bonn. Even though Jürgen was ill and not able to travel to Boston, he gave his talk via Skype on the “Engineering and 3D protein-ligand interaction scaling of 2D fingerprints.” In this talk he discussed the refinement and advancement of molecular descriptors for SAR analysis. While fingerprints have long been the preferred descriptors for similarity searching and SAR studies, the standard fingerprints typically have a constant bit string format and are used as individual database search tools. However, by applying “engineering” techniques such as “bit silencing,” fingerprint reduction, and “recombination,” standard fingerprints can be tuned in a compound class-directed manner and converted into size-reduced versions with higher search performance. It is also possible to combine preferred bit segments from fingerprints of distinct design and generate “hybrids” that exceed the search performance of their parental fingerprints. Furthermore, effective 2D fingerprint representations can be generated from strongly interacting parts of ligands in complex crystal structures. These “interacting fragment” fingerprints focus search calculations on pharmacophore elements without the need to encode interactions directly. Moreover, 3D protein-ligand interaction information can implicitly be taken into account in 2D similarity searching through fingerprint scaling techniques that emphasize characteristic bit patterns.

The next speaker of the morning session was Y. Jane Tseng of the National Taiwan University. Jane discussed her group’s recent work on the development of in silico binary QSAR models for the prediction of a compound’s potential to block the human ether-a-go-go related gene (hERG) ion channel. The blockage of the hERG potassium ion channel is a major factor related to cardiotoxicity. Hence, binding to this channel has become an important biological endpoint in side effects screening. A structurally diverse hERG data set of 250 compounds was used to construct a set of two-state hERG QSAR models. The descriptor pool used to construct the models consisted of 4D-fingerprints generated from the thermodynamic distribution of conformer states available to a molecule, 204 traditional 2D descriptors, and 76 3D VolSurf-like descriptors computed using the Molecular Operating Environment (MOE) software. One model is a continuous partial least squares (PLS) QSAR hERG binding model. Another related model is an optimized binary QSAR model that classifies compounds as active or inactive. This binary model achieves 91% accuracy over a large range of molecular diversity spanning the training set. An external test set was constructed from the condensed PubChem bioassay database containing 816 compounds and successfully used to validate the binary model. The binary QSAR model permits a structural interpretation of possible sources for hERG activity. In particular, the presence of a polar negative group at a distance of 6 to 8 Å from a hydrogen bond donor in a compound is predicted to be a quite structure-specific pharmacophore that increases hERG blockage.

Robert D. Clark of Simulations Plus, Inc. in Lancaster, California gave an informative talk that highlighted the challenge of evaluating pharmacophore model performance. He likened the problem to having the ability to tell “the good from the bad and the ugly.” Pharmacophore models are useful when they provide qualitative insight into the interactions between ligands and their target macromolecules, and therefore are more akin in many ways to molecular simulations than to quantitative structure activity relationships (QSARs) based on the partition of activity across a set of molecular descriptors. When the performance of a pharmacophore model is assessed quantitatively, it is usually in terms of its ability to recover known ligands or, less often, in terms of how well it distinguishes ligands from non-ligands. This status as a classification technique also sets it apart from more numerical QSAR methods, in part because of fundamental differences in what being "good" means. Carefully defining what "good" classification is, however, can make creative combination with other techniques a productive way to capture the value of their intrinsic complementarity.

The afternoon session moved from the cheminformatics aspects to the molecular modeling and simulation side of the QSAR paradigm. In this session, speakers spoke about how their research efforts are a mixture of molecular modeling and QSAR techniques.

Curt M. Breneman of Rensselaer Polytechnic Institute kicked off the afternoon with an intriguing look at using QSAR approaches to learn from protein crystal structures. In practice, there is no inherent disconnect between the descriptor-based cheminformatics methods commonly used for predicting small molecule properties and those that can be used to understand and predict protein behaviors. Examples of such connections include the development of predictive models of protein/stationary phase binding in HIC and ion-exchange chromatography, protein/ligand binding mode characterization through PROLICSS analysis of crystal structures, and the use of PESD binding site signatures for pose scoring and predicting off-target drug interactions. In all of these cases, models were created using descriptors based on protein electronic and structural features and modern machine learning methods that include model validation tools and domain of applicability assessment metrics.

William L. Jorgensen of Yale University discussed his research group’s success designing novel HIV reverse transcriptase inhibitors. Drug development is being pursued through computer-aided structure-based design. For de novo lead generation, the BOMB program builds combinatorial libraries in a protein binding site using a selected core and substituents, and QikProp is applied to filter all designed molecules to ensure that they have drug-like properties. Monte Carlo/free-energy perturbation simulations are then executed to refine the predictions for the best scoring leads including ca. 1000 explicit water molecules and extensive sampling for the protein and ligand. FEP calculations for optimization of substituents on an aromatic ring and for choice of heterocycles are now common. Alternatively, docking with Glide is performed with the large databases of purchasable compounds to provide leads, which are then optimized via the FEP-guided route. Successful application has been achieved for HIV reverse transcriptase, FGFR1 kinase, and macrophage migration inhibitory factor (MIF); micromolar leads have been rapidly advanced to extraordinarily potent inhibitors.

José S. Duca of Novartis provided an overview of the evolution of structure-based drug design (SBDD) and nD-QSAR methods while he was doing a postdoc in Tony’s lab. José discussed case studies in which QSAR and SBDD have worked in concert during the discovery process of pre-clinical candidates. The importance of incorporating time-dependent sampling to improve the quality of the nD-QSAR models (n=3,4) was discussed and compared to simplified low dimensional QSAR models.

ImageThe Skolnik Award Symposium was concluded with Tony stressing that QSAR analysis and molecular modeling/simulation methods can often be complementary, and when combined in a study yield results greater than the sum of their parts. Modeling and simulation offer the ability to design custom, information-rich trial descriptors for a QSAR analysis. In turn, QSAR analysis is able to discern which of the custom descriptors most fully relate to the behavior of an endpoint of interest. Grid cell occupancy descriptors (GCODs) of 4D-QSAR analysis form one useful set of custom QSAR descriptors from modeling and simulation for describing ligand-receptor interactions. These descriptors characterize the relative spatial occupancy of all the atoms of a molecule over the set of conformations available to the molecule when in a particular environment. GCODS permit the construction of a 4D-QSAR equation for virtual screening, as well as a spatial pharmacophore of the 4D-QSAR equation for exploring mechanistic insight. Applications that can particularly benefit from combining QSAR analysis and modeling/simulation tools are those in which a model chemical system is needed to determine the sought after property. One such application is the transport of molecules through biological compartments, an integral part of many ADMET properties. For example, the reliable estimation and characterization of the diffusion of organic compounds through cellular membranes is greatly enhanced by simulation modeling, and the subsequent extraction of properties from the simulation trajectories as custom descriptors to build a corresponding QSAR-based diffusion model. The key descriptors of the QSAR models, in turn, also permit the investigator to probe and postulate detailed molecular mechanisms of action.

Emilio Exposito, Co-organizer, 2010 Herman Skolnik Award Symposium