Vol. 63, No. 1: Spring, 2011

Chemical Information Bulletin

A Publication of the Division of Chemical Information of the ACS

Volume 63 No. 1 (Spring) 2011

 

David Martinsen, Editor
American Chemical Society
Publications Division
1155 16th Street NW
Washington, DC 20036 USA
d_martinsen@acs.org

Image

ISSN: 0364-1910
Chemical Information Bulletin,
©Copyright 2011 by the Division of Chemical Information of the American Chemical Society.

Message from the Chair

ImageIt is a great honor to begin my tenure as Chair of the ACS Division of Chemical Information in 2011, the International Year of Chemistry. Yet while there is great excitement that comes with this momentous time for chemists worldwide, the continued malaise of the global economy engenders both uncertainty and hesitancy. It is in such uncertain times that participation in CINF makes all the more sense, whether it for networking to find a new job or honing one's skills to keep a current job.

According to the U.S. Bureau of Labor Statistics, 70% of jobs are found through networking. Networking with colleagues in CINF and related ACS Divisions such as the Division of Computers in Chemistry (COMP) is a primary benefit of membership in CINF and attending of CINF events. The strong scientific programs that have been assembled by current CINF Program Chair Rachelle Bienstock and her predecessor Rajarshi Guha (CINF Chair Elect) provide another powerful reason to attend CINF meetings. Another significant reason to attend CINF events is that its members (present company excluded) are simply delightful human beings: dedicated, motivated, intelligent, charming, unpretentious, and interesting people.

If you have not been involved in CINF events, I urge you to do so. We start out with the Long Range Planning Meeting and breakfast on Saturday March 26th followed immediately by committee meetings until noon, a luncheon at noon, and the Executive Committee Meeting in the afternoon. There is a Welcoming Reception on Sunday, Harry’s Party on Monday, the CINF Luncheon on Tuesday followed by the CINF Reception Tuesday evening. Everyone is welcome to attend any or all of these events (except for the Executive Committee Meeting).

Finally, I must thank all the recent CINF Chairs for their tremendous help and support: Dave Martinsen, Svetlana Korolev, and Carmen Nitsche are simply amazing. That said, I owe the biggest debt to Carmen, CINF Past Chair. Her dedication to CINF is truly inspiring, and her gracious help and guidance over the past year has been greatly appreciated.

I look forward to seeing you in Anaheim!

Warmest regards,

Gregory M. Banik, Ph.D., Chair
ACS Division of Chemical Information (CINF)

Letter from the Editor

I would simply like to thank all those who made contributions to this issue of the Chemical Information Bulletin. Svetla Baykoucheva provided two thought-provoking interviews and Bob Buntrock contributed several book reviews. Our new division chair, Greg Banik, shared his thoughts on the year (welcome, Greg!). Also included are highlights from the technical program from our Program Chair, Rachelle Beinstock, as well as the technical program itself. Be sure to check out the product announcements from our sponsors, as well as the information about submissions for awards. Thanks to Mark Luchetti for the cover page design. Finally, I would also recognize the efforts of our webmaster, Danielle Dennie, who designed the templates for the e-CIB and flowed all of the content seamlessly into the site (well, at least it looked seamless to me).

Dave Martinsen
Guest Editor

Sponsors & Product Announcements

Here are some sponsor and product announcements.

CINF Sponsors - Spring 2011

The American Chemical Society Division of Chemical Information (CINF) is very fortunate to receive generous financial support from our sponsors to maintain the high quality of the Division’s programming and to promote communication between members at social functions at the ACS Spring 2011 National Meeting in Anaheim, CA, and to support other divisional activities during the year, including scholarships to graduate students in Chemical Information. The Division gratefully acknowledges contribution from the following sponsors:

Gold:

Silver:

Bronze:

Opportunities are available to sponsor Division of Chemical Information events, speakers, and material. Our sponsors are acknowledged on the CINF web site, in the Chemical Information Bulletin, on printed meeting materials, and at any events for which we use your contribution. Please feel free to contact me if you would like more information about supporting CINF.

Graham Douglas Chair,
Fundraising Committee
Email: Fundraising@acscinf.org
Tel: 510-407-0769

The ACS CINF Division is a non-profit tax-exempt organization (pdf) with taxpayer ID no. 52-6054220.

Accelrys Draw 4.0

Image

Accelrys Draw 4.0, the latest release of Accelrys's chemical drawing application is now available for download at no charge for academic and non-commercial personal use at www.symyx.com/getdraw. Accelrys Draw 4.0 features the follows enhancements:

  • Biological sequence editor
  • Multi-tabbed user interface
  • Structure resolver - extends name-to-structure conversion to DiscoveryGate, ChemSpider, PubChem, and NCI/CADD web resources
  • View files with structure-based thumbnail images in Windows Explorer
  • Dynamic toolstrips
  • Customizable atom toolstrip
  • Automatic, customizable coloring of atom labels
  • Reading of ChemDraw CDX files
  • Enhanced stereochemistry labels

The no-fee download of Accelrys Draw 4.0 for academic and non-commercial personal use contains all the functionality in the commercial vaersion and it is available now. For more information, visit www.symyx.com/getdraw. (Note, these URLs will move to the Accelrys domain in the near future, but redirects will be put in place to maintain access.) Contact Information: Keith T Taylor PhD, MRSC Advisory Product Manager, Chemistry Accelrys, Inc. 2440 Camino Ramon, Suite 300 San Ramon, CA 94583 Ph: +1-925-543-7525 Cell: +1 209 221 9415 Fax: +1-925-543-7553 Smarter Science. Better Business. Stay Connected with Accelrys and the Scientific & Engineering Community

Reaxys & IDBS E-Workbook

Image

Reaxys and IDBS: working together to provide seamless environments for researchers Elsevier and IDBS recently announced that Reaxys is now interoperable with the IDBS E-Workbook Suite. This partnership creates a new mechanism that integrates the best-in-class content from journals and patents with documented proprietary scientific results. E-WorkBook users searching for relevant chemical data can now smoothly transition into Reaxys, the workflow solution that provides extensive information on chemical compounds, related physical and pharmacological properties, and synthesis information, and then save their data and findings in their workflow. With Reaxys available via E-WorkBook, a new group of researchers can now access this extensive repository of experimentally validated data. "There is a genuine need for automatically integrating relevant chemistry information directly into the research process,” said Neil Kipling, founder and CEO of IDBS. “Our partnership with Elsevier delivers essential chemical information to scientists just in time and at the point of use." Mark van Mierle, Managing Director of Elsevier Information Systems GmbH, added: "Our customers want seamless, interoperable environments for their researchers. Our partnership with IDBS responds to customer needs and should further improve individual workflows and company productivity." "Bringing information together in a federated search reduces the time to make decisions on which molecule to make next,” said Robert Glen, Professor of Molecular Sciences Informatics at Cambridge University. “The integration of E-WorkBook and Reaxys provides an exciting new approach to improving productivity." IDBS and Elsevier will continue to work closely together to provide additional innovative content-related functions to Reaxys and E-WorkBook to significantly improve how researchers interact with the world’s best data. For more information on Reaxys and IDBS, please visit our website.

Reaxys 2011 PhD Prize

Image

The Reaxys 2011 PhD Prize is open: celebrating innovation and creativity in young chemists Elsevier Properties SA recently announced that the 2011 Reaxys PhD Prize, a global competition for candidates currently studying for a PhD or having completed a PhD within the last 12 months, is open, with a final submission date of 28th February 2011. The prize will be awarded for original and innovative research in organic, organometallic and inorganic chemistry to the candidates that demonstrate excellence in methodology and approach in a peer-reviewed publication. Three prize winners will each receive a check for $2000 and be invited to present their research at the Winners’ Symposium, to be held during the 14th Asian Chemical Congress, 5 – 8 September, 2011 in Bangkok, Thailand. David Evans PhD, Scientific Affairs Director at Elsevier Properties SA said, "The Reaxys PhD Prize celebrates innovation and creativity in chemistry research from around the world, values which lie at the heart Reaxys itself." He continued, "In 2010 we received over 300 submissions from around the world covering the breadth of modern chemistry including representatives from most of the leading chemistry universities. The quality of research was outstanding, and the finalists and winners are clearly at the cutting edge of chemistry research. A high bar has been set for 2011." All entries will be evaluated by a review board of leading international chemists, chaired by the following members of the Reaxys Advisory Board:

  • Professor A. G. M. Barrett - Imperial College London, UK
  • Professor B. M. Trost - Stanford University, USA
  • Professor H. N. C. Wong - Chinese University of Hong Kong, China

Submissions are reviewed based upon originality, innovation, importance to the field, applicability, rigor of approach and publication quality. For more information, including submission details and requirements, please visit our website. Reaxys is a registered trademark owned and protected by Elsevier Properties SA and used under license

FIZ CHEMIE Chemisches Zentralblatt

ImageFIZ CHEMIE has digitized the entire contents of the first and oldest abstracts journal published in the field of chemistry, the German Chemisches Zentralblatt. Beginning from 1830, approximately 900.000 page images with about 2 million abstracts cover 140 years of research progress in pharmaceutical science and chemistry. InfoChem software company for chemoinformatics has applied advanced data mining technologies to the hole content and created a database that allows combined full-text, structure and substructure search throughout the page images. Thus, researchers are able to scan 140 years of scientific knowledge and patents published in the time period from 1830 to 1969.

The Chemisches Zentralblatt Structure Database is offered optionally as a web application or as an in-house system. The web Imageapplication is hosted on the InfoChem server. Access is provided on a licence basis. By purchasing the in-house solution, customers get the database together with the original pdf files to integrate them into their company systems. Customized solutions are offered as packages.

For more information, please visit our website or http://www.infochem.de/content/downloads/czb.pdf.

ACS Access to JCIM

Complimentary access to the Journal of Chemical Information and Modeling 2011 Sample Issue

ACS Publications invites you to explore the 2011 sample issue of the Journal of Chemical Information and Modeling, available now online free until the end of the year.
 
On Bibliometric Analysis of Chinese Research on Cyclization, MALDI-TOF, and Antibiotics: Methodical Concerns
 Petr Heneberg
 Image
Comments on "On Bibliometric Analysis of Chinese Research on Cyclization, MALDI-TOF, and Antibiotics: Methodological Concerns"
 Jiang Li and Peter Willett

Classifying Large Chemical Data Sets: Using A Regularized Potential Function Method
 Hamse Y. Mussa, Lezan Hawizy, Florian Nigsch, and Robert C. Glen
 Image
Cross-Target View to Feature Selection: Identification of Molecular Interaction Features in Ligand-Target Space
 Satoshi Niijima, Hiroaki Yabuuchi, and Yasushi Okuno
 Image
New Fragment Weighting Scheme for the Bayesian Inference Network in Ligand-Based Virtual Screening
 Ammar Abdo and Naomie Salim
 Image
Molecular Docking and Pharmacophore Filtering in the Discovery of Dual-Inhibitors for Human Leukotriene A4 Hydrolase and Leukotriene C4 Synthase
 Sundarapandian Thangapandian, Shalini John, Sugunadevi Sakkiah, and Keun Woo Lee
 Image
Would the Pseudocoordination Centre Method Be Appropriate To Describe the Geometries of Lanthanide Complexes?
 Danilo A. Rodrigues, Nivan B. da Costa, Jr., and Ricardo O. Freire
 Image
Transplant-Insert-Constrain-Relax-Assemble (TICRA): Protein-Ligand Complex Structure Modeling and Application to Kinases
 Siavash Meshkat, Anthony E. Klon, Jinming Zou, Jeffrey S. Wiseman, and Zenon Konteatis
 Image
Discovery of Chemical Compound Groups with Common Structures by a Network Analysis Approach (Affinity Prediction Method)
 Shigeru Saito, Takatsugu Hirokawa, and Katsuhisa Horimoto
 Image
Assessing the Performance of the MM/PBSA and MM/GBSA Methods. 1. The Accuracy of Binding Free Energy Calculations Based on Molecular Dynamics Simulations
 Tingjun Hou, Junmei Wang, Youyong Li, and Wei Wang
 Image
StructRank: A New Approach for Ligand-Based Virtual Screening
 Fabian Rathke, Katja Hansen, Ulf Brefeld, and Klaus-Robert Mller
 Image
Quantum Mechanics/Molecular Mechanics Strategies for Docking Pose Refinement: Distinguishing between Binders and Decoys in Cytochrome cPeroxidase
 Steven K. Burger, David C. Thompson, and Paul W. Ayers
 Image
Comments on the Article "Evaluation of pKa Estimation Methods on 211 Druglike Compounds"
 John C. Shelley, David Calkins, and Arron P. Sullivan
 Image
Calculation of the Solvation Free Energy of Neutral and Ionic Molecules in Diverse Solvents
 Sehan Lee, Kwang-Hwi Cho, Chang Joon Lee, Go Eun Kim, Chul Hee Na, Youngyong In, and Kyoung Tai No
 Image
Sequence, Structure, and Active Site Analyses of p38 MAP Kinase: Exploiting DFG-out Conformation as a Strategy to Design New Type II Leads
 Preethi Badrinarayan and G. Narahari Sastry
 Image
Rational Approaches for the Design of Effective Human Immunodeficiency Virus Type 1 Nonnucleoside Reverse Transcriptase Inhibitors
 Sergio R. Ribone, Mario A. Quevedo, Marcela Madrid, and Margarita C. Bri??n
 Image
Importance of Receptor Flexibility in Binding of Cyclam Compounds to the Chemokine Receptor CXCR4
 Alfonso R. Lam, Supriyo Bhattacharya, Kevin Patel, Spencer E. Hall, Allen Mao, and Nagarajan Vaidehi
 Image
Automated Selection of Compounds with Physicochemical Properties To Maximize Bioavailability and Druglikeness
 Taiji Oashi, Ashley L. Ringer, E. Prabhu Raman, and Alexander D. MacKerell, Jr.
 Image
Bacterial Carbohydrate Structure Database 3: Principles and Realization
 Philip V. Toukach
 Image
CYANOS: A Data Management System for Natural Product Drug Discovery Efforts Using Cultured Microorganisms
 George E. Chlipala, Aleksej Krunic, Shunyan Mo, Megan Sturdy, and Jimmy Orjala
 Image
ThermoData Engine (TDE): Software Implementation of the Dynamic Data Evaluation Concept. 5. Experiment Planning and Product Design
 Vladimir Diky, Robert D. Chirico, Andrei F. Kazakov, Chris D. Muzny, Joseph W. Magee, Ilmutdin Abdulagatov, Jeong Won Kang, Kenneth Kroenlein, and Michael Frenkel
 Image

Free access to other ACS Sample Issues.
 
ACS Publications offers complimentary access to the first issue of the year for all 39 of its journals.
 

Thieme Chemistry

ImageThieme Chemistry publishes highly evaluated information about synthetic and general chemistry for professional chemists and advanced students since 1909. Our portfolio of products includes the well known journals SYNFACTS, SYNLETT and SYNTHESIS, the renowned synthetic methodology reference work Science of Synthesis, RÖMPP, the largest and most renowned chemical encyclopedia published in German, as well as a selected range of monographs. www.thieme-chemistry.com

RCS Publishing Platform

RSC Publishing Platform reaches the one million milestone

Image

The one millionth publication to appear on the RSC Publishing Platform went online recently in a landmark achievement for the learned society. The seven figure milestone was reached as the RSC's exceptional range of peer-reviewed journals, magazines, books, databases and publishing services to the chemical science community more than doubled in output in the last three years.

Royal Society of Chemistry editorial director James Milne said: "This marks a significant landmark for the RSC Publishing Platform. Delivering the millionth record, a paper published in the journal Nanoscale, demonstrates not only the significance of the RSC in terms of disseminating high-quality research content worldwide but also with many millions of article downloads each year, the value researchers place on being able to access this content through our new publishing platform."

In the last four years RSC Publishing has gone from being the fifth largest publisher in chemistry to challenging Wiley in third place.

Read more about this growth at: http://www.rsc.org/AboutUs/News/PressReleases/2011/Million.asp

View the one millionth publication, "Controlled assembly of plasmonic colloidal nanoparticle clusters", at: http://pubs.rsc.org/en/Content/ArticleLanding/2011/NR/C0NR00804D

InfoChem's Chemisches Zentralblatt

InfoChem Launches Chemisches Zentralblatt Structure Database

ImageAt the end of 2010, InfoChem GmbH launched the structure searchable version of Chemisches Zentralblatt, a powerful new way of gaining information from an essential resource for chemists, researchers and intellectual property professionals.

Chemisches Zentralblatt is the first and oldest abstracts journal published in chemistry, covering the literature from 1830 to 1969 and describing the "birth" of chemistry as a science. Over the period of 140 years, Chemisches Zentralblatt has published 900,000 pages, containing two million abstracts. InfoChem was able to identify one million unique names and 500,000 unique structures in these documents. Now, the structure searchable database provides non-German speaking users with the opportunity to query this valuable source in the language of chemistry.

Using modern scanning technology, FIZ CHEMIE has digitized the entire content of Chemisches Zentralblatt. Then InfoChem produced the structure searchable database by applying specialized software tools for OCR, chemical named entity extraction and name to structure conversion. InfoChem used its exceptional skills and experience in German naming conventions to achieve optimal conversion results.

Chemisches Zentralblatt is available as a web-based application or as an in-house database. Scientists can search structures, substructures and full-text. Then from the hit list, users can link directly to the original page in Chemisches Zentralblatt containing the information. Applications may include preparative chemistry and prior art searches.

About InfoChem GmbH

Founded in 1989 and based in Munich (Germany), InfoChem has over 20 years' experience in the development and integration of sophisticated software tools for the storage and handling of structure and reaction information. For more information, please visit our website.

InfoChem is pleased to announce that we now have a representative in the UK. Dr Stephanie North has over 25 years' experience in chemical information within the pharmaceutical industry and is delighted to be working with the InfoChem team.

Contact address: PO Box 240, Royston, Hertfordshire SG8 1DA, UK; e-mail: sn@infochem.de.

 

National Meeting

ACS Chemical Information Division (CINF)
Spring, 2011 ACS National Meeting
Anaheim, CA (March 27-31)

Image

Technical Program Highlights

ImageDr. Martin Walker has organized a very special symposium for this Spring Anaheim meeting in honor of his mentor, Dr. James Hendrickson, "Fifty Years of Computers in Organic Chemistry: A Symposium in Honor of James B. Hendrickson". Dr. Hendrikson, Professor Emeritus of Chemistry, Brandeis University, was a pioneer in the field of computer-aided organic synthesis design and was one of the early visionaries in this field. The designer of the programs SYNGEN and WebReactions, much current work in the field is built on the early work of his research group. Many of the successful students who trained with Dr. Hendrickson over his long career, or those whose work was built on ideas and concepts originating from Dr. Hendrickson's work, will be speaking in this symposium including Dr. Paul A. Wender, Bergstom Professor in Chemistry, Stanford University; Dr. Phil S. Baran, Professor, Scripps Research Institute; Dr. Valentina Eigner-Pitto, InfoChem GmbH and Dr. Orr Ravitz, SimBioSys Inc.

CINF was ahead of the Golden Globe awards this year when Dr. Steve Bachrach and Dr. Henry Rzepa planned our symposium "Internet and Chemistry: Social Networking". Use of the internet is pervasive and this symposium will focus on how it can be effectively used to promote the exchange of chemical ideas and chemical information. The symposium will feature presentations by Dr. Peter Murray-Rust on "Collaborative Agile Internet Projects: The Green Chain Reaction" and Dr. Antony Williams, the developer of the successful and highly useful ChemSpider, as well as presentations on the CAS Registry by Dr. Roger Schenck, "Publishing and Consuming Scientific Literature in a Digital, Device Agnostic World" by Dr. David Martinsen (ACS) and OpenTox by Dr. D.A. Gallagher.

Dr. X. Simon Wang, has organized a symposium on "Natural Products and Drug Discovery" which will feature some interesting talks on screening traditional Indonesian herbs by Dr. D. Barlow, and identifying antiviral leads from nature for common cold and flu treatment by Dr. J. M. Rollinger, as well as presentations on cheminformatic analysis of natural product data by Dr. Jose Medina-Franco’s group and data mining by Drs. Baker and Fourches with Dr. Alex Tropsha. Discoveries in the area of natural product cancer treatment will be presented by Dr. Lawrence Hurley, and natural product 5Ht-1A inhibitors will be discussed by Dr. X. Simon Wang. A discussion on patenting traditional medicines from natural products will be presented as well by Drs. Zabliski and Schenck.

Drs. Maciej Haranczyk and Jose Medina-Franco have organized "Integration of Combinatorial Chemistry with Cheminformatics: Current Trends and Future Directions in Drug Discovery and Material Science". This symposium features presentations by Dr. Dimitris Agrafiotis and Dr. W. Zheng on combinatorial library design and presentations on high throughput screening by Dr. Peter Shenkin. There will also be presentations on managing combinatorial libraries by Dr. Carsten Detering and fragment based design by Dr. Miereles. We are thankful to the CSA Trust for cosponsoring, and Drs. Irina Sens and Peter Rusch for organizing “Open Data, Open Science, Open Knowledge” featuring presentations on visual search in scientific research data by Dr. Sens, curated scientific data resources by Dr. C.R. Groom and Open Data by Peter Murray-Rust.

Leah Solla, Robert McFarland, Norah Xiao have organized "Data Archiving, E-Science and Primary Data", which features presentations on Librarian 2.0 data management (Blanton-Kent), PubChem (S. Swamidass),hosting a computing centric resource for chemistry data by Tony Williams and data curation profiles by Jeremy Garritano.

Dr. Guenter Grethe has selected and organized review of our student award posters for the CINF Scholarship for Scientific Excellence (sponsored by Accelrys) which will be presented in a poster session on Sunday evening March 27th.

We will also have some interesting presentations in our general papers session on Wednesday morning March 30th and a small CINF poster session as part of the Sci Mix session on late Monday evening.

The number of papers presented is too numerous to mention each by title and author so I only hoped to give you an overview flavor of the rich variety of topics and material. The Spring 2011 Anaheim meeting is the first which I have organized as CINF program chairperson. I want to thank Dr. Rajarshi Guha, past program chair, for all his advice and assistance. With the capable assistance of all the symposium chairpersons, I think we have put together an interesting program which will cover the diverse interests of the CINF membership. Please come and experience first hand!

Rachelle Bienstock
Chair, Program Committee

Committee Meetings and Social Events

Saturday, March 26

Long Range Planning and Breakfast Meeting
Room 201 A, Anaheim Convention Center
7:30 AM – 9:00 AM

Program Committee Meeting
Room 201 B, Anaheim Convention Center
9:00 AM – 12:00 PM

Membership Committee Meeting
Room 202 B, Anaheim Convention Center
9:00 AM – 10:30 AM

Finance Committee Meeting
Room 202 A, Anaheim Convention Center
11:00 AM – 12:00 PM

Communications & Publications Committee
Room 203 A, Anaheim Convention Center
9:00 AM – 12:00 PM

CINF Fundraising Committee Meeting
Room 202 A, Anaheim Convention Center
10:00 AM – 11:00 AM

CINF Careers Committee Meeting
Room 202 B, Anaheim Convention Center
10:30 AM – 12:00 PM

CINF Education Committee Meeting
Room 203 B, Anaheim Convention Center
9:00 AM – 12:00 PM

CINF Awards Committee Meeting
Room 202 A, Anaheim Convention Center
9:00 AM – 10:00 AM

CINF Functionary Luncheon
Room 201 A, Anaheim Convention Center
12:00 PM – 1:00 PM

CINF Executive Committee Meeting
Room 201 B, Anaheim Convention Center
1:00 PM – 5:30 PM

 

Sunday, March 27

CINF-CSA Trust Group Meeting
Orangewood 3, Clarion Hotel Anaheim Resort
12:00 PM – 2:00 PM

CINF Sunday Welcoming Reception & CINF Scholarships for Scientific Excellence Posters
Ballroom A, Anaheim Convention Center
6:30 PM – 8:30 PM
Reception co-sponsored by ACS Publications, Bio-Rad Laboratories, CambridgeSoft, InfoChem & Thieme Chemistry
Scholarships for Scientific Excellence sponsored by Accelrys.

 

Monday, March 28

CINF Open Meeting
Room 204 A, Anaheim Convention Center
4:20 – 4:30 PM

ACS Publications/CAS Open Meeting
Room 204 A, Anaheim Convention Center
4:30 – 5:30 PM

Harry's Party
2nd Floor Suite
Sheraton Park Hotel at the Anaheim Resort
5:30 – 8:00 PM
Sponsored exclusively by FIZ CHEMIE Berlin
* Use ACS Shuttle #2

 

Tuesday, March 29

CINF Luncheon
Anaheim Marriott, Platinum Room 7
12:00 PM – 1:30 PM
Sponsored exclusively by RSC Publishing
* Ticketed event
Speaker: Richard Walter - Jack the Ripper, Unveiled
Richard Walter, co-author of Profiling Killers: A Revised Classification Model for Understanding Sexual Murder and co-founder of the Vidocq Society, is featured in "The Murder Room : The Heirs of Sherlock Holmes Gather to Solve the World's Most Perplexing Cold Cases" by Michael Capuzzo.

CINF Reception
Room 204 B, Anaheim Convention Center
6:30 PM – 8:30 PM
Reception hosted by the ACS Division of Chemical Information
* Cash bar

 

Wednesday, March 30

CINF-CIC Collaborative Working Group
204 A, Anaheim Convention Center
12:00 PM – 5:00 PM

Technical Program Listing

CINF Symposia

ACS Chemical Information Division (CINF)
Spring, 2011 ACS National Meeting
Anaheim, CA (March 27-31)

R. Bienstock, Program Chair

SUNDAY MORNING

Section A
Anaheim Convention Center
213 C

50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson - Cosponsored by ORGN
M. Walker, Organizer, Presiding
9:00   Introductory Remarks.
9:10 1 James Hendrickson: A life-long quest for systematizing organic synthesis.
G. Grethe Abstract
10:00 2 Reaction classification, an enduring success story.
V. Eigner- Pitto, H. Kraut, H. Saller, H. Matuszczyk, P. Loew, G. Grethe Abstract
10:30   Intermission.
10:40 3 Back to the future of synthesis planning: How new technology and new resources revitalize the vision of computer aided synthesis design.
J. Law, M. Mirzazadeh, A. P. Cook, O. Ravitz, P. A. Johnson, A. Simon Abstract

SUNDAY AFTERNOON

Section B
Anaheim Convention Center
211 B

Integration of Combinatorial Chemistry with Cheminformatics: Current Trends and Future Directions in Drug Discovery and Material Science
J. Medina-Franco, Organizer
M. Haranczyk, Organizer, Presiding
1:00   Introductory Remarks.
1:05 4 Experimental design for high throughput materials development.
J. N. Cawse Abstract
1:30 5 High-throughput strategies for synthesis and characterization of metal- organic frameworks for CO2 capture.
K. Sumida Abstract
1:55 6 Combinatorial library design revisited: Finding new uses for old tools.
D. K. Agrafiotis, V. S. Lobanov Abstract
2:20 7 How to screen 10^14 cores per second.
P. S. Shenkin, K. P. Lorton Abstract
2:45   Intermission.
3:00 8 Synergies of combinatorial chemistry and fragment-based drug design for efficient generation of focused virtual libraries.
L. Meireles, G. Mustata, I. Bahar Abstract
3:25 9 Chemical library design: From diversity, similarity, and multicriterion optimization to a versatile cheminformatics content management system (CCMS).
W. Zheng Abstract
3:50 10 Six years of collaborative drug discovery in the cloud.
B. Bunin, S. Ekins, M. Hohman, K. Gregory, B. Prom, S. Ernst Abstract
4:15 11 Managing giant combinatorial chemistry spaces in silico.
C. Detering, H. Claussen, M. Lilienthal, C. Lemmen Abstract


Section A
Anaheim Convention Center
213 C

50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson - Cosponsored by ORGN
M. Walker, Organizer, Presiding
1:30 12 Toward the ideal synthesis: The role of step economy and function oriented synthesis in first-in-class approaches to HIV eradication, overcoming cancer resistance and treating Alzheimer's disease.
P. A. Wender Abstract
2:20 13 Aiming for the ideal synthesis.
P. S. Baran Abstract
3:10   Final introduction.
3:25 14 Half a century of computers in chemistry.
J. B. Hendrickson Abstract

SUNDAY EVENING

Section A

CINF Scholarship for Scientific Excellence Financially supported by Accelrys
G. Grethe, Organizer
6:30 - 8:30
  15 Exhaustive docking protocol with SAR-based pose selection.
F. Klepsch, G. F. Ecker Abstract
  16 Comparison of weighted and unweighted consensus approaches in QSAR/QSPR.
D. Zhuang, A. Lee, R. Fraczkiewicz, M. Waldman, B. Clark, W. Woltosz Abstract
  17 When is chemical similarity significant? The statistical distribution of chemical similarity scores and its extreme values.
P. Baldi, R. J. Nasr Abstract
  18 Reaction prediction as ranking molecular orbital interactions.
M. A. Kayala, C. A. Azencott, J. H. Chen, P. Baldi Abstract
  19 Re-examining the tubulin-binding conformation of antitumor epothilones using QSAR and crystallographic refinement.
S. A. Johnson, A. J. Smith, J. P. Snyder, K. N. Houk Abstract
  20 Efficient core structure searches using various fingerprinting methodologies: Advantages, particularities and pitfalls.
S. M. Furrer, D. J. Wild Abstract
  21 DockingDB: A cyberinfrastructure for computer-aided drug design based on ChemDB.
P. M. Rigor Abstract

MONDAY MORNING

Section A
Anaheim Convention Center
207 C

Natural Products and Drug Discovery: Chemiformatics and Computational Chemistry
R. Bienstock, Organizer
X. Wang, Organizer, Presiding
8:30   Introductory Remarks.
8:35 22 Protein Fold Topology: Will it aid drug discovery or is it the reason natural products have drug properties?
R. J. Quinn, E. Kellenberger Abstract
9:05 23 Screening of herbs used in traditional Indonesian medicine for inhibitors of aldose reductase.
D. Barlow, S. Naeem, P. Hylands Abstract
9:35 24 Common cold and flu: Computational strategies for the identification of antiviral leads from nature.
J. M. Rollinger, J. Kirchmair, U. Grienke, D. Schuster, K. R. Liedl, M. Schmidtke Abstract
10:05   Intermission.
10:20 25 Chemoinformatic analysis of natural products: Towards the discovery of DNA methyltransferase inhibitors of natural origin.
J. Medina-Franco, F. López- Vallejo, R. Guha, A. Bender, D. Kuck, F. Lyko Abstract
10:50 26 Lessons from covalent inhibitor modeling.
O. Eidam, S. Bonazzi, S. Guttinger, J. Wach, I. Zemp, U. Kutay, K. Gademann Abstract

Section B
Anaheim Convention Center
202 A

Open Data Open Data-, Open Science-, Open Knowledge- Financially supported by Chemical Structure Association Trust
P. Rusch, Organizer
I. Sens, Organizer, Presiding
9:00   Introductory Remarks.
9:10 27 Open Data and the Panton Principles.
P. Murray-Rust Abstract Presentation
9:35 28 Making priors a priority.
M. D. Segall, A. Chadwick Abstract  Presentation (pdf)
10:00   Intermission.
10:10 29 Ensuring sustainability of a comprehensive and highly curated scientific data resource.
I. J. Bruno, C. R. Groom Abstract  Presentation (pdf)
10:35 30 Visual search in scientific research data.
I. Sens, O. Koepler Abstract

MONDAY AFTERNOON

Section A
Anaheim Convention Center
204 C

Natural Products and Drug Discovery: Cheminformatics and Computational Chemistry
X. Wang, Organizer
R. Bienstock, Presiding
1:30 31 Specific targeting of the G-quadruplex in the c-Myc promoter with ellipticine.
T. A. Brooks, V. Gokhale, R. Brown, L. H. Hurley Abstract
2:00 32 Exploring natural products for drug discovery by mining biomedical information resources.
N. Baker, N. Rice, D. Fourches, E. Muratov, A. Tropsha Abstract
2:30 33 In silico strategies in natural product research to combat inflammation and lifestyle diseases: Identification of FXR-inducing triterpenes from Ganoderma lucidum.
U. Grienke, J. Mihály-Bison, D. Schuster, D. Guo, B. R. Binder, G. Wolber, H. Stuppner, J. M. Rollinger Abstract
3:00   Intermission.
3:15 34 Discovery of natural product-derived 5HT-1A receptor binders by QSAR modeling of known inhibitors, virtual screening and experimental validation.
X. S. Wang Abstract
3:45 35 Traditional medicine patents lead to enhanced drug discovery derived from natural products.
J. Zabilski, R. Schenck Abstract

Section B
Anaheim Convention Center
204 A

Data Archiving, E-Science, and Primary Data
R. McFarland, N. Xiao, Organizers
L. Solla, Organizer, Presiding
1:30   Introductory Remarks.
1:40 36 Librarian2.0: Synthesizing data management and subject expertise.
B. Blanton-Kent, S. Lake, A. Sallans Abstract
2:05 37 Anatomy of a PubChem project.
S. Swamidass, B. Calhoun, M. Browning Abstract
2:30 38 Evolution of the University of Minnesota Libraries' approach to e- scholarship.
M. Lafferty, L. Johnston Abstract
2:55   Intermission.
3:05 39 Hosting a compound centric community resource for chemistry data.
A. J. Williams, V. Tkachenko, R. Kidd Abstract
3:30 40 Library data services in the social sciences: Lessons for science?
K. Peter Abstract
3:55 41 Using Data Curation Profiles (DCPs) as a means of raising data management awareness.
J. R. Garritano Abstract

MONDAY EVENING

Section A
Anaheim Convention Center
Hall B

Sci-Mix
R. Bienstock, Organizer
8:00 - 10:00 16. See previous listings.
  42 Synthesis of 3-halo-2-butanones.
J. Porter Abstract
  43 Visualizing molecule similarity.
K. Boda Abstract

TUESDAY MORNING

Section A
Anaheim Convention Center
204 A

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
8:25   Introductory Remarks.
8:30 44 Collaborative agile Internet projects: The Green Chain Reaction.
P. Murray-Rust, S. E. Adams, L. Hawizy, D. M. Jessop Abstract
9:10 45 Re-imagining scientific communication for the 21st century: Is chemistry low hanging fruit or the worst-case scenario?
C. Neylon Abstract
9:50 46 Quixote: An Internet project to build a distributed Open Knowledgebase for quantum chemistry.
P. Murray-Rust, J. Thomas, P. Echenique, J. Estrada, M. D. Hanwell, S. E. Adams, W. Phadungsukanan, L. Westerhoff Abstract
10:30   Intermission.
10:40 47 Catching the mobile wave.
S. M. Muskal Abstract
11:20 48 Chemistry in your pocket: Shrinking cheminformatics applications for mobile devices.
A. M. Clark Abstract

CINF LUNCHEON

Anaheim Marriott
Platinum Room 7
12:00 PM – 1:30 PM (Ticketed Event)

TUESDAY AFTERNOON

Section A
Anaheim Convention Center
204 A

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
1:30 49 chemicalize.org: Adding chemistry to Web pages and predicted data and links to structures.
A. Allardyce, A. Stracz, D. Bonniot, F. Csizmadia Abstract
2:10 50 Using Campus Guides for leveraging Web 2.0 technologies and promoting the chemistry and life sciences information resources.
S. Baykoucheva Abstract
2:50   Intermission.
3:00 51 How the web has weaved a web of interlinked chemistry data.
A. J. Williams Abstract
3:40 52 What is the Internet doing to chemistry and our brains?
S. Heller Abstract

WEDNESDAY MORNING

Section A
Anaheim Convention Center
204 B

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
8:30 53 Bridging the gap: Publishing and consuming the scientific literature in a digital, device-agnostic world.
D. P. Martinsen Abstract
9:10 54 Open access in chemistry: Information wants to be free?
J. Kuras, B. Vickery, D. Kahn Abstract
9:50   Intermission.
10:00 55 OpenTox: An open-source web-service platform for toxicity prediction.
D. A. Gallagher, B. Hardy, S. Chawla Abstract
10:40 56 CAS Registry: Maintaining the gold standard for chemical substance information.
R. Schenck, J. Zabilski Abstract
11:20 57 Evolution of the science journal and the chemical publication.
H. S. Rzepa Abstract

Section B
Anaheim Convention Center
201C

General Papers
R. Bienstock, Organizer, Presiding
9:00   Introductory Remarks.
9:05 58 Collaborative QSAR analysis of Ames mutagenicity.
E. Muratov, D. Fourches, A. Artemenko, V. Kuz'min, G. Zhao, A. Golbraikh, P. Polischuk, E. Varlamova, I. Baskin, V. Palyulin, N. Zefirov, L. Jiazhong, P. Gramatica, T. Martin, F. Hormozdiari, P. Dao, C. Sahinalp, A. Cherkasov, T. Oberg, R. Todeschini, V. Poroikov, A. Zaharov, A. Lagunin, D. Filimonov, A. Varnek, D. Horvath, G. Marcou, C. Muller, L. Xi, H. Liu, X. Yao, K. Hansen, T. Schroeter, K. Muller, I. Tetko, I. Sushko, S. Novotarskyi, N. Baker, J. Reed, J. Barnes, A. Tropsha Abstract
9:25 59 How (not) to build a toxicity model.
A. C. Lee, R. Clark, M. Waldman, J. Chung, R. Fraczkiewicz, W. S. Woltosz Abstract
9:45 60 Metabolic site prediction using artificial neural network ensembles.
M. Waldman, R. Fraczkiewicz, J. Zhang, R. D. Clark, W. S. Woltosz Abstract
10:05 61 Withdrawn.
10:25   Intermission.
10:35 62 Use and results of using an online chemistry laboratory package in a large general chemistry course.
R. L. Nafshun Abstract
10:55 63 Reaction prediction as ranking molecular orbital interactions.
M. A. Kayala, C. A. Azencott, J. H. Chen, P. Baldi Abstract

WEDNESDAY AFTERNOON

Section A
Anaheim Convention Center
204 B

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
1:40 64 Automated semantic data embargo and publication by the CLARION project.
S. E. Adams, N. Day, J. Downing, B. Brooks, P. Murray-Rust Abstract
2:20 65 Chemical eCommerce.
K. Gubernator Abstract
3:00   Intermission.
3:10 66 Waiting on the Chemical Internet.
S. M. Bachrach Abstract
3:50 67 Rapid dissemination of chemical information for people and machines using Open Notebook Science.
J. Bradley, A. S. Lang Abstract

 

Abstracts

CINF Symposia

ACS Chemical Information Division (CINF)
Spring, 2011 ACS National Meeting
Anaheim, CA (March 27-31)

R. Bienstock, Program Chair

SUNDAY MORNING

Section A
Anaheim Convention Center
213 C

50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson - Cosponsored by ORGN
M. Walker, Organizer, Presiding
9:00   Introductory Remarks.
9:10 1 James Hendrickson: A life-long quest for systematizing organic synthesis.
G. Grethe
Self employed, 352, CA, United States

During his long academic tenure, James Hendrickson was interested in applying logic and systematic characterization of molecules and reactions to organic synthesis. Starting in the early 70's, his work gradually evolved from a mathematical presentation of the structural and functional features of molecules and their reactions to the development of systematic signatures for organic reactions. In this presentation we will discuss the individual steps along the way illustrated by examples. Some recent developments by other groups in the area of reaction classification will be mentioned
10:00 2 Reaction classification, an enduring success story.
V. Eigner- Pitto, H. Kraut, H. Saller, H. Matuszczyk, P. Loew, G. Grethe
InfoChem GmbH, Munich, Germany; None, United States

Beginning in the late 1980s InfoChem started to develop a deep understanding of the storage and handling of chemical structure and reaction information. The first major project was the development of an electronic version of the printed abstract series “ChemInform” published by FIZ CHEMIE Berlin. Then in 1989 InfoChem acquired an exclusive license to a reaction database (SPRESI) of (initially) 2.3 million records. Since the reaction database management systems (REACCS and ORAC) commercially available at that time could not handle more than 500,000 records, InfoChem was forced to conceive a concept for the selection of meaningful subsets of SPRESI. Based on a high quality reaction center detection module, InfoChem's sophisticated reaction type classification application, “Classify”, remains unique to this day. This concept allowed the generation of widely used reaction type databases such as ChemReact (400,000 reaction types) and ChemSynth (100,000 reaction types). Classify also enables reaction type searching, and clustering of reaction databases, and, in particular, it is the only way of linking different reaction databases. The world's major vendors of chemical information have adopted this technology to enhance the reaction retrieval capabilities of their products. More recent developments at InfoChem have resulted in a processing tool for detecting name reactions in any reaction database, and the retrosynthesis tool ICSYNTH, both of which are based on the company's earlier fundamental work. This talk will briefly present the background and technology of these software modules and their efficient use in the field of modern reaction planning.
10:30   Intermission.
10:40 3 Back to the future of synthesis planning: How new technology and new resources revitalize the vision of computer aided synthesis design.
J. Law, M. Mirzazadeh, A. P. Cook, O. Ravitz, P. A. Johnson, A. Simon
SimBioSys Inc., Toronto, Ontario, Canada; School of Chemistry, University of Leeds, Leeds, United Kingdom

Sophisticated systems like LHASA and SYNGEN were regarded in the late 1980's as a great promise to the field of organic synthesis. Their intent, as Hendrickson stated, was “not to replace art ¼ but to show where real art lies”. Sparked by the introduction of retrosynthetic analysis, the newborn field of computer aided synthesis design proved that chemical perception and synthetic thinking can be formulated in an algorithmic fashion. However, the vision of routine use of such tools has not materialized, and research in that area came to a lull in the early 1990's. The major obstacle was the difficulty of generating high quality and up-to-date databases of synthetic transforms. We show how our retrosynthetic analysis system, ARChem, capitalizes on the advent of comprehensive reaction databases and the dramatic progress in computing capabilities to automatically generate expansive synthetic rule-sets, which pave the way to representation and application of synthetic strategies.

SUNDAY AFTERNOON

Section B
Anaheim Convention Center
211 B

Integration of Combinatorial Chemistry with Cheminformatics: Current Trends and Future Directions in Drug Discovery and Material Science
J. Medina-Franco, Organizer
M. Haranczyk, Organizer, Presiding
1:00   Introductory Remarks.
1:05 4 Experimental design for high throughput materials development.
J. N. Cawse
Cawse and Effect LLC, Pittsfield, MA, United States

High-throughput methods of chemical experimentation present a challenge to experimental planning. Experiments run in arrays of dozens to hundreds require rethinking of the classic methods of Design of Experiments. This talk will review the adaptation of classical methods and improvisation of new methods for high throughput systems. These methods are becoming more important as laboratories for chemistry and materials science are being equipped with the robots and high-speed analytical tools for the acceleration of research. In particular, the use of these methods for effective protection of a chemical patent will be discussed.
1:30 5 High-throughput strategies for synthesis and characterization of metal- organic frameworks for CO2 capture.
K. Sumida
Department of Chemistry, University of California, Berkeley, Berkeley, CA, United States; Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, United States

High-throughput methodologies are a tremendously versatile platform for the discovery of next-generation materials (metal-organic frameworks) for CO2 capture. However, the considerable impact that the reaction conditions employed in the synthetic step can have on the material properties results in a large number of synthetic trials, which result in tremendous quantities of data from powder X-ray diffraction and gas adsorption experiments. An ideal computational support system in this regard would allow rapid, automated identification of the highest performance materials, and provide feedback to the high-throughput synthetic step, such that the preparation of a material may be more rigorously optimized, and new target materials that might show high CO2 capture performance can be identified. Here, we discuss our overall progress towards this goal, and present a number of examples in which the system has been employed to discover the optimal synthetic conditions for the preparation of new metal-organic frameworks for CO2 capture.
1:55 6 Combinatorial library design revisited: Finding new uses for old tools.
D. K. Agrafiotis, V. S. Lobanov
Informatics, Johnson & Johnson Pharmaceutical Research & Development, LLC, Spring House, PA, United States

In the 15 years since our first publication on diversity analysis and library design, the field of combinatorial chemistry has traversed the entire length of the hype curve, from the initial excitement, to the peak of inflated expectations, to the trough of disillusionment, and finally to the plateau of productivity. Along the way, many of the tools that were originally developed for analyzing massive virtual libraries were either forgotten or adapted to the realities of modern pharmaceutical research. While the need to mine massive combinatorial libraries is no longer there, the tools have found a new life in supporting and automating smaller parallel synthesis efforts in lead generation and lead optimization. In this talk, we review some of these earlier technologies and describe their adaptation and integration in today's discovery workflows.
2:20 7 How to screen 10^14 cores per second.
P. S. Shenkin, K. P. Lorton
Schrodinger, New York, NY, United States

We describe Schrodinger's attachment-based core-hopping method and present results achieved using it. The method starts with a template compound in which core and side-chains are identified. The core is replaced by new cores from a library while maintaining side-chain positions as well as possible. No receptor is required, but if a docked pose is available, receptor interactions can be conserved. Several scores are computed. These include a synthesizability score as well as a score reflecting how well side-chain positions are maintained. A combination of GPU processing, multithreading, and automatic linker addition lead to an overall screening rate in excess of 1.0e14 unique cores per second.
2:45   Intermission.
3:00 8 Synergies of combinatorial chemistry and fragment-based drug design for efficient generation of focused virtual libraries.
L. Meireles, G. Mustata, I. Bahar
Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, United States

While combinatorial chemistry used to emphasize rapid synthesis and screening of large libraries of compounds, the current trend is to synthesize much smaller focused compound libraries. In this talk, we present our recently developed computational strategy that combines combinatorial chemistry and fragment-based drug design techniques, fragment linking and fragment growing, to generate focused virtual libraries more efficiently. Once combinatorial chemistry scaffolds are placed in the binding site, fragments can be grown from and/or linked to the scaffold side chains to maximize favorable interactions with the target protein. Different methods for placing the scaffold on the binding site will be discussed along with rules that are essential for effective filtering. One advantage offered by our strategy is that it can also be universally applied to design compounds that replicate onto combinatorial chemistry scaffolds the essential binding features of proteins, peptides and small molecules. The application of the methodology to designing inhibitors of c-Myc-Max protein interaction will be presented.
3:25 9 Chemical library design: From diversity, similarity, and multicriterion optimization to a versatile cheminformatics content management system (CCMS).
W. Zheng
Pharmaceutical Sciences, North Carolina Central University, Durham, North Carolina, United States

Combinatorial chemistry and high throughput screening research often involve the generation, storage and analysis of large datasets. These data are often complex and heterogeneous in nature. To enable the most efficient design of chemical libraries and biological assays, various computational methods have been developed in the past 15 years. More recent research in chemical genomics and systems chemical biology require the integration of different data sources and computational tools. For example, target family- and pathway- based library design may require information about biological targets and pathways. These requirements call for an integrated system that can organize data, models and computational tools in a flexible and extensible fashion. In this talk, I will first briefly review some concepts for library design, and then describe our effort to develop a flexible cheminformatics content management system (with tagging, sharing as well as user uploading of data and tools).
3:50 10 Six years of collaborative drug discovery in the cloud.
B. Bunin, S. Ekins, M. Hohman, K. Gregory, B. Prom, S. Ernst
Collaborative Drug Discovery (CDD, Inc.), Burlingame, CA, United States

Collaborative Drug Discovery hosts a widely used drug discovery data cloud platform with advanced collaborative capabilities for distributed researchers. The CDD Vault, Collaborate, and Public together host private, collaborative (selectively shared), and public data spanning the competitive, precompetitive, and neglected disease domains including publicly disclosed collaborations with GlaxoSmithKline, Pfizer, and the Bill & Melinda Gates Foundation, as well as with hundreds of academic and biotech startup companies. CDD provides a novel, collaborative approach for integration experimental and computational screening with distributed data collection, storage, visualization and analysis - balancing privacy-security with encouraging collaborations, when desired. Experiences will be shared with researchers using the “CDD Vault” - a secure, private industrial-strength database combining traditional drug discovery informatics (registration and SAR) with social networking capabilities. CDD Collaborate enables real-time collaboration by securely exchanging selected confidential data. Traditional drug discovery capabilities include the ability to import/export to ExcelÔ and sdfiles, Boolean queries for potency, selectively, and therapeutic windows for small molecule enzyme, cell, and animal data, substructure and Tanimoto similarity search, physical chemical property search, as well as IC50 calculation/curve generation, heat-maps, and Z/Z' statistics for archived data (protocols, molecules, plates, hyperlinked files). CDD Public has unique, constantly growing drug discovery SAR content.
4:15 11 Managing giant combinatorial chemistry spaces in silico.
C. Detering, H. Claussen, M. Lilienthal, C. Lemmen
BioSolveIT, Sankt Augustin, NRW, Germany

We will introduce a method which catches the two aforementioned two birds (chemcial complexity and chemical universe) with one stone: by cleverly searching a fragment space on the fly without the need to enumerate compounds, the computational overhead is kept to a minimum, and thus, search times are low (minutes for 1010 molecules). Secondly, if the fragment space is composed of the inhouse available chemistry, results obtained are much more likely to be synthesizable, as the chemical reaction protocol is automatically delivered together with the hits. We will show a few validation cases from the industry, and look at the properties of one publicly available fragment space which contains 12 billion molecules.

Section A
Anaheim Convention Center
213 C

50 Years of Computers in Organic Chemistry: Symposium in Honor of James B. Hendrickson - Cosponsored by ORGN
M. Walker, Organizer, Presiding
1:30 12 Toward the ideal synthesis: The role of step economy and function oriented synthesis in first-in-class approaches to HIV eradication, overcoming cancer resistance and treating Alzheimer's disease.
P. A. Wender
Department of Chemistry, Stanford University, Stanford, CA, United States

Jim Hendrickson has had a major impact on how we think about synthesis. He was also an inspiring influence of my early career. Evolving from that time are programs in our group directed at the eradication of HIV (Science 2008,649), overcoming resistant cancer (PNAS 2008 12128, the major cause of chemotherapy failure) and novel strategies for treating Alzheimer's disease (Neurobiology of Disease 2009, 332). A major aspect of these programs is the singular importance of step economy in synthesis and how that can be achieved by computational analysis, new reactions and function oriented synthesis (Accounts 2008 40). In this lecture we will show three case studies of how step economy provides a key to addressing major therapeutic challenges of our time.
2:20 13 Aiming for the ideal synthesis.
P. S. Baran
Department of Chemistry, Scripps Research Institute, La Jolla, CA, United States

Our laboratory is focused on the practical total synthesis of complex natural products such as alkaloids and terpenes by aiming to achieve the “ideal synthesis”. Hendrickson defined such a synthesis in 1975, stating: ”The ideal synthesis creates a complex molecule . . . . . in a sequence of only construction reactions involving no intermediary refunctionalizations, leading directly to the target, not only its skeleton but also its correctly placed functionality.” (JACS 1975, 97, 5784). In order to achieve this level of efficiency one must minimize superfluous refunctionalization steps such as protecting group and non-strategic redox chemistry. Such considerations require exquisite control of chemoselectivity by the invention of chemistry and logical frameworks to aid in the planning of such routes. This invention-oriented approach to total synthesis will be illustrated with several case studies from our laboratory.
3:10   Final introduction.
3:25 14 Half a century of computers in chemistry.
J. B. Hendrickson
Department of Chemistry, Brandeis University, Waltham, MA, United States

My half-century of chemistry and computers may be divided into three areas. The first was to calculate the lowest-energy conformations of the 6-10-membered cycloalkane rings, and then their pseudorotation energies, to assist in synthesis planning. The second area was to define a process to seek the optimal plans for efficient synthesis design. We developed a process to find just the few shortest synthesis routes to any input target structure and this has resulted in the SynGen program. This effort led to the third area, the development of a general system to afford a unique, linear string to describe any organic reaction, defined by its input reactant and product structures, irrespective of mechanism or number of operational steps in the reaction. This has afforded a program to assign a unique “signature” for any given reaction and has the important feature of providing searchable indexing for any reaction database.

SUNDAY EVENING

Section A

CINF Scholarship for Scientific Excellence Financially supported by Accelrys
G. Grethe, Organizer
6:30 - 8:30
15 Exhaustive docking protocol with SAR-based pose selection.
F. Klepsch, G. F. Ecker
Department of Medicinal Chemistry, University of Vienna, Vienna, Austria

The polyspecific nature of the transmembrane drug efflux pump P-glycoprotein (P-gp) represents a great impediment for standard docking protocols. Furthermore, a ~6000 Å3 large transmembrane binding cavity, consisting of several binding sites, the high flexibility of P-gp and the lack of structural information render the correct ranking of docking poses a quite challenging task. Thus, we present a docking protocol that combines exhaustive conformational sampling of propafenone-type P-gp inhibitors with common scaffold clustering and SAR-based pose selection. The resultant binding hypotheses are in agreement with experimental data, which strengthens the validity of this approach. Analogous protocols were performed with other membrane proteins, like the GABAA receptor and the serotonin transporter. We acknowledge financial support provided by the Austrian Science Fund, grant F03502.
16 Comparison of weighted and unweighted consensus approaches in QSAR/QSPR..
D. Zhuang, A. Lee, R. Fraczkiewicz, M. Waldman, B. Clark, W. Woltosz
Life Science, Simulations Plus, Inc, Lancaster, CA, United States

Two flavors of making consensus categorical predictions in QSAR/QSPR, 'unweighted consensus' and the 'weighted consensus' approaches, were compared with several datasets using ADMET Predictor(TM). While the unweighted method gives equal weight to every member model, the weighted implicitly assigns different weights to the outcomes of its member models. To find out if there is any benefit of using one approach over the other, we constructed several datasets, which have different structural characteristics (balanced, imbalanced, diverse, non-diverse, and etc.), and built predictive models from them. The performances of the two approaches on these datasets were compared head-to-head using paired t-test. Our results show that the performances of the two approaches on the selected datasets are statistically equal, and thus in general there is no clear advantage of using one approach over the other. Possible reasons for the observation will be discussed.
17 When is chemical similarity significant? The statistical distribution of chemical similarity scores and its extreme values.
P. Baldi, R. J. Nasr
Department of Computer Science, University of California, Irvine, Irvine, CA, United States

As repositories of chemical molecules continue to expand and become more open, it becomes increasingly important to develop tools to search them efficiently and assess the statistical significance of chemical similarity scores. Here, we develop a framework for modeling, predicting, and approximating the distributions of chemical similarity scores and their extreme values in large databases. From the distributions of the scores and their analytical forms, Z-scores, E-values, and p-values are derived to assess the significance of similarity scores. In addition, the framework also allows one to predict the value of standard chemical retrieval metrics, such as sensitivity and specificity at fixed thresholds, or receiver operating characteristic (ROC) curves at multiple thresholds, and to detect outliers in the form of atypical molecules. Numerous and diverse experiments that have been performed, in part with large sets of molecules from the ChemDB, show remarkable agreement between theory and empirical results.
18 Reaction prediction as ranking molecular orbital interactions.
M. A. Kayala, C. A. Azencott, J. H. Chen, P. Baldi
Department of Computer Science, University of California, Irvine, Irvine, CA, United States

Being able to predict the course of chemical reactions is essential to the practice of chemistry. While computational approaches to this problem have been extensively studied in the past, a fast, accurate, and scalable solution has yet to be described. Here, we propose a novel formulation of reaction prediction as a machine learning ranking problem: given a set of molecules and a description of conditions, learn a ranking over potential filled to unfilled molecular orbital (MO) interactions approximating the corresponding transition state energy ranking. Using an existing rule-based expert system (ReactionExplorer), we derive restricted chemistry dataset consisting of 1300 full multi-step reactions with 2200 distinct starting materials and intermediates. This yields 3600 predicted MO interactions and 14 million unpredicted MO interactions. A two-stage machine learning scheme is used to learn the model. First, we train reactive site predictors using a combination of topological and real-valued global features to filter out 61% and 44% of non-predicted filled and unfilled MOs with a 0.0001% error rate. Then various ranking models are trained on the MO interactions using features engineered to approximate transition state entropy and enthalpy. Using cross-validation, current best models recover a perfect-ranking 61% of the time and recover a within-4-ranking 95% of the time.
19 Re-examining the tubulin-binding conformation of antitumor epothilones using QSAR and crystallographic refinement.
S. A. Johnson, A. J. Smith, J. P. Snyder, K. N. Houk
Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, United States; Department of Chemistry, Emory University, Atlanta, GA, United States

Several different bioactive conformations of epothilones, potent anti-tumor compounds, have been reported in the literature. We proposed to provide additional support to one of these conformations using a QSAR-based approach. By assuming a common pharmacophore for a set of epothilone analogs, we clustered conformations of these analogs using dihedral angles responsible for orienting functional groups with known SAR effects. We identified clusters common among the most active compounds, and developed simple QSAR models that relate the experimental IC50 values to the conformational strain energy. The resulting epothilone conformer that minimizes strain energy in the active epothilone analogs is different from previously proposed conformers. This conformation demonstrates good agreement when refined in the experimental electron crystallographic density for tubulin-bound epothilone.
20 Efficient core structure searches using various fingerprinting methodologies: Advantages, particularities and pitfalls.
S. M. Furrer, D. J. Wild
School of Informatics and Computing, Indiana University, Bloomington, IN, United States; Science & Technology, Givaudan Flavors Corp, Cincinnati, OH, United States

The complexity of medicinal chemistry patent applications as well as the number of compounds enumerated as examples was increasing spectacularly in recent years. Finding the structures of major interest using traditional methods is often a difficult task. Molecular fingerprinting methods are excellent tools to rapidly organize chemical information. Different fingerprinting methods however represent structural characteristics in different ways. Multiple fingerprinting methodologies were evaluated in their capacity to differentiate and isolate core compounds in chemical patents. It was found that the fingerprint designs as well as medicinal chemistry approaches have significant impact on the overall performance: different tools shed different "lights" over the molecular landscape. Modal fingerprints were investigated to focus on core compounds in patents, through a relative over-expression of co-occurring molecular features. Concrete examples will be given based on several major patent cases.
21 DockingDB: A cyberinfrastructure for computer-aided drug design based on ChemDB.
P. M. Rigor
School of Information and Computer Sciences, University of California in Irvine, Irvine, CA, United States

Although there are several open-source and commercially available computational tools for virtual high-throughput drug screening -- including DOCK, Autodock and Schroedinger's Maestro; there is still a lack of a more general, tool-agnostic and scalable framework that is able to leverage the advantages offered by readily available docking and molecular dynamics programs in a high-performance computing (HPC) environment. We have developed a cyber-infrastructure built on top of an HPC pipeline and existing proteomics and chemical informatics tools -- such as ChemDB and SCRATCH -- to support an iterative computer-aided drug design methodology. We have applied our approach to two biological problems and describe preliminary results. Moreover, growing extensions to the pipeline and related tools are discussed.

MONDAY MORNING

Section A
Anaheim Convention Center
207 C

Natural Products and Drug Discovery: Chemiformatics and Computational Chemistry
R. Bienstock, Organizer
X. Wang, Organizer, Presiding
8:30   Introductory Remarks.
8:35 22 Protein Fold Topology: Will it aid drug discovery or is it the reason natural products have drug properties?
R. J. Quinn, E. Kellenberger
Eskitis Institute, Griffith University, Brisbane, Queensland, Australia; Université de Strasbourg, Illkirch, France

Natural products are made by nature through interacting with biosynthetic enzymes. Natural products also exert their effect as drugs by interaction with proteins. We have explored the question does the recognition of the natural product by biosynthetic enzymes translate to recognition of the therapeutic target. Molecular modeling of flavonoid biosynthetic enzymes and protein kinases with a series of natural product kinase inhibitors led to the development of the concept of Protein Fold Topology (PFT). PFT describes cavity recognition points unrelated to protein fold similarity. The topology or spatial properties are preserved even though there is deformation of the protein elements that participate in the protein-ligand interactions. We observe helices or Β-sheets as equivalent in providing the invariant topology for protein-ligand interaction and, as such, are seeking to find automated methods to interrogate these interactions.
9:05 23 Screening of herbs used in traditional Indonesian medicine for inhibitors of aldose reductase.
D. Barlow, S. Naeem, P. Hylands
Pharmacy, King's College London, London, London, United Kingdom

Virtual screening of phytochemical constituents of herbs used in traditional Indonesian medicine has been performed to search for novel leads active against the enzyme aldose reductase (AR). The screening was performed using the docking software, MolDock, and the activities (IC50s) of the docked compounds predicted using an artificial neural network (ANN) trained using the crystallographic data for AR complexes involving inhibitors of known potency. The ANN gave a mean accuracy of ~ 98% for the activities of those compounds involved in the known protein crystal structures. The trained ANN was used to predict the IC50s for all carboxyl containing compounds in the database of Indonesian herbal constituents, and the predicted IC50 values ranged from 17 nM to 118 mM. Selected hits were subsequently tested in vitro against human recombinant AR and while some of these proved to be about as active as predicted, others proved significantly less potent than predicted.
9:35 24 Common cold and flu: Computational strategies for the identification of antiviral leads from nature.
J. M. Rollinger, J. Kirchmair, U. Grienke, D. Schuster, K. R. Liedl, M. Schmidtke
Institute of Pharmacy and Center for Molecular Biosciences, University of Innsbruck, Innsbruck, Austria; Institute of Theoretical Chemistry and Center for Molecular Biosciences, University of Innsbruck, Innsbruck, Austria; Institute of Virology and Antiviral Therapy, Friedrich Schiller University, Jena, Germany
The search for new drug leads against respiratory viruses remains an area of active investigations. In this regard natural products offer a tremendous potential as source for antivirals. In our lab several virtual screening campaigns on 3D natural product databases such as pharmacophore searches, similarity-based approaches and docking have proven to be highly efficient for the target-oriented identification of bioactive candidates. Integration of these heuristic approaches with empirical ones, like ethnopharmacology and in vitro extract screening, are helpful strategies for prioritizing compounds to be isolated from natural sources and pharmacologically tested. Here we demonstrate the application of different in silico techniques for the discovery of new anti-rhinoviral and anti-influenza virus natural compounds using well defined molecular targets, such as the hydrophobic pocket in the rhinoviral capsid and the influenza virus neuraminidase.
10:05   Intermission.
10:20 25 Chemoinformatic analysis of natural products: Towards the discovery of DNA methyltransferase inhibitors of natural origin.
J. Medina-Franco, F. López- Vallejo, R. Guha, A. Bender, D. Kuck, F. Lyko
Torrey Pines Institute for Molecular Studies, Port St. Lucie, Florida, United States; NIH Chemical Genomics Center, Rockville, Maryland, United States; Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom; Division of Epigenetics, Deutsches Krebsforschungszentrum, Heidelberg, Germany

A comparative diversity analysis of natural products, drugs, the Molecular Libraries Small Molecule Repository (MLSMR), and combinatorial libraries is presented in this work. To this end, a multiple criteria strategy was employed including physicochemical properties, scaffolds and different fingerprints as molecular descriptors. The approach enabled a comprehensive analysis of property space coverage, the degree of overlap between collections, scaffold and structural diversity and overall structural novelty. Since several natural products contained in dietary products are implicated in the inhibition of DNA methyltransferases (DNMTs), which are emerging targets for the treatment of cancer, we conducted a docking-based virtual screening of a natural product database with a homology model of the catalytic domain of DNMT1. Herein we discuss the results of the virtual screening that represents a first step towards the systematic screening of compounds with natural origin targeting DNMTs.
10:50 26 Lessons from covalent inhibitor modeling.
O. Eidam, S. Bonazzi, S. Guttinger, J. Wach, I. Zemp, U. Kutay, K. Gademann
Chemical Synthesis Laboratory, EPFL, Lausanne, VD, Switzerland; Department of Pharmaceutical Chemistry, UCSF, San Francisco, CA, United States; Institut fur Biochemie, ETHZ, Zurich, ZH, Switzerland

Leptomycin B (LMB) has antifungal, antibacterial and anti-tumor activity and is an important “tool compound” in cell biology. It inhibits the export of certain proteins from the nucleus through specific alkylation of Cys528 of human CRM1. The recently published x-ray structure of CRM1 motivated us to model LMB to rationalize the activity of recently discovered LMB analogues. A manual modeling approach combined with all-atom energy minimizations was used. We found that modeling was largely guided by the structural environment, and steric and geometric restraints imposed both from the binding site and the ligand. Mechanistic considerations of covalent inhibitor binding highlight important residues in the binding site, and the internal energy of the ligand may play a crucial role in the binding mode of covalent inhibitors. Perhaps the most important lesson is that manual modeling can generate models useful for the design of future analogues.

Section B
Anaheim Convention Center
202 A

Open Data Open Data-, Open Science-, Open Knowledge- Financially supported by Chemical Structure Association Trust
P. Rusch, Organizer
I. Sens, Organizer, Presiding
9:00   Introductory Remarks.
9:10 27

Open Data and the Panton Principles.
P. Murray-Rust

Department of Chemistry, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom
Although an increasing amount of chemical data is becoming visible on the Internet it cannot be re-used without explicit permission to avoid potentially breaking copyright. The Open Knowledge Foundation and Science Commons have collaborated on a definition of Open Data and produced a set of principles and practices (Panton Principles) to help authors and publishers assert that their published data is truly Open. An example of fully Open Data is shown in Crystaleye http://wwmm.ch.cam.ac.uk/crystaleye with over 200,000 crystallographic datasets from the literature. Several publishers are adopting Panton, and this presentation will show the advantages of doing so.

Presentation

9:35 28

Making priors a priority.
M. D. Segall, A. Chadwick
Optibrium Ltd., Cambridge, United Kingdom; Tessella plc., Burton upon Trent, Staffs, United Kingdom

When we build a predictive model of a drug property we rigorously assess its predictive accuracy, but we are rarely able to address the most important question, “How useful will the model be in making a decision in a practical context?” To answer this requires an understanding of the prior probability distribution and hence prevalence of negative outcomes due to the property. We will illustrate the importance of the prior to assess the utility of a model to select or eliminate compounds for further investigation. A better understanding of the prior probabilities of adverse events due to key factors will improve our ability to make good decisions in drug discovery, finding higher quality molecules more efficiently. As the data necessary to estimate these priors does not include proprietary compound structures, this presents an opportunity for collaboration to improve the basis for good decision-making for all.

Presentation (pdf)

10:00   Intermission.
10:10 29

Ensuring sustainability of a comprehensive and highly curated scientific data resource.
I. J. Bruno, C. R. Groom
CCDC, Cambridge, Cambridgshire, United Kingdom

The Cambridge Crystallographic Data Centre (CCDC) has been established as the primary repository for the experimentally determined 3D structures of organic and organometallic compounds for over 45 years. Individual data sets are available to the scientific community free of charge through CCDC's structure request service. Additionally structures are made available as part of the Cambridge Structure Database (CSD). Structures in the CSD are expertly curated by editorial staff so as to facilitate reliable and sophisticated retrieval, visualisation and analysis by software that the centre also develops. The CSD and associated software is made available on a subscription basis with significant discounts applied for academic institutions. The income generated from subscriptions has ensured until now the sustainability of a comprehensive and highly curated scientific resource. This presentation will discuss the implications that increasing throughput and scientific complexity have for the way CCDC must operate, opportunities for alternative distribution models that respond to evolving expectations of the scientific community, and the pitfalls we must avoid to ensure sustainability in the years ahead.

Presentation (pdf)

10:35 30 Visual search in scientific research data.
I. Sens, O. Koepler
German National Library of Science and Technology, Hannover, Germany

In recent discussions among research institutions and research funding agencies, scientific research data has been identified as of strategic interests. As a consequence there are ongoing efforts to establish an infrastructure to support storage, long-term preservation, and accessing of scientific research data. Registration of datasets with DOI names makes research data citable and searchable. To date a number of operational Digital Library systems for scientific research data already exist. Datasets often comprise numeric data on continuous or discrete scales and are often associated with textual metadata including data description, author and origin information. While searching in textual metadata is commonly available a content-based access to the research data is an open challenge. Thereby visualisation and visual analysis of numeric data is common when processing scientific research data. To close this gap in the information retrieval process we report on a concept and first implementations to support visual retrieval and exploration in a specific class of primary research data, namely, time-oriented data. The concept discusses relevant challenges for a general approach to scientific primary data and we present first implementations on a real-world dataset.

MONDAY AFTERNOON

Section A
Anaheim Convention Center
204 C

Natural Products and Drug Discovery: Cheminformatics and Computational Chemistry
X. Wang, Organizer
R. Bienstock, Presiding
1:30 31 Specific targeting of the G-quadruplex in the c-Myc promoter with ellipticine.
T. A. Brooks, V. Gokhale, R. Brown, L. H. Hurley
College of Pharmacy, University of Arizona, United States; Arizona Cancer Center, University of Arizona, United States; BIO5 Institute, University of Arizona, United States

Previous studies have shown that the G-quadruplex in the c-Myc promoter is the silencer element for transcriptional control. More recent studies have shown the involvement of NM23-H2 and nucleolin in the activation and silencing of c-Myc transcription. Using a computational overlay of c-Myc G-quadruplex-binding compounds and virtual screening, we have identified ellipticine as a potential G-quadruplex-interactive compound. Then, by taking advantage of a Burkitt's lymphoma cell line in which only the non-translocated allele is under the direct control of the promoter containing the G-quadruplex, we were able to show that the c-Myc-lowering effect is directly due to interaction with the G-quadruplex. In follow-up studies using CADD we designed further ellipticine analogs. These studies provide the best available cellular evidence not only for the presence of G-quadruplex in the promoter elements of oncogenes such as MYC but also that inhibition of specific transcription can be mediated by small molecules that bind to this promoter element.
2:00 32 Exploring natural products for drug discovery by mining biomedical information resources.
N. Baker, N. Rice, D. Fourches, E. Muratov, A. Tropsha
Laboratory for Molecular Modeling, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, CHAPEL HILL, NORTH CAROLINA, United States; Laboratory of Theoretical Chemistry, Department of Molecular Structure, Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, Ukraine

Parallel screening of Natural Products (NPs) is a typical approach for identifying drug candidates and their targets. However, biomolecular targets of NPs are often discovered serendipitously. We report on the use of Chemotext, a database of assertions extracted from biomedical literature that link chemicals, targets, and diseases [J Biomed Inform 2010, 43:510-9] to rationalize the search for NP targets in the context of the Systems Chemical Biology paradigm [Nat Chem Biol 2007, 3:447-50]. We have identified similar biochemical pathways that NPs are known to interact with in both plants and humans. Through this analysis, we can deduce novel compound-target-disease associations as well as novel molecular targets for NP-derived compounds. Using Chemotext, we have collected and integrated cross-species NP-target associations. We present the case studies of Diabetes mellitus for predicting new compound-target interactions and Tacrolimus-Binding Proteins for detecting similar biochemical pathways in both plants and animals/humans.
2:30 33 In silico strategies in natural product research to combat inflammation and lifestyle diseases: Identification of FXR-inducing triterpenes from Ganoderma lucidum.
U. Grienke, J. Mihály-Bison, D. Schuster, D. Guo, B. R. Binder, G. Wolber, H. Stuppner, J. M. Rollinger
Institute of Pharmacy and Center for Molecular Biosciences, University of Innsbruck, Innsbruck, Austria; Center of Biomolecular Medicine and Pharmacology, Department of Vascular Biology and Thrombosis Research, Medical University of Vienna, Vienna, Austria; Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China

Farnesoid X receptor (FXR) is a ligand-activated transcription factor. The available structural information and the importance of FXR to control endogenous pathways related to inflammation and lifestyle diseases, like metabolic syndrome, dyslipidemia, atherosclerosis and type 2 diabetes renders FXR an attractive target for computational approaches. Virtual screenings of our in-house Chinese Herbal Medicine database with structure-based pharmacophore models revealed mainly triterpenes of the famous TCM fungus Ganoderma lucidum Karst. as putative FXR ligands. Ganoderma fruit body extracts verified the predicted FXR-inducing effect in a reporter gene assay which prompted us to determine its bioactive constituents. Five out of 25 secondary metabolites from G. lucidum, i.e. ergosterol peroxide, lucidumol A, ganoderic acid TR, ganodermanontriol, and ganoderiol F, dose-dependently induced FXR in the low micromolar range. To rationalize the binding interactions, additional molecular docking studies were performed, which allowed establishing a first structure activity relationship of the investigated triterpenes.
3:00   Intermission.
3:15 34 Discovery of natural product-derived 5HT-1A receptor binders by QSAR modeling of known inhibitors, virtual screening and experimental validation.
X. S. Wang
Department of Pharmaceutical Sciences, Howard University, Washington, DC, United States

The 5-Hydroxytryptamine receptor subtype 1A (5-HT1A) has been an attractive target to treat mood disorders such as anxiety and depression. In this study we have developed combinatorial Quantitative Structure-Activity Relationship (QSAR) models for 105 5-HT1A binders and 61 non-binders retrieved from the Psychactive Drug Screening Program (PDSP) Ki database. Three advanced methods, k-Nearest Neighbor (kNN), Random Forest (RF) and Support Vector Machine (SVM), were employed for model building. The robust QSAR models of 5-HT1A binders were then used to mine major natural product libraries such as the TimTec Natural Product Library (NPL) and Natural Derivatives Library (NDL). Multiple potential hits were identified and are currently examined by the PDSP for experimental validation. The success ratios, chemical diversities and structural novelties of the natural product libraries for the purpose of virtual screening were further explored in comparison with other types of screening libraries, i.e. drug-like libraries, targeted libraries and diversity libraries.
3:45 35 Traditional medicine patents lead to enhanced drug discovery derived from natural products.
J. Zabilski, R. Schenck
Content Planning, CAS, Columbus, OH, United States

Since ancient times natural products have provided relief from numerous aliments. Hippocrates, the father of modern medicine, noted that powder derived from the bark of the willow tree helped heal pain and headaches. In the 1800's, chemists isolated the beneficial substance as salicylic acid and refined it by buffering sodium salicylate with acetyl chloride to create acetylsalicylic acid or aspirin. In more recent years, Traditional Medicine patents have increasingly delved into rich vein of natural products for potential drug discovery. The CAS databases have mined this wealth by adding more than 50,000 new traditional patent records from several countries. This presentation will illustrate the vast content available and methods to easily explore it by using SciFinder or STN.

Section B
Anaheim Convention Center
204 A

Data Archiving, E-Science, and Primary Data
R. McFarland, N. Xiao, Organizers
L. Solla, Organizer, Presiding
1:30   Introductory Remarks.
1:40 36 Librarian2.0: Synthesizing data management and subject expertise.
B. Blanton-Kent, S. Lake, A. Sallans
University of Virginia Library, Charlottesville, VA, United States

The University of Virginia Library is working to support new data management requirements in science and engineering by developing a model that first draws upon close collaboration between data experts and subject librarians, and culminates in policy and infrastructure recommendations to the University's Office of the Vice President for Research (VPR) and the Office of the Vice President/Chief Information Officer (VP/CIO). This model begins with a data interview to assess the researcher's data management practices and needs and to establish a baseline awareness of current practice. After collecting this information, the results are furnished to the institutional repository team and NSF Data Management Plan working group to inform their processes. In aggregate form, this information is provided to the VPR and VP/CIO as policy and infrastructure recommendations. Ultimately, the entire process cycles back to the researcher. This presentation will offer a case study following a chemist/chemical engineer through this process.
2:05 37 Anatomy of a PubChem project.
S. Swamidass, B. Calhoun, M. Browning
Department of Pathology and Immunology, Washington University in St Louis, St Louis, MO, United States

More raw data from high-throughput screens is made available to the public every day, often through repositories like PubChem. This data, however, is often unorganized and incompletely annotated. Of particular interest, often several screens are components of a larger project. Each screen is a step in the project's workflow, its anatomy. Knowledge of the project's workflow includes non-obvious but valuable information. For instance, the scaffolds the project team chose to pursue and how exactly compounds were chosen for follow testing. Although, these details are not well annotated in PubChem projects, it is possible to infer them from the raw screening data using a collection of statistical techniques. Moreover, inferred workflows can be used to automatically discover additional active molecules, inform useful views of screening data, and identify methodological errors.
2:30 38 Evolution of the University of Minnesota Libraries' approach to e- scholarship.
M. Lafferty, L. Johnston
Science and Engineering Library, University of Minnesota, Minneapolis, MN, United States

Libraries have struggled with how best to respond to the challenges of e-science since the middle of the last decade. The University of Minnesota Libraries' approach to e-science and other cyberinfrastructure issues has changed multiple times since our initial response in 2006; it has primarily taken the form of groups rather than a dedicated position. We have more recently expanded our focus beyond e-science to e-scholarship in order to include areas such as the digital humanities. The talk will address the evolution of group structures and their primary emphases over the past 5 years, the rationales for different changes, and potential future directions.
2:55   Intermission.
3:05 39 Hosting a compound centric community resource for chemistry data.
A. J. Williams, V. Tkachenko, R. Kidd
ChemSpider, Royal Society of Chemistry, Wake Forest, NC, United States; Informatics, Royal Society of Chemistry, Cambridge, United Kingdom

Laboratories around the world continue to generate immense amounts of data that are non-proprietary and of value to the community. If available these data could dramatically reduce costs by minimizing rework and ultimately facilitating faster research. High quality reference data collections of chemical compound dictionaries, properties and spectra have been generated over many decades. With the advent of social networking tools and platforms such as Wikipedia, the community has an opportunity to contribute. The ChemSpider platform hosted by the Royal Society of Chemistry is a compound centric database with associated data. Already populated with almost 25 million unique compounds the community can deposit and host their own data, and curate and annotate existing data including those generated in Open Notebook Science Efforts. This presentation will provide an overview of progress to date and outline the vision of this community platform for chemistry and ensuring the longevity of chemistry reference data.
3:30 40 Library data services in the social sciences: Lessons for science?
K. Peter
University of Southern California Libraries, University of Southern California, Los Angeles, CA, United States

Social science data have a rich history within universities: aggregate statistical publications, such as Statistical Abstract of the United States, and even more detailed U.S. decennial census results, have long held a place within academic depository library collections. Following the development of Machine Readable Data Files, social science data archives were established within several universities across the United States—notably, the Inter-university Consortium for Political and Social Research and Roper Center for Public Opinion Research. Although differences between social science and science data are not insignificant (for example, average file size), as data librarians we face the similar obstacles to: outreach, access, archiving and management and, in general, effectively creating a place within libraries for data and data services. This presentation will outline current library services and service models for social science data in hopes of launching a dialog and skill-share between social science and sciences data professionals.
3:55 41 Using Data Curation Profiles (DCPs) as a means of raising data management awareness.
J. R. Garritano
Purdue University, West Lafayette, IN, United States

While one can discuss data management plans in a general sense, there is no single solution for managing the diverse data generated by various disciplines and projects. Therefore one possible solution is to determine best practices for individual data management plans guided by a more general Data Curation Profile (DCP). The DCPs were created at Purdue University and the University of Illinois Urbana-Champaign through a grant from the Institute of Museum and Library Services. Using a DCP, librarians and/or researchers explore various data management issues. Once a profile has been completed, not only will the librarian have a richer understanding of the kind and quantity of data that might have to be curated and archived, but the researcher will have a better understanding of their data preferences related to sharing and intellectual property, regardless of where the data ultimately resides. Current applications of the DCP at Purdue will be discussed.

MONDAY EVENING

Section A
Anaheim Convention Center
Hall B

Sci-Mix
R. Bienstock, Organizer
8:00 - 10:00 16. See previous listings.
42 Synthesis of 3-halo-2-butanones.
J. Porter
Transylvania University, United States

This study is attempting to find out the effect of adding a halide group to a ketone. The main molecules I worked with were 3-halo-2-butanones. I used ether as a solvent and performed Grignard reactions under nitrogen adding ethynyl Grignards as the nucleophiles. I was measuring diastereomeric ratios using GC-MS, H1 and C13 NMR, and GC. Unexpectedly, results showed that ratios were similar to those found using LiAlH4 as the nucleophile. Future experiments will be working with larger nucleophiles as well as using larger ketones.
43 Visualizing molecule similarity.
K. Boda
OpenEye Scientific Software, Santa Fe, New Mexico, United States

Similarity searching based on fingerprint similarity is one of the most common approach for virtual screening. The main advantages of the method that it provides a rapid calculation of similarity scores to identify molecules that are similar to the reference structure. However, most fingerprint methods does not provide any insight into molecule similarity beyond a single numerical score. The poster will represent a method where molecular graphs are highlighted using a color gradient scheme that emphasizes shared fragments encoded into fingerprints. This representation not only makes molecular similarity immediately apparent but also reveals information about the underlying fingerprint method. The method is utilized to analyze the hit-lists using different fingerprint methods on datasets of previously published benchmarks. The 2D graphics are generated using OpenEye's Ogham package that provides a framework to construct molecular diagrams. The poster will also represent various Ogham functionalities that allow the customization of molecule depiction.

TUESDAY MORNING

Section A
Anaheim Convention Center
204 A

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
8:25   Introductory Remarks.
8:30 44 Collaborative agile Internet projects: The Green Chain Reaction.
P. Murray-Rust, S. E. Adams, L. Hawizy, D. M. Jessop
Department of Chemistry, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom

An Open Science project was designed, implemented and completed within a month to investigate whether chemical reactions were using "greener" solvents than formerly. 10 volunteers wrote or implemented code to extract recipes from European patents. The recipes were analysed by OSCAR and chemical Natural Language processing using medium-depth parsing to extract solvents, with high precision. The volunteers crawled the patent website, analysed over 100,000 recipes and posted the results to a communal, Open server, using the Lensfield "make/build" philosophy. The solvent information was then aggregated and presented for the years 2000 to 2010. There is no obvious trend showing that "green" solvents are becoming commoner.
9:10 45 Re-imagining scientific communication for the 21st century: Is chemistry low hanging fruit or the worst-case scenario?
C. Neylon
ISIS Neutron Source, Science and Technology Facilities Council, Didcot, NON-US, United Kingdom

We are told that “the web changes everything” but scientific communication still owes more to the 17th century than to the 20th. The central problem with current practice is the view of “the paper” as a monolithic object, and the only form of communication that is rewarded. We need to both technically enable the publication of many different research objects and to create tools to aggregate these together into large narrative works that retain the structure and meaning of internal links. Along with this we need both technical and social infrastructure to help us filter and discover this large range of items. I will argue that chemistry, and in particular synthetic organic chemistry, is a special case with its own particular difficulties, but that the inherent structure and regularity of synthetic research makes it a good target for testing and demonstrating new approaches to scholarly communication.
9:50 46 Quixote: An Internet project to build a distributed Open Knowledgebase for quantum chemistry.
P. Murray-Rust, J. Thomas, P. Echenique, J. Estrada, M. D. Hanwell, S. E. Adams, W. Phadungsukanan, L. Westerhoff
Department of Chemistry, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom; Computational Science and Engineering Department, Science and Technology Facilities Council, Daresbury Laboratory, Daresbury, Cheshire, United Kingdom; Instituto de Química Física "Rocasolano", CSIC, Madrid, Spain; Department of Scientific Visualization, Kitware, Inc, Clifton Park, NY, United Kingdom; Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom; QuantumBio Inc., State College, PA, United States

Quixote is a distributed semantic knowledgebase for quantum chemistry deliberately prototyped within a month by distributed volunteers. It uses a wide range of existing Open Source tools such as from the Blue Obelisk collection and uses them to translate conventional QC files (log, punch, archive, input) into semantic form. The semantics are controlled by per-program dictionaries which are created by program experts. The process is controlled by and rests heavily on modern Internet approaches such as Etherpad, Skype, Wiki, REST, HTTP, RDF and SPARQL. Parsing is through ANTLR and recursive descent. Semantics are provided by namespaced dictionaries, elements and attributes allowing lossless transmission of information. The system is completely Open/free and allows anyone to clone and run a node, on a peer-to-peer system with as much or little security as desired.
10:30   Intermission.
10:40 47 Catching the mobile wave.
S. M. Muskal
Eidogen-Sertanty, Oceanside, CA, United States

With the explosive growth of mobile computing environments, including the iPhone, Android-based devices, the iPad, and its fast-followers, it has become important for scientific software companies to enable technology and content access on these ubiquitous devices. Coupled with cloud computing environments (e.g. Amazon's EC2 and RDS environments), these platforms represent the new frontier for scientific computing. We will describe both technical and business challenges and lessons learned as we developed our mobile apps - iKinase, iKinasePro, iProtein, and MobileReagents.
11:20 48 Chemistry in your pocket: Shrinking cheminformatics applications for mobile devices.
A. M. Clark
Molecular Materials Informatics, Montreal, Quebec, Canada

Internet resources are now a routine part of the workflow of a research chemist, and in recent years many of these services have been made accessible from ultra-portable devices such as smartphones and tablet computers. Efforts have been hampered by the need to draw chemical structures to access certain functionality, e.g. searching databases by structure. To a large extent mobile devices have been limited to use for content consumption. Implementing a chemical structure sketching interface on a tiny device is difficult, because the traditional paradigm requires an accurate pointing device, such as a mouse. A finger on a touchscreen is simply too clumsy for standard structure drawing techniques, and many devices lack a pointing device entirely. This presentation will describe a new approach to drawing 2D chemical structures, which reevaluates the traditional drawing techniques in order to make them work well with input-constrained devices. This is accomplished by using a high degree of automation and inference, which is provided by newly developed algorithms. The end result is a mobile application which can be used to create publication quality 2D sketches with a small number of steps, which is convenient to use on a variety of current smartphones and tablets, including BlackBerry, iPhone and iPad devices. Also discussed will be some of the internet-based applications which are possible now that a viable structure editor is available. With this hurdle removed, a large number of desktop-based cheminformatics applications can be migrated to smaller devices by splitting the interface between a mobile client and web-based services. Mobile devices can now be used for creating, managing, viewing and sharing chemical information.

TUESDAY AFTERNOON

Section A
Anaheim Convention Center
204 A

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
1:30 49 chemicalize.org: Adding chemistry to Web pages and predicted data and links to structures.
A. Allardyce, A. Stracz, D. Bonniot, F. Csizmadia
ChemAxon, Budapest, Hungary

chemicalize.org is a new free online service developed by ChemAxon which adds chemistry to Web pages as well as data and Web pages to structures. The primary use is to parse chemical names from Web page text and serve an annotated Web page version which includes structure images hyper-linked from the chemical name source. By storing structures and Web page URL's we can search the database to find those Web pages containing any given structure query. For each structure users can also generate structure based prediction results within a user customizable report, predictions include logP, pKa, logD etc. Current developments center around user profiles, 'tracking' structures in newly chemicalized pages and presenting chemicalize.org user activity to give a snapshot of current Web pages and structures that are interesting chemists online. This presentation will outline the aims of the development, describe the service, current developments and overview use and user feedback.
2:10 50 Using Campus Guides for leveraging Web 2.0 technologies and promoting the chemistry and life sciences information resources.
S. Baykoucheva
White Memorial Chemistry Library, University of Maryland, College Park, MD, United States

The introduction of Campus Guides and a “lighter” version of this program, Lib Guides, in the last few years has created many exciting opportunities for science librarians to promote the chemistry and life sciences information resources in a new way using multimedia and social networking tools. The flexibility and the wide range of solutions these programs provide have tempted librarians to use them in many innovative ways, which has not been possible to do in static web pages controlled by rigid rules and other external factors. This presentation will show how users have responded to the new dynamic information environment created with Campus Guides and what the statistical data show about their preferences toward particular information resources in chemistry and the life sciences.
2:50   Intermission.
3:00 51 How the web has weaved a web of interlinked chemistry data.
A. J. Williams
ChemSpider, Royal Society of Chemistry, Wake Forest, NC, United States

The internet has provided access to unprecedented quantities of data. In the domain of chemistry specifically over the past decade the web has become populated with tens of millions of chemical structures and related properties of assays together with tens of thousands of spectra and syntheses. The data have, to a large extent, remained disparate and disconnected. In recent years with the wave of Web 2.0 participation, any chemist can contribute to both the sharing and validation of chemistry-related data whether it be via Wikipedia, the online encyclopedia, or one of the multiple public compound databases. This presentation will offer a perspective of what is available today, our experiences of building a public compound database to link together the internet, and a suggested path forward for enabling even greater integration and connectivity for chemistry data for the masses to both use and participate in developing.
3:40 52 What is the Internet doing to chemistry and our brains?
S. Heller
NIST, Gaithersburg, Maryland, United States

The Internet, like any technology, has good, bad, and ugly sides to it. This lecture will attempt to talk about these aspects with examples in chemistry that should both enlighen and disturb.

WEDNESDAY MORNING

Section A
Anaheim Convention Center
204 B

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
8:30 53 Bridging the gap: Publishing and consuming the scientific literature in a digital, device-agnostic world.
D. P. Martinsen
American Chemical Society, Washington, DC, United States

Scientific publishing has seen a steady transition from the primarily paper-based model of the pre-2000 era to the digital world of the late 1990s and now the first decade of the 21st century. While usage analysis, as well as end-user studies, indicate that paper, or at least PDF files printed out on paper, are still the preferred way for most scientists to interact with the scholarly literature, there is a growing percentage of scientists who are asking for more. New data formats, new devices, and new applications present a challenge for publishers as well as authors and readers. Publishers try to keep up with the demands of authors and readers who want to push the technology, while at the same time addressing the more modest concerns of the majority of scientists who just want to get the article text and not be bothered with bells and whistles. While some call for a revolution in publishing, the reality is a much slower evolution. Publishers, authors, editors, reviewers, and readers all make inputs into the ecosystem, and each responds, sometimes in unexpected ways, to the changes that are made. As the journal of the future and the article of the future, emerge from the old models, it is useful to consider the impact of those changes.
9:10 54 Open access in chemistry: Information wants to be free?
J. Kuras, B. Vickery, D. Kahn
Chemistry Central, London, United Kingdom

The open access (OA) publishing movement was motivated by a desire to increase visibility and dissemination of scientific information. Electronic publishing and the advent of the Internet helped establish and accelerate the growth of OA in the early 2000s. Acceptance and uptake was significant amongst e.g. the high-energy physics and biomedical research communities as demonstrated by the success of initiatives such as ArXiv, BioMed Central, and the Public Library of Science. In chemistry, the growth of OA has been more conservative. This presentation will review the development of OA in chemistry, examine the current situation with reference to recent studies, and look forward to future directions in particular with the emergence of other open data initiatives and Web technologies.
9:50   Intermission.
10:00 55 OpenTox: An open-source web-service platform for toxicity prediction.
D. A. Gallagher, B. Hardy, S. Chawla
CAChe Research LLC, Beaverton, Oregon, United States; Douglas Connect, Zeiningen, Switzerland; Seascape Learning LLC, Cuppertino, California, United States

The new European Union (EU) REACH chemical legislation will require 3.9 million additional test animals, if no alternative methods for toxicity prediction are accepted. However, the number of test animals could be significantly reduced by utilizing existing experimental data in conjunction with (Quantitative) Structure Activity Relationship ((Q)SAR) models. To address the challenge, the European Commission has funded the OpenTox (www.OpenTox.org) project to develop an open source web-service-based framework, that provides unified access to experimental toxicity data, in Silico models (including (Q)SAR), and validation/reporting procedures. Now, in the final year of the initial three-year project, the current state of architecture, Open API, algorithms, ontologies, and approach to web services will be presented. Our experiences on current collaborative approaches aiming to combine OpenTox with other systems such as CERF, Bioclipse, CDK, and SYNERGY to create “super-interoperable K-infrastructure” will be discussed both in terms of conceptual promise and implementation reality.
10:40 56 CAS Registry: Maintaining the gold standard for chemical substance information.
R. Schenck, J. Zabilski
Department of Content Planning, Chemical Abstracts Service, Columbus, OHIO, United States

CAS has traditionally built its databases from the journal and patent literature. With the advent of the Internet, CAS now has another major source of chemical substance information. This presentation will discuss these internet resources and how CAS evaluates them for inclusion in CAS REGISTRY, while maintaining its quality standards. Since 1965, the scientific experts at CAS have identified more than 56 million organic and inorganic substances. This presentation will examine the sources of this growth and illustrate what CAS is doing to keep pace with this explosion in small molecule chemistry.
11:20 57 Evolution of the science journal and the chemical publication.
H. S. Rzepa
Department of Chemistry, Imperial college London, London, United Kingdom

The concept of a modern scientific journal becomes 346 old in 2011 (DOI: 10.1098/rstl.1665.0001), although only since 1994 has the journal article been embedded in the Internet and Web era (DOI: 10.1039/C39940001907). Although the structure of the article itself morphed little during the first part of the Internet age, there are now signs that many aspects of its creation and dissemination are starting to evolve more rapidly. Here, several potential future enhancements are reviewed, including the role of the scientific blog in augmenting the effectiveness of the peer-review processes, the role of data-integrity within the article, integration of Web-enhanced and other data-rich and functional objects, the role of open digital repositories, article semantification, and delivery and re-functionalisation of the re-invented article via new generations of mobile personal devices.

Section B
Anaheim Convention Center
201C

General Papers
R. Bienstock, Organizer, Presiding
9:00   Introductory Remarks.
9:05 58 Collaborative QSAR analysis of Ames mutagenicity.
E. Muratov, D. Fourches, A. Artemenko, V. Kuz'min, G. Zhao, A. Golbraikh, P. Polischuk, E. Varlamova, I. Baskin, V. Palyulin, N. Zefirov, L. Jiazhong, P. Gramatica, T. Martin, F. Hormozdiari, P. Dao, C. Sahinalp, A. Cherkasov, T. Oberg, R. Todeschini, V. Poroikov, A. Zaharov, A. Lagunin, D. Filimonov, A. Varnek, D. Horvath, G. Marcou, C. Muller, L. Xi, H. Liu, X. Yao, K. Hansen, T. Schroeter, K. Muller, I. Tetko, I. Sushko, S. Novotarskyi, N. Baker, J. Reed, J. Barnes, A. Tropsha
University of North Carolina, Chapel Hill, NC, United States; A.V. Bogatsky Physical-Chemical Institute NAS of Ukraine, Odessa, Ukraine; Moscow State University, Moscow, Russian Federation; University of Insubria, Varese, Italy; US Environmental Protection Agency, Cincinnati, OH, United States; Simon Fraser University, Burnaby, Canada; University of British Columbia, Vancouver, Canada; University of Kalmar, Kalmar, Sweden; University of Milano-Bicocca, Milan, Italy; Institute of Biomedical Chemistry RAS, Moscow, Russian Federation; University of Strasbourg, Strasbourg, France; Lanzhou University, Lanzhou, China; Technical University of Berlin, Berlin, Germany; Institute for Bioinformatics, Nuremberg, Germany; BioWisdom Ltd, Cambridge, United Kingdom

We report the results of a collaborative QSAR modeling project between 15 teams to develop predictive computational QSAR models of in vitro Ames mutagenicity induced by organic compounds. The Ames dataset consisted of 6542 compounds (after curation). In total, 32 predictive classification QSAR models were developed using different combinations of chemical descriptors and machine learning approaches, representing the most extensive combinatorial QSAR modeling study ever done in the cheminformatics field in public domain. The resulting consensus model had the highest external predictive power nearly reaching the experimental reproducibility of 85% for the Ames test. In addition, we found published evidence indicating that 31 of 130 outliers (29 mutagens and 2 non-mutagens) were erroneously annotated in the original dataset. This work presents a model of collaboration that integrates the expertise of participating laboratories to establish the best practices and most reliable solutions for difficult problems in chemical and computational toxicology.
9:25 59 How (not) to build a toxicity model.
A. C. Lee, R. Clark, M. Waldman, J. Chung, R. Fraczkiewicz, W. S. Woltosz
Department of Life Sciences, Simulations Plus, Inc., Lancaster, CA, United States

When a seemingly well-curated chemical data set hits the press, a modelers' first impulse is to apply their preferred QSAR method to the data in hopes of building a model that exhibits superior statistics to other published models. Occasionally, the results appear too good to be true. Are these models useful? This work details a procedure for building a useful and well-validated model, using respiratory sensitization data. We highlight the do's and don'ts of data selection, pre- and post- data curation, QSAR methodologies, and validation strategies implemented from 1984 to present. The examples demonstrate how to identify a narrow sampling of chemical space by examining good-looking models, applying a model to (believable) real-world data in order to determine its usefulness both inside and outside the model's applicability domain, and techniques that modelers (should) use to validate as well as assess the robustness of a model.
9:45 60 Metabolic site prediction using artificial neural network ensembles.
M. Waldman, R. Fraczkiewicz, J. Zhang, R. D. Clark, W. S. Woltosz
Simulations Plus, Inc., Lancaster, CA, United States

Hepatic first-pass metabolism of drugs and prodrugs plays a key role in oral bioavailability, and the cytochrome P450 enzymes are responsible for metabolism of most drugs. Knowledge of likely sites of metabolic attack in a drug molecule can aid in designing out unwanted metabolic liabilities early on in the drug discovery process as well as in the design of prodrugs where metabolic transformation is desired. Using datasets constructed from literature compilations and commercially available databases, we have constructed models based on artificial neural network ensembles that predict one or more likely sites of metabolism for a given molecule for several CYP isoforms including 2C9, 2D6, and 3A4. The models employ atomic descriptors describing charge, reactivity, steric accessibility, and other properties of the candidate atom and its local environment. Model performance will be shown based on various statistical criteria as well as specific examples demonstrating scope and limitations.
10:05 61 Withdrawn.
10:25   Intermission.
10:35 62 Use and results of using an online chemistry laboratory package in a large general chemistry course.
R. L. Nafshun
Department of Chemistry, Oregon State University, Corvallis, Oregon, United States

In addition to traditional on-campus general chemistry courses, The Department of Chemistry at Oregon State University has been offering an online general chemistry sequence since 2003. We have struggled to identify a method of facilitating an appropriate distance laboratory program. We have investigated a "kitchen" chemistry kit and various online virtual toolboxes. We are currently using a virtual laboratory package (www.onlinechemlabs.com) which presents the user with a split screen: one side contains chemistry laboratory tools and the other is text. The tools include standard experimental equipment such as an analytical balance, flasks, pipettes, and reagents, as well as more complex analytical instruments or reaction equipment such as an absorbance spectrophotometer, calorimeter, NMR, and a combustion chamber. The logical progress (or flow) of these tools in experiments is analogous to that in classroom labs. The tools incorporate both random and systematic error, providing data simulations where detailed error analyses can be performed that are analogous to that in classroom laboratory experiments. Each of these features allows for a significant enhancement in instructional capabilities, and could integrate very well with the instructional modalities of models and argumentation that have been recently developed and outlined in more detail below. Results of the use of the online chemistry laboratory package in three different modes (fully online/hybrid/supplemental) and methods of use will be discussed.
10:55 63 Reaction prediction as ranking molecular orbital interactions.
M. A. Kayala, C. A. Azencott, J. H. Chen, P. Baldi
Department of Computer Science, University of California, Irvine, Irvine, CA, United States

Being able to predict the course of chemical reactions is essential to the practice of chemistry. While computational approaches to this problem have been extensively studied in the past, a fast, accurate, and scalable solution has yet to be described. Here, we propose a novel formulation of reaction prediction as a machine learning ranking problem: given a set of molecules and a description of conditions, learn a ranking over potential filled to unfilled molecular orbital (MO) interactions approximating the corresponding transition state energy ranking. Using an existing rule-based expert system (ReactionExplorer), we derive restricted chemistry dataset consisting of 1300 full multi-step reactions with 2200 distinct starting materials and intermediates. This yields 3600 predicted MO interactions and 14 million unpredicted MO interactions. A two-stage machine learning scheme is used to learn the model. First, we train reactive site predictors using a combination of topological and real-valued global features to filter out 61% and 44% of non-predicted filled and unfilled MOs with a 0.0001% error rate. Then various ranking models are trained on the MO interactions using features engineered to approximate transition state entropy and enthalpy. Using cross-validation, current best models recover a perfect-ranking 61% of the time and recover a within-4-ranking 95% of the time.

WEDNESDAY AFTERNOON

Section A
Anaheim Convention Center
204 B

Internet and Chemistry: Social Networking - Cosponsored by YCC
H. Rzepa, Organizer
S. Bachrach, Organizer, Presiding
1:40 64 Automated semantic data embargo and publication by the CLARION project.
S. E. Adams, N. Day, J. Downing, B. Brooks, P. Murray-Rust
Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom

The CLARION project has created the infrastructure to enable research chemists to make selected data available as Open Data, shared over the Semantic Web, without requiring technical expertise themselves. Data is automatically collated from central services, such as the Departmental Crystallographic Service, and chemists' Electronic Lab Notebooks. An Embargo Manager application presents research groups with a view of the data they own, and allows them to set embargo conditions and add additional metadata. Once the embargo period expires data is automatically semantified and deposited as Open Data in a public Chem# repository.
2:20 65 Chemical eCommerce.
K. Gubernator
eMolecules, Inc., Solana Beach, CA, United States

Chemist are late adopters of the internet. The main obstacle is that search engines and eCommerce systems are text-based and as such inherently inadequate to handle chemical structures. Also, chemical nomenclature and names are poorly standardized and inconsistently used by both suppliers and buyers of chemicals. Therefore, only the combination of a chemical search engine and a chemical eCommerce system can address the needs of the market. Such a system has to handle millions of chemical structures, return results in seconds, and provide tools to handle lists of thousands of molecules. In addition, user expectations are created by their experiences with Amazon and eBay: Prices and availability should be on line. The purchasing process is expected to be predictable: you get what you order on time. Implementing and operating a chemical eCommerce system therefore requires a paradigm shift in the quality of the entire purchasing process.
3:00   Intermission.
3:10 66 Waiting on the Chemical Internet.
S. M. Bachrach
Department of Chemistry, Trinity University, San Antonio, TX, United States

The chemical internet dates back roughly to 1994. Over that time the impact of the Internet and the web on society in general has been overwhleming. Business have come and gone, communication has evolved from web sites to blogs to tweets. But for chemists, the impact has been of much less significance. The talk will present some of the causes of the slow uptake of the Internet by chemists and what potentially the future might hold for us.
3:50 67 Rapid dissemination of chemical information for people and machines using Open Notebook Science.
J. Bradley, A. S. Lang
Department of Chemistry, Drexel University, Philadelphia, PA, United States; Department of Mathematics, Oral Roberts University, Tulsa, OK, United States

This presentation will cover methods and tools used to collect, record and disseminate chemical information using Open Notebook Science, the practice of making a laboratory notebook and all associated raw data available publicly in as close to real time as possible. Both solubility measurements and organic chemistry reactions are handled in this way. The recording of laboratory data is handled primarily using free and hosted services such as Wikispaces and Google Spreadsheets. The information is made discoverable using redundant communication channels, including Google, Google Scholar, Wikipedia and other vehicles. The abstraction of key elements from the solubility measurements and the chemical reactions allows for the use of live machine-readable feeds and web services. The implications for the future of the automation of the scientific process based on Open Data and Open Services will be discussed.

 

 

Committee Reports

The following committee reports have been submitted.

Communications & Publications

Report of the CINF Communications & Publications Committee

Transfer from the CINF Yahoo! Group to the ACS Network

The decision to close the CINF Yahoo! Group and transfer all CINF Division business to the ACS Network has been implemented. The CINF Yahoo! Group still exists but access is limited to the former group moderators and the group will not be closed until we can decide how to preserve the email archive.

The CINF Division group on the ACS Network has now grown to 127 members and CINF members have begun to use the group in a serious manner after some initial reluctance due to unfamiliarity with the new network. The discussion on the new CINF website has generated almost 900 views and a large number of postings. This group is open to all ACS Network members. There are also closed groups for the CINF Executive and also for this committee where private business can be conducted.

Switchover to the new CINF website

In January, Danielle Dennie, the new CINF webmaster reported as follows:

"Ideally, I would have liked to survey or talk with members of CINF to ask you how you use the site and what you would like to see in a new site. This would have meant that I would have kept the old site while gathering data that could be used to create the new site. Unfortunately, because of my limited knowledge of the software that was used to create the old site, I could not make any edits to it. Which means that if there were any updates to be made to the old site before a new site could be built, I would not have been able to make them.

"Therefore, I quickly designed a new template that I could work with. To make it easy for myself, and for users of the site, I kept the same logical organization that was on the old site. This means that the menu on the left hand side is practically (with minor exceptions) the same as the old menu, as well as the organization of the secondary pages.

"That being said, I was not able to transfer over all content. Specifically, there are 3 sections that I could not transfer:

  • "Because the old site used a database to generate past meetings information, I have not, for the moment transferred over the tremendous amount of content that was in that section of the site. Therefore, I simply link to it from the new site. (http://acscinf.org/meetings/past.php).
  • "I did not transfer over volume 62 of the e-CIB newsletter (http://acscinf.org/publications/bulletin.php). However, for the upcoming e-CIB, I will create a new template. Perhaps once a new template is agreed upon, I will be able to transfer over volume 62.
  • "The CINF electronic newsletter has not, as yet, been transferred to the new template (again, because of the tremendous amount of content). A link was made to the individual newsletters on the old site. (http://acscinf.org/publications/enews.php).

"Furthermore, there are a couple of links in the left hand menu that I did not add to the new site. If these are needed, please let me know and I will add them:

  • Surveys
  • Disclaimers

"Otherwise, I went through every section of the old site, and recoded each page that I came across to fit the new template. If there were files on the old site that were orphan pages (i.e. not linked to from any other page), these files were unfortunately probably not transferred over. I hope I did not miss anything too important. If you notice anything glaring, please let me know.

"Overall, there are still some little tweaks that I need to bring to the site, but the bulk of the work is completed. I look forward to working with everyone to make the site as user-friendly as possible. If you have any questions or concerns, please don’t hesitate to contact me."

The new website has been well received by CINF members but we still have some way to go to achieving our vision of a website to which it will be easier for CINF members to post content themselves.

eCIB editorship for 2011

At the end of 2010, Svetla Baykoucheva retired as eCIB editor but will continue to contribute actively to future editions.

In Spring 2011, David Martinsen has agreed to be guest editor and will try to experiment with new workflows based on using the ACS Network. Svetlana Korolev will edit the Summer and Winter edittions which follow and report on National ACS meetings and the Fall edition will be edited by Judith Currano.

Bill Town, Chair,
CINF Communications and Publications Committee

 

Awards and Scholarships

The following awards and scholarships have been announced for this issue.

Scientific Excellence

2011 CINF Scholarship for Scientific Excellence Sponsored by FIZ Chemie Berlin

ImagePlease note: The deadline to submit an abstract for the 2011 CINF Scholarship for Scientific Excellence has been extended to April 8, 2011.

The scholarship program of the Division of Chemical Information (CINF) of the American Chemical Society (ACS) funded by FIZ Chemie Berlin is designed to reward graduate and postdoctoral students in chemical information and related sciences for scientific excellence and to foster their involvement in CINF.

Up to three scholarships valued at $1,000 each will be presented at the 242nd ACS National Meeting in Denver, CO, August 28 – September 1, 2011. Applicants must be enrolled at a certified college or university, and they will present a poster during the Welcoming Reception of the division on Sunday evening at the National Meeting. Additionally, they will have the option to also show their poster at the Sci-Mix session on Monday night. Abstracts for the poster must be submitted electronically through PACS, the abstract submission system of ACS.

To apply, please inform the Chair of the selection committee, Guenter Grethe at ggrethe@att.net, that you are applying for a scholarship. Submit your abstract to http://abstracts.acs.org using your ACS ID. If you do not have an ACS ID, follow the registration instructions and submit your abstract for "CINF Scholarship for Scientific Excellence". The deadline for submitting an abstract to PACS is April 1, 2011. Additionally, please send a 2,000-word abstract describing the work to be presented in electronic form to the Chair of the selection committee by June 30, 2011. Any questions related to applying for one of the scholarships should be directed to the same e-mail address.

Winners will be chosen based on contents, presentation and relevance of the poster and they will be announced during the reception. The contents will reflect upon the student’s work and describe research in the field of cheminformatics and related sciences. Winning posters will be marked "Winner of FIZ Chemie-CINF Scholarship for Scientific Excellence" at the poster session.

Guenter Grethe

 

CSA Trust Grants

Applications Invited for CSA Trust Jacques-Émile Dubois Grants for 2012

ImageThe Chemical Structure Association (CSA) Trust is an internationally recognized organization established to promote the critical importance of chemical information to advances in chemical research. In support of its charter, the Trust has created a unique Grant Program, renamed in honor of Professor Jacques-Émile Dubois who made significant contributions to the field of cheminformatics. The Trust is currently inviting the submission of grant applications for 2012.

Purpose of the Grants:

The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, research or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. A Grant will be awarded annually up to a maximum of five thousand U.S. dollars ($5,000). Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how the grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

Who is Eligible?

Applicant(s), age 35 or younger, who have demonstrated excellence in their chemical information related research and who are developing careers that have the potential to have a positive impact on the utility of chemical information relevant to chemical structures, reactions and compounds, are invited to submit applications. While the primary focus of the Grant Program is the career development of young researchers, additional bursaries may be made available at the discretion of the Trust. All requests must follow the application procedures noted below and will be weighed against the same criteria.

Which Activities are Eligible?

Grants may be awarded to acquire the experience and education necessary to support research activities; e.g. for travel to collaborate with research groups, to attend a conference relevant to one’s area of research, to gain access to special computational facilities, or to acquire unique research techniques in support of one's research.

Application Requirements:

Applications must include the following documentation:

  1. A letter that details the work upon which the Grant application is to be evaluated as well as details on research recently completed by the applicant;
  2. The amount of Grant funds being requested and the details regarding the purpose for which the Grant will be used (e.g. cost of equipment, travel expenses if the request is for financial support of meeting attendance, etc.). The relevance of the above-stated purpose to the Trust’s objectives and the clarity of this statement are essential in the evaluation of the application);
  3. A brief biographical sketch, including a statement of academic qualifications;
  4. Two reference letters in support of the application. Additional materials may be supplied at the discretion of the applicant only if relevant to the application and if such materials provide information not already included in items 1-4. Three copies of the complete application document must be supplied for distribution to the Grants Committee.

Deadline for Applications:

Applications must be received no later than March 14, 2012. Successful applicants will be notified no later than May 2, 2012.

Address for Submission of Applications:

Three copies of the application documentation should be forwarded to: Bonnie Lawlor, CSA Trust Grant Committee Chair, 276 Upper Gulph Road, Radnor, PA 19087, USA. If you wish to enter your application by e-mail, please contact Bonnie Lawlor at blawlor@nfais.org prior to submission so that she can contact you if the e-mail does not arrive.

Interviews

This month we have interviews with James L. Mullins and Michael Gordin.

James L. Mullins

What Do Libraries Have to Do with e-Science?

An Interview with James L. Mullins, Dean of Purdue University Libraries

By Svetla Baykoucheva

ImageJames L. Mullins has been Dean of Libraries and professor of library science at Purdue University since 2004. Before that he was associate director for administration of the Massachusetts Institute of Technology (MIT) Libraries. His more than thirty years long career includes administrative positions at Villanova University and Indiana University. He earned BA and MALS degrees from the University of Iowa and a PhD from Indiana University.

Dr. Mullins has served in leadership positions within the American Library Association (ALA) and the Association of Research Libraries (ARL) and presently is an elected member of the ARL board of directors and chair of the e-Science Working Group. Presently he serves on the editorial board of the journal College and Research Libraries. He is also on the board of directors of the International Association of Scientific and Technological University Libraries (IATUL), Center for Research Libraries (CRL), and is a delegate to the Science and Technology Section of the International Federation of Library Associations (IFLA). Last June, Purdue was host to the 2010 IATUL Conference, which focused on the role of libraries in e-science. He was a signatory to the formation in December 2009 of DataCite, an international consortium assigning digital object identifiers (DOI) to datasets for citation.

Dr. Mullins is a frequent contributor to the professional literature, speaks at national and international conferences, and consults with research libraries and universities internationally on challenges facing research communication and dissemination. He has served on National Science Foundation (NSF) panels, including one in 2006 recommending that data management plans be required for NSF research funding.

Svetla Baykoucheva: The new buzzword in academic libraries is "e-Science." It is also called "eScience." We are seeing job announcements for e-Science librarians, conferences on e-Science being organized, the Association of Research Libraries (ARL) publishing a white paper on it, and NSF introducing new requirements for data management. What is e-Science?

James L. Mullins: In 1999, John Taylor, the Director General of the United Kingdom's Office of Science and Technology, created the term to describe computationally-intensive science that draws upon large data sets and, through modeling and algorithms, test assumptions. In today's world, scientists rarely use the term e-Science since computational methodologies have become so embedded in the research process that it hardly warrants distinctive nomenclature.

SB: Last year you organized a conference on e-Science. What were the topics discussed at this conference? Could you point to some future conference on e-Science?

JM: Purdue was host to the 31st Annual Conference of IATUL (International Association of Scientific and Technological University Libraries); the theme of the program was: “The Evolving World of e-Science: Impact and Implications for Science and Technology Libraries.” The intent of the conference was to start with the broadest concept—what is e-science/computational science, what is the role of data in computational science and how are scientists coping (or not) with managing data? The keynote speaker was Dr. Dan Kleppner of MIT who co-chaired a task force for the National Academies on issues related to data. In addition, Dr. Arden Bement, who had stepped down as director of the NSF a few weeks before the conference, spoke about the interest the funding agencies have in ensuring that data generated through sponsored research would be available generally to researchers. Dr. Bement assumed the position of executive director of the Global Policy Research Institute at Purdue earlier that month, so his interest was twofold: the management of data and the need to create a global policy on data management to facilitate research. Most of the program was focused on how data can be managed and what the role can or should be for librarians; so it wasn’t just a theoretical discussion, as it provided an opportunity for librarians to gain knowledge of the processes that could assist them in developing e-science programs in their institutions. Rather than having me provide a complete summary of the program, it would be easy for readers who are interested in the topics to go to the website: http://blogs.lib.purdue.edu/iatul2010/program/.

There are many organizations that have a focus on e-science/data management within the international library community, especially the Digital Curation Centre (DCC) in the United Kingdom: http://www.dcc.ac.uk/events. In the United States, the Distributed Data Curation Center (D2C2) at Purdue is a research center focused on exploring and researching ways in which data can be accessed and archived. Further description can be gained at the link: http://d2c2.lib.purdue.edu/index.php. The Coalition for Networked Information (CNI) at its twice annual briefing sessions often has papers focused on e-science and data management. Also, on the CNI website (http://www.cni.org/regconfs/) there is a list of upcoming conferences and workshops that include ones on e-science/data management

Finally, the Association of Research Libraries (ARL) and the Digital Library Federation (DLF) are in the early stages of developing an e-science institute planned for fall, 2011. Initially the Institute will be open to sponsoring libraries (ARL/DLF members), but the intent is that it will be repeated for the broader community in 2012.

SB: How do you see the role that librarians could play in this new area? What kind of expertise will be required from them?

JM: Working in the area of data management draws upon the principles of library and archival sciences. Our ability to see structure to overlay on a mass of disparate “parts,” as well as the ability to identify taxonomies to create a defined language for accessing and retrieving data is what is needed from us. The challenge will be for librarians to understand that we have collections that we cannot see and may not actually understand the importance of, but that we will have a responsibility to steward and preserve for researchers now and in the future. Archival science is important since there are requirements and expectations from investigators that there will be limited access to data that will require that an embargo be in place. Just as people can give their personal papers to archives with an expectation that access will be limited to specific researchers or closed for a period of time, researchers may similarly want to protect their intellectual property by creating an embargo. For librarians this would normally be unacceptable, while for archivists this is standard procedure. I also think it helps us to think about our present print archives as being raw bits of data, until a researcher (typically a humanist or social scientist) "mines" them to answer a research question, which is similar to scientists or engineers consulting digital data in their research.

SB: Will e-Science change the way academic libraries function? Will it change the infrastructure and the services libraries provide?

JM: Many of our librarians (even those working in scientific and engineering disciplines) often have humanities or social sciences backgrounds. However, the trepidation that many librarians may have about sitting down with researchers and discussing their data management needs shouldn’t be a controlling factor. Once a librarian has the experience of talking with researchers about their research and the challenges they have with managing data, it becomes clear that the most important factor is not our subject expertise (although some subject understanding is needed) but rather the librarian’s knowledge of metadata and taxonomies. In the old days we would have said that this is “cataloging and classification,” but today, to convey that we have morphed into a new role, it is best to use the more technical terminologies since it may help identify our “new” role as a cutting edge initiative and not be encumbered with past misperceptions. In fact, a few times I have seen researchers frustrated by librarians with significant subject expertise, who more or less intrude their subject knowledge into what the investigator is researching, while what investigators want is the library/archival science contribution to their team. We need to remember that and be proud of the special expertise that we as librarians bring to the research team.

The impact for libraries in the broadest sense is the recognition that we have an important role to ensure the archiving and preservation of important data sets that initially may not be apparent to the researchers or us. We need to be able to think of treating these data sets as important collections, which is not that dissimilar to how we have stewarded our print book and serial collections or our archives. Responsibility for digital data brings new challenges and cost models—ones that we will need to work through with our university administrations and develop further collaboration with our colleagues in research administration and information technology.

SB: What kind of problems do you see for librarians to be able to get involved in e-Science? Will faculty be willing to share raw data with outsiders and how could this potentially affect intellectual property rights?

JM: I have touched on some of the problems for librarians to become involved with e-Science; so I will focus on the second part of your question. And the simple answer, from my perspective, is, "it depends." The one thing we have learned from the work we have done so far with disciplinary faculty and their research is that no two disciplines have identical policies or principles guiding them about sharing data. When we at Purdue embarked on this work six years ago, we thought it was going to be simple to help researchers manage and share their data. However, that naïve assumption was soon disproved. Some disciplines share data through a central database available to all, while others keep their data "close to the vest" while the research is being undertaken and are willing to share it only when it is needed to document findings in a published research article.

The mandate by the NSF and the likelihood this will be adopted by other funding agencies will trump, possibly, the traditions of data sharing (or not) within a field. It will take some time before it becomes an accepted, required step of the process. The NSF mandate is a start, but ultimately it will gain acceptance when researchers themselves begin to see benefits of sharing data beyond what they have done in the past.

SB: How will e-Science affect the way research is performed and reported? What will be the consequences for the science and technology publishing field?

JM: Some of the effects have been discussed above; so I won’t go back over them here. But I will amplify some of the potential impact that may come from the availability of data and the requirements necessary to provide that access. During the past several years, the publishing industry has begun to assign digital object identifiers (DOIs) through the service provided by CrossRef. This has been very successful as it assigns a persistent identifier that will tag this article for retrieval, now and far into the future. The DOI serves somewhat like a barcode or ISBN, a unique tag that provides access to this article. So, with this ability to identify the article, there comes the concurrent need or desire to link relevant data to it. That initiative has been taken on by libraries around the world, through the development of the international organization called DataCite (http://datacite.org/). Its charge is to create a registry available to researchers throughout the world to permanently tag a data set, and provide enough description to allow for access and retrieval, if desired by a researcher. In the United States, the coordination and assignment of DOIs through DataCite is being undertaken by the California Digital Library (CDL), Purdue University Libraries and the Office of Science and Technology (OSTI) of the Department of Energy (DOE).

Creating DataCite and the assignment of DOIs is a major undertaking, not unlike what took place forty years ago with ISBN—the difference being that ISBN was a collaboration between publishers and national libraries, which had the reach and the clout to make it a standard in a short time and which were dealing with a finished product (a book). For DataCite, it is a few international libraries banding together to try to get this elephant headed in the right direction. At this time, the DOI assignment to a data set is not mandatory. There is a possibility, however, with OSTI recently joining DataCite, that the DOI assignment will become a requirement by funding agencies.

SB: I have done many interviews for the Chemical Information Bulletin, but this is the first time I am interviewing a dean of libraries. And I would like to ask you a question that all academic librarians are asking: how do you see the academic libraries and the work librarians are doing change in the next few years? As dean of libraries in such prominent institution as Purdue, what changes are underway in your own libraries?

JM: There is a shift from the trend that was happening ten years ago, which was the reduction of the number of librarians and other professionals and the increase in the number of clerical and student staff. In the "post print" world, the effort necessary to acquire, check-in, catalog, bind, and manage print collections has significantly been reduced. However, the work that needs to be done in collaboration with the faculty in the classroom and lab has increased.

ImageAt Purdue, librarians are full members of the professorial faculty, and with that comes an expectation that they not only ensure that the Purdue Libraries operate using sound library science principles, but that the latest initiatives be evaluated and integrated if deemed appropriate into the operations and services of the Libraries. However, in order to extend the work of the librarians, it is becoming clearer and clearer that we need to move much of the day-to-day management and such services as reference and cataloging/metadata operations to another tier of professional and clerical staff, trained and able to do these operations. This frees up the librarians to collaborate on information literacy instruction, research team collaboration, and research in the areas of changing scholarly communication models. If anyone came or is coming to librarianship thinking it would be a static, complacent, and quiet place to work, they may want to reconsider!

SB: On a personal note, could you tell us about something that interests you besides information science and librarianship?

JM: One of the great advantages of being a librarian is that we have the ability to explore so many aspects of knowledge and to follow the curiosity that I believe is an important trait that all librarians must have. Although I have a great love of travel and a commitment to international librarianship through participation in IFLA and IATUL, I don't consider that as my sideline interest, as it is still, for the most part, professional. I can give you an example of what I am reading for pleasure, pure enjoyment—and that is about the beginning of the Cold War, from the end of World War II and through the 1960's, into the Vietnam Era. Being a child during the 1950's, I remember so well our fear of the Chinese and the Soviets/Russians and the competition that was in place to out-achieve the Soviets in science and technology. We were aware that we could be destroyed any day by nuclear war, but as a child I really had no idea what the reason was. I remember watching as a boy in the 1950's an old WWII movie made during the War, where the sailors on an American ship began cheering when they realized that the planes they saw overhead were Russian and not Japanese. I remember asking my mother how could that be, and her answer was that they were our allies in the War. In the 1950s that seemed inconceivable. A little like today when we think of Iran. Therefore, I am reading about the beginning of the Cold War period and just finished an excellent book, The Lost Peace: Leadership in a Time of Horror and Hope, 1945-1953, by Robert Dallek.

SB: It is an interesting coincidence that for this issue I also interviewed Dr. Michael Gordin, who has done extensive research on the beginning of the Cold War and has published books on that period. Thank you, Dean Mullins, for discussing e-Science and for your personal insights.

Michael Gordin

Political, Cultural, and Technological Impacts on Chemistry

An Interview with Michael Gordin, Director of Graduate Studies of the Program in the History of Science, Princeton University

By Svetla Baykoucheva

ImageMichael Gordin is the Director of Graduate Studies of the Program in the History of Science at Princeton University. He has done extensive research on the history of the modern physical sciences and Russian history. He earned his A.B. (1996) and his Ph.D. (2001) from Harvard University and served a term at the Harvard Society of Fellows. He has published articles on the introduction of science into Russia in the early 18th century, the history of biological warfare in the late Soviet period, the relations between Russian literature and science, and a series of studies on Dmitrii I. Mendeleev. His book on the life and chemistry of Mendeleev1 is considered the most comprehensive and authoritative study published on the formulator of the periodic table of elements. Dr. Gordin has also worked extensively in the early history of nuclear weapons and is the author of Five Days in August: How World War II Became a Nuclear War2 (2007), a history of the atomic bombings of Japan during World War II and an international history of nuclear intelligence, Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly (2009)3. He has also co-edited the four-volume Routledge History of the Modern Physical Sciences (2001), Intelligentsia Science: The Russian Century, 1860-1960 (2008)4, and Utopia/Dystopia: Conditions of Historical Possibility (Princeton, 2010)5. He is now working on a history of the modern category of "pseudoscience" in postwar America, from the age of McCarthy to the counterculture, centering on the sensational career of Immanuel Velikovsky (1895-1979), whose 1950 best-seller, Worlds in Collision,6 sparked three decades of controversy over the boundaries of legitimate science. Professor Gordin teaches lecture courses in the history of modern science, technology and society, and translation in the history of science, as well as seminars on nuclear-weapons history, the history of pseudoscience, the Soviet science system, and biography.

Image

Svetla Baykoucheva: The United Nations has designated 2011 as the International Year of Chemistry, and I am very pleased to be able to interview someone who has performed such extensive research in the field of history of chemistry. Your book on Dmitrii Mendeleev1 shows deep understanding not only of chemistry, but also of the socio-political environment in Russia at the time. How does the cultural milieu of an epoch, a country, a region, or an organization influence the developments in science and the public attitude about it?

Michael Gordin: This is a great question, and in many ways it is the central concern of the history of science, and clearly there is no straightforward answer to it. There are many factors that influence the development of science at any particular time and place: the experimental equipment and resources available to the scientist, his or her level of education and preparation, access to communication from other scientists, and the general state of science at the time, to name just a few. Some of these factors are pretty tightly bound with intellectual matters, and some of them are more broadly social or cultural, and I think it would be an error to rule out any particular factor by fiat. In some cases, such as Mendeleev's, the need to reform the pedagogy of chemistry for students in St. Petersburg proved crucial to his creating a framework for organizing the elements which eventually grew into the periodic system we know today. The concerns were both social and political (how do you educate a large number of students who have inadequate preparation) and intellectual (the rapidly expanding knowledge of the properties of elements, especially their atomic weights, in the 1860s). That’s not to say we wouldn’t have a periodic table without educational reform in Russia — far from it, as we know by the existence of multiple competing systems. Rather, I mean to say that the form we received has a great deal to do with the specifics of that time and place; the content is a more nuanced philosophical matter. The purpose of the history of science is to elucidate all these various factors and point to their relative weights in specific episodes.

SB: Two of your books (Five Days in August: How World War II Became a Nuclear War2 and Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly3) were devoted to nuclear proliferation in the context of the Cold War How do these topics relate to the history of chemistry?

MG: My colleagues often ask me the same thing. Nuclear weapons in the early Cold War, after all, are indeed a long way from Mendeleev and Imperial St. Petersburg. Certainly as topics they are pretty different, but as ways of investigating the past they are not that far apart. One of the great challenges in writing the history of science is avoiding what we call "Whiggish" interpretations of history; that is, writing a history of the past which leads inevitably to the present, placing the end of the story right there in the beginning. This kind of presentist version of history is very tempting in the history of science, because science’s achievements are so obvious, and seem so inalterable. The important point, from the historical point of view, is that they were not obvious to the scientists engaged in making the discoveries. They were beset by uncertainties, alternatives, doubts, and vigorous arguments. It is the historian's task to capture those uncertainties and show the past as it unfolded, not tell a just-so story for the present. Well, after publishing the Mendeleev book, I found myself grabbed by a set of questions concerning the early nuclear arms race, and wanted to see if the same approach would yield results there, even if these weren’t, strictly speaking, classic "history of science" questions. For example, in Five Days in August, I focused on how American military officials, politicians, and scientists thought about the atomic bomb in the period before surrender of the Japanese government in August 1945, and especially in the five days between the bombing of Nagasaki and that surrender. At that time, no one could say that the bomb "ended the war," because the war was not yet over; so how did they think about it? Was it a revolutionary weapon or not? And in Red Cloud at Dawn, I concentrated on the period between the end of World War II and the detonation of the first Soviet atomic device in August 1949, in order to explore how people on both the American and Soviet sides evaluated the arms race before, strictly speaking, any such race existed. The approach is heavily indebted to the history of chemistry, even if the topics aren’t. To be honest, I’m looking forward to returning to more chemical questions now that I have spent all this time with nuclear weapons.

SB: You are the co-editor of a monograph, Intelligentsia Science: The Russian Century, 1860-1960, for which you also wrote an essay on the Heidelberg Circle — a group of Russian chemists who specialized in Germany and who later founded the Russian Chemical Society. Who were these people and what impact did they have on the development of chemistry both in and outside Russia? What motivated them to choose chemistry as a career? What was the role of learned societies at that time?

MG: Russia entered the decade of the 1860s facing a series of severe challenges. In 1856 it had lost the Crimean War, a defeat which was interpreted by the elite and the intelligentsia as a sign that Russia was "backward" in significant ways with respect to the Western powers. They began to promote a series of military and fiscal reforms in an effort to modernize the state, the most famous of which was the abolition of serfdom in February, 1861. But the problem of technical modernization also occupied these decision makers, and they initiated a program to sponsor talented young scientists (and other scholars, like lawyers and physicians) to study abroad, absorb the very latest word in their specialties, and then return to Russia to help rebuild a self-sustaining community at home. And, to a great degree, it worked. Many of the leading lights of Russian chemistry, to pick the example I know best, and those behind the formation of the Russian Chemical Society in 1868, were part of this temporary emigration: Dmitrii Mendeleev, Aleksandr Borodin, Vladimir Markovnikov, and others. Each was drawn to chemistry for different personal reasons, but the choice was in a sense no surprise: chemistry was the most dynamic and exciting science at mid-century, and it was the science most well established in both St. Petersburg and Kazan, which trained these individuals to a level where they could take advantage of their sojourn abroad. As for learned societies, we see a proliferation of chemical societies all across Europe during this time period, and they served a crucial role in creating a national community of scholars who could communicate with each other, establish journals, and lobby their states and national industries for greater support of chemistry. As a step in the professionalization of chemistry, these societies were vital.

SB: In a chapter published in the same book, you characterized the Russian national style of scientific discourse as "theoretical, bold, impulsive, and stridently argumentative. It was the style of D. I. Mendeleev and V. V. Markovnikov. It was also the style of Emil Erlenmeyer." Are there national differences in the way scientists perform research and discuss scientific ideas and experimental results?

MG: Yes and no. At almost any point in the past two centuries (although, interestingly, not so much before then), you can find cases of scientists claiming that their work bears some specific "national style" in a laudatory sense, or that the manner of research of their competitors from another national context bears a deleterious national style. We can easily jot down a number of these crude stereotypes: Russians are impulsive and bold; Germans are nit-picking and meticulous; the French are abstract and conceptual; the Americans are pragmatic and application-oriented. I do not endorse any of these points of view as being accurate descriptions of how people really were or are. Instead, in the article you mentioned I point to how certain Russians chose to brand themselves as being bold and speculative; the irony being that the person they were patterning themselves on most was Erlenmeyer, a German. These assertions of "national styles" have been over the years very important aspects of how scientists have understood their own activity, and as such they are significant for the historian to analyze. Some of them — such as the high level of mathematics found in certain chemical communities — can be traced to national educational systems and thus are more likely to bear a relationship to deeper processes, but many of the others are rhetoric. But, at the risk of belaboring a point: just because something is rhetoric doesn't mean it is historically insignificant.

SB: Which events and discoveries in the history of chemistry have happened unexpectedly and have become turning points for the development of science?

MG: This is a great question, and one that opens up a number of very interesting issues about how science has evolved over time. No one would doubt that unexpected events happen in the laboratory all the time — Becquerel leaving his uranium salts on top of some film in a drawer, for example. But it is pretty rare for something completely unexpected to happen, since the chemist has a certain collection of equipment and reagents available and is usually trying to accomplish something particular in the laboratory that day. As anyone who has spent any time in a laboratory knows, you don't always get what you expected, but that doesn't mean that the choices you have made have no impact on the set of unexpected outcomes that result. And if something completely unexpected were to happen, one which would have no framework in the concepts available to chemists at the time, then it would surely meet with a lot of resistance, as one finds with the way established chemists objected to the discovery of noble gases. (Mendeleev initially thought argon had to be N3, since the notion of an element that was chemically inert made no sense to him.) Generally, when an unexpected finding comes along in the historical record, closer investigation reveals that a certain group of chemists made a concerted effort to claim that it was a revolution in the science, and argued for thinking about this "unexpected" discovery as a confirmation of their prior theoretical arguments. This interplay between the serendipity of discovery and the hopefulness of theoretical speculation is one of the wellsprings of scientific creativity.

SB: You have taught a course on pseudoscience. What did you cover in that course?

MG: I find the topic of pseudoscience fascinating, and when I’ve taught this course I’ve covered a large variety of topics of things that have been variously classified (not without controversy) as pseudosciences: astrology, alchemy, phrenology, mesmerism, spiritualism, creationism, cold fusion, Lysenkoism, eugenics, and others. In the course, we emphasized what we can learn about how science works from these rejected domains of knowledge. After all, no one calls themselves a "pseudoscientist"—every single person so designated thinks that they are engaged in real scientific work. They don’t have to be right about that, but there is a lot of interest in trying to understand them in their own terms.

SB: Although scientific fraud is much less seen in chemistry than in the life sciences, cases like the one of Hendrick Schön, from Bell Labs, shook the chemical community several years ago. Schön had published numerous articles before it was discovered that he had submitted the same data repeatedly. Many of his papers had to be retracted, including ones that were published in reputable journals such as Science and Nature. How can scientific fraud be prevented or, once it has happened, punished? Do you consider peer review capable of filtering bad science?

MG: With regard to Schön, there is an important distinction to be made. On the one hand, we have the category of "pseudoscience," which can be roughly defined as something that is not science but tries very hard to look like science and adopt its methods and approaches. That is not quite the same thing as "fraud," which connotes a level of insincerity that one doesn’t find, for example, among seventeenth-century alchemists. (There is a third category, the hoax, which is something else again.) Now, as to what can be done about any of these things, I do not have any particular insights. Wherever you find science, you will find something that scientists label pseudoscience; the two always come together. Fraud, if one subscribes to a particular model of psychology, is a matter of incentives, and it is possible that with intensified safeguards, one can reduce its occurrence. But we almost certainly can’t eliminate it altogether. Peer review, as you mention, is often put forward as a solution to this problem, and it is likely better than having no safeguard at all — at least this guarantees that a few scientists read over the piece before it is published — but the evidence of recent years has shown that it is far from foolproof in catching fraud. But, as in the case of Schön, eventually the misdeeds come to light. Time seems to be our best tool in this matter.

SB: There are some historians who are very passionate about "The Kekulé Riddle".7 To chemists, the notion that it was Archibald Scott Couper and not Kekulé who found out that carbon is tetravalent and that it was Johann Josef Loschmidt, who drew the benzene ring for the first time, is quite surprising. What do you think of the claims that Kekulé has received credit for concepts in structural organic chemistry that had actually already been developed by others—such as Couper, Loschmidt, Ladenburg, Frankland and Butlerov? And does it matter, from a science historian point of view, who was the first to make the discovery?

MG: Being first in making a discovery certainly matters to the scientist! And, in that sense, it does matter for historians of science, since the passions and debates of the scientists are one of the most important things we investigate. Personally, I have spent a lot of time researching the priority dispute over the periodic system between Mendeleev and Julius Lothar Meyer. I am not interested in deciding who was "right" — I don't think historians are in the business of awarding prizes or credit — but the fact that this fight took place, and the kinds of arguments Mendeleev and Meyer used to argue for who was first, makes for a fascinating story to uncover. For better or worse, our system of assigning credit in the sciences centers on priority, and the historian is obligated to explore why that particular system emerged, and what its consequences have been. With respect to Couper, Butlerov, Kekulé, and others — I’m afraid I am a spectator in that historiography and am not going to weigh in on one side or the other, but I can tell you my own particular approach to this kind of question. The fact remains that Kekulé was awarded the credit by his peers. I am personally more interested in why they thought he should receive the credit, rather than in adjudicating whether they were correct or incorrect in doing so.

SB: How are the current conditions in academia (I have in mind such things as wider collaborations, struggling for grants, requirements for tenure that include publishing in high-impact journals, pre-prints, open-access, etc.) changing the way research is performed, reported and credited?

MG: It's generally a bad idea for a historian to speculate on the future, but there is no question that there have been significant transformations in the way of doing science both inside and outside academia that are bound to have important implications for how various disciplines develop in the future. One obvious factor has to do with funding. On the one hand, science is continually becoming more expensive, and there are more scientists competing for a fixed (or in some cases shrinking) pool of funds. On the other hand, the linkages between academia and industry are becoming tighter now than they have typically been (at least in the American context, with which I am most directly familiar), and this is shaping questions that are asked within universities as well as those asked in industrial laboratories. Conditions of publication are also changing in interesting ways. The problem of "information overload" has been with us as long as we have had journals (which is over three hundred years), and probably even longer than that. There is simply so much information for researchers to keep abreast of, so many venues where it appears, and not enough time in the world to track it. Managing this volume of information is a tremendous challenge, and the Internet has both provided tools for addressing this issue and in other ways also compounded the problem. We are seeing strains in the peer review system — exemplified in the use of the pre-print server among physicists, as well as other experiments in open-access — and also mounting costs for libraries. Without sufficient funding for research and access to information, science will suffer, or at the very least be forced to adapt. But I am not a pessimist on these questions. One of the most inspiring things about the history of science is how flexible scientists have been in adjusting to different conditions, and I am confident that while science will look different in thirty years than it did thirty years ago, the developments are going to be quite exciting.

SB: What projects have you been working on recently? What are you going to work on in the near future?

MG: I'm now beginning a large research project that connects with your query about the changes in chemistry in recent years. One of the most significant transformations in science over the last two hundred years has been the replacement of a polyglot community with an increasingly monoglot one. To take the example of chemistry, which is the focus of my research, in 1850 a chemist would be expected to be able to read, and to a lesser degree speak, German, English, and French. Today, almost no PhD program in chemistry requires any foreign-language competence at all, as the global production in chemistry becomes increasingly Anglophone. This is an extremely important development, and I believe there has not been enough attention to it aside from a dedicated group of sociological linguists based mostly in Germany. I am planning to write a history that spans from the decline of Latin as a language of scientific communication in the early eighteenth century, through the rise of national languages (including Russian), experiments with artificial languages like Esperanto, the fate of German (almost certainly the most important language in chemistry in the early twentieth century), and the current ascendancy of English. Before embarking on that, however, I am finishing another project related to my interests in pseudoscience as a way of exploring the history of science, with a book on the debates over the theories of Immanuel Velikovsky in Cold War America.

SB: Thank you for promoting the history of chemistry to a broad audience and for agreeing to discuss these interesting topics.

References

 

  1. Gordin, M. D., A Well-ordered Thing: Dmitrii Mendeleev And The Shadow Of The Periodic Table. Basic Books: 2004; p 384.
  2. Gordin, M. D., Five Days in August: How World War II Became a Nuclear War. Princeton University Press: 2007; p 226.
  3. Gordin, M. D., Red Cloud at Dawn: Truman, Stalin, and the End of the Atomic Monopoly. Farrar, Straus and Giroux: 2009; p 416.
  4. Intelligentsia Science: The Russian Century, 1860-1960. Gordin, M.; Hall, K.; Kojevnikov, A., Eds. University of Chicago Press Journals: 2008; p 316.
  5. Utopia/Dystopia: Conditions of Historical Possibility. Princeton University Press: 2010; p 264.
  6. Velikovsky, I., Worlds in Collision. Paradigma Ltd: 2009; p 436.
  7. John H. Wotiz, E., The Kekulé riddle: a challenge for chemists and psychologists. Cache River Press: Clearwater, FL, 1993; p 329.

Back to top

Book Reviews

Book Reviews: Scientific Writing

Robert E. Buntrock
buntrock16@myfairpoint.net

For this issue, several books on scientific writing will be covered either with brief reviews or by citation. Writing is not only fundamental to dissemination of information but it is a viable alternative career path for chemists and other scientists.

The first is on scientific communication, for both written and oral presentations.

 

Harmon, Joseph E.; Gross, Alan G. The Craft of Scientific Communication; University of Chicago Press: Chicago, 2010. $55. (Hardcover) 240 p. ISBN: 978-0-226-31661-1; $20 (Paper) ISBN: 978-022-31662-8; $7 rent, $20 (Electronic), 978-022-631663-5.

Although little information is given on writing for chemistry, a good text and reference for writing and presenting science to both scientific and public audiences. Crafting of a scientific article is described followed by four examples. Research proposals and communications to a lay audience are described next followed by a discussion of style based on how good scientists actually write. Not all sentences need to be short in active mode nor do long sentences need be split (no mention of the "Fog Index", recommended by some technical editors for "executive summaries"). Method descriptions are similar to Julia Childs’ recipes. Exercises, with answers follow each chapter. Another deficiency is the lack of mention of poor, cluttered, low contrast power point slides. (Previously reviewed by J. Kovac, J. Chem. Educ., 87(11), 1139-1140, 2010, doi: 10.1021/ed100882.)

 

The second review covers communication of scientific information to the public.

 

Introducing Scientific Communication: A Practical Guide; Brake, Mark L., Weitkamp, Emma, Eds.; Palgrave Macmillan: New York, 2010. $33.95. (Hardcover) 177 pp., ISBN 978-02305373864.

Excellent text or reference for scientific journalism, for presentation of science to policy makers and the general public. Scientific journalism is a viable but underutilized alternative career path for chemists and scientists for which a few universities are developing courses. (Previously reviewed by R. Buntrock, J. Chem. Educ., 87(11), 1138-1139, 2010, doi: 10.1021/ed100855.)

Also reviewed in that issue of J. Chem. Educ. (by L. Montes, J. Chem. Educ., 87(11), 1138, 2010, doi:10.1021/ed100864) is The Oxford Book of Modern Scientific Writing, by Richard Dawkins. Shown and discussed are more than 80 examples of writing by prominent scientists including some Nobel Prize winners.

For chemists, the benchmark reference remains The ACS Style Guide: Effective Communication of Scientific Information, 3rd edition, by A. M. Coghill and L. R. Garson. (Previously reviewed by R. Buntrock, J. Chem. Inf. Model., 47(2), 703-704, 2007, doi: 10.1021/ci600536.)

As before, we’re always open to suggestions for books to review as well as volunteer reviewers. With the demise of book reviews in JCIM, it’s up to us to "carry the torch" for book reviews on chemical information and related topics.

Dowload the PDF