Vol. 68, No. 3: Fall 2016

Chemical Information Bulletin

A Publication of the Division of Chemical Information of the ACS
Fall 2016 — Vol. 68, No. 3

Philadelphia, PA

 

Teri M. Vogel, Editor,
UC San Diego Library, San Diego
tmvogel@ucsd.edu

ISSN: 0364-1910
Chemical Information Bulletin,
© Copyright 2016 by the Division of Chemical Information of the American Chemical Society.
 

Message from the Chair

ImageI realize that I begin almost every CINF bulletin “Message from the Chair” commenting regarding where we are relative to the national meeting.  The message either becomes a recap of what happened at the past national meeting, or an advertisement of what to expect from the CINF program and social events in the meeting yet to come.  A major topic of discussion at the CINF executive meeting during the national meetings is what value we can provide, as a division, for the majority of our division members who do not attend a national meeting.  We can provide them with slides, summaries and videos of the talks at our division symposia that they missed because they did not attend the national meeting.  But again this is “national meeting-centric.”  What is the value of CINF Division Membership?

Networking is routinely listed as one of the major reasons people join an ACS division or attend a national meeting. Membership in CINF can provide interactive networking opportunities but most of the networking occurs during social events at national meetings.  Is there some way in which we can extend networking to “virtual” online networking throughout the year through topical discussions related to chemical information or through chat and social media or tools such as “trello” or “slack,” or other networking or virtual meeting tools? CINF is currently developing an effective member email list group so that we will be able to more directly interact with our membership and share information.  Hopefully with more direct interactions we will encourage more of our membership to be involved and share ideas and more effectively network outside of the national meeting venue.

Another way to provide value to CINF division members would be to reach membership locally by having symposia at regional meetings or have local “brown bag CINF” groups in areas where there is a sufficient group or cluster of CINF members.  CINF also participates in the development of online educational tools, webinars and tutorials such as the OLCC (http://olcc.ccce.divched.org/) or our webinar series (http://www.acscinf.org/content/webinars) , open source tools or programs in the area of chemical information or cheminformatics. If you are interested in participating as a speaker in our webinar series, please let us know. We are looking for engaging speakers in the area of chemical information and cheminformatics or related topics.

This bulletin is one way in which we reach out to our membership.  If you have any ideas how CINF can serve you more effectively or what activities you feel CINF should engage in to be more responsive to membership, please let us know!  The name and email addresses of all CINF officers are listed on our website (http://www.acscinf.org/content/executive-committee), so please reach out to us with your suggestions. Or, better yet, contact us and join a committee, run for office, or start a new committee based on an idea of a new service/function CINF can provide its members!

CINF members will be receiving ballots shortly, so please don’t forget to vote! Voting information and candidate statements appear on the CINF website (http://www.acscinf.org/content/cinf-2016-elections-candidate-statements).

In a few weeks we will be meeting in Philadelphia.  We have an excellent program planned, including a session cosponsored by MEDI, “Effectively Harnessing the World’s Literature to Inform Rational Compound Design”, and a session cosponsored with CHED on “Bringing Cheminformatics Into the College Chemistry Classroom”, and with BIOT, COMP and MEDI “Shedding Light on the Dark Genome” and CINF is cosponsoring a ANYL symposium on “New Directions in Chemometrics: Making Sense of Big and Small Chemical Data Sets”.

On Tuesday, please come and listen to our Herman Skolnik awardees, Dr. Evan Bolton and Dr. Stephen Bryant, and Skolnik symposium speakers.  I want to invite all who will be attending the Philadelphia meeting to our welcoming reception and poster session on Sunday evening August 21st 6:30-8:30 in the Loews Philadelphia Hotel, Howe room.  If you are attending, please approach me with your ideas of how CINF can serve you, its members.

Rachelle J. Bienstock, Chair,
ACS Division of Chemical Information
Rachelleb1@gmail.com

Letter from the Editor

Please enjoy the fall—even though it is not quite fall yet—pre-conference issue of the Chemical Information Bulletin (CIB). This is my first turn as editor after several years of serving as assistant editor for the fall and spring issues that Vincent Scalfani edited. We have decided to split the duties, so Vincent will continue as editor for the spring CIB.

In this issue, you will find Rachelle’s “Message from the Chair,” a report from our CINF Councilors (and congratulations to Bonnie Lawlor for 25 years of service as a CINF Councilor), and Bob Buntrock’s review of You Could Look It Up, The Reference Shelf from Ancient Babylon to Wikipedia. There’s also another installment of the best articles and blog posts about chemistry, science and popular culture, a continuation of the columns I wrote for the last two fall issues. One thing you’ll notice about this issue is that we did not include a copy of the CINF technical program, but you can access the PDF and HTML versions (with and without abstracts) from the CINF website. It looks like another interesting and thought-provoking lineup of symposia and speakers, and I’m sorry that I’ll be missing the conference.

Along thanking our contributors and everyone who assisted with the production and copyediting, I want to give a special thanks to our generous sponsors, many of whom provided news and updates about their recent chemical information products. So please check their product news and stop by their booths at the meeting to find out more.

At the end of this issue is the contact information for the four CIB editors. You can contact any of us if you are interested in contributing to a future issue, even if you just have an idea of what you might want to write and want to bounce ideas off one of us.

Have a great time in Philadelphia!

Teri M. Vogel, Editor
UC San Diego Library
tmvogel@ucsd.edu

CINF Social Networking Events at the Fall 2016 ACS Meeting

Image      Image

Please Join Us At These Division of Chemical Information Events!

The ACS Division of Chemical Information is pleased to host the following social networking events at the Fall 2016 ACS National Meeting in Philadelphia, PA.


Sunday Welcoming Reception & Scholarships for Scientific Excellence Posters
6:30-8:30 pm, Sunday, August 21stHowe Room, Loews Philadelphia Hotel
Reception co-sponsored by: Journal of Chemical Information & Modeling, ACS Publications, EPA National Center for Computational Toxicology, Genentech, and PerkinElmer.

Scholarships for Scientific Excellence Sponsored exclusively by ACS Publications.


Tuesday Luncheon (Ticketed Event – Contact Division Chair, Rachelle Bienstock)
12:00-1:30 pm Tuesday, August 23rd – Howe Room, Loews Philadelphia Hotel Sponsored exclusively by the Royal Society of Chemistry.

Speaker: Dr. James Voelkel,
Curator of Rare Books, Othmer Library of Chemical History, Resident Scholar, Beckman Center for the History of Chemistry, Chemical Heritage Foundation

Presentation: Isaac Newton’s Alchemy in XML: The Chymistry of Isaac Newton Project and the proposed online Chymical Encyclopedia, Database, and Repository.

James Voelkel Ph.D.Preparing an online edition of Isaac Newton’s alchemical papers presented a digital humanities challenge.  The papers are replete with encoding challenges, such as alchemical symbols, paleography, abbreviations and citations, not to mention all the ordinary manuscript conventions of deletions, insertions, and rearrangement.  This paper will present the work undertaken by the Chymistry of Isaac Newton team, and where the digital history of alchemy goes from here.  It has become clear that true understanding alchemical practice requires reenactment of alchemical laboratory procedures. The proposed online Chymical Encyclopedia, Database, and Repository will be an attempt to widen the field of study and involve practicing chemists as well as historians, conservators, and archaeologists, in the reconstruction of chemical practice before Lavoisier.


Herman Skolnik Award Symposium & Reception Honoring Dr. Evan Bolton & Dr. Steve Bryant

Symposium: 8 am-5 pm Tuesday, August 23rd – Room 112A/B - Pennsylvania Convention Center

Reception: 6:30-8:30 pm Tuesday, August 23rd – Howe Room, Loews Philadelphia Hotel

Co-sponsored by: Journal of Chemical Information & Modeling, Bio-Rad Laboratories, Chemical Abstracts Service, Thomson Reuters, and DrugPatentWatch.

CINF Business Meetings

Saturday, August 20: 1:00-3:00 PM

  • Education Committee – Pennsylvania Convention Center 104B
  • Awards Committee - Pennsylvania Convention Center 104A
  • Program Committee - Pennsylvania Convention Center 102A

Saturday, August 20: 3:00-6:00 PM

  • Executive Committee - Pennsylvania Convention Center 103A

Sunday, March 13: 12:00-2:00 PM

  • Chemical Structure Association Trust - Pennsylvania Convention Center 109A

Awards and Scholarships

Image

Chemical Structure Association Trust

 

 

 

Applications Invited for CSA Trust Grant for 2017

The Chemical Structure Association (CSA) Trust is an internationally recognized organization established to promote the critical importance of chemical information to advances in chemical research. In support of its charter, the Trust has created a unique Grant Program and is now inviting the submission of grant applications for 2017. The deadline for receipt of proposals for the 2017 Grant is also being announced at this time.

Purpose of the Grants

The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, research or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. One or more Grants will be awarded annually up to a total combined maximum of ten thousand U.S. dollars ($10,000).  Grantees have the option of payments being made in U.S. dollars or in British Pounds equivalent to the U.S. dollar amount. Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how the grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

Who is Eligible?

Applicant(s), age 35 or younger, who have demonstrated excellence in their chemical information related research and who are developing careers that have the potential to have a positive impact on the utility of chemical information relevant to chemical structures, reactions and compounds, are invited to submit applications.  While the primary focus of the Grant Program is the career development of young researchers, additional bursaries may be made available at the discretion of the Trust. All requests must follow the application procedures noted below and will be weighed against the same criteria.

Which Activities are Eligible?

Grants may be awarded to acquire the experience and education necessary to support research activities, for example, for travel to collaborate with research groups, to attend a conference relevant to one’s area of research (including the presentation of an already accepted research paper), to gain access to special computational facilities, or to acquire unique research techniques in support of one’s research.

Application Requirements

Applications must include the following documentation:

  1. A letter that details the work upon which the Grant application is to be evaluated as well as details on research recently completed by the applicant;The amount of Grant funds being requested and the details regarding the purpose for which the Grant will be used (e.g. cost of equipment, travel expenses if the request is for financial support of meeting attendance, etc.). The relevance of the above-stated purpose to the Trust’s objectives and the clarity of this statement are essential in the evaluation of the application)A brief biographical sketch, including a statement of academic qualifications;
  2. Two reference letters in support of the application.

Additional materials may be supplied at the discretion of the applicant only if relevant to the application and if such materials provide information not already included in items 1-4. A copy of the completed application document must be supplied for distribution to the Grants Committee and can be submitted via regular mail or e-mail to the Committee Chair (see contact information below).

Deadline for Applications

The application deadline for the 2017 Grant is March 31, 2017. Successful applicants will be notified no later than May 9, 2017.

Address for Submission of Applications

The application documentation can be mailed via post or emailed to: Bonnie Lawlor, CSA Trust Grant Committee Chair, 276 Upper Gulph Road, Radnor, PA 19087, USA. If you wish to enter your application by e-mail, please contact Bonnie Lawlor at chescot@aol.com prior to submission so that she can contact you if the e-mail does not arrive.

Chemical Structure Association Trust: Recent Grant Awardees

2016 - Thomas Coudrat

Monash University, Australia, was awarded a Grant to cover travel to present his work at three meetings in the United States: the Open Eye Scientific CUP XVI, The American Chemical Society Spring Meeting, and the Molsoft ICM User Group Meeting. His work is in ligand directed modeling.

2016 - Clarisse Pean

Chimie Paris Tech, France, was awarded a Grant to cover travel to give an invited presentation at the 2016 Pacific Rim Meeting on Electrochemical and Solid State Science later this year.

2016 - Qian Peng

University of Oxford, England, was awarded a Grant to attend the 23rd IUPAC Conference on Physical Organic Chemistry. His research is in the development of new ligands for asymmetric catalysis.

2016 - Petteri Vainikka

University of Turku, Finland, was awarded a Grant to spend the summer developing and testing new methods for modeling organic solvents in organic solutions with Dr. David Palmer and his group at the University of Strathclyde, Glasgow, Scotland.

2016 - Qi Zhang

Fudan University, China, was awarded a Grant to attend a Gordon Conference on Enzymes, coenzymes and metabolic pathways. His research is in enzymatic reactions.

2015 – Dr. Marta Encisco

Molecular Modeling Group, Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Australia. She was awarded a Grant to cover travel costs to visit collaborators at universities in Spain and Germany and to present her work at the
European Biophysical Societies Association Conference in Dresden, Germany in July 2015.

2015 – Jack Evans

School of Physical Science, University of Adelaide, Australia. He was awarded a grant to spend two weeks collaborating with the research group of Dr. Francois-Xavier Coudert (CNRS, Chimie Paris Tech).

2015 – Dr. Oxelandr Isayev

Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmaacy, University of North Carolina at Chapel Hill. He was awarded a Grant to attend summer classes at the Deep Learning Summer School 2015 (University of Montreal) to expand his knowledge of machine learning to include Deep Learning (DL). His goal is to apply DL to chemical systems to improve predictive models of chemical bioactivity.

2015 – Aleix Gimeno Vives

Cheminformatics and Nutrition Research Group, Biochemistry and Biotechnology Dept., Universitat Rovira i Virgili. He was awarded a Grant to attend the Cresset European User Group Meeting in June 2015 in order to improve his knowledge of the software that he is using to determine what makes an inhibitor selective for PTP1B.

2014 – Dr. Adam Madarasz

Institute of Organic Chemistry, Research Centre for Natural Sciences, Hungarian Academy of Sciences. He was awarded a Grant for travel to study at the University of Oxford with Dr. Robert S. Paton, a 2013 CSA Trust Grant winner, in order to increase his experience in the development of computational methodology which is able to accurately model realistic and flexible transition states in chemical and biochemical reactions.

2014 – MJosé Ojeda Montes

Department of Biochemistry and Biotechnology, University Rovira i Virgili, Spain. She was awarded a Grant for travel expenses to study for four months at the Freie University of Berlin to enhance her experience and knowledge regarding virtual screening workflows for predicting therapeutic uses of natural molecules in the field of functional food design.

2014 – Dr. David Palmer

Department of Chemistry, University of Strathclyde, Scotland.  He was awarded a Grant to present a paper at the fall 2014 meeting of the American Chemical Society on a new approach for representing molecular structures in computers based upon on ideas from the Integral Equation Theory of Molecular Liquids.

2014 – Sona B. Warrier

Departments of Pharmaceutical Chemistry, Pharmaceutical Biotechnology, and Pharmaceutical Analysis, NMIMS University, Mumbai. She was awarded a Grant to attend the International Conference on Pure and Applied Chemistry to present a poster on her research on inverse virtual screening in drug repositioning.

2013 – Dr. Johannes Hachmann

Department of Chemistry and Chemical Biology at Harvard University, Cambridge, MA.   He was awarded the Grant for travel to speak on “Structure-property relationships of molecular precursors to organic electronics” at a workshop sponsored by the Centre
Européen de Calcul Atomique et Moléculaire (CECAM) that took place October 22 – 25, 2013 in Lausanne, Switzerland.

2013 – Dr. Robert S. Paton

University of Oxford, UK.  He was awarded the Grant to speak at the Sixth Asian Pacific Conference of Theoretical and Computational Chemistry in Korea on July 11, 2013. Receiving the invitation for this meeting provided Dr. Paton with an opportunity to further his career as a Principal Investigator.

2013 – Dr. Aaron Thornton

Material Science and Engineering at CSIRO in Victoria, Australia. He was awarded the Grant to attend the 2014 International Conference on Molecular and Materials Informatics at Iowa State University with the objective of expanding his knowledge of web semantics, chemical mark-up language, resource description frameworks and other online sharing tools. He also visited Dr. Maciej Haranczyk, a prior CSA Trust Grant recipient, who is one of the world leaders in virtual screening.

2012 – Tu Le

CSIRO Division of Materials Science & Engineering, Clayton, VIV, Australia. Tu C. was awarded the Grant for travel to attend a cheminformatics course at Sheffield University and to visit the Membrane Biophysics group of the Department of Chemistry at Imperial College London.

2011 – J. B. Brown

Kyoto University, Kyoto, Japan. J.B. was awarded the Grant for travel to work with Professor Ernst Walter-Knappat the Freie University of Berlin and Professor Jean-Phillipe Vert of the Paris MinesTech to continue his work on the development of atomic partial charge kernels.

2010 – Noel O’Boyle

University College Cork, Ireland. Noel was awarded the grant to both network and present his work on open source software for pharmacophore discovery and searching at the 2010 German Conference on Cheminformatics.

2009 – Laura Guasch Pamies

University Rovira & Virgili, Catalonia, Spain.  Laura was awarded the Grant to do three months of research at the University of Innsbruck, Austria.

2008 – Maciej Haranczyk

University of Gdansk, Poland. Maciej was awarded the Grant to travel to Sheffield University, Sheffield, UK, for a 6-week visit for research purposes.

2007 – Rajarshi Guha

Indiana University, Bloomington, IN, USA. Rajarshi was awarded the Grant to attend the Gordon Research Conference on Computer-Aided Design in August 2007.

2006 – Krisztina Boda

University of Erlangen, Erlangen, Germany. Krisztina was awarded the Grant to attend the 2006 spring National Meeting of the American Chemical Society in Atlanta, GA, USA.

2005 – Dr. Val Gillet and Professor Peter Willett

University of Sheffield, Sheffield, UK.  They were awarded the Grant for student travel costs to the 2005 Chemical Structures Conference held in Noordwijkerhout, the Netherlands.

2004 – Dr. Sandra Saunders

University of Western Australia, Perth, Australia. Sandra was awarded the Grant to purchase equipment needed for her research.

2003 – Prashant S. Kharkar

Institute of Chemical Technology, University of Mumbai, Matunga, Mumbai. Prashant was awarded the Grant to attend the conference, Bioactive Discovery in the New Millennium, in Lorne, Victoria, Australia (February 2003) to present a paper, “The Docking Analysis

of 5-Deazapteridine Inhibitors of Mycobacterium avium complex (MAC) Dihydrofolate reductase (DHFR).”

2001 – Georgios Gkoutos

Imperial College of Science, Technology and Medicine, Department of Chemistry. London, UK. Georgios was awarded the Grant to attend the conference, Computational Methods in Toxicology and Pharmacology Integrating Internet Resources, (CMTPI-2001) in Bordeaux, France, to present part of his work on internet-based molecular resource discovery tools.

Committee Reports

Report on the Council Agenda for August 24, 2016

The Council of the American Chemical Society will meet in Philadelphia, PA on Wednesday, August 24, 2016 from 8:00 am until approximately 12:00 pm in the Grand Ballroom Salon E-H of the Philadelphia Marriott Hotel.  All ACS members are welcome to attend, although only Councilors are permitted to vote.  A continental breakfast is usually available at 7:00 am for all attendees.

There are eight items for Council Action and they are summarized below.

Nominations and Elections

Elections will be held to fill open slots on the following Committees:

Committee on Committees: Council will vote to fill five slots on the Committee on Committees.  There are ten nominees as follows: Gary D. Anderson, Sandra J. Bonetti, Dee Ann Casteel, D. Richard Cobb, Jaqueline A. Erickson, Emilio X. Esposito, Robert J. Hargrove, Martha G. Hollomon, Wayne E. Jones, Jr., and Stephanie J. Watson. The five candidates receiving the highest numbers of votes will be declared elected for the 2017-2019 term.

Council Policy Committee: Council will vote to fill seven slots on the Council Policy Committee.  There are fourteen nominees as follows: Harmon B. Abrahamson, Karl S. Booksh, Dwight W. Chasar, Ell L. Davis, Lissa A. Dulany, Gregory M. Ferrence, John W. Finley, Doris I. Lewis, Kim M. Morehouse, Barbara E. Moriarty, Sally B. Peters, Martin D. Rudd, Julianne M. D. Smist, Andrea B. Twiss-Brooks. The five candidates receiving the highest numbers of votes will be declared elected for the 2017-2019 term, and two candidates receiving the sixth and seventh highest votes will be elected for a one-year term in 2017.

Committee on Nominations and Elections: Council will vote to fill six slots on the Committee on Nominations and Elections.  There are twelve nominees as follows: Anthony W. Addison, Spiro D. Alexandratos, Lisa M. Balbes, Alan M. Ehrlich, Stan S. Hall, Alan A, Hazari, Amber S. Hinkle, Neil D. Jespersen, James M. Landis, Jr., Thomas H. Lane, Will E. Lynch, and V. Michael Mautino. The five candidates receiving the highest numbers of votes will be declared elected for the 2017-2019 term and the candidate receiving the sixth highest number of votes will serve a one-year term in 2017.

Committee on Committees

The Committee on Committees will seek Council approval for the continuation of the ACS Committees whose five-year reviews are completed and accepted prior to the Council meeting.

Committee on Local Section Activities

The Committee on Local Section Activities is seeking approval for changes to two Local section territories.

The Permian Basin Section in Texas requests approval for annexation of two unassigned counties: Brewster and Pecos. Permian Basin would like to incorporate Brewster county, which is the home of Sul Ross State University, and Pecos, the contiguous territory, which includes Ft. Stockton. The six members residing in the two unaffiliated counties were contacted and asked to respond to the petition. Both of the responding members supported the annexation.

The Upper Peninsula Section in Michigan requests approval for annexation of seven unassigned counties: Alger, Delta, Dickenson, Schoolcraft, Luce, Mackinac, and Chippewa. The section also requests reassignment of Menominee County, Michigan, currently in the territory of the Northeast Wisconsin Local Section. The twenty members residing in the seven unaffiliated counties of the Upper Peninsula were contacted and asked to respond to the petition. All seven of the responding members supported the annexation. The officers of the Northeast Wisconsin Local Section and the three members on record as residing in Menominee County, Michigan were also contacted. One member was recently deceased. One member was in favor, and one member was opposed to the petition. All of the officers of the Northeast Wisconsin Local Section supported the petition. (It should be noted that all ACS members currently assigned to a local section may choose to change their local section affiliation upon request, but may only belong to one local section at a time.) Annexation of all eight counties will unite the territory into a single local section representing all of the Upper Peninsula of Michigan .

Committee on Membership Affairs

The Committee on Membership Affairs will seek approval for a petition to extend the unemployed dues waiver for ACS members (this was presented for consideration at the San Diego Council meeting held earlier this year).

The petitioners propose changes to the ACS’s Bylaws to allow unemployed members of the Society to remain as members without paying dues for a period of up to three years. Bylaw XIII, Sec. 3, k currently provides for the dues to be waived for an unemployed member for a period of up to two years. The Committee on Membership Affairs (MAC) has conducted a three-year test and has prepared a Market Data Status Report (as of 5-14-15), which resulted in the following data. Consecutive Years Unemployed Count 1-year unemployed 882 2-years unemployed 331 3-years unemployed 179. Expanding this benefit to a third year prevented 179 members from being removed from membership in the Society. The extension of the benefit by another year may have small cost implications that are difficult to determine, but are far outweighed by the preservation of membership status for those individuals who have been unemployed as chemists for up to three years.

The Society Committee on Budget and Finance has examined this petition and cannot assess with reasonable accuracy the range of potential costs. Therefore, the financial impact of the petition is unknown. The Committee on Constitution and Bylaws (C&B)

received a revised version of the petition, signed by a majority of the petitioners, to     
address C&B's concerns. C&B made one slight editorial change to the Explanation, which also was approved by a majority of the petitioners. The Committee finds the revised petition to be legal and consistent with other provisions of the Society's documents. A two-thirds (2/3) vote of Council is required for approval of amendments to the Bylaws. If approved by Council, the amendments will become effective upon confirmation by the Board of Directors.

Committee on Economic and Professional Affairs

The Committee on Economics and Professional Affairs seeks Council approval of their Chemical Professional’s Code of Conduct (CPCC) for which they developed revisions and presented to Council for consideration in San Diego earlier this year.  The Code was last approved in 2012.  After a rigorous review of the document, an updated version of the CPCC is also included in the Council Agenda Book page 93.  The Council Agenda Book can be accessed at: https://www.acs.org/content/dam/acsorg/about/governance/councilors/council-agenda-8-16.pdf.

Committee on Constitution and Bylaws

The Committee on Constitution and Bylaws (C&B) is putting forth two proposed revisions to their Charter Bylaws templates, one for Divisions in probationary status and one for new local sections.  These appear on pages 97 through 113 of the Council Agenda book (see: https://www.acs.org/content/dam/acsorg/about/governance/councilors/council-agenda-8-16.pdf).

The Committee on Economic and Professional Affairs (CEPA) has developed revisions to the Chemical Professional’s Code of Conduct (CPCC). This was last approved in 2012. After a rigorous review of the document, an updated version of the CPCC is also included in the Council Agenda Book page 72. Please send any suggestions for further revisions to careers@acs.org before April 30, 2016 so that they can be incorporated into the revised document which will be up for Council action at the meeting in Philadelphia later this year. The Council Agenda Book can be accessed at: http://www.acs.org/content/dam/acsorg/ about/governance/councilors/council-agenda-3.16.pdf.

C&B also has two proposed amendments for the ACS Bylaws - one a petition for the removal of ACS Officers and Councilors and a second on the rights of ACS affiliates.

Petition for the Removal of ACS Officers and Councilors

Currently no explicit authorization for the removal of Officers exists in either the SOCIETY Constitution or Bylaws. C&B recommends that Local Sections and Divisions include a procedure in their documents that enables them to remove officers for neglect of duties and C&B provides model language for that purpose. Up to this point, Councilors and Alternate Councilors have not been covered by the model language because Councilors and Alternate Councilors, although elected by the Local Sections and Divisions, are officials of a national body. The proposed amendments do two separate things. The first part of these amendments (to Bylaw III) will authorize Local Sections and Divisions to petition the Council Policy Committee to remove Councilors and Alternate Councilors for neglect of duties, misconduct, or injurious conduct, thus filling the gap identified above. In addition, five members of the Council may also file such a petition. The Council Policy Committee (CPC) is developing due-process procedures to evaluate such petitions and make removal decisions. These new procedures are derived from those developed by C&B for removal of local section or division officers, and must be approved by Council before this provision takes effect. Future changes to the procedures must also be approved by Council. The second and third parts of these amendments add provisions to the SOCIETY Bylaws that explicitly authorize the removal of an elected official of a Local Section or Division. The existing model language only applies to neglect of duties. The amendments would expand that authorization to any misconduct or conduct which tends to injure the Local Section or Division or to adversely affect its reputation or which is contrary to or destructive of its objects. Local Sections and Divisions could then expand the scope of the removal procedures if they amend their documents accordingly. Expansion of the scope is permitted, but not required.

Petition on the Rights of Affiliates

BYLAW II, Secs. 1 and 2 specify the rights of Local Section Affiliates and Division Affiliates; however, C&B believes that Local Section Affiliates and Division Affiliates should be allowed to be nonvoting members of the Executive Committee (or equivalent policymaking body), and to be appointed as Committee Chairs, if allowed in the bylaws of the Local Section or Division. Society Affiliates should also have the right to vote in Local Section and Division elections, except for Councilors and Alternate Councilors, if so decided by each Local Section and Division and if included in their bylaws. These changes would align the voting and participation rights within Local Sections and Divisions for Society Affiliates of a Division, Local Section Affiliates, and Division Affiliates. This would not change the rights of Society Affiliates as stated elsewhere in the ACS Governing Documents.

International Activities Committee

The International Activities Committee (IAC) seeks Council approval of three petitions to charter new International Chemical Sciences Chapters. These are as follows:

China National Capital Area International Chemical Sciences Chapter

One legal application has been received for the formation of a new international chemical sciences chapter to be known as the China National Capital Area (JingJinJi) International Chemical Sciences Chapter. The JingJinJi (China) International Chemical Sciences Chapter will consist of the territory of the provinces of Beijing, Tianjin and Hebei, and is not part of any other Chapter or Local Section of the Society. The petition was initiated and signed by ACS members in good standing and residing in the territory. The application meets all of the requirements of Bylaw IX of the Society, and includes a statement that the applicants are familiar with and will abide by all governing documents of the Society including specifically Bylaw IX Section 2(c), which states that the Chapter and its officers as representatives of the Chapter shall not engage in political activity, shall avoid any activities that may adversely affect the interests and/or public and professional image of the Society, and shall assure that all activities of the Chapter shall be open to all members of the Society. The application includes a proposed budget for the operation of the Chapter, which includes no allotment of funds from the Society. The petition has been reviewed by the ACS Joint-Board Committee on International Activities (IAC). This action seeks the approval of the Council and is contingent on the approval from the ACS Board of Directors, after which, the Chapter will begin operation.

Iraq International Chemical Sciences Chapter

One legal application has been received for the formation of a new international chemical sciences chapter to be known as the Iraq International Chemical Sciences Chapter. The Iraq International Chemical Sciences Chapter will consist of the territory of Iraq, and is not part of any other Chapter or Local Section of the Society. The petition was initiated and signed by ACS members in good standing and residing in the territory. The application meets all of the requirements of Bylaw IX of the Society, and includes a statement that the applicants are familiar with and will abide by all governing documents of the Society including specifically Bylaw IX Section 2(c), which states that the Chapter and its officers as representatives of the Chapter shall not engage in political activity, shall avoid any activities that may adversely affect the interests and/or public and professional image of the Society, and shall assure that all activities of the Chapter shall be open to all members of the Society. The application includes a proposed budget for the operation of the Chapter, which includes no allotment of funds from the Society. The petition has been reviewed by the ACS Joint-Board Committee on International Activities (IAC). This action seeks the approval of the Council and is contingent on the approval from the ACS Board of Directors, after which, the Chapter will begin operation.

South Western China International Chemical Sciences Chapter

One legal application has been received for the formation of a new international chemical sciences chapter to be known as the South Western China International Chemical Sciences Chapter. The South Western China International Chemical Sciences Chapter will consist of the territory of the provinces of Sichuan, Chongqing, Guizhou, and Yunnan, and is not part of any other Chapter or Local Section of the Society. The petition was initiated and signed by ACS members in good standing and residing in the territory. The application meets all of the requirements of Bylaw IX of the Society, and includes a statement that the applicants are familiar with and will abide by all governing documents of the Society including specifically Bylaw IX Section 2(c), which states that the Chapter and its officers as representatives of the Chapter shall not engage in political activity, shall avoid any activities that may adversely affect the interests and/or public and professional image of the Society, and shall assure that all activities of the Chapter shall be open to all members of the Society. The application includes a proposed budget for the operation of the Chapter, which includes no allotment of funds from the Society. The petition has been reviewed by the ACS Joint-Board Committee on International Activities (IAC). This action seeks the approval of the Council and is contingent on the approval from the ACS Board of Directors, after which, the Chapter will begin operation.

Town Hall Meeting

A Town Hall meeting organized by the Committee on Nominations and Elections is scheduled for Sunday, August 21, 2016 in the Liberty Salon A/B of the Philadelphia Marriott Hotel from 4:30pm - 5:30pm. It will highlight a Q&A session with the candidates for Director-at-Large.  All ACS members are encouraged to attend.  It is a great way to gather first-hand information and decide for whom you might want to vote in the fall election.

Note: The Council Agenda Book can be accessed at: http://www.acs.org/content/dam/ acsorg/about/governance/councilors/council-agenda-3.16.pdf.

Respectfully submitted July 23, 2016

CINF Councilors
Bonnie Lawlor
Andrea Twiss-Brooks
Svetlana N. Korolev

Book Review: You Could Look It Up

Lynch, J., You Could Look It Up, The Reference Shelf from Ancient Babylon to Wikipedia, Bloomsbury Press. New York, NY, 2016. 464 p. + x. ISBN 978-0-8027-7752-2, $64.

The title of this wonderful book takes me back to high school English class in the ‘50s where the motto for the inevitable library research project was “You can look it up.”  Interestingly, even though I have ended up using reference materials for over 60 years and making a living at it for 45 years, I don’t remember much about those library exercises, except that they often involved questions not of interest to me or looking up stuff that I already knew (I did have the nickname of “the walking encyclopedia”).

Nevertheless, the last 60 years have made this book even more interesting.  In the Prologue, Jack Lynch (Professor of English, Rutgers Newark) observes that reference works require writing or written words since it is impossible to record oral histories without them.  Reference books are by definition “big” and designed for “looking up” rather than reading all the way through (although some users have done so).  Books have readers, reference books have users.

This book is far from comprehensive but it does describe 50 reference “books”, by pairs, in 25 chapters, each pair on related topics.  Each chapter is designed to answer a number of questions including “what need is answered”, “who wrote it”, “what were their motives”, “what is in and what is not”, and reception and impact by and on society and culture.  A data box for each work describes the work’s (often the first edition) properties and measurements, dimensions, number of pages/volumes, even weights.  The works range from The Code of Hammurabi (1754 BCE, 7 feet by 2 feet, 4 tons) to the Guinness Book of Records (1955, 10 x 7.5 in., 4000 entries, 198 pages).

Possibly of even more interest than the primary 25 chapters are the secondary “half chapters” which further describe various aspects of reference material in general historical and cultural context.  Topics include alphabetization, personal organization of reference works, why works went out of print, and 4 pages of esoteric or trivial information sources (subjective and only a fraction of those available) in Chapter 22-1/2, “Unlikely Reference Books”.

Of special interest to me is Chapter 3-1/2, “The Rise and Fall of Alphabetization”.  I have long been fascinated by alphabets and alphabetization and have wondered how the various alphabetic orders were established.  Unfortunately, I still do not have an answer since Lynch knows of no explanation.  The first written media were not really alphabets but were symbols that stood for entire words or syllables.  Sumerian cuneiform was the first (3300 BDC) with 1000 characters, next came Egyptian hieroglyphics (3200 BCE), and then Chinese, currently with 50,000 characters.  Alphabets were derived by Semitic speakers in central Egypt about 2000 BCE and by the time they migrated to Phoenicia about 1050 BCE (who distributed the concept through trade), the symbolism was lost and each letter represented a sound.  (However, one source (1) maintains that the Hebrew alphabet maintains symbolism and numeracy.)  Alphabets brought efficiency to reading but alphabetic order did not arise until much later.  Reference words were organized thematically for millennia but alphabetic order appeared in European reference works about 1300 CE.  However, it was so unfamiliar that instructions had to be given for use (until recently, something mastered by most elementary school students).  The need for alphabetic order is decreasing as print resources morph into electronic.  I recently witnessed this when my 12 year old grandson was helping me work a crossword puzzle and had difficulty looking up words in our collection of “essential” crossword dictionaries and references.  He and his fellow students have not had to use alphabetic order in the age of Google.

Another chapter of interest is 15-1/2, “Out of Print”.  Reference works, even those which have appeared in several editions, cease for a number of reasons including political (e.g., Soviet supplanting Czarist Russian), calculators (supplanting log and sine tables), and the rise of the internet.  The 2010 Encyclopedia Britannica print edition will be the last and no plans are announced for the Oxford English Dictionary.  (For what it’s worth, I still have my books of 5 place log and trig tables as well as my 12’ and 24’ inch slide rules.).

Back to the topics of the reference works.  Library catalogues are described in Chapter 21.  The history and development of library catalogs is described leading up to the features works of Panizzi’s “Catalogue of Printed Books in the Library of the British Museum” and “The National Union Catalog, Pre-1956 Imprints…”.  The latter two are probably well known to library school graduates but fascinating to this reviewer.  The immensity of these resources, still far from complete, as well as the development of LC 3x5 catalog cards and OCLC (and expansion beyond books to other materials) and progressions to online are described.  The quote by John Overholt (Harvard) is noteworthy: “Good cataloging is the foundation stone of librarianship.  If you have an item and can’t find it, you really don’t have it”.  The earliest library catalogs date from 660-300 BCE.  About 300 CE, Callimachus prepared a list, on 120 scrolls, of ancient works by category.  Due to general illiteracy, numbers of libraries and books in the Western World lagged far behind those resources in the Arabic and Chinese cultures but the Gutenberg Revolution changed all that.

The controversial rise of “Index learning” is recounted in Chapter 21-1/2.  As far back as Pope and Swift, various authors condemned the increase in use of reference resources as undermining the full impact of the original and leading not only to reduced knowledge but wisdom as well.  Of course, Google is currently condemned for not only information overload but leading to no deep knowledge or inability to attain wisdom.  Plus ca change …:

Possibly of most interest to CINF members and readers of the CIB are Chapters 20 and 23.  The former is medical resources and discusses the history and evolution of “Anatomy Descriptive and Surgical” (aka Gray’s Anatomy) and “Diagnostic and Statistical Manual of Mental Disorders” (aka DSM, 1958).  DSM-5, 2013, is much larger than the original, 300+ conditions and 947 pages.  Chapter 23 describes The Merck Index (1889, $1) and the CRC Handbook of Chemistry and Physics (1913. $2).  The latest edition of The Merck Index is the 15th (2013) and has spawned two related reference resources.  Lynch’s description of the Merck Index is somewhat out of date since he acknowledges that much of the usage is online but does not mention the new website now administered by the Royal Society of Chemistry.  He also does not cite my reviews of various editions of The Merck Index, including the 15th, but cites a 2007 reference.  The Handbook of Chemistry and Physics has of course grown immensely and is in its 90th edition as well as online.  Described as a “franchise”, there are dozens of related CRC Handbooks.  Lynch closes the chapter by observing that these two resources are poorly acknowledged as essential parts of scientific advancements.

There are chapter notes at the rear of the book.  However, the citations are terse and each citation must be looked up in the Bibliography necessitating 2-3 bookmarks when reading.  Also included are a Glossary and Index.  Highly recommended to educators, historians, librarians of all stripes, and all of us who use reference materials for business or pleasure.

(1) Ouaknin, M.-A., Mysteries of the Alphabet, Abbeville Press, New York, 1999.

Robert E. (Bob) Buntrock
Buntrock Associates
Orono, ME

Editors’ Corner

Science and Popular Culture: Part 3

If you had asked me to guess which popular culture works would get covered in the media this last year for their use of science, Back to the Future 2 would not have been on the list. But October 21, 2015 is the date that Doc Brown and Marty McFly visit “in the future”. In honor of the day, Forbes ran a series of articles about the science and technology of BTTF2’s 2015 and how they compare with our reality. My favorite invention—introduced at the end of the first film—was the Mr. Fusion reactor that powered the upgraded DeLorean, so Carmen Drahl’s interview with a specialist in the control of fusion plasmas was a must-read. As Natalie Robehmed pointed out, some of the technology “predictions” of BTTF2 did come true, such as wearable technology and drones. Unfortunately, we willl still have to wait for Mr. Fusion and the flux capacitor.

The movie that would have been at the top of that list was The Martian, not a surprise. There were a lot of articles written last fall and winter; even NASA took the opportunity to show how they are developing the technologies referenced in the film, like generating oxygen and recovering water. A good place to start is Mika McKinnon’s io9 article where she grades (good, bad, fascinating) the film’s science. One of my favorite scenes in the book is how Watney tries to make water for his crops by burning hydrazine. It was also in the film, and I was pleased it was mentioned in Caroline Framke’s Vox piece on the science behind five major plot points. As she did the previous year with Interstellar, astrophysicist Katie Mack also wrote about The Martian from the perspective of both scientist and “space enthusiast.

There were not as many articles about Star Wars: The Force Awakens, though I enjoyed Rhett Allain’s article on three ways the new Starkiller base could actually work. But the best Star Wars-related work I read last year was actually about the economic costs to the Empire for those first two Death Stars. I followed Kelsey Atherton’s Popular Science post to Washington University professor Zachary Feinstein’s paper on arXiv. And just so I cannot be accused of bias, here is one about the less-than-appetizing appeal of Star Trek: The Next Generation’s food replicator.

On the tastier side of things, sixth season MasterChef contestant (and biochemistry major) Hetal Vasavada was interviewed by Jyllian Kemsley for C&EN. In it she talked about her love of chemistry and how that informed her culinary skills. I was amazed just watching Vasavada—a vegetarian—cook dishes on the show that she could not taste.

Now we move to the DC and Marvel. Esther Inglis-Arkell reminded us why kryptonite cannot exist. And Wired posted a video about the science and special effects of Agent Carter’s “zero matter” and the devices designed to contain it.

Kyle Hill’s Beyond Science series at The Nerdist continues to investigate the science of your favorite fandoms. Among the more chemistry-focused questions he tried to answer this past year: “What poisons are inside the Joker’s laughing gas?” “What kind of poison is The Princess Bride’s iocane powder?” (I sense a theme here) and “How acidic is the Alien xenomorph’s blood?”

While I am pretty sure moon tea is not mentioned in the Game of Thrones television series, it is does get a mention in the third novel, A Storm of Swords. As part of a recent “Science of Game of Thrones” blog carnival, Raychelle Burks wrote about the chemistry of the compounds behind moon tea.

I will wrap up with one of the more recent films where science plays an important role: Ghostbusters, and Joshua Sokol’s Wired article about the MIT physicists who contributed their expertise in the design of the proton packs and even one of the character’s campus office.

Teri M. Vogel
UC San Diego

Notes From Our Sponsors

Image  Image

  

 

Division of Chemical Information Sponsors Fall 2016

The American Chemical Society Division of Chemical Information is very fortunate to receive generous financial support from our sponsors. Their support allows us to maintain the high quality of the Division’s programming, to promote communication between members at social functions at the ACS Spring 2016 National Meeting in San Diego, CA, and to support other divisional activities during the year, including scholarships to graduate students in chemical Information.

The Division gratefully acknowledges contributions from the following sponsors:

Gold
Journal of Chemical Information & Modeling (ACS Publications)

Silver
Bio-Rad Laboratories
Royal Society of Chemistry

Bronze
Chemical Abstract Service
EPA National Center for Computational Toxicology
Genentech
PerkinElmer
Thomson Reuters
Thieme

Contributors
DrugPatentWatch

Opportunities are available to sponsor Division of Chemical Information events, speakers, and material. Our sponsors are acknowledged on the CINF web site, in the Chemical Information Bulletin, on printed meeting materials, and at any events for which we use their contribution. For more information please review the sponsorship brochure at http://www. acscinf.org/PDF/CINF_Sponsorship_Brochure.pdf. Please feel free to contact me if you would like more information about supporting CINF.

Graham Douglas
Chair pro tem, Fundraising Committee 2016
Email: sponsorship@acscinf.org
Tel: 510-407-0769

The ACS CINF Division is a non-profit tax-exempt organization with taxpayer ID no. 52-6054220.

Journal of Chemical Information and Modeling (ACS Publications)

ImageThe Journal of Chemical Information and Modeling is pleased to introduce a Special Issue focused on the “Community Structure-Activity Resource”. In collaboration with the Journal of Medicinal Chemistry, the journal also published a Virtual Issue on “Computational Methods for Drug Discovery and Design”. Additionally, the Journal of Chemical Information and Modeling has launched two new manuscript types: Application Note and Review. Application Note articles are informative peer-reviewed reports on novel software packages, databases, and web servers. Review articles are peer-reviewed topical overviews of general interest to the JCIM community. Please refer to the journal’s author guidelines for more details. For pre-submission inquiries, please email the journal at eic@jcim.acs.org.

ACS Publications

ACS Omega

Learn More about ACS Omega

ACS Omega, ACS’s newest open access journal, published its first articles earlier this summer. What makes ACS Omega unique? ACS Omega features and highlights quality research from all aspects of chemistry and features four prestigious co-editors from across the globe: Cornelia Bohne, Luis M. Liz-Marzán, Krishna N. Ganesh, and Deqing Zhang. Together they give the journal a unique global focus that promotes new ideas and research.

The journal also features expedited manuscript handling and simplified formatting requirements, allowing authors more time to focus on their research.

Publishing in ACS Omega is $2000, but membership and regional discounts are available. ACS Members receive a $500 discount, lowering the base price to $1500. Creative Commons CC-BY licenses are also available. Find out more by using our new ACS Omega Cost Calculator.

Manuscript Transfer

Find the best fitLast year, ACS Publications launched our Manuscript Transfer Service, allowing authors to easily have their manuscripts transferred to another journal based on recommendations from our editors. This service helps streamline the publications process.

Over the last few months, we have made improvements to this process. With the introduction of ACS Omega, authors have more options to find the best fit for their research.

Manuscript Transfer is an opt-in service. Authors will choose to opt-in during the submission process on ACS Paragon Plus.

If you have any questions, comments, or need any assistance, please do not hesitate to reach out to Michael Qiu, Library Relations Manager at M_Qiu@acs.org.

Royal Society of Chemistry

Celebrating 175 years of progress and people in the chemical sciences

ImageAs the oldest chemical society in the world, we’re proud to be celebrating our 175th anniversary this year.

Recent highlights:

From our origins as The Chemical Society in 1841, we have grown to become the professional body for more than 54,000 members in 125 countries.

In 2015, we published over 43,000 scholarly journal articles – a 20% increase on the year before.

We now invest around £4 million a year in chemistry education in the UK.

You can get involved with our 175th anniversary

As part of our anniversary celebrations, were asking our global community of members and supporters to dedicate 175 minutes to chemistry. From attending an event to supporting our campaigning activities to volunteering, there's a whole variety of ways to get involved.

If you or your library users would like to take part, we’ve put together a list of activities to get you started. Or, for some additional inspiration, you can read about others who have already taken part.

We'd love to know how you’ve spent your 175 minutes, so please share your stories via our website or on Twitter using #time4chem.

In other news … RSC Advances is going open access

Were making the world's largest chemistry journal gold open access. From its first issue in January 2017, RSC Advances will move to a gold open access (OA) model meaning more high-quality content available without subscription.

Since its launch in 2011, we've deliberately pushed the boundaries with RSC Advances, looking for new and unique ways to make the scientific developments we publish accessible to the widest possible audience.

The change will provide free access to a broader scope of high-quality work and offer new, affordable OA publishing options for authors around the world.

RSC Advances' article processing charge (APC) will be £750, one of the lowest in the industry. And for the first two years, this will be discounted to £500, with further discounts and waiver options also available.

Our aim is to support the scientific community and advance excellence in the chemical sciences. Converting a journal of this size cements our influence in OA publishing, putting us in a strong position to shape its future for the benefit of our community.

Bio-Rad Offers a Free ATR-IR & Raman Database

Bio-RadBio-Rad KnowItAll ID Expert combines the world’s largest IR and Raman spectral database collections with the world’s fastest and most intelligent software for spectral identification.

  • As a member of CINF, you are eligible for a two-week trial of Bio-Rad’s KnowItAll ID Expert software along with their collection of 235,000 IR spectra and 13,000 Raman spectra.

  • At the end of the trial, you can continue to use the KnowItAll ID Expert software with a free 350 compound Sadtler database of ATR-IR and Raman spectra.

Register now at www.knowitall.com/trial and enter CODE 3CNZX

CAS, a Division of the American Chemical Society

CAS continues to provide unparalleled discoverability of chemistry with the recent launch of ChemZent.  ChemZent is a new solution available for purchase in SciFinder that delivers the complete collection of approximately three million abstracts from Chemische Zentralblatt, the oldest compendium of chemistry abstracts dating from 1830-1969.

If you are a chemistry faculty member teaching organic chemistry this fall, CAS invites you to participate in our voluntary beta of a new education solution that leverages SciFinder to enhance teaching and learning of organic chemistry. Organized by the topics taught in the classroom, Chemistry Class Advantage harnesses the power of SciFinder through carefully architected problems aimed at enhancing an undergraduates' ability to use original literature to improve overall comprehension of organic chemistry fundamentals. The goal is to help students think more like researchers. Preview our short video, and visit the CAS booth in Philadelphia to learn more.

Chemical Abstracts ServiceCAS also strengthened its commitment to science and innovation selecting 26 outstanding international scientists to participate in the 2016 SciFinder Future Leaders Program.  The SciFinder Future Leaders program will be held August 15-20 in Columbus, OH and provides selected international Ph.D. students and postdoctoral researchers’ opportunities to collaborate with CAS scientists, innovators and business leaders.  Visit the CAS website for dates and times of the various technical presentations given by past and present SciFinder Future Leaders.

About CAS

Dedicated to the ACS vision of improving people’s lives through the transforming power of chemistry, the CAS team of highly trained scientists finds, collects, and organizes all publicly disclosed substance information, creating the world’s most valuable collection of content that is vital to innovation worldwide. Scientific researchers, patent professionals and business leaders around the world rely on a suite of research solutions from CAS that enables discovery and facilitate workflows to fuel tomorrow’s innovation.

EPA National Center for Computational Toxicology: The iCSS Chemistry Dashboard – A New Online Resource for Environmental Scientists

A new web application, released by computational toxicology researchers within the Environmental Protection Agency (EPA), is a primary hub for navigating chemistry data related to environmental toxicology. The data and information available via the iCSS Chemistry Dashboard (https://comptox.epa.gov) supports scientists in their research and investigations of environmental chemicals. The dashboard provides access to expertly curated experimental chemical property data and, when data are not available, has used modern computational modeling approaches to fill in the gaps. Such chemical property data are essential in a number of EPA research efforts, specifically for the development of environmental exposure and toxicity models. But now the data, the properties and a collection of additional resources have been brought together in one application to serve scientists around the world. How much data? Over 700,000 chemicals and approaching 10,000,000 experimental and predicted chemical properties are available via a web search, downloadable at the click of a button and even viewable on your favorite smartphone.

NCCT Figure 1

Figure 1: The Home Page for the Chemistry Dashboard

The dashboard is a new app in the armory of computational toxicologists developed by the EPA and is being developed as a central connection point to integrate across various other applications. This hub will be connected not only to other EPA tools that have been developed over the years, but also will link across many agency resources and to an array of public domain databases. The vast majority of EPA projects ultimately link to chemical substances and structures, whether they are identified by “Registry Numbers”, by systematic nomenclature or common names, or ideally in formats that are amenable to computational modeling for the purposes of prediction.

The new dashboard has been available for only a couple of months and is already garnering positive feedback from its users. New data, functionality and capabilities are already in development with the intention to seamlessly provide regular updates. Especially of interest to the users is a way for each user to inform the project team of any issues they see in the data, at a chemical by chemical level, so that this “crowdsourced feedback” can be used to improve the data for all users.

Software development is structured under an agile methodology and updates to both data and functionality will occur on a regular basis. Presently in testing is providing access to data regarding NCCT’s Toxcast in vitro data and information regarding how specific chemicals are used within products. Of particular interest may be the detailed “Model Report” regarding a specific prediction model, its performance characteristics for a specific chemical, and applicability domain details for that chemical. A picture is worth a thousand words and the Figure below should illustrate the available details.

NCCT Figure 2

Figure 2: A Model Report for the prediction of logP for a particular chemical.
The report shows the overall performance of the model using 5-fold cross-validation
as well as five nearest neighbors from the training set

While the dashboard is being developed to support environmental chemistry in particular we encourage all scientists to make use of the application and the available data. All chemistry data will be made available under Open Data licenses and an application programming interface is under development for third party integrations. We welcome your feedback and encourage you to check back regularly for updates to both data and functionality as the service expands.

Antony Williams
williams.antony@epa.gov

Thomson Reuters announces its newest innovation in scientific intelligence: Drug Research Advisor – Target Druggability, launching in Q1 2017

The three key decision-making phases of preclinical drug development have changed little over the decades:

  • Target Identification and validation

  • Drug Design & Hit screening

  • Lead optimization & DMPK review

However, the landscape in which these phases exist has changed dramatically in just the past 10 years. Contributing factors include the growth of NGS technologies, increased regulatory requirements around animal testing, Standard of Care benchmarking, and biomarkers, among others.

These new circumstances require researchers to deepen their understanding in the areas of integrated diseases; target mechanisms; drug design and pipeline; and patient population and variations. They also need accessible tools allowing them to apply common workflows and keep research moving efficiently.

Thomson Reuters’ IP & Science expert consultants are designing three application suites to meet the complex needs of drug development professionals:

  • Drug Research Advisor

  • Clinical Research Advisor

  • Disease Research Advisor

Thomson Reuters Figure 1

Figure 1: The Research Advisor suites from Thomson Reuters are workflow-based applications
with curated content designed to address the changing needs of the drug development community

The Disease Research Advisor and Clinical Research Advisor suites include several critical applications to support drug development, including Key Pathway Advisor (Disease Research Advisor), and Precision Medicine Intelligence and Cortellis Clinical Trials Intelligence (Clinical Research Advisor).

Thomson Reuters will launch Target Druggability, the first of the Drug Research Advisor suite’s three applications, in Q1 2017, with an early access program in 2016 for Integrity customers.  Leveraging three collaborative applications, Drug Research Advisor will integrate the major steps of your preclinical research into a single cloud based workflow. Using the latest visualizations and analytics, you will be able to interrogate all required manually curated content types in one space, enabling faster access to trusted decisions.

Target Druggability supports the first step of your drug discovery process: target identification, assessment and validation. By combining state of the art visualizations with ranking and scoring algorithms, Target Druggability will help you to find and validate your next target ahead of the competition. Two additional applications Drug Design and DMPK Reviewer are anticipated in late 2017 and 2018 respectively.

[For more information, to take part in the early access program, or book a demonstration please contact Leo Lafferty-Whyte (Leo.Lafferty-Whyte@ThomsonReuters.com) or Montse del Fresno (Montse.DelFresno@ThomsonReuters.com).

Thieme Chemistry Releases Science of Synthesis 4.4 Containing Domino Transformations and Latest Updates

With the release of Science of Synthesis 4.4, Thieme Chemistry expands its definitive knowledge base to include significant knowledge updates as well as two new reference library volumes on domino transformations in organic synthesis.

ImageThe latest addition to the Science of Synthesis reference library, Applications of Domino Transformations in Organic Synthesis (Vols. 1 and 2), edited by Scott A. Snyder, highlights the current state of the art in the rapidly changing field of domino/cascade-based transformations. The two volumes are organized by the core type of reaction used to initiate the domino event. Volume 1 covers polyene/cation-π cyclizations, the synthesis of polyether natural products by polyepoxide ring-opening, metathesis, radical, and metal-mediated reactions, and non-radical skeletal rearrangements. Volume 2 focuses on pericyclic reactions, for example sigmatropic shifts and ene reactions. Alkylative dearomatization reactions, additions to non-activated alkenes (e.g., halocyclizations), activated alkenes (e.g., Michael reactions, enamine/enol ether reactions), and to C=O and C=N bonds are also covered in this volume.

To ensure that the unique synthetic methodology tool continues to offers the most current and reliable information on chemical transformations, Science of Synthesis 4.4 includes knowledge updates comprising a total of 500 printed pages. Among the highlights of the latest knowledge update are new insights on the synthesis and applications of organometallic complexes of platinum and on the applications of organometallic complexes of iridium. A significant update on the synthesis of 1,2,3-triazoles focuses on the addition of azides to alkynes and alkenes in “click chemistry", which has received much interest over the past few years. Furthermore, the recently expanded knowledge base includes updates on the synthesis of phthalazines and quinazolines. Both nitrogen heterocycle classes have been a recent focus of attention from the pharmaceutical industry. Highlights of the current update also include updates on the syntheses of various non-aromatic phosphorus-containing heterocycles, oxygen- and phosphorus-substituted alkynes, and nitrogen-substituted alkynes. The update on nitrosoalkenes covers new methods for the synthesis of these generally unstable compounds along with their applications, particularly in cycloadditions and 1,4-additions. A series of updates on the synthesis of haloalkanes rounds off the latest knowledge updates, covering synthesis by substitution of hydrogen atoms, metals or carbon functionalities, and other halogens or oxygen functionalities, as well as by addition across carbon-carbon multiple bonds. Contributors to the latest SoS Knowledge Update include A. Nomoto and A. Ogawa; H. Li and C. Mazet; A. C. Tomé; T. J. Hagen and T. R. Helgren; F.-A. Kang and S.-M. Yang; M. H. Larsen, M. Cacciarini, and M. Brøndsted Nielsen; K. Banert; H.-U. Reissig and R. Zimmer; G. Keglevich and A. Grün; J. Iskra and S. S. Murphree; M. C. Elliott and B. A. Saleh; F. V. Singh and T. Wirth; and U. Hennecke.

Science of Synthesis continues to be updated following established editorial processes with clearly defined criteria and discerning standards for method selection. New content will continually be added to the digital version, which prevails as the most up-to-date evaluated digital reference work available, reflecting the latest developments in synthetic methodology. All content is available with full text and graphics and can be searched by structure and reaction type.

To access Science of Synthesis 4.4 or get a free trial please visit: http://sos.thieme.com. For more information about Science of Synthesis please visit the website at www.thieme-chemistry.com/sos/.

DrugPatentWatch Announces New Features

DrugPatentWatchIn response to user requests, DrugPatentWatch now features email alerts. You can set up alerts for new drug approvals, addition or expiration of patents, Paragraph IV challenges, and other critical business information.

The new email alerts add to other value-added features such as data export and flexible flat-rate pricing. Data sets include US and International patent data, expired and active patents, Supplementary Protection Certificates, suppliers, formulation, and more.

For more information about DrugPatentWatch see http://www.DrugPatentWatch.com or email admin@DrugPatentWatch.com.

As a new CINF sponsor, DrugPatentWatch is offering CINF members a 25% bonus: Get a free 3 month extension with the purchase of a one-year subscription. Contact admin@DrugPatentWatch.com and mention the CINF Fall 2016 offer to take advantage of this time limited offer.

About DrugPatentWatch

DrugPatentWatch provides actionable business intelligence on small-molecule drugs and the 110,000 global patents covering them. Since its founding in 2002, DrugPatentWatch has been cited by CNN, NEJM, Nature Journals, and many other leading publications.

Use cases for the database include:

  • Branded pharmaceutical firms seeking competitive intelligence

  • Generic and API manufacturers seeking knowledge of which drugs to develop

  • Wholesalers seeking advance notice of patent expiry to avoid over-stocking off-patent drugs

  • Healthcare payers seeking to project and manage future budgets

Technical Program Listing

ACS Chemical Information Division (CINF)
252th ACS National Meeting, Fall 2016
Philadelphia, PA (August 21-25, 2016)

CINF Symposia

Elsa Alvaro, Program Chair

[Created Fri Jul 29 2016, Subject to Change; Check ACS Online Program for Latest Changes]

CINF: Effectively Harnessing the World's Literature to Inform Rational Compound Design 8:25am - 11:45am
Sunday, August 21
Room 112A - Pennsylvania Convention Center
Daniel Ortwine, Organizing
Daniel Ortwine, Presiding
Cosponsored by MEDI
Financially supported by: Genentech
8:25am-8:30am Introductory Remarks
8:30am-9:05am CINF 1: PubChem’s literature and patent information for drug discovery
Sunghwan Kim, kimsungh@ncbi.nlm.nih.gov, Paul Thiessen, Tiejun Cheng, Bo Yu, Benjamin Shoemaker, Jiyao Wang, Evan Bolton, Yanli Wang, Steve Bryant

National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

Abstract

9:05am-9:40am CINF 2: Harnessing the world’s literature to provide a crystallographic perspective on compound design: federated pharmacophore searching as example
Erin Davis, Ian Bruno, Paul Sanschagrin, sanschagrin@ccdc.cam.ac.uk

Cambridge Crystallographic Data Centre, Piscataway, New Jersey, United States

Abstract

9:40am-10:15am CINF 3: GOSTAR and ChEMBL comparison – commercial vs. open chemogenomics databases

Johannes Voigt, johannes.voigt@gilead.com, Uli Schmitz

Gilead Sciences, Foster City, California, United States

Abstract

10:15am-10:30am Intermission
10:30am-11:05am CINF 4: Exploring available compound data with the open PHACTS discovery platform and KNIME

Daniela Digles, daniela.digles@univie.ac.at, Gerhard Ecker

University of Vienna, Vienna, Austria

Abstract

11:05am-11:40am CINF 5: NDEx, the Network Data Exchange: a resource for biological networks with application in informed compound design
Dexter Pratt, depratt@ucsd.edu

School of Medicine, UCSD, La Jolla, California, United States

Abstract

11:40am-11:45am Concluding Remarks
CINF: Bringing Cheminformatics into the College Chemistry Classroom 8:15am - 12:05pm
Sunday, August 21
Room 112B - Pennsylvania Convention Center
Robert Belford, Sunghwan Kim, Organizing
Robert Belford, Sunghwan Kim, Presiding
Cosponsored by CHED
8:15am-8:20am Introductory Remarks
8:20am-8:40am CINF 6: Learning to find the right information: A survey of chemistry information literacy in the undergraduate classroom
Thibault Geoui, t.geoui@elsevier.com

Marketing, Elsevier, Frankfurt, Hesse, Germany

Abstract

8:40am-9:00am CINF 7: Co-developing chemical information management and laboratory safety skills
Ralph Stuart2, secretary@dchas.org, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States

Abstract

9:00am-9:20am CINF 8: Introducing SIVVU, a web-based program for modeling spectrophotometric titration data

Douglas Vander Griend, dav4@calvin.edu

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

Abstract

9:20am-9:30am Intermission
9:30am-9:50am CINF 9: Integration of cheminformatics material into the STEMWiki hyperlibrary

Robert Belford3, rebelford@ualr.edu, Delmar Larsen2, Andrew Cornell1

1 Department of Chemistry, University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 2 Department of Chemistry, Univ California Davis, Davis, California, United States; 3 Department of Chemistry, Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States

Abstract

9:50am-10:10am CINF 10: Holistic approach to cheminformatics in a liberal arts environment

Philip Adler, padler1@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States

Abstract

10:10am-10:30am CINF 11: Cheminformatics education and research at home: the best way to teach graduate chemistry in the professional community
Hao Zhu, hao.zhu99@rutgers.edu

Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States

Abstract

10:30am-10:40am Intermission
10:40am-11:00am CINF 12: Fall 2015 cheminformatics OLCC project based learning: Validation of Wikipedia Chembox hazard information
Robert Belford, Brian Murphy, my59vw@gmail.com

Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States

Abstract

11:00am-11:20am CINF 13: Cheminformatics in the chemistry classroom
Denis Fourches, dfourch@ncsu.edu

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States

Abstract

11:20am-11:40am CINF 14: Withdrawn
11:40am-12:00pm CINF 15: Modern cheminformatics tools in the teaching laboratory: A practical exercise simulating a drug discovery project

Chase Smith2, chase.smith@mcphs.edu, Tamsin Mansley1, tamsin.mansley@optibrium.com

1 Optibrium Ltd, Cambridge, Massachusetts, United States; 2 MCPHS University, Worcester, Massachusetts, United States

Abstract

12:00pm-12:05pm Concluding Remarks
CINF: Effectively Harnessing the World's Literature to Inform Rational Compound Design 1:25pm - 4:45pm
Sunday, August 21
Room 112A - Pennsylvania Convention Center
Daniel Ortwine, Organizing
Daniel Ortwine, Presiding
Cosponsored by MEDI
Financially supported by: Genentech
1:25pm-1:30pm Introductory Remarks
1:30pm-2:05pm CINF 16: Extracting and exploiting medicinal chemistry ADMET knowledge automatically from public and large pharma data

Alexander Dossetter1, al.dossetter@medchemica.com, Edward Griffen2, Andrew Leach2,3, Shane Montague2

1 MedChemica Limited, Macclesfield, United Kingdom; 2 Medchemica Ltd, Macclesfield, United Kingdom; 3 Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Liverpool, United Kingdom

Abstract

2:05pm-2:40pm CINF 17: Extracting knowledge from large in-vitro metabolic stability data sets using matched molecular pair analysis (MMPA)
Hao Zheng, zheng.hao@gene.com

Discovery Chemistry, Genentech, South San Francisco, California, United States

Abstract

2:40pm-3:15pm CINF 18: Gavitational waves shaking the chemical universe: virtual chemistry 2.0
Carsten Detering, detering@biosolveit.com

BioSolveIT Inc, Bellevue, Washington, United States

Abstract

3:15pm-3:30pm Intermission
3:30pm-4:05pm CINF 19: Network analytics of structured and unstructured data: an evolutionary solution
Olivier Lichtarge, lichtarge@bcm.edu

Baylor College of Medicine, Houston, Texas, United States

Abstract

4:05pm-4:40pm CINF 20: Integrative data science, semantics, knowledge graphs, and evidence paths in the service of molecular discovery
Jeremy Yang2,1, jeremyjyang@gmail.com, Tudor Oprea2, David Wild1

1 School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States; 2 School of Medicine, University of New Mexico, Albuquerque, New Mexico, United States

Abstract

4:40pm-4:45pm Concluding Remarks
CINF: Beyond Citations: Challenges & Opportunities in Altmetrics 1:30pm - 4:55pm
Sunday, August 21
Room 112B - Pennsylvania Convention Center
Elsa Alvaro, Rachel Borchardt, Organizing
Elsa Alvaro, Rachel Borchardt, Matthew Hartings, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-1:55pm CINF 21: Altmetrics in the library
Anne Rauh, aerauh@syr.edu

Syracuse University, Syracuse, New York, United States

Abstract

1:55pm-2:15pm CINF 22: Trusting altmetrics: updates from NISO's recommended practices
Todd Carpenter, tcarpenter@niso.org

National Information Standards Organization (NISO), Baltimore, Maryland, United States

Abstract

2:15pm-2:35pm CINF 23: Tell the full story of your research with altmetrics
William Gunn, william.gunn@mendeley.com

Mendeley, Mountain View, California, United States

Abstract

2:35pm-2:55pm CINF 24: Is that a wart or a beauty mark? An altmetrics analysis of an assistant professor’s scholarly activity
Matthew Hartings1, hartings@american.edu, Rachel Borchardt2

1 Chemistry, American University, Gaithersburg, Maryland, United States; 2 American University, Washington, District of Columbia, United States

Abstract

2:55pm-3:15pm CINF 25: Imperfect impact
Stuart Cantrill, stuartcantrill@gmail.com

Nature Chemistry, Cottenham, United Kingdom

Abstract

3:15pm-3:30pm Intermission
3:30pm-3:50pm CINF 26: Advanced Research Projects Agency – Energy (ARPA-E): The mechanism and metrics of funding transformational technology for energy innovation
Daniel Cunningham, Daniel.Cunningham@hq.doe.gov

US Department of Energy, Advanced Research Projects Agency – Energy (ARPA-E), Washington, District of Columbia, United States

Abstract

3:50pm-4:10pm CINF 27: Responsible usage of diverse research metrics
Lisa Colledge, L.Colledge@elsevier.com

Elsevier, Amsterdam, Netherlands

Abstract

4:10pm-4:30pm CINF 28: Investigating impact metrics for performance for the US-EPA National Center for Computational Toxicology
Antony Williams, tony27587@gmail.com, Monica Linnenbrink, Kevin Crofton, Russell Thomas

National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina, United States

Abstract

4:30pm-4:50pm CINF 29: Altmetrics: What has been the impact on ACS Publications?
Jeff Lang, j_lang@acs.org

ACS, Washington, District of Columbia, United States

Abstract

4:50pm-4:55pm Concluding Remarks
CINF: CINF Scholarships for Scientific Excellence 6:30pm - 8:30pm
Sunday, August 21
Howe - Loews Philadelphia Hotel
6:30pm-8:30pm CINF 30: Virtual nanoparticles

Wenyi Wang1, wwyi6@hotmail.com, Alexander Sedykh3, Linlin Zhao1, Bing Yan2, Hao Zhu3,1

1 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 2 Shandong University, Jinan, China; 3 Department of Chemistry, Rutgers University, Camden, New Jersey, United States

Abstract

6:30pm-8:30pm CINF 31: Experimental errors in QSAR modeling sets: What we can do and what we cannot do

Linlin Zhao2, zhaolin9142@gmail.com, Wenyi Wang1, Alexander Sedykh4, Hao Zhu3

1 Rutgers University, Camden, New Jersey, United States; 2 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 3 Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States; 4 Multicase Inc., Beachwood, Ohio, United States

Abstract

6:30pm-8:30pm CINF 32: Combining proprietary and published data in synthesis planning and reaction mining using Wiley ChemPlanner
Orr Ravitz, David Flannagan, Joyce Theisen, jtheisen@wiley.com

Research Informatics, John Wiley & Sons, Hoboken, New Jersey, United States

Abstract

6:30pm-8:30pm CINF 33: Modeling spectrophotometric titration data: tracking error from the measurement, through the model, and to the targeted output parameters
Nathanael Kazmierczak, kazmierczak314@gmail.com, Douglas Vander Griend

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

Abstract

6:30pm-8:30pm CINF 34: Dark reactions project: A cheminformatics approach to hydrothermal syntheses
Philip Adler2, padler1@haverford.edu, Joshua Schrier2, Alex Norquist1, Sorelle Friedler1

1 Haverford College, Bryn Mawr, Pennsylvania, United States; 2 Chemistry, Haverford College, Haverford, Pennsylvania, United States

Abstract

6:30pm-8:30pm CINF 35: Adverse drug reactions triggered by the common HLA-B*57:01 variant: A molecular docking study

George Van Den Driessche, gavanden@ncsu.edu, Denis Fourches

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States

Abstract

6:30pm-8:30pm CINF 36: ChemML: A machine learning and informatics program suite for the chemical and materials sciences

Mojtaba Haghighatlari1, mojtabah@buffalo.edu, Johannes Hachmann1,2

1 Chemical and Biological Engineering, University at Buffalo, Buffalo, New York, United States; 2 New York State Center of Excellence in Materials Informatics, Buffalo, New York, United States

Abstract

CINF: Chemistry Data for the People: From Policy to Practice 8:05am - 12:00pm
Monday, August 22
Room 112A - Pennsylvania Convention Center
Evan Bolton, Ian Bruno, Darla Henderson, Leah McEwen, Organizing
Darla Henderson, Leah McEwen, Presiding
Cosponsored by MPPG
8:05am-8:15am Introductory Remarks
8:15am-8:25am CINF 37: Viewpoint on open access by an editor, author, reviewer, and reader
Jonathan Sweedler, jsweedle@illinois.edu

Chemistry, University of Illinois, Urbana, Illinois, United States

Abstract

8:25am-8:35am CINF 38: Data generation, publication and sharing
Richard Kidd, kiddr@rsc.org

Royal Soc of Chem T Graham Hse, Cambridge, United Kingdom

Abstract

8:35am-8:45am CINF 39: Implementing a data sharing policy: A publisher perspective
Raymond Boucher, rboucher@wiley.com, Kathryn Sharples

John Wiley and Sons Ltd, Chichester, United Kingdom

Abstract

8:45am-8:55am CINF 40: Ten habits of happy data: An exploration of Elsevier’s research data management program
Anita De Waard1, A.dewaard@elsevier.com, William Gunn2, william.gunn@mendeley.com

1 Elsevier RDMS, Elsevier Inc., Jericho, Vermont, United States; 2 Mendeley, Elsevier Inc., Mountain View, California, United States

Abstract

8:55am-9:05am CINF 41: Changing workflows and mindsets
Martin Hicks, mhicks@beilstein-institut.de

Beilstein Institut, Frankfurt, Germany

Abstract

9:05am-9:35am Panel Discussion
9:35am-9:45am Discussion
9:45am-10:00am Intermission
10:00am-10:10am CINF 42: NSF MPS Open Data workshop series: Taking the pulse of the research community on open data issues
Mike Hildreth2, mhildret@nd.edu, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Department of Physics, University of Notre Dame, Notre Dame, Indiana, United States

Abstract

10:10am-10:20am CINF 43: Open Data: What the reader wants to know rather than what the author wants to present
Robin Rogers, robin.rogers@mcgill.ca

Department of Chemistry, McGill University, Montreal, Quebec, Canada

Abstract

10:20am-10:30am CINF 44: Role of disciplinary data repositories in data publishing
Ian Bruno, bruno@ccdc.cam.ac.uk, Amy Sarjeant, Erin Davis

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom

Abstract

10:30am-10:40am CINF 45: Figshare data repository
Dan Valen, dan@figshare.com

Figshare, Brooklyn, New York, United States

Abstract

10:40am-10:50am CINF 46: Importance of open raw data in chemistry research
Santiago Dominguez Vivero, sdominguez@mestrelab.com, Carlos Cobas, Agustin Barba, Felipe Seoane, Santiago Fraga

Mestrelab Research SL, Feliciano Barrera, Santiago de Compostela, Spain

Abstract

10:50am-11:00am CINF 47: Practical issues in chemistry data sharing in PubChem
Sunghwan Kim, Evan Bolton, Steve Bryant, Yanli Wang, ywang@ncbi.nlm.nih.gov

National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

Abstract

11:00am-11:30am Panel Discussion
11:30am-11:40am Discussion
11:40am-12:00pm CINF 48: Value of open data for chemists: Summary and perspectives
Judith Currano, currano@pobox.upenn.edu

Chemistry Library, University of Pennsylvania, Philadelphia, Pennsylvania, United States

Abstract

CINF: Shedding Light on the Dark Genome: Methods, Tools & Case Studies 8:15am - 12:00pm
Monday, August 22
Room 112B - Pennsylvania Convention Center
Rajarshi Guha, Tudor Oprea, Organizing
Rajarshi Guha, Tudor Oprea, Presiding
Cosponsored by BIOT, COMP and MEDI
8:15am-8:40am CINF 49: Illuminating the druggable genome: Linking diseases, targets and drugs
Tudor Oprea, toprea@salud.unm.edu

University of New Mexico, Albuquerque, New Mexico, United States

Abstract

8:40am-9:05am CINF 50: Tracking biological targets in drug discovery using the ChEMBL and SureChEMBL databases
Prudence Mutowo, prudence@ebi.ac.uk

CHEMBL, EMBL-EBI, CAMBRIDGE, HINXTON, United Kingdom

Abstract

9:05am-9:30am CINF 51: Formal ontologies and software tools to facilitate integration, classification and modeling of drug discovery data

Stephan Schürer1,2, stephan.schurer@gmail.com, Asiyah Yu Lin1, Hande McGinty1, Qiong Cheng1, Amar Koleti1, Nooshin Zadeh1, Dusica Vidovic1

1 Center for Computational Science, University of Miami, Miami, Florida, United States; 2 Department of Pharmacology, University of Miami, Miami, Florida, United States

Abstract

9:30am-9:40am Intermission
9:40am-10:05am CINF 52: KEA2: Multiple views of the human kinome

Nicolas Fernandez2, nicolas.fernandez@mssm.edu, Andrew Rouillard2, Klarisa Rikova1, Peter Hornbeck1, Avi Ma'ayan2

1 Cell Signaling Technology, Danvers, Massachusetts, United States; 2 Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, United States

Abstract

10:05am-10:30am CINF 53: Pharos - shining light on the druggable genome
Dac Trung Nguyen, Timothy Sheils, Geetha Mandava, Ajit Jadhav, Noel Southall, Rajarshi Guha, rajarshi.guha@gmail.com

NCATS, Manchester, Connecticut, United States

Abstract

10:30am-10:55am CINF 54: From dark chemical matter to shedding light on the dark genome: How can chemistry and informatics enable biology?
Meir Glick, meir.glick@merck.com

Merck Research Laboratories, Boston, Massachusetts, United States

Abstract

10:55am-11:05am Intermission
11:05am-11:30am CINF 55: KinomeNet: accurate prediction of protein kinase inhibitors with deep convolutional neural networks

Olexandr Isayev, olexandr@olexandrisayev.com, Alexander Tropsha

UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States

Abstract

11:30am-11:55am CINF 56: Analogous phylogenetic analysis using protein length and protein disorder
Haobo Guo1, guohaobo@gmail.com, Gerald Tuskan2, Xiaohan Yang2, Hong Guo1

1 BCMB, University of Tennessee Knoxville, Oak Ridge, Tennessee, United States; 2 Biology, Oak Ridge Nationa Laboratory, Oak Ridge, South Dakota, United States

Abstract

11:55am-12:00pm Concluding Remarks
CINF: Chemistry Data for the People: From Policy to Practice 1:30pm - 4:30pm
Monday, August 22
Room 112A - Pennsylvania Convention Center
Evan Bolton, Ian Bruno, Darla Henderson, Leah McEwen, Organizing
Evan Bolton, Presiding
Cosponsored by MPPG
1:30pm-1:35pm Introductory Remarks
1:35pm-1:55pm CINF 57: Community forum for chemistry data and information
Ian Bruno1, bruno@ccdc.cam.ac.uk, Leah McEwen2, Stuart Chalk3

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 Clark Library, Cornell University, Ithaca, New York, United States; 3 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

Abstract

1:55pm-4:10pm CINF 58: Chemistry data pain points: distilled, analyzed, and next steps
Evan Bolton2, evan.e.bolton@gmail.com, Leah McEwen3, Ian Bruno1

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 National Center for Biotechnology Information, Bethesda, Maryland, United States; 3 Clark Library, Cornell University, Ithaca, New York, United States

Abstract

4:10pm-4:30pm Concluding Remarks
CINF: Using New Media to Communicate Chemistry to the Public 1:30pm - 4:15pm
Monday, August 22
Room 112B - Pennsylvania Convention Center
Susan Morrissey, Lauren Wolf, Organizing
Matt Davenport, Lauren Wolf, Presiding
Cosponsored by MPPG and PRES
1:30pm-1:40pm Introductory Remarks
1:40pm-2:00pm CINF 59: Communicating chemistry on YouTube
Adam Dylewski, a_dylewski@acs.org

American Chemical Society, Washington, District of Columbia, United States

Abstract

2:00pm-2:20pm CINF 60: Sound of science (and history and culture)
Mariel Carr, MCarr@chemheritage.org

Chemical Heritage Foundation, Philadelphia, Pennsylvania, United States

Abstract

2:20pm-2:40pm CINF 61: Got something to say? Engaging with social media in the time you have
David Oppenheimer, oppenhe@ufl.edu, Paris Grey

University of Florida, Gainesville, Florida, United States

Abstract

2:40pm-2:55pm Intermission
2:55pm-3:15pm CINF 62: Compound interest: Communicating chemistry using infographics
Andy Brunning, ndbrning@gmail.com

Compound Interest, Cambridge, United Kingdom

Abstract

3:15pm-3:35pm CINF 63: Pop culture chemistry
Raychelle Burks, rmburks@gmail.com

St. Edward's University, Austin, Texas, United States

Abstract

3:35pm-4:15pm Panel Discussion
CINF: Sci-Mix 8:00pm - 10:00pm
Monday, August 22
Halls D/E - Pennsylvania Convention Center
8:00pm-10:00pm CINF 10: Holistic approach to cheminformatics in a liberal arts environment

Philip Adler, padler1@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States

8:00pm-10:00pm CINF 15: Modern cheminformatics tools in the teaching laboratory: A practical exercise simulating a drug discovery project

Chase Smith2, chase.smith@mcphs.edu, Tamsin Mansley1, tamsin.mansley@optibrium.com

1 Optibrium Ltd, Cambridge, Massachusetts, United States; 2 MCPHS University, Worcester, Massachusetts, United States

8:00pm-10:00pm CINF 16: Extracting and exploiting medicinal chemistry ADMET knowledge automatically from public and large pharma data

Alexander Dossetter1, al.dossetter@medchemica.com, Edward Griffen2, Andrew Leach2,3, Shane Montague2

1 MedChemica Limited, Macclesfield, United Kingdom; 2 Medchemica Ltd, Macclesfield, United Kingdom; 3 Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Liverpool, United Kingdom

8:00pm-10:00pm CINF 30: Virtual nanoparticles

Wenyi Wang1, wwyi6@hotmail.com, Alexander Sedykh3, Linlin Zhao1, Bing Yan2, Hao Zhu3,1

1 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 2 Shandong University, Jinan, China; 3 Department of Chemistry, Rutgers University, Camden, New Jersey, United States

8:00pm-10:00pm CINF 31: Experimental errors in QSAR modeling sets: What we can do and what we cannot do

Linlin Zhao2, zhaolin9142@gmail.com, Wenyi Wang1, Alexander Sedykh4, Hao Zhu3

1 Rutgers University, Camden, New Jersey, United States; 2 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 3 Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States; 4 Multicase Inc., Beachwood, Ohio, United States

8:00pm-10:00pm CINF 35: Adverse drug reactions triggered by the common HLA-B*57:01 variant: A molecular docking study

George Van Den Driessche, gavanden@ncsu.edu, Denis Fourches

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States

8:00pm-10:00pm CINF 36: ChemML: A machine learning and informatics program suite for the chemical and materials sciences

Mojtaba Haghighatlari1, mojtabah@buffalo.edu, Johannes Hachmann1,2

1 Chemical and Biological Engineering, University at Buffalo, Buffalo, New York, United States; 2 New York State Center of Excellence in Materials Informatics, Buffalo, New York, United States

8:00pm-10:00pm CINF 3: GOSTAR and ChEMBL comparison – commercial vs. open chemogenomics databases

Johannes Voigt, johannes.voigt@gilead.com, Uli Schmitz

Gilead Sciences, Foster City, California, United States

8:00pm-10:00pm CINF 4: Exploring available compound data with the open PHACTS discovery platform and KNIME

Daniela Digles, daniela.digles@univie.ac.at, Gerhard Ecker

University of Vienna, Vienna, Austria

8:00pm-10:00pm CINF 51: Formal ontologies and software tools to facilitate integration, classification and modeling of drug discovery data

Stephan Schürer1,2, stephan.schurer@gmail.com, Asiyah Yu Lin1, Hande McGinty1, Qiong Cheng1, Amar Koleti1, Nooshin Zadeh1, Dusica Vidovic1

1 Center for Computational Science, University of Miami, Miami, Florida, United States; 2 Department of Pharmacology, University of Miami, Miami, Florida, United States

8:00pm-10:00pm CINF 52: KEA2: Multiple views of the human kinome

Nicolas Fernandez2, nicolas.fernandez@mssm.edu, Andrew Rouillard2, Klarisa Rikova1, Peter Hornbeck1, Avi Ma'ayan2

1 Cell Signaling Technology, Danvers, Massachusetts, United States; 2 Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, United States

8:00pm-10:00pm CINF 55: KinomeNet: accurate prediction of protein kinase inhibitors with deep convolutional neural networks

Olexandr Isayev, olexandr@olexandrisayev.com, Alexander Tropsha

UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States

8:00pm-10:00pm CINF 85: Active machine learning perspective on hit identification and optimization

Daniel Reker, danielreker@googlemail.com, Gisbert Schneider

Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

8:00pm-10:00pm CINF 86: Binding affinity prediction using frequency of protein-ligand interactions: method validation and application to bromodomain inhibitors

Jamel Meslamani, j.meslamani@gmail.com, Adam S Vincek, Elena Russinova, Alexander N. Plotnikov, Roberto Sanchez, Ming-Ming Zhou

Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States

8:00pm-10:00pm CINF 8: Introducing SIVVU, a web-based program for modeling spectrophotometric titration data

Douglas Vander Griend, dav4@calvin.edu

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

8:00pm-10:00pm CINF 92: VSViewer3D: An open source tool for interactive data mining of 3D virtual screening data

David Diller1, djrdiller@gmail.com, Kyle Diller2

1 Computational Chemistry, CMDBioscience, East Windsor, New Jersey, United States; 2 Rochester Institute of Technology, Rochester, New York, United States

8:00pm-10:00pm CINF 93: Strategies to improve PubChem data quality and search effectiveness through data analysis

Leonid Zaslavsky, zaslavsk@ncbi.nlm.nih.gov, Gang Fu, Asta Gindulyte, Paul Thiessen, Sunghwan Kim, Evan Bolton

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 94: Sketchy sketches: Hiding chemistry in plain sight

Daniel Lowe, daniel@nextmovesoftware.com, John May, Roger Sayle

NextMove Software, Cambridge, United Kingdom

8:00pm-10:00pm CINF 95: Hybrid search engine for chemical information in PubChem

Jie Chen1, chenj@ncbi.nlm.nih.gov, Siqian He1, Asta Gindulyte2, Evan Bolton3, Steve Bryant1

1 NCBI, NLM, NIH, Bethesda, Maryland, United States; 2 NCBI/CBB, NIH, Bethesda, Maryland, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 9: Integration of cheminformatics material into the STEMWiki hyperlibrary

Robert Belford3, rebelford@ualr.edu, Delmar Larsen2, Andrew Cornell1

1 Department of Chemistry, University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 2 Department of Chemistry, Univ California Davis, Davis, California, United States; 3 Department of Chemistry, Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States

CINF: Herman Skolnik Award Symposium 8:45am - 12:00pm
Tuesday, August 23
Room 112A/B - Pennsylvania Convention Center
Elsa Alvaro, Evan Bolton, Leah McEwen, Organizing
Evan Bolton, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 64: Developing databases and standards in chemistry
Stephen Heller, steve@hellers.com

Retired, Silver Spring, Maryland, United States

Abstract

9:15am-9:40am CINF 65: Two decades of open chemical data at DTP/NCI
Daniel Zaharevitz, ZaharevD@mail.nih.gov

Information Technology Branch, Developmental Therapeutics Program, National Cancer Institute, Bethesda, Maryland, United States

Abstract

9:40am-10:05am CINF 66: Using InChI to manage data
Peter Linstrom, peter.linstrom@nist.gov

NIST, Gaithersburg, Maryland, United States

Abstract

10:05am-10:30am CINF 67: Open chemistry resources provided by the NCI CADD group
Marc Nicklaus, mn1@helix.nih.gov

Nci Frederick Bldg 376 RM 207, Natl Inst Health Ft Detrick, Frederick, Maryland, United States

Abstract

10:30am-10:45am Intermission
10:45am-11:10am CINF 68: Evolution of open chemical information
Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States

Abstract

11:10am-11:35am CINF 69: Open chemical information at the European Bioinformatics Institute
Christoph Steinbeck, steinbeck@ebi.ac.uk

Cheminformatics and Metabolism, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, Cambridge, United Kingdom

Abstract

11:35am-12:00pm CINF 70: History and the future of tools and software components for working with public chemistry data
Wolf-Dietrich Ihlenfeldt, wdi@xemistry.com

Xemistry GmbH, Konigstein, Germany

Abstract

CINF: Herman Skolnik Award Symposium 2:00pm - 5:05pm
Tuesday, August 23
Room 112A/B - Pennsylvania Convention Center
Elsa Alvaro, Evan Bolton, Leah McEwen, Organizing
Evan Bolton, Presiding
2:00pm-2:05pm Introductory Remarks
2:05pm-2:30pm CINF 71: PubChem a resource for cognitive computing
Stephen Boyer, sboyer@us.ibm.com

IBM Almaden Research Center, IBM, San Jose, California, United States

Abstract

2:30pm-2:55pm CINF 72: SPL and openFDA resources of open substance data
Yulia Borodina, yulia.borodina@fda.hhs.gov

FDA, Silver Spring, Maryland, United States

Abstract

2:55pm-3:20pm CINF 73: Building a network of interoperable and independently produced linked and open biomedical data
Michel Dumontier, michel.dumontier@gmail.com

Medicine, Stanford University, Stanford, California, United States

Abstract

3:20pm-3:35pm Intermission
3:35pm-4:00pm CINF 74: Chemical structure representation in PubChem
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom

Abstract

4:00pm-4:25pm CINF 75: iRAMP & PubChem: Of the people, for the people
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States

Abstract

4:25pm-4:50pm CINF 76: Open chemical information: Where now and how?
Evan Bolton, bolton@ncbi.nlm.nih.gov

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

Abstract

4:50pm-4:55pm Concluding Remarks
4:55pm-5:05pm Award Presentation
CINF*: Using Public Information to Support a Chemical Safety Culture 8:25am - 12:05pm
Wednesday, August 24
Room 112A - Pennsylvania Convention Center
Evan Bolton, Leah McEwen, Ralph Stuart, Organizing
Evan Bolton, Leah McEwen, Ralph Stuart, Presiding
Cosponsored by CHAS
8:25am-8:30am Introductory Remarks
8:30am-8:45am CINF 77: Users roundtable: Laboratory use cases for chemical safety information
Ralph Stuart2, secretary@dchas.org, Leah McEwen1, Evan Bolton3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States

Abstract

8:45am-9:10am CINF 78: Risk assessment and crisis management in the research laboratory using online resources: A EH&S perspective
Shailendra Singh2, Neelam Bharti1, neelambh@ufl.edu

1 Marston Science Library, University of Florida, Gainesville, Florida, United States; 2 EH&S, University of Delaware, Newark, Delaware, United States

Abstract

9:10am-9:35am CINF 79: Institutional use of chemical safety data streams
Chris Jakober, jakecattleco@yahoo.com

Davis E&HS, University of California, Woodland, California, United States

Abstract

9:35am-10:00am CINF 80: Chemical safety and hazard information in PubChem
Jian Zhang3, jiazhang@ncbi.nlm.nih.gov, Paul Thiessen3, Asta Gindulyte3, Leah McEwen1, Ralph Stuart2, Evan Bolton3, Steve Bryant3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States; 3 NLM/NCBI, National Institutes of Health, Bethesda, Maryland, United States

Abstract

10:00am-10:15am Intermission
10:15am-10:40am CINF 81: Semantic annotation of the laboratory chemical safety summary in PubChem
Gang Fu2, gangfu1982@gmail.com, Jian Zhang2, Evan Bolton2, Jeremy Frey4, Stuart Chalk3, Mark Borkum5, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 NLM/NCBI, National Institutes of Health, Bethesda, Maryland, United States; 3 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States; 4 University of Southampton, Southampton, United Kingdom; 5 Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington, United States

Abstract

10:40am-11:05am CINF 82: GHS and NFPA diamonds: Where they come from and how they can be useful
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom

Abstract

11:05am-11:30am CINF 83: Critical cases for information identifiers in chemical asset management
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States

Abstract

11:30am-11:45am CINF 84: Surveying the academic laboratory population: Project updates from the iRAMP collaboration
Leah McEwen1, Ralph Stuart2, secretary@dchas.org

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States

Abstract

11:45am-12:05pm Concluding Remarks
CINF: General Papers 1:30pm - 4:50pm
Wednesday, August 24
Room 112A - Pennsylvania Convention Center
Elsa Alvaro, Organizing
Elsa Alvaro, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 85: Active machine learning perspective on hit identification and optimization

Daniel Reker, danielreker@googlemail.com, Gisbert Schneider

Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

Abstract

2:00pm-2:25pm CINF 86: Binding affinity prediction using frequency of protein-ligand interactions: method validation and application to bromodomain inhibitors

Jamel Meslamani, j.meslamani@gmail.com, Adam S Vincek, Elena Russinova, Alexander N. Plotnikov, Roberto Sanchez, Ming-Ming Zhou

Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States

Abstract

2:25pm-2:50pm CINF 87: MOARF, an integrated workflow for multiobjective optimization: Implementation, synthesis, and biological evaluation
Nathan Brown, nathan.brown@icr.ac.uk

The Institute of Cancer Research, Sutton, United Kingdom

Abstract

2:50pm-3:15pm CINF 88: Systematic generation of analog relationships of bioactive compounds and promiscuity analysis
Dagmar Stumpfe1, stumpfe@bit.uni-bonn.de, Dilyana Dimova2, Jürgen Bajorath3

1 B-it, Universtity of Bonn, Bonn, Germany; 2 Department of Life Science Informatics, University of Bonn, Bonn, Germany; 3 Life Science Informatics, University of Bonn, B-IT, Bonn, Germany

Abstract

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 89: SAR characteristics of matching molecular series and exploration of structural relationships
Dilyana Dimova1, dimova@bit.uni-bonn.de, Jürgen Bajorath2

1 Department of Life Science Informatics, University of Bonn, Bonn, Germany; 2 Life Science Informatics, University of Bonn, B-IT, Bonn, Germany

Abstract

3:55pm-4:20pm CINF 90: How frequent are your clusters in hierarchical cluster analysis? Quantifying their frequencies considering ties in proximity
Guillermo Restrepo1,3, guillermorestrepo@gmail.com, Wilmer Leal2,3, Eugenio Llanos2,4, Carlos Suarez2,5, Manuel Patarroyo2,6

1 University of Leipzig, Leipzig, Saxony, Germany; 2 Fundación Instituto de Inmunología de Colombia (FIDIC), Bogota, Colombia; 3 Universidad de Pamplona, Pamplona, Colombia; 4 SCIO Corporacion colombiana del saber cientifico, Bogota, Colombia; 5 Universidad del Rosario, Bogota, Colombia; 6 Universidad Nacional de Colombia, Bogota, Colombia

Abstract

4:20pm-4:45pm CINF 91: Line notations for nucleic acids (both natural and therapeutic)
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom

Abstract

4:45pm-4:50pm Concluding Remarks
CINF: General Papers 8:45am - 11:40am
Thursday, August 25
Room 112A - Pennsylvania Convention Center
Elsa Alvaro, Organizing
Elsa Alvaro, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 92: VSViewer3D: An open source tool for interactive data mining of 3D virtual screening data

David Diller1, djrdiller@gmail.com, Kyle Diller2

1 Computational Chemistry, CMDBioscience, East Windsor, New Jersey, United States; 2 Rochester Institute of Technology, Rochester, New York, United States

Abstract

9:15am-9:40am CINF 93: Strategies to improve PubChem data quality and search effectiveness through data analysis

Leonid Zaslavsky, zaslavsk@ncbi.nlm.nih.gov, Gang Fu, Asta Gindulyte, Paul Thiessen, Sunghwan Kim, Evan Bolton

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

Abstract

9:40am-10:05am CINF 94: Sketchy sketches: Hiding chemistry in plain sight

Daniel Lowe, daniel@nextmovesoftware.com, John May, Roger Sayle

NextMove Software, Cambridge, United Kingdom

Abstract

10:05am-10:20am Intermission
10:20am-10:45am CINF 95: Hybrid search engine for chemical information in PubChem

Jie Chen1, chenj@ncbi.nlm.nih.gov, Siqian He1, Asta Gindulyte2, Evan Bolton3, Steve Bryant1

1 NCBI, NLM, NIH, Bethesda, Maryland, United States; 2 NCBI/CBB, NIH, Bethesda, Maryland, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States

Abstract

10:45am-11:10am CINF 96: Amoeba-inspired heuristic search dynamics for semi-quantitative estimation of unknown chemical kinetics
Masashi Aono1,2, masashi.aono@elsi.jp

1 Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan; 2 PRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama, Japan

Abstract

11:10am-11:35am CINF 97: Database searching and rediscovering the wheel in scientific research
Christina Gilpin1, crgilpin@selectosep.com, Roger Gilpin2

1 Select-O-Sep, Freeport, Ohio, United States; 2 Wright State University, Dayton, Ohio, United States

Abstract

11:35am-11:40am Concluding Remarks

Cosponsored Symposia

ANYL: Kavli Symposium on Chemical Neurotransmission: What Are We Thinking? 1:00pm - 5:00pm
Monday, August 22
Room 105B - Pennsylvania Convention Center
Anne Andrews, Diane Schmidt, Paul Weiss, Organizing
Anne Andrews, Paul Weiss, Presiding
Cosponsored by BIOL, BMGT, CHED, CINF, MEDI, MPPG, PMSE and SCHB
Financially supported by: ACS Nano
Kavli Foundation
The White House BRAIN Initiative
1:00pm-1:15pm Introductory Remarks
1:15pm-1:45pm ANYL 198: 21st century neuroscience: A chemist’s perspective
Luke Lavis, lavisl@janelia.hhmi.org

HHMI/Janelia Farm, Ashburn, Virginia, United States

1:45pm-2:15pm ANYL 199: Watching neural activity in the dish and in the brain
Adam Cohen, cohen@chemistry.harvard.edu

Chemistry Chemical Biology, Harvard University, Cambridge, Massachusetts, United States

2:15pm-2:45pm ANYL 200: Realization of cell-based optical tools for measuring changes in volume transmission of neuromodulators in vivo
Paul Slesinger, Paul.Slesinger@mssm.edu

Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, New York, United States

2:45pm-3:00pm Intermission
3:00pm-3:30pm ANYL 201: In vivo electronic neurotransmitter sensing
Anne Andrews, anne.andrews@ucla.edu

Departments of Psychiatry & Biobehavioral Science and Chemistry & Biochemistry, UCLA - California NanoSystems Institute, and Hatos Center for Neuropharmacology, Los Angeles, California, United States

3:30pm-4:00pm ANYL 202: Novel neurotechnologies
Rafael Yuste, rmy5@columbia.edu

NeuroTechnology Center, Columbia University, New York, New York, United States

4:00pm-4:30pm ANYL 203: Brain chemistry for the people
Walter Koroshetz, koroshetzw@ninds.nih.gov

National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, United States

4:30pm-5:00pm ANYL 392: Chemists to a chemical synapse: imaging and repairing synapses with chemical tools
Dalibor Sames, sames@chem.columbia.edu

Department of Chemistry and NeuroTechnology Center, Columbia University, New York, New York, United States

ORGN: Connectivity & the Global Reach of Chemistry: Honoring the Life & Scientific Contributions of Ernest L. Eliel 1:30pm - 5:10pm
Tuesday, August 23
Room 124 - Pennsylvania Convention Center
Cynthia Maryanoff, Organizing
Cynthia Maryanoff, Presiding
Cosponsored by ANYL, BIOT, BMGT, CARB, CHED, CINF, HIST, INOR, MEDI, MPPG, PMSE and SCHB
1:30pm-1:35pm Introductory Remarks
1:35pm-2:05pm ORGN 323: Ernest L. Eliel: A professional’s professional
Jeffrey Seeman, jiseeman@yahoo.com

University of Richmond, Richmond, Virginia, United States

2:05pm-2:35pm ORGN 324: Importance of electrostatic interactions on the conformational behavior of substituted 1,3-dioxanes
William Bailey, William.Bailey@uconn.edu

Univ of Connecticut, Storrs Mansfield, Connecticut, United States

2:35pm-3:05pm ORGN 325: Interplay between organocatalysis and multicomponent reactions in stereoselective synthesis
Daniel Garcia Rivera, dgr@fq.uh.cu

Center for Natural Products Research, University of Havana, Havana, Cuba

3:05pm-3:35pm ORGN 326: Asymmetric autocatalysis and the origin of homochirality
Kenso Soai, soai@rs.kagu.tus.ac.jp

Tokyo Univ of SCI, Shinjuku-Ku Tokyo, Japan

3:35pm-4:05pm ORGN 327: Stereodivergent synthesis of chiral fullerenes
Margarita Suarez, msuarez@fq.uh.cu

Laboratory of Organic Synthesis, University of Havana, Havana, Cuba

4:05pm-4:35pm ORGN 328: Theoretical evidence for the relevance of n(F) → σ*(C-X) (X = H, C, O, S) stereoelectronic interactions
Eusebio Juaristi, ejuarist@cinvestav.mx

Cinvestav-Ipn/Dept of Chem, Mexico D F, Mexico

4:35pm-5:05pm ORGN 329: Saccharide structure and mechanism: Walking in the footsteps of Ernest Eliel
Anthony Serianni, Anthony.S.Serianni.1@nd.edu

Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States

5:05pm-5:10pm Concluding Remarks
ANYL: New Directions in Chemometrics: Making Sense of Big & Small Chemical Data Sets 8:30am - 10:50am
Thursday, August 25
Room 104A - Pennsylvania Convention Center
Rachelle Bienstock, Karl Booksh, Steven Brown, Organizing
Steven Brown, Presiding
Cosponsored by CINF
8:30am-8:50am ANYL 346: Investigation of the urinary steroidal profile by non-targeted metabolomics
Amelia Palermo2,1, palermo@imsb.biol.ethz.ch, Francesco Botre2, francesco.botre@uniroma1.it, Xavier de la Torre3, xavier.delatorre@gmail.com, Nicola Zamboni1, zamboni@imsb.biol.ethz.ch

1 Institute of Molecular Systems Biology, ETH Zürich, Zürich, Switzerland; 2 'Sapienza' University of Rome, Rome, Italy and Laboratorio Antidoping FMSI, Rome, Italy; 3 Laboratorio Antidoping FMSI, Rome, Italy

8:50am-9:10am ANYL 347: Elastic variable selection approach for calibration
Cannon Giglio, decannon@udel.edu, Steven Brown

Chemistry and Biochemistry, University of Delaware, Newark, Delaware, United States

9:10am-9:30am ANYL 348: Adaptive regression by subspace elimination. Towards a modeling strategy that is robust to spectral interferents
Karl Booksh, kbooksh@udel.edu, Joshua Ottaway

University of Delaware, Newark, Delaware, United States

9:30am-9:50am Intermission
9:50am-10:10am ANYL 349: Chemometric model development for high precision real-time PAT applications
Alice Tang, Iswandi Jarto, J Johnson, cjohnson@hilmarcheese.com

R&D, Hilmar Ingredients, Hilmar, California, United States

10:10am-10:30am ANYL 350: Materials assurance through orthogonal materials measurements
Curtis Mowry, cdmowry@sandia.gov, Mark Van Benthem, Donald Susan, Mark Rodriguez, James Griego, Pin Yang, David Enos, Katherine Simonson

Sandia National Laboratories, Albuquerque, New Mexico, United States

10:30am-10:50am ANYL 351: Modeling spectrophotometric titration data: A detailed look at optimal methodology and transparent reporting
Douglas Vander Griend, dav4@calvin.edu, Nathanael Kazmierczak

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

ANYL: New Directions in Chemometrics: Making Sense of Big & Small Chemical Data Sets 1:00pm - 3:40pm
Thursday, August 25
Room 104A - Pennsylvania Convention Center
Rachelle Bienstock, Karl Booksh, Steven Brown, Organizing
Rachelle Bienstock, Presiding
Cosponsored by CINF
1:00pm-1:20pm ANYL 371: Methodological limits for the determination of binding constants via equilibrium-restricted factor analysis of spectrophotometric data
Douglas Vander Griend1, dav4@calvin.edu, Anna Michmerhuizen1, Andrew Rylaarsdam1, SeongEun Kim1, Lucas Van Laar1, Zachary Drees2, Tasha Thong3

1 Chemistry & Biochemistry, Calvin College, Holland, Michigan, United States; 2 Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

1:20pm-1:40pm ANYL 372: Multivariate exploratory methods applied to Raman microspectroscopic mapping for the classification and geospatial estimation of titanium dioxide polymorphs
Joseph Smith1, joesmith@udel.edu, Frank Smith2, Billy Glass2, Karl Booksh1

1 Chemistry & Biochemistry, University of Delaware, Newark, Delaware, United States; 2 Geological Sciences, University of Delaware, Newark, Delaware, United States

1:40pm-2:00pm ANYL 373: Variable selection to improve biomarker identification and infrared spectral library matching
Barry Lavine, bklab@chem.okstate.edu, Collin White, Tao Ding

Oklahoma State University, Stillwater, Oklahoma, United States

2:00pm-2:20pm Intermission
2:20pm-2:40pm ANYL 374: USP up-to-date quality standards for excipients: Using infrared spectroscopy as a critical tool to determine identity of microcrystalline cellulose

Lucy Botros, Tong (Jenny) Liu, Catherine Sheehan, Kevin Moore, ktm@usp.org

United States Pharmacopeial Convention, Vienna, Washington, United States

2:40pm-3:00pm ANYL 375: New methodology for finding optimal spectral matches in reference databases
Gregory Banik, gregory_banik@bio-rad.com, Ty Abshear, Karl Nedwed

Bio Rad Informatics, Philadelphia, Pennsylvania, United States

3:00pm-3:20pm ANYL 376: EPA iCSS Chemistry dashboard to support compound identification using high resolution mass spectrometry data
Antony Williams1, tony27587@gmail.com, Jon Sobus2, Kamel Mansouri3, Mark Strynar2, Elin Ulrich2, Christopher Grulke1

1 National Center for Computational Toxicology, US Environmental Protection Agency, Durham, North Carolina, United States; 2 National Exposure Research Laboratory, US Environmental Protection Agency, Durham, North Carolina, United States; 3 ORISE Fellow, U.S. Environmental Protection Agency, Durham, North Carolina, United States

3:20pm-3:40pm ANYL 377: Demystify substance identity with clues from the CAS Registry
Allison Dick, allisondick@cas.org, Pillhun Son

CAS, Columbus, Ohio, United States

Technical Program with Abstracts

ACS Chemical Information Division (CINF)
252th ACS National Meeting, Fall 2016
Philadelphia, PA (August 21-25, 2016)

CINF Symposia

Elsa Alvaro, Program Chair

Download as PDF

[Created Fri Jul 29 2016, Subject to Change; Check ACS Online Program for Latest Changes]

CINF: Effectively Harnessing the World's Literature to Inform Rational Compound Design 8:25am - 11:45am
Sunday, August 21
Room 112A - Pennsylvania Convention Center
Daniel Ortwine, Organizing
Daniel Ortwine, Presiding
Cosponsored by MEDI
Financially supported by: Genentech
8:25am-8:30am Introductory Remarks
8:30am-9:05am CINF 1: PubChem’s literature and patent information for drug discovery
Sunghwan Kim, kimsungh@ncbi.nlm.nih.gov, Paul Thiessen, Tiejun Cheng, Bo Yu, Benjamin Shoemaker, Jiyao Wang, Evan Bolton, Yanli Wang, Steve Bryant

National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States
PubChem is a public repository for information on chemical substances and their biological activities. Since its launch in 2004, PubChem has been a key chemical information resource that serve scientific communities in many areas, including cheminformatics, chemical biology, and medicinal chemistry. Currently, PubChem contains more than 219 million depositor-provided chemical substance descriptions, 88 million unique chemical structures, and 229 million biological activity test results provided from over one million biological assay records.
Many PubChem records include depositor-provided cross-references to scientific articles in PubMed. Some PubChem contributors provide bioactivity data extracted from scientific articles, which complement high-throughput screening (HTS) data from the concluded NIH Molecular Libraries Program (MLP) and other HTS projects. Some journals provide PubChem with information on chemicals that appear in their newly published articles, enabling concurrent publication of scientific articles in journals and associated data in public databases. In addition, PubChem provides links to patent information for chemicals, thanks to data contribution from a growing number of organizations, including IBM and SureChEMBL (formerly known as SureChem). Currently, PubChem offers links between about 6 million patent documents and more than 17 million unique chemical structures, with over 345 million chemical substance-patent links covering U.S., European, Japanese, and World Intellectual Property Organization patent documents published since 1800.
Literature and patent information in PubChem can be accessed using PubChem’s web interfaces, allowing users to explore information related to PubChem records beyond typical web search results. Moreover, this information can also be accessed programmatically, enabling one to build a drug discovery pipeline that automatically check existing literature and patent information for compounds of interest.

9:05am-9:40am CINF 2: Harnessing the world’s literature to provide a crystallographic perspective on compound design: federated pharmacophore searching as example
Erin Davis, Ian Bruno, Paul Sanschagrin, sanschagrin@ccdc.cam.ac.uk

Cambridge Crystallographic Data Centre, Piscataway, New Jersey, United States
Crystal structures provide insights into molecular shape and interactions that are relevant to a range of scientific domains. Some 50 years ago, the Cambridge Crystallographic Data Centre (CCDC) began abstracting published crystal structure sets from scientific literature into the Cambridge Structural Database (CSD) and developing software that enabled this to be searched and analyzed. Over time the software around the CSD has evolved into a platform of software applications and services that enables the knowledge embedded in over 820,000 crystal structures to be applied to compound and materials design.

Today, processes for collating and curating data are somewhat different compared to the early beginnings due to adoption by the crystallographic community of robust digital data publishing workflows. Crystallography and the Cambridge Structural Database demonstrate the value of communities coming together to maximize the digital availability and accessibility of the data underpinning the scientific literature. Data access is only as good as how easily it can be queried however. In this talk we present Cross-Miner, a new interactive 3D pharmacophore querying tool, searching across federated crystallographic databases (CSD, in-house databases, PDB) to find not only matches within individual molecules, but also matching protein-ligand complexes.

9:40am-10:15am CINF 3: GOSTAR and ChEMBL comparison – commercial vs. open chemogenomics databases

Johannes Voigt, johannes.voigt@gilead.com, Uli Schmitz

Gilead Sciences, Foster City, California, United States
Chemogenomics databases capture chemical structures and activity values reported in the medicinal chemistry and patent literature.
While services like Reaxys and Thomson-Reuters Integrity/Pharma offer the full content only online, databases like ChEMBL/SureChEMBL from EBI and GOSTAR from GVKBio offer the entire content as database dumps, SDF-files etc. The latter are valuable for applications like comprehensive structure activity analysis (SAR) for an entire target/target class, single/multi target activity model building, and idea generation exercises like fragment/scaffold extraction (“privileged structures”).
GOSTAR covers both literature and patents which are manually curated by experts. It is limited to 11 major target classes and requires a commercial license. ChEMBL is freely available, manually curated, but covers only literature and other sources like Pubchem bioassays. Since being freely released in 2010 it is the most widely used chemogenomics database. More recently SureChembl (formerly SureChem) is now freely available as a collection of automatically extracted patent chemical structures.
In this study we compare GOSTAR and ChEMBL/SureChEMBL in terms of coverage, overlap, and accuracy. What is the value added for the commercial database versus the publically offered content? How are targets and target classes annotated, how well is the data normalized and how much clean-up work is required for comprehensive model building and analysis?
Finally, an in house query and annotation system was developed to access the GOSTAR and ChEMBL data.

10:15am-10:30am Intermission
10:30am-11:05am CINF 4: Exploring available compound data with the open PHACTS discovery platform and KNIME

Daniela Digles, daniela.digles@univie.ac.at, Gerhard Ecker

University of Vienna, Vienna, Austria
The Open PHACTS project [1] integrates several public databases, thus allowing answering questions relevant for research in the drug-discovery process [2]. The data collected in the project can be accessed with web tools, such as the Open PHACTS Explorer (www.openphacts.org/explorer), however, in a drug discovery project, this might be just another database to search information in. If the aim is to combine the data from the Open PHACTS Discovery Platform with in-house data or other specialised data sources, workflow tools represent a very convenient option. An example of combining public data with a manually curated dataset was published recently [3].
Here, we present possibilities to access the Open PHACTS Discovery Platform from within a KNIME workflow that can be used at the beginning of a drug-discovery project. It returns available data for a list of compounds and similar molecules, which can be used to prioritize molecules for follow-up.
Information collected for the molecule includes function and toxicity annotations (from Drugbank), the role of the molecule (from ChEBI), biological pathways containing this molecule (from Wikipathways), and patents mentioning the compound (from SureChEMBL). In the next step, proteins where the molecule is reported to be active in ChEMBL are returned, and the connection of the proteins to biological pathways (from Wikipathways) and to diseases (from DisGeNET) are shown. The data from all these sources is retrieved via the Open PHACTS API, easily connecting the identifiers used in the different databases. Links to the original data sources are retained, allowing manual curation of the collected associations. Additionally, external data sources or in-house data can be added in.

11:05am-11:40am CINF 5: NDEx, the Network Data Exchange: a resource for biological networks with application in informed compound design
Dexter Pratt, depratt@ucsd.edu

School of Medicine, UCSD, La Jolla, California, United States
NDEx, the Network Data Exchange (www.ndexbio.org), is an open-source project to enable scientists and organizations to share, store, manipulate, and publish biological network knowledge. NDEx can aid in informed compound design as a channel for programmatic access to knowledge about biological mechanisms and their interactions with compounds. It also provides a software framework to enable compound design applications using or producing networks. The NDEx public site is already serving as a publication channel for both compound-protein interaction networks and biological networks describing both molecular and phenotypic level information. For practical application in drug development, NDEx was developed with features that promote bridging between the academic and commercial communities, presenting a layered strategy in which users can control access to the networks they store and where organizations may run private NDEx Servers. In this presentation, we will explore examples in which NDEx content and software are used to link compounds to mechanism and phenotype, assembling information relevant to the compound design process.

11:40am-11:45am Concluding Remarks
CINF: Bringing Cheminformatics into the College Chemistry Classroom 8:15am - 12:05pm
Sunday, August 21
Room 112B - Pennsylvania Convention Center
Robert Belford, Sunghwan Kim, Organizing
Robert Belford, Sunghwan Kim, Presiding
Cosponsored by CHED
8:15am-8:20am Introductory Remarks
8:20am-8:40am CINF 6: Learning to find the right information: A survey of chemistry information literacy in the undergraduate classroom
Thibault Geoui, t.geoui@elsevier.com

Marketing, Elsevier, Frankfurt, Hesse, Germany
As part of today's undergraduate training, chemistry students are asked to radically change the way in which they interact with information. Search and use strategies that have served them well in the past are a poor match for the structure of scientific information, and pose a hindrance to their development as scientists. Over the course of 3 months, we informally discussed needs in information literacy training with librarians, faculty and teaching staff at various undergraduate institutions. We encountered a range of approaches to building information literacy into department curricula and just as broad a range of opinions about what makes it so difficult to successfully teach information retrieval and use skills. From our learnings, we have constructed an initial list of best practices, which we aim to improve as we collect more input from successful and unsuccessful experiences in the classroom.

8:40am-9:00am CINF 7: Co-developing chemical information management and laboratory safety skills
Ralph Stuart2, secretary@dchas.org, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States
The 2015 edition of the American Chemical Society’s Guidelines and Evaluation Procedures for Bachelor’s Degree Programs identifies six skill sets that undergraduate chemistry programs should instill in their students. In our roles as support staff for chemistry departments at two different institutions, we have been collaboratively studying these requirements and have found significant synergies between two in particular: “Chemical Literature and Information Management Skills” and “Laboratory Safety Skills”. We believe that by integrating emerging tools in the laboratory safety field into information literacy frameworks, a strong foundation can be established for the development of all the skills called out by the ACS. This presentation describes this strategy and provides examples of how these concepts can be implemented in both the chemistry teaching and research laboratory settings.


 

9:00am-9:20am CINF 8: Introducing SIVVU, a web-based program for modeling spectrophotometric titration data

Douglas Vander Griend, dav4@calvin.edu

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States
Spectrophotometric titrations are a simple and powerful way to thermodynamically characterize multicomponent systems. The data is relatively easy to obtain but proper analysis requires appropriate computer programs along with the requisite training. SIVVU is one of several programs that is capable of such analyses and after years of development has now been made available through the internet. Designed from a chemist’s perspective, it employs singular value decomposition to analyze the mathematical factor structure of pertinent datasets. More importantly, it can then model the data according to user-provided chemical reactions in order to determine the spectroscopic signatures and binding constants for the system. Data is all uploaded through a single MIcrosfot Excel spreadsheet, and outputs are written back to the same. Most functionalities are available free of cost making it ideal for implementation in undergraduate chemistry laboratories.


'The name says is all',

 


Sivvu takes a spectrophotometric dataset and deconvolutes it into a set of molar absorptivity curves and equilibrium concentration profiles.

9:20am-9:30am Intermission
9:30am-9:50am CINF 9: Integration of cheminformatics material into the STEMWiki hyperlibrary

Robert Belford3, rebelford@ualr.edu, Delmar Larsen2, Andrew Cornell1

1 Department of Chemistry, University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 2 Department of Chemistry, Univ California Davis, Davis, California, United States; 3 Department of Chemistry, Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States
This presentation will describe a project to contribute cheminformatics educational material to the STEMWiki Hyperlibrary. The Hyperlibrary consists of multiple interconnected and independently operating STEMWiki hypertext applications (ChemWiki, BioWiki, MathWiki, StatWiki, GeoWiki, PhysWiki), and is a collaborative platform that enables dissemination and evaluation of new education developments and approaches, with an emphasis on data-driven assessment of student learning and performance. The contents of these STEMWikis are both horizontally (across multiple fields) and vertically (across multiple levels of complexity) integrated within a massively interconnected network that provides, not just single textbooks, but an infinitely large Hyperlibrary through which interconnected STEM textbooks can be built.

This enables reuse or repurposing of material across the curriculum, and an objective of this project is to create a cheminformatics hypertextbook using material generated in the Fall 2015 Cheminformatics OLCC as the initial nucleus for cheminformatics educational content generation. Much of this material is in the form of Teaching and Learning Objects (TLOs), like YouTube Videos and modular assignments, which by their nature can be directly integrated into other STEM textbooks within the STEMWiki Hyperlibrary. The objective is not only to create a place for a community to contribute to a cheminformatics hypertextbook, but to do so in a way that enables integration of cheminformatics TLOs into other hypertextbooks of the chemistry curriculum. This presentation will describe the Cheminformatics OLCC and the integration of OLCC content into the STEMWiki Hyperlibrary.


 

9:50am-10:10am CINF 10: Holistic approach to cheminformatics in a liberal arts environment

Philip Adler, padler1@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States
Cheminformatics is an intrinsically interdisciplinary field, and most “cheminformaticians” began as either computer scientists or chemists. The Liberal Arts college environment presents unique opportunities to teach cheminformatics to undergraduate students in a way that mimics the evolution of this discipline. As such, we must question also what is meant by the ‘classroom’; with the inclusion of Undergraduate Thesis research projects carried out from both computer science and chemical perspectives, which are often notionally components of the same over-arching cheminformatics research, and ‘self-study’ modules, the line of what constitutes a ‘classroom’ and more specifically a ‘chemistry classroom’ becomes ever more blurred. To illustrate these different aspects, I will discuss different approaches to teaching cheminformatics, from the inclusion of aspects of Cheminformatics in computer science courses (one, for instance, centered around relational database schema design and implementation), through joint supervision of dual major students, and the employment of self-study courses to aid students in visiting this vibrant interdisciplinary field, as well as the use of cheminformatics as a component of more ‘traditional’ chemistry thesis students. A discussion of the experiences of a sample of the involved students and staff will be included to highlight the experience of cheminformatics in a Liberal Arts environment for undergraduates, and and to highlight potential areas of improvement moving forward.

10:10am-10:30am CINF 11: Cheminformatics education and research at home: the best way to teach graduate chemistry in the professional community
Hao Zhu, hao.zhu99@rutgers.edu

Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States
As one of the major goals in regional universities, education needs to be offered to a significant number of part-time students. For example, in Rutgers-Camden, 50% graduate students (764 out of 1,509) are part-time students by the end of 2015. Most of these graduate students have their full time jobs in the daytime and they can only use their free period to learn necessary courses and even perform research projects. The urgent requirements of these students are not only the flexible course schedule (e.g. courses in the evenings) but also the feasibility to finish most of the research works off the campus. In the past decades, there are many cheminformatics tools developed and most of them are public available through internet. Since I started the new cheminformatics class in graduate school of Rutgers-Camden, there have been over 40 graduate students enrolled in this class in the past four years. Although these students are still required to attend in the cheminformatics lectures on campus, they are able to finish most of the assignments at home. Furthermore, five students chose to perform research in the cheminformatics area to get their master degree. They finished almost all the research works at home using public available cheminformatics tools in their free periods. These efforts resulted in four research papers in peer reviewed scientific journals. The cheminformatics studies, majorly performed at home, greatly advanced their careers and also strengthened the newly developed graduate program in Rutgers-Camden.

10:30am-10:40am Intermission
10:40am-11:00am CINF 12: Fall 2015 cheminformatics OLCC project based learning: Validation of Wikipedia Chembox hazard information
Robert Belford, Brian Murphy, my59vw@gmail.com

Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States
In the Fall of 2015, four campuses participated in a hybrid (face to face and online) intercollegiate course in cheminformatics, the Cheminformatics OLCC. The purpose of this course, which was collaboratively taught with online guest lecturers and onsite faculty facilitators, was to enable the presentation of chemical subjects that are typically not available in the undergraduate curriculum due to their specialized nature. The course was structured around a series of modules covering different topics in cheminformatics, and in addition to module-specific assignments, each student developed their own project. Many of these projects were presented during a symposium of the Spring 2016 ACS National Meeting. This presentation will be by a member of the 2015 class who currently manages an analytical lab and has extended work on his class project into a subject of graduate study. The first part of the presentation will deal with the student’s perspective of the Cheminformatics OLCC as a distributed, collaboratively-taught hybrid course, and the second will focus on the student’s project of using databases to validate information within Wikipedia, specifically, information related to chemical hazards.

Wikipedia is a 21st Century Information Source of the People, by the People and for the People, and is globally the 7th top visited site on the Internet. It includes a variety of information about chemicals and chemical processes. However, the open-access crowdsourced nature of Wikipedia leads to new types of information literacy challenges; addressing these challenges fits into the theme of the meeting of “Chemistry of the People, by the People and for the People”. How can the People trust the chemical information within Wikipedia?

This presentation will describe the student’s work in developing electronic systems to validate chemical safety information using the structure of the Wikipedia Chembox, which models an RDF triple, to compare that information to values within more authoritative databases, specifically those collected by PubChem. Currently, it appears that only chemical identifiers are validated and this work seeks to assess the practicality and value of extending Chembox validation to other high value parts of the Chembox, such as the safety and hazard information found there.

11:00am-11:20am CINF 13: Cheminformatics in the chemistry classroom
Denis Fourches, dfourch@ncsu.edu

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States
Learning the nomenclature of organic molecules and the chemical reactions they undergo is not a trivial task. Nor is teaching molecular modeling concepts such as structure-based virtual screening and quantitative structure-activity relationships. Meanwhile, cheminformatics software tools to help understanding and accomplishing all the aforementioned tasks have never been so easily accessible. Therefore chemistry majors should be exposed and trained to these concepts and software as early as possible. This presentation will start by a brief review of several cheminformatics software programs that are used in the Organic Chemistry (CH221) classroom at NC State to help students (i) draw molecules and chemical reactions in 2D, (ii) visualize complex molecules (e.g., stereoisomers, conformers of cyclohexane) in 3D, and (iii) practice their knowledge on molecular nomenclature and chemical reactions. Publicly-available tools used in CH221 will be put in perspective with commercial educational platforms such as Sapling and Connect and their numerous complementarities will be highlighted. Second, we will present the computer-aided molecular design class (CH795) which aims to familiarize the graduate students in the chemistry PhD program at NC State to the concepts of molecular descriptors, QSAR modeling, structure-based docking, virtual screening, molecular dynamics simulations, and HTS data analysis. Software tools such as Knime, AutoDock Vina, PyMol, Chimera, and NAMD used in CH795 will be briefly reviewed. Finally, new technological trends (e.g., mobile devices, virtual reality) will be introduced as a perspective for improving the way we use and teach cheminformatics in the Chemistry classroom.

11:20am-11:40am CINF 14: Withdrawn
11:40am-12:00pm CINF 15: Modern cheminformatics tools in the teaching laboratory: A practical exercise simulating a drug discovery project

Chase Smith2, chase.smith@mcphs.edu, Tamsin Mansley1, tamsin.mansley@optibrium.com

1 Optibrium Ltd, Cambridge, Massachusetts, United States; 2 MCPHS University, Worcester, Massachusetts, United States
A 5-week long laboratory exercise has been incorporated into the Pharmaceutical Sciences graduate program syllabus at MCPHS University to simulate an early stage hit-to-lead and lead optimization in a drug discovery program. Students use the StarDrop™ cheminformatics software package from Optibrium Ltd., to guide the selection and design of compounds with an optimal balance of properties, together with publicly available datasets downloaded from the European Molecular Biology Laboratory (EMBL) Neglected Tropical Disease website. The laboratory simulation exercise provides a much-needed hands-on experience related to complex topics normally only discussed in theory, including mining primary screening data, predictive modelling and drug metabolism, and provides the students with practical experience utilizing modern cheminformatics software.

12:00pm-12:05pm Concluding Remarks
CINF: Effectively Harnessing the World's Literature to Inform Rational Compound Design 1:25pm - 4:45pm
Sunday, August 21
Room 112A - Pennsylvania Convention Center
Daniel Ortwine, Organizing
Daniel Ortwine, Presiding
Cosponsored by MEDI
Financially supported by: Genentech
1:25pm-1:30pm Introductory Remarks
1:30pm-2:05pm CINF 16: Extracting and exploiting medicinal chemistry ADMET knowledge automatically from public and large pharma data

Alexander Dossetter1, al.dossetter@medchemica.com, Edward Griffen2, Andrew Leach2,3, Shane Montague2

1 MedChemica Limited, Macclesfield, United Kingdom; 2 Medchemica Ltd, Macclesfield, United Kingdom; 3 Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Liverpool, United Kingdom
The remorseless increase in the cost of drug discovery requires medicinal chemists to generate compounds with properties acceptable for in vivo testing as efficiently as possible. An approach to this problem is to extract and record medicinal chemistry knowledge from measured data. The vast size of medicinal chemistry space, the global research efforts in compound design and intrinsically complex nature of drug sized molecules make the manual capture of such knowledge increasingly challenging. An automated approach based on advanced match molecular pair analysis combining two algorithms and capturing the local chemical environment will be presented. Case studies showing how such knowledge has been used to solve problems will be shared.

2:05pm-2:40pm CINF 17: Extracting knowledge from large in-vitro metabolic stability data sets using matched molecular pair analysis (MMPA)
Hao Zheng, zheng.hao@gene.com

Discovery Chemistry, Genentech, South San Francisco, California, United States
Through many years of drug discovery effort, pharmaceutical companies have accumulated a large set of in-vitro ADME data. Useful knowledge can be extracted from the data using matched molecular pair (MMP) and statistical testing.
This talk will describe how different pharmaceutical companies share the knowledge without disclosing molecular structures, and how we extracted the knowledge from share data sets using matched molecular pair analysis (MMPA)

2:40pm-3:15pm CINF 18: Gavitational waves shaking the chemical universe: virtual chemistry 2.0
Carsten Detering, detering@biosolveit.com

BioSolveIT Inc, Bellevue, Washington, United States

We have recently started to make the chemical universe much more maneuverable by generating a new virtual chemistry space: 58 robust chemical reactions (Hartenfeller et al., 2011), 42 from a previous fragment space and 21 text book chemistry reactions were collated and together with building blocks from trusted vendors, a virtual chemistry space generated, containing 16.314.207.184.647.693 molecules (more than 16 quadrillion compounds!). All these virtual molecules contain high likelihood of straight forward synthetic access. Together with this literature-derived collection of compounds we provide a unique search method that is capable of handling such vast amounts of molecules relatively easily.
Imagine de novo design of (a) hit expansion libraries, (b) follow-up series and (c) fragment evolution design from within an all accessible, gigantic compound space. We will demonstrate how this can become reality.

3:15pm-3:30pm Intermission
3:30pm-4:05pm CINF 19: Network analytics of structured and unstructured data: an evolutionary solution
Olivier Lichtarge, lichtarge@bcm.edu

Baylor College of Medicine, Houston, Texas, United States
Data integration is essential to overcome the logjam of data and publications. Here, we show a set of approaches that twin networks with evolution.

A first example integrates gene interaction networks in hundreds of species by eliminating redundant evolutionary relationships. This enables novel functional predictions, including for the essential but functionally uncharacterized malarial antigen EXP1. We find that EXP1 is a GST that efficiently degrades cytotoxic hematin and which is potently inhibited by artesunate. Thus, EXP1 is a possible molecular target to a frontline antimalarial drug.

A different example focuses on reasoning over the literature. A Knowledge Integration Toolkit (KnIT) automatically and scalably mines 25 million public PubMed abstracts to suggest novel protein kinases that phosphorylate the tumor suppressor p53. Focusing on a top candidate of pharmaceutical interest, we found that this protein phosphorylates p53 at Ser315, in vitro and in vivo, and functionally inhibits p53. This study demonstrates that automated reasoning over the entire literature generates falsifiable, novel and useful molecular hypotheses that test true and accelerate scientific discovery.

The last example aims to personalize networks by quantifying the impact of individual mutational variations. This impact depends on the unique context of each mutation, which is complex and often cryptic. Modeling evolution as a continuous and differentiable mapping between genotype to phenotype, yields a formal equation for the Evolutionary Action (EA) of coding mutations on fitness, the terms of which are readily computable. Mutational, clinical, and population genetic evidence show this Evolutionary Action equation predicts the effect of point mutations in vivo and in vitro in diverse proteins, correlates disease-causing gene mutations with morbidity, and determines the frequency of human coding polymorphisms, respectively.

Together these early studies suggest a broad integrative network formalism that unifies structured and unstructured, and which can be personalized to an individual's relevant mutational variations.

4:05pm-4:40pm CINF 20: Integrative data science, semantics, knowledge graphs, and evidence paths in the service of molecular discovery
Jeremy Yang2,1, jeremyjyang@gmail.com, Tudor Oprea2, David Wild1

1 School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States; 2 School of Medicine, University of New Mexico, Albuquerque, New Mexico, United States
Science is illuminated first and foremost by knowing the knowns, traditionally from peer reviewed scientific literature. However, this model has been strained and recast with the advent and evolution of the WWW. Informatics and data science have emerged as a hybrid discipline combining library science, computer science, and domain knowledge as in cheminformatics and bioinformatics. Scale-out of traditional publication has exceeded capacity for human consumption, while alternate online publication modes grow, improve and surpass old ways. Automated text mining, programmable web standards such as REST APIs, machine learning, community semantics through ontologies and vocabularies, and knowledge graph -based systems are some of the emerging technologies. In this talk we discuss projects that illustrate such emerging methods for knowledge processing and discovery. Chem2Bio2RDF, from Indiana University, is an integrated system of public datasets converted to RDF, all relevant to chemical biology and drug discovery. Since its release in 2010, several applications have been developed using Chem2Bio2RDF, including SLAP (Semantically Linked Association Prediction) for missing-link prediction of compound-target activity. Chem2Bio2RDF combines 25 datasets from 16 sources, employing formal semantic formats and domain expertise based novel linkages. An ongoing major upgrade reflects the profound advances of methods and resources in the last few years. Target Central Research Database (TCRD), developed at the University of New Mexico, is the main repository for the Illuminating the Druggable Genome Knowledge Management Center (IDG-KMC), and serves as the primary source for the related web portal, Pharos (pharos.nih.gov). TCRD integrates diverse datasets relevant to the druggable genome in a platform for data integration and analytics, with data types including: proteins, compounds, text mined bibliometric associations, gene expression, disease, phenotype and pathway associations, bioactivity and drug interactions. Chem2Bio2RDF and TCRD are exemplars of a new genre of knowledge resource, harnessing emerging methods and resources to offer unique discovery opportunities in drug discovery.

4:40pm-4:45pm Concluding Remarks
CINF: Beyond Citations: Challenges & Opportunities in Altmetrics 1:30pm - 4:55pm
Sunday, August 21
Room 112B - Pennsylvania Convention Center
Elsa Alvaro, Rachel Borchardt, Organizing
Elsa Alvaro, Rachel Borchardt, Matthew Hartings, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-1:55pm CINF 21: Altmetrics in the library
Anne Rauh, aerauh@syr.edu

Syracuse University, Syracuse, New York, United States
Research libraries assist scholars in demonstrating the value of their scholarly output through citation metrics and other measures. As the forms of scholarly communication change, so do the metrics used for assessing them. The services libraries offer must evolve in concert with these changes. This talk will provide a general overview of the ways in which altmetrics complement traditional citation metrics and will explore how libraries can benefit from engaging with a broader set of metrics to reach a wide range of users. The talk will cover the roles librarians can play in helping researchers and institutions understand the benefits and limitations of these metrics and will discuss how altmetrics are being used in library discovery services, how they are represented in institutional repositories, and they can drive collection development and other library decisions.

1:55pm-2:15pm CINF 22: Trusting altmetrics: updates from NISO's recommended practices
Todd Carpenter, tcarpenter@niso.org

National Information Standards Organization (NISO), Baltimore, Maryland, United States
For decades, the coin of the realm in scholarly assessment has been citations. But as content has moved to digital distribution, the variety of ways in which activity with and related to scholarship can be tracked and reported on has grown rapidly. Understanding what these data streams are, how they can be aggregated, and how they correlate to traditional metrics are all important elements to a network of assessment that can be trusted. Over the past three years, the National Information Standards Organization has been exploring these issues and is putting forward a set of recommended practices related to new forms of assessment. This talk will cover the ways in which the community is beginning to use non-traditional metrics and how the NISO recommendations will support a network of trust around these metrics.

2:15pm-2:35pm CINF 23: Tell the full story of your research with altmetrics
William Gunn, william.gunn@mendeley.com

Mendeley, Mountain View, California, United States
Managing attention only becomes more and more important as the rate at which information is produced grows. One way to do this is to focus on a few well-curated sources, but another way which puts the reader and the author more in control is to use metrics to help you filter a broader range of sources. This presentation will cover the kinds of metrics that are useful for discovery of scholarly content, using chemistry-focused examples from Mendeley and Scopus, and will also discuss how the scholarly community is dealing with challenges such as trust and reliability of data sources.

2:35pm-2:55pm CINF 24: Is that a wart or a beauty mark? An altmetrics analysis of an assistant professor’s scholarly activity
Matthew Hartings1, hartings@american.edu, Rachel Borchardt2

1 Chemistry, American University, Gaithersburg, Maryland, United States; 2 American University, Washington, District of Columbia, United States
In an effort to better quantify the research impact of practicing scientists a number of alternative metrics (altmetrics) have been developed to complement traditional metrics such as h-index and journal impact factor. The problem for chemical researchers (which is magnified for early career independent chemists) is understanding how and when to employ altmetrics. Specifically, a chemist must know what each metric is supposed to analyze and must understand what this analysis can and cannot tell. Only then will a researcher be able to decide if a particular altmetric can and should have sway over decisions that impact their career (made by deans, provosts, journal editors, and funding agencies). For this talk, a scholar of alternative metrics (Rachel Borchardt) has performed an analysis of the scholarly activities of a pre-tenure faculty member (Matthew Hartings). Matthew will discuss Rachel’s findings and try to put each metric into a proper and strategic perspective, discussing along the way whether the analysis notices a beauty mark or a blemish (or perhaps a little of both) on his résumé.

2:55pm-3:15pm CINF 25: Imperfect impact
Stuart Cantrill, stuartcantrill@gmail.com

Nature Chemistry, Cottenham, United Kingdom
Journals such as Nature Chemistry (and many others) are based on a model in which a large proportion of the submissions that they receive are ultimately declined for publication. The role of an editor working on one of these selective titles is to try to gauge just how interesting or significant (or useful) the papers that it receives are potentially going to be. This is not an easy task (and certainly not always a thankful one) especially when you consider that it is often still difficult to measure what the impact of any particular paper is in the years *after* it has been published. Do editors select papers based on citation potential or the past performance of papers on that topic (or indeed the same set of authors)? Is it even possible to predict if a paper will be a citation blockbuster? And exactly what do citations measure anyway? As editors, we do also look at other metrics to see how much attention papers are getting, but even then, if a paper is all over Twitter, is that a good thing? (Hint: maybe, but maybe not). And surely a large number of page views suggests that a paper is quite popular? (Hint: yes, but have you considered why?). This will be an editor's take on metrics, what their value is and what they do or don't mean.

3:15pm-3:30pm Intermission
3:30pm-3:50pm CINF 26: Advanced Research Projects Agency – Energy (ARPA-E): The mechanism and metrics of funding transformational technology for energy innovation
Daniel Cunningham, Daniel.Cunningham@hq.doe.gov

US Department of Energy, Advanced Research Projects Agency – Energy (ARPA-E), Washington, District of Columbia, United States
The U.S. Department of Energy’s Advanced Research Project Agency for Energy (ARPA-E) was established in 2009 to fund early stage transformational energy technologies that are too risky for private-sector investment alone. ARPA-E’s investment portfolio aims to generate options to address specific energy challenges that could provide dramatic benefits for the nation. The Agency is beginning to see real commercial impact in the areas that contribute to ARPA-E’s mission of promoting a more secure, affordable and sustainable American energy future.
ARPA-E invests in a range of different technologies across the energy spectrum such as renewable energy production and storage, energy efficiency, and bio-mass. The investment topics vary by year as ARPA-E identifies high potential and high impact technology “white spaces”. Selection metrics include the transformative nature of the technology, the potential impact of the technology on ARPA-E’s energy goals, potential environmental impact, and the potential for the project to yield commercial applications that benefit U.S. economic and energy security.
Within ARPA-E, the role of the Technology to Market group is to maximize the deployment, and ultimately impact, of funded project. This includes supporting project teams with their commercialization efforts and position them to be attractive to private or public investment to carry them through to deployment.
Ideally, metrics of success would be directly related to ARPA-E goals, such as a quantified reduction of greenhouse gas emissions or a reduced amount of imported oil. However, since these metrics are slow to evolve, ARPA-E is required to use intermediary metrics such as private follow-on-funding, new companies formed, and post project government partnerships to quantify success, with the assumption that they will predict the impact on ARPA-E mission.

3:50pm-4:10pm CINF 27: Responsible usage of diverse research metrics
Lisa Colledge, L.Colledge@elsevier.com

Elsevier, Amsterdam, Netherlands
There continues to be much discussion about the responsible use of research metrics, and the value that they can offer. This presentation will cover current best practice, and discuss how a basket of carefully chosen metrics can be used to support human judgment in decision making. We will look at the importance of combining novel metrics, such as altmetrics, with more familiar metrics to enable benchmarking that provides input into a wide range of activities.

4:10pm-4:30pm CINF 28: Investigating impact metrics for performance for the US-EPA National Center for Computational Toxicology
Antony Williams, tony27587@gmail.com, Monica Linnenbrink, Kevin Crofton, Russell Thomas

National Center for Computational Toxicology, U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina, United States
The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. We have delivered public access to terabytes of open data, as well to a large number of publicly accessible databases and applications, to support the research efforts for a large community of scientists. Many of our contributions to science are summarily described in research papers but to date we have not optimized our contributions to inform altmetrics statistics associated with our work. Critically missing from altmetrics is access to our numerous software applications and web service accesses, as well as the growing importance of our experimental data and models (e.g ToxCast, ExpoCast, DSSTox and others) to the scientific and regulatory communities. This presentation will provide an overview of our efforts to more fully understand, and quantify, our impact on the environmental sciences using a combination of our measurement approaches and available altmetrics tools. This abstract does not reflect U.S. EPA policy.

4:30pm-4:50pm CINF 29: Altmetrics: What has been the impact on ACS Publications?
Jeff Lang, j_lang@acs.org

ACS, Washington, District of Columbia, United States
In Spring of 2016, ACS Publications placed Altmetric badges on individual articles. This talk examines how readers and authors have used and reacted to this new source of information about an article's impact. We will analyze the data and take feedback from the audience on the value of these features on the ACS Publications website.

4:50pm-4:55pm Concluding Remarks
CINF: CINF Scholarships for Scientific Excellence 6:30pm - 8:30pm
Sunday, August 21
Howe - Loews Philadelphia Hotel
6:30pm-8:30pm CINF 30: Virtual nanoparticles

Wenyi Wang1, wwyi6@hotmail.com, Alexander Sedykh3, Linlin Zhao1, Bing Yan2, Hao Zhu3,1

1 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 2 Shandong University, Jinan, China; 3 Department of Chemistry, Rutgers University, Camden, New Jersey, United States
Drug delivery using nano materials (e.g. nanoparticle) is a promising method to achieve cell recognition. Folate receptors (FRs), which are overexpressed in many human cancer cells, have been used as ideal targets for the treatment of cancer and inflammatory diseases over several decades. In this study, we compiled a dataset consisting of 30 mono-ligand gold nanoparticles (GNPs) and 30 dual-ligand GNPs on cell recognition and uptake against four human cancer cell lines that express different levels of FR. Quantitative nanostructure toxicity relationship (QNTR) models were developed using this dataset. Specifically, we simulated the surface chemistry of GNPs by constructing virtual nanoparticles with various surface ligands. The receptor-binding affinities correlate to the important surface properties (e.g. surface shape, electron density and etc), which can be calculated by the virtual nanoparticles. Various modeling approaches (e.g. random forest, support vector machine and etc.) were applied to the resulting surface chemical descriptor set using ten-fold cross-validation procedure. The modeling results clearly indicate the relationship between nanostructure (i.e. GNPs with different ligands) and cell recognition. The validated models can be used to design new GNPs with desired cell recognition and the developed virtual nano-particle model can be used to evaluate other nanotoxicity endpoints for new nanomaterials.


The core of this study is to simulate the surface chemistry of GNPs (Figure 1) during the modeling procedure. The ligands attached to the Au core has relatively low accessibility than free ligands. A novel “expose parameter”, ranged from 0 to 1, was designed by calculating the distance of each ligand atom from the Au core and the ligand density.

6:30pm-8:30pm CINF 31: Experimental errors in QSAR modeling sets: What we can do and what we cannot do

Linlin Zhao2, zhaolin9142@gmail.com, Wenyi Wang1, Alexander Sedykh4, Hao Zhu3

1 Rutgers University, Camden, New Jersey, United States; 2 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 3 Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States; 4 Multicase Inc., Beachwood, Ohio, United States
Numerous data sources have become available for quantitative structure–activity relationship (QSAR) modeling studies. However, the quality of various data sources may be different based on the nature of experimental protocols. In this study, we explored the relationship between the ratio of questionable data, which was obtained by simulating experimental errors, in the modeling sets and the QSAR modeling performance. To this end, we used eight datasets (four continuous endpoints and four binary endpoints) that has been extensively curated in our lab to create over 1,800 various QSAR models. Each dataset was duplicated to seven new modeling sets with different ratios of simulated experimental errors (i.e. randomizing the activities of part of the compounds) in the modeling process. The five-fold cross validation process was used to show the model performance, which becomes worse when the ratio of experimental errors increases. All the resulting models were also used to predict external sets of new compounds which were excluded at the beginning of modeling process. The modeling results showed that the compounds with relatively large prediction errors in cross validation process are more likely to be those with simulated experimental errors. However, after removing certain number of compounds with large prediction errors in cross-validation process, the external predictions of new compounds did not gain improvement. Our conclusion is that the QSAR predictions, especially consensus predictions, are able to indicate those compounds with potential experimental errors. But removing those compounds will not result in better model performance due to overfitting. Apparently extra experimental testing is necessary for those compounds found to be questionable by QSAR predictions.

6:30pm-8:30pm CINF 32: Combining proprietary and published data in synthesis planning and reaction mining using Wiley ChemPlanner
Orr Ravitz, David Flannagan, Joyce Theisen, jtheisen@wiley.com

Research Informatics, John Wiley & Sons, Hoboken, New Jersey, United States
Organic synthesis is a vital component of the drug discovery process, essential to achieving high productivity, efficiency and novelty along the entire development pipeline. The knowledge accumulated in pharmaceutical companies in this domain is one of the most valuable assets of those organizations, and as such requires both protection and means of dissemination. Yet more often than not, the accessibility and utilization of this knowledge is limited. Data are frequently scattered across several systems in non-standardized formats, and their discoverability is wanting. ChemPlanner, the state of art in computer-aided synthesis design (CASD), typically offered as an online service, will see the release in the next few weeks of a version that can be hosted locally, behind the organization’s firewall. This platform can host the users’ proprietary data along with databases of published reactions, and thus enables federated queries of reaction retrieval and retrosynthesis within a secured environment.

ChemPlanner is a synthesis planning tool that allows chemists to consider broader sets of synthetic approaches to make their target molecules by carrying out retrosynthetic analysis based on large reaction databases and reaction rules that are derived from them. ChemPlanner also has integrated “traditional” searching capabilities such as structure, substructure and similarity searches, with metadata fields offering means of refining results sets. The retrosynthetic search can identify a large spectrum of synthetic routes leading to the target from commercially available starting materials, with literature examples supporting each reaction step and giving essential experimental information. Users’ reactions and starting material collections can supplement the provided data sources and offer to discovery and process chemists even broader coverage of synthetic know-how.

In the poster we give a general overview of ChemPlanner via test-cases. We discuss the main principles of extracting synthetic knowledge from reaction databases, including rules, stereo-selectivity and functional group tolerance, and the means to segregate data sources and extracted knowledge to guarantee the integrity of intellectual property. The requirements from the data source, as well as the available searching options will be shown as well.

6:30pm-8:30pm CINF 33: Modeling spectrophotometric titration data: tracking error from the measurement, through the model, and to the targeted output parameters
Nathanael Kazmierczak, kazmierczak314@gmail.com, Douglas Vander Griend

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States
Spectrophotometric titrations are a simple and powerful way to thermodynamically characterize multicomponent systems. While an increasing number of researchers rely on factor analysis to deconvolute the data in order to determine binding constants, the non-linear relationship between binding constants and spectroscopic data make it non-trivial to ascertain the error on the former when modeling the latter. An appraisal of direct methods of error analysis are presented alongside of Monte-Carlo simulations. Data sources include both real examples and artificially generated spectrophotometric titration data.


Spectrophotometric data, left, upon being deconvoluted into molar absoprtivity curves and equilibrium concentration traces, center, still exhibits residual, right, due to various forms of error in the measurement and the model.

6:30pm-8:30pm CINF 34: Dark reactions project: A cheminformatics approach to hydrothermal syntheses
Philip Adler2, padler1@haverford.edu, Joshua Schrier2, Alex Norquist1, Sorelle Friedler1

1 Haverford College, Bryn Mawr, Pennsylvania, United States; 2 Chemistry, Haverford College, Haverford, Pennsylvania, United States
Cheminformatics covers a wide remit of problems, including, but by no means limited to, statistical and machine learning models and their validation, consistent descriptions of systems under examination and frequently, software engineering. This poster highlights my work in each of these spheres as aspects of the Dark Reactions Project[1] at Haverford College, an open-source and publicly-available hydrothermal synthesis database with a web-based interface and associated software, which has been used to build models of hydrothermal synthetic reactions and to make predictions and hypotheses about those systems by harnessing big data approaches.
[1] http://darkreactions.haverford.edu

6:30pm-8:30pm CINF 35: Adverse drug reactions triggered by the common HLA-B*57:01 variant: A molecular docking study

George Van Den Driessche, gavanden@ncsu.edu, Denis Fourches

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States
Human leukocyte antigen (HLA) genes are encoding for cell surface proteins involved in key signaling mechanisms of the immune system.1 Recently, these proteins have been shown to be directly responsible for idiosyncratic adverse drug reactions (ADR).1–3 Herein, building upon our first proof-of-concept docking study with clozapine2, we present the analysis of the common HLA-B*57:01 variant that is notably responsible for the abacavir hypersensitivity syndrome. First, we analyzed three crystal structures (PDB codes: 3VRI, 3VRJ, and 3UPR) involving the HLA-B*57:01 protein variant as well as the anti-HIV drug abacavir and different endogenous peptides co-bound in the antigen-binding cleft.3,4 We superimposed the three structures and showed that abacavir had no significant conformational variation whatever the co-bound peptide (Figure 1). Second, we used Schrodinger’s Glide software to evaluate the abacavir-HLA binding affinity. The docking scores for abacavir without peptide in the cleft were as low as -8.27 and -7.99 kcal/mol using SP and XP scoring functions, respectively. In the presence of an endogenous co-binding peptide, we found a significant increase (~2 kcal/mol) of the docking scores and a key abacavir-peptide hydrogen bond indicating that the peptide plays a role in stabilizing the HLA-drug complex. Third, we docked a small set of drugs with known ADRs (e.g., allopurinol, fenofibrate, simvastatin) and analyzed their binding affinities toward HLA-B*57:01. Our presentation will focus on these drug-specific interactions with the B*57:01 variant and their matching with known HLA-mediated ADRs. This study demonstrates the appropriateness of molecular docking for evaluating HLA-drug interactions of high importance for precision medicine.


Figure 1. Three-dimensional superimposition of 3VRI, 3VRJ, and 3UPR containing the HLA-B*57:01 protein variant and the anti-HIV drug abacavir

6:30pm-8:30pm CINF 36: ChemML: A machine learning and informatics program suite for the chemical and materials sciences

Mojtaba Haghighatlari1, mojtabah@buffalo.edu, Johannes Hachmann1,2

1 Chemical and Biological Engineering, University at Buffalo, Buffalo, New York, United States; 2 New York State Center of Excellence in Materials Informatics, Buffalo, New York, United States
The idea of utilizing modern data science in chemical and materials research has recently gained considerable attention. However, tools and techniques that could facilitate this work have oftentimes not yet been developed or are still in their infancy. Existing expertise tends to be in-house, specialized, or otherwise unavailable to the community at large. Data science is thus in practice beyond the scope and reach of most researchers in the field. Our work aims to address this situation by creating the ChemML, a program suite and software toolbox that is designed to overcoming this situation, filling the prevalent infrastructure gap, and thus making the application of big data analytics in the chemical and materials context – e.g., via machine learning and informatics – a viable and widely accessible proposition. ChemML can be employed for the validation, analysis, mining, and modeling of large-scale data sets. Its primary purpose is to uncover hidden structure-property relationships that govern the behavior of chemical and materials systems. These insights are a prerequisite for rational design and inverse engineering capability as outlined in the White House Materials Genome Initiative.

A key consideration of our work is to make ChemML as comprehensive, black-box, and user-friendly as possible, so that it can be readily employed by interested researchers without the need for excessive expert knowledge. Our presentation will detail the code design and modular structure of ChemML, its capabilities, methodological advances, and initial proof-of-principle applications.

CINF: Chemistry Data for the People: From Policy to Practice 8:05am - 12:00pm
Monday, August 22
Room 112A - Pennsylvania Convention Center
Evan Bolton, Ian Bruno, Darla Henderson, Leah McEwen, Organizing
Darla Henderson, Leah McEwen, Presiding
Cosponsored by MPPG
8:05am-8:15am Introductory Remarks
8:15am-8:25am CINF 37: Viewpoint on open access by an editor, author, reviewer, and reader
Jonathan Sweedler, jsweedle@illinois.edu

Chemistry, University of Illinois, Urbana, Illinois, United States
Open access to published articles and to the associated research data supporting the article is a requirement of many funders around the globe. Having ready access offers value not only to medical and life sciences, but also to chemistry as a discipline. As Editor-in-Chief of Analytical Chemistry, and as a frequent open access author, reviewer, and reader, I present my perspectives on the value that open brings to research, to education, and to public awareness of the sciences.

8:25am-8:35am CINF 38: Data generation, publication and sharing
Richard Kidd, kiddr@rsc.org

Royal Soc of Chem T Graham Hse, Cambridge, United Kingdom
A view from the Royal Society of Chemistry on the evolving landscape around data within the chemical sciences.

8:35am-8:45am CINF 39: Implementing a data sharing policy: A publisher perspective
Raymond Boucher, rboucher@wiley.com, Kathryn Sharples

John Wiley and Sons Ltd, Chichester, United Kingdom
The data revolution is underway. Opening access to the world's research data offers huge potential to improve the transparency of research, accelerate the pace of discovery, improve return on investment, and lead to a future in which more research can be independently verified or made reproducible. Publishers, funders, and researchers have a shared responsibility to create an ecosystem that supports the sharing of data. To help address this Wiley has been implementing a Data Sharing Service that enables authors to transfer or link to data in approved repositories. This service is designed to increase discoverability, encourage innovation, and help authors comply with journal or funder mandates. Topics covered in the presentation will include the implementation of a data sharing policy, data accessibility statements, the licenses associated with data, and guidelines to authors on how to share data. The talk will illustrate how the Publisher supports and promotes data sharing in the community and how it provides guidance to enable effective data sharing.

8:45am-8:55am CINF 40: Ten habits of happy data: An exploration of Elsevier’s research data management program
Anita De Waard1, A.dewaard@elsevier.com, William Gunn2, william.gunn@mendeley.com

1 Elsevier RDMS, Elsevier Inc., Jericho, Vermont, United States; 2 Mendeley, Elsevier Inc., Mountain View, California, United States
The main tenet of data science is that new science can be done on old data. To make this possible, however, the data needs to be collected and stored in a way that allows downstream scrutiny, validation, and use. This calls for a connected infrastructure of research data management tools, in which the inputs and outputs of the many tools and parties currently involved in data creation, storage, and access should work together. In this talk, we will discuss how different components of a research data ecosystems can work together to address data preservation, curation, archiving, access, comprehension, reproducibility, discoverability, trust, citation, and re-use. We will present a series of initiatives in which Elsevier is partnering with research institutions to improve such ecosystems, and in particular consider the role of chemical data within this framework.

8:55am-9:05am CINF 41: Changing workflows and mindsets
Martin Hicks, mhicks@beilstein-institut.de

Beilstein Institut, Frankfurt, Germany
We are in the middle of a large technological revolution. The internet of things is knocking on the door – household appliances, electricity grids, medical monitoring and automobiles, for example, are becoming interconnected. Chemistry is also changing, and machines and intelligent control systems are starting to have an impact. To date, one of the technologies most resilient to change has been scientific publishing. The basic functions of sharing research results need to be re-engineered. The current workflows need to be changed to accommodate better data reporting, validation and sharing – and they will change, as will the mindset of the practitioners when the theoretical advantages become practical ones.

9:05am-9:35am Panel Discussion
9:35am-9:45am Discussion
9:45am-10:00am Intermission
10:00am-10:10am CINF 42: NSF MPS Open Data workshop series: Taking the pulse of the research community on open data issues
Mike Hildreth2, mhildret@nd.edu, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Department of Physics, University of Notre Dame, Notre Dame, Indiana, United States
We have begun to coordinate a series of two NSF-funded workshops aimed at gauging the needs of the community of NSF MPS researchers in terms of data preservation, the infrastructure required to make research data public and useful, and a response to potential guidelines that might be implemented to insure the preservation of scientific knowledge without undue burden on the researchers, while making the data available in a useful way to the public. The first workshop was held in Arlington, VA, in November 2015 and has resulted in a preliminary report framing the opinions of the MPS community. A second workshop will be held in fall 2016 to adapt the report in order to incorporate feedback from the research community “at large”, collected at venues like this meeting. The presentation will include an overview of the conclusions presented in the report. The report can be accessed at https://mpsopendata.crc.nd.edu/

10:10am-10:20am CINF 43: Open Data: What the reader wants to know rather than what the author wants to present
Robin Rogers, robin.rogers@mcgill.ca

Department of Chemistry, McGill University, Montreal, Quebec, Canada
The ACS journal Crystal Growth & Design has promoted the concept of ‘open data’ since its inception in 2000. Crystallographic data, with the current repositories and software to view and analyse the data offers a glimpse at the future when all data is available for analysis and use from the perspective of what the reader wants to know or study, rather than what the author wants to present. In this presentation some of the advantages and disadvantages of open data will be discussed in view of the idealistic goals and the pragmatic realities.

10:20am-10:30am CINF 44: Role of disciplinary data repositories in data publishing
Ian Bruno, bruno@ccdc.cam.ac.uk, Amy Sarjeant, Erin Davis

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
When it comes to open sharing of data, disciplinary repositories have been enabling this for many years. Today, the Cambridge Structural Database (CSD) enables access to over 820,000 crystal structures which may be associated with journal articles but are increasingly being published independently as CSD Communications. Disciplinary repositories can provide streamlined deposition mechanisms that offer value to the researcher who generates the data and lower the barriers to efficient and effective deposition. Crucially, disciplinary repositories provide the domain expertise necessary to make the data discoverable and reusable across a range of subject areas. In the case of crystallography, the CSD enables chemists to discover and use crystallographic data and knowledge in ways that are best suited to their specific application areas and operating environment. This presentation will explain the role that the CSD has in enabling meaningful and applicable publication of research data.

10:30am-10:40am CINF 45: Figshare data repository
Dan Valen, dan@figshare.com

Figshare, Brooklyn, New York, United States
Good data management and infrastructure is at the foundation of reproducible research. This talk will touch on the evidence and challenges for reproducibility we’ve seen at Figshare and will delve deeper into the funder policies and incentives to motivate different stakeholders and communities toward best practices and workflows to achieve transparency in scientific research.

10:40am-10:50am CINF 46: Importance of open raw data in chemistry research
Santiago Dominguez Vivero, sdominguez@mestrelab.com, Carlos Cobas, Agustin Barba, Felipe Seoane, Santiago Fraga

Mestrelab Research SL, Feliciano Barrera, Santiago de Compostela, Spain
Primary raw data is fundamental to the integrity and quality of the publication process in chemistry. All stakeholders in publication (publishers, readers, authors and reviewers) would derive very significant benefits from the inclusion of primary raw data with publications.

Raw data is critical to insure the integrity of the published materials, but has many other benefits, such as:
- Reproducibility (other scientists can reproduce the same analyses and the same analysis results)
- Interactivity (allows other scientists to interact with the data generated from the relevant experiments)
- Knowledge building: the community as a whole can build knowledge based on the primary research data and leverage this knowledge for future research.

Mestrelab is presenting here the results of our efforts to support the inclusion of chemistry analytical primary raw data in the publication process. We present results from the following initiatives:
- Automatic generation of formatted primary raw data for inclusion in publications
- Published analytical raw data review tools freely available to the chemistry community
- The role of ELNs in facilitating the sharing of research data, and an example of an ELN with a readily exportable data structure which significantly facilitates the sharing of chemistry data.

10:50am-11:00am CINF 47: Practical issues in chemistry data sharing in PubChem
Sunghwan Kim, Evan Bolton, Steve Bryant, Yanli Wang, ywang@ncbi.nlm.nih.gov

National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public archive that contains information on a broad range of chemical entities, including small molecules, lipids, carbohydrates, and (chemically-modified) amino acid and nucleic acid sequences (including siRNA and miRNA). Currently, PubChem contains more than 219 million substance descriptions, 88 million unique chemical structures, and 229 million bioactivity test results from one million bioassays, covering about ten thousand target protein sequences. This vast amount of chemical information is contributed from more than 400 data sources, including government agencies, academic institutions, pharmaceutical companies, chemical vendors, publishers, and other databases.
Chemical data sharing through a public database like PubChem presents some unique challenges. Although funding agencies can mandate sharing of data generated in studies they support, many organizations in the private sector, like publishers and chemical vendors, are not required to submit their data to PubChem or other public databases. Then, why would they share their data with other people? What kind of data would they want to share? What benefits can data sharing provide for these private entities? In addition to these questions, data sharing among more than 400 data sources raises many technical issues to consider. When multiple data sources provide redundant data on the same chemical, how can we extract unique information? What should we do if discrepancies exist in data from different sources? In this presentation, we will discuss how these issues are handled in PubChem.

11:00am-11:30am Panel Discussion
11:30am-11:40am Discussion
11:40am-12:00pm CINF 48: Value of open data for chemists: Summary and perspectives
Judith Currano, currano@pobox.upenn.edu

Chemistry Library, University of Pennsylvania, Philadelphia, Pennsylvania, United States
The expansion of data sharing and archiving requirements by funding agencies has forced the issue of open data in the chemistry community, causing researchers to think more about methods by which they can make their data available to others and providing them with additional data streams for their research. Librarians and other information and data professionals are working to support these new open data initiatives, to help researchers develop and enhance their own best practices for managing and presenting data, and to help researchers integrate good data practices into the research life-cycle, using many of the same techniques that they have historically used to promote the integration of sound information-seeking practices in research. This summary presentation will offer a chemistry librarian's practical reflection on the key issues and takeaways from the preceding panel discussion.

CINF: Shedding Light on the Dark Genome: Methods, Tools & Case Studies 8:15am - 12:00pm
Monday, August 22
Room 112B - Pennsylvania Convention Center
Rajarshi Guha, Tudor Oprea, Organizing
Rajarshi Guha, Tudor Oprea, Presiding
Cosponsored by BIOT, COMP and MEDI
8:15am-8:40am CINF 49: Illuminating the druggable genome: Linking diseases, targets and drugs
Tudor Oprea, toprea@salud.unm.edu

University of New Mexico, Albuquerque, New Mexico, United States
The Illuminating the Druggable Genome Knowledge Management Center (IDGKMC) evaluates, organizes and distils more than 80 protein-centric and over 20 gene-centric resources, currently focused on G-Protein Coupled Receptors, Nuclear Receptors, Ion Channels and Kinases. Data wrangling, coupled with algorithmic processing, text mining of drug labels, patents, medical literature, as well as human curation and drug-target ontology development, yield emergent properties and knowledge for target-disease associations. Tissue expression data from GTEx and other sources, disease-centric text mining and other resources are integrated using a number of specialized ontologies, e.g., the Disease Ontology. Using metrics derived from text mining and gene reference into function, as well as the number of antibodies, IDGKMC catalogs proteins into four categories: “Tdark”, i.e., proteins that lack functional information and disease relevance, 'Tclin', proteins with confirmed drug mechanism of action, 'Tchem', for which potent) small molecules are known, and literature, functional and disease annotations data are available (“Tbio”) - see Figure 1. Data can be mined via the user interface portal, Pharos.
This integrative effort led to the following observations: i) there appears to be a knowledge deficit, i.e., we lack understanding of protein function for 38% of human proteome; ii) less than 3% of the human proteome is therapeutically addressed by drugs; iii) given current understanding of disease (~8,800 disease concepts), as well as all diseases addressed via on-label (~2,000) and off-label (~400) indications, we currently address at most a quarter of all diseases via therapeutic agents.


Figure 1: Target Development Levels for the Human Proteome, including four 'druggable' protein families.

8:40am-9:05am CINF 50: Tracking biological targets in drug discovery using the ChEMBL and SureChEMBL databases
Prudence Mutowo, prudence@ebi.ac.uk

CHEMBL, EMBL-EBI, CAMBRIDGE, HINXTON, United Kingdom
The identification of potential drug targets and up to date knowledge about the extent to which these targets have been studied is essential information in giving impetus to the next wave of drug discovery efforts.
This talk describes how as part of the Illuminating the Druggable Genome Program, we used data curation efforts to retrieve drug-discovery relevant information to understudied targets in the main drug target families. We will detail our use of publicly available data sources such as the ChEMBL database, SureCHEMBL patent resource and clinical trials information to collate compound and disease information to understudied protein targets and present some of the challenges we encountered while doing so. We present highlights on how this information extraction and curation has proved useful in providing insights to proteins occupying the dark part of the ‘druggable genome space’.

9:05am-9:30am CINF 51: Formal ontologies and software tools to facilitate integration, classification and modeling of drug discovery data

Stephan Schürer1,2, stephan.schurer@gmail.com, Asiyah Yu Lin1, Hande McGinty1, Qiong Cheng1, Amar Koleti1, Nooshin Zadeh1, Dusica Vidovic1

1 Center for Computational Science, University of Miami, Miami, Florida, United States; 2 Department of Pharmacology, University of Miami, Miami, Florida, United States
Several research consortia and countless projects in pharmaceutical companies generate, organize, and analyze small molecule drug screening data. Such consortia supported by the NIH Common Fund include the (now past) Molecular Libraries Program (MLP), and currently the Illuminating the Druggable Genome (IDG) and the Library of Integrated Network-based Cellular Signatures (LINCS) projects. A large component of the MLP program was the development of chemical probes to study a wide variety of biological questions. This program generated new assay technologies, huge amounts of chemical biology screening data and over 350 chemical probes. The observation of an apparent strong bias of drug discovery research and development efforts towards targets that are already well studied, motivated the IDG program to prioritize novel drug targets and catalyze the development of chemical entities to target understudied proteins with a focus on four protein families, kinases, GPCRs, nuclear receptors and ion channels. The LINCS program has a systems biology focus. The project creates a reference 'library' of molecular signatures, such as changes in gene expression and other cellular phenotypes that occur when cells are exposed to a variety of perturbing agents, and computational tools for data integration, access, and analysis. Dimensions of LINCS signatures include the biological model system (cell type), the perturbation (e.g. small molecules) and the assays that generate diverse of phenotypic profiles.
Data integration is a common and critical challenge in these and other projects; and data integration requires common metadata standards and conventions for data representation and exchange. Towards the goal of creating common data standards to represent data in these and other projects that produce data relevant for drug discovery, and to support software tools that we and others have been building as part of these projects, we have been developing ontologies including the BioAssay Ontology (BAO) and the Drug Target Ontology (DTO). The goal of these ontologies is enable the knowledge-based classification of diverse and large datasets into categories that facilitates re-use and context-specific integration and querying, for example to develop predictive models or to quickly explore and correlate different datasets.
BAO, DTO and other ontologies provide a robust framework to represent, integrate, model, and query diverse drug discovery data generated in different projects.

9:30am-9:40am Intermission
9:40am-10:05am CINF 52: KEA2: Multiple views of the human kinome

Nicolas Fernandez2, nicolas.fernandez@mssm.edu, Andrew Rouillard2, Klarisa Rikova1, Peter Hornbeck1, Avi Ma'ayan2

1 Cell Signaling Technology, Danvers, Massachusetts, United States; 2 Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, United States
Kinases are a class of cell signaling proteins that control diverse cellular functions through protein phosphorylation. Dysregulation of kinase activity is common in many cancers, and kinases are effective therapeutic drug targets. However, we still have a very partial view of the human kinome in normal physiology and disease. To address this challenge we constructed mammalian kinome networks from over twenty diverse public resources, and developed a web-based tool and database called kinase enrichment analysis version 2 (KEA2). KEA2 can be used to predict kinase activity given proteomics, phosphoproteomics and genomic data. The different views of the human kinome connect kinases based on their known binding partners, known substrates, co-expression, effects on cancer cell-lines when knocked down, effects on gene expression when knocked down, and similar roles in disease. As a case study, we applied KEA2 to the analysis of original unpublished phosphoproteomics dataset collected from 31 non-small cell lung cancer cell lines. The analysis generated unique kinome signatures for the cell lines with agreement with previous knowledge as well as point to potential new drivers in lung cancer. In conclusion, KEA2 is a useful resource to advance our understanding of the human kinome.

10:05am-10:30am CINF 53: Pharos - shining light on the druggable genome
Dac Trung Nguyen, Timothy Sheils, Geetha Mandava, Ajit Jadhav, Noel Southall, Rajarshi Guha, rajarshi.guha@gmail.com

NCATS, Manchester, Connecticut, United States

The druggable genome corresponds to the set of protein targets that are amenable to small molecule perturbation. While this set of targets has enormous potential in terms of understanding and treating many disease conditions, the bulk of them are understudied or not studied at all. To address this the NIH initiated the 'Illuminating the Druggable Genome' program to characterize the dark regions of the druggable genome. As part of this program, a Knowledge Management Center (KMC) was created to aggregate and integrate heterogeneous data sources and data types creating a centralized location for information about all protein targets indentified as part of the druggable genome. In this presentation we describe the design and deployment of Pharos, the user interface for the KMC. Based on modern web design principles the interface provides facile access to all data types collected by the KMC. We provide an overview of the data sources and types made available via Pharos and then describe the architecture of the system and its integration with KMC & external resources. Given the complexity of the data surrounding any target, efficient and intuitive visualization has been a high priority, to enable users to quickly navigate and summarize search results and rapidly identify patterns. We highlight the approaches we have taken to address this requirement. A critical feature of the interface is the ability to perform flexible search and subsequent drill down of search results. We describe the design of a faceted search interface coupled to the Drug-Target Ontology (DTO) that supports these activites. Underlying the interface is a RESTful API that provides programmatic access to all KMC data, allowing for easy consumption in user applications. We conclude by highlighting some workflows on targets of interest to the IDG program.

10:30am-10:55am CINF 54: From dark chemical matter to shedding light on the dark genome: How can chemistry and informatics enable biology?
Meir Glick, meir.glick@merck.com

Merck Research Laboratories, Boston, Massachusetts, United States
This presentation describes a forward-looking strategy on how integration of screening, chemical synthesis and informatics can enable target identification and validation. More specifically, design of high quality molecular probes that target novel targets and phenotypes, creation of the most biologically relevant assay systems, progressing of the right perturbagen into in vivo and eventually transforming the organization from data creators into decision makers.

10:55am-11:05am Intermission
11:05am-11:30am CINF 55: KinomeNet: accurate prediction of protein kinase inhibitors with deep convolutional neural networks

Olexandr Isayev, olexandr@olexandrisayev.com, Alexander Tropsha

UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
The human genome encodes 518 protein kinases that are collectively referred as a human kinome. Kinases are among the most important targets for drug discovery and development in the pharmaceutical industry. A large number of protein kinase inhibitors are either in clinical development or have been approved to treat a wide variety of diseases including cancer, inflammation, diabetes, immunodeficiency and neurological disorders.
Traditionally QSAR models were developed for every individual target separately. Therefore, accurate prediction of full kinome profile for a molecule is a great challenge for computational drug discovery. Here, we address this challenge using an approach based on recent advances in machine learning – deep convolutional neural networks (CNNs). These neural networks allow for multi-task learning, procedure of learning several tasks at the same time with the aim of mutual benefits. Its architecture allows sharing of learnt information across sub-tasks, and therefore joint training for every endpoint. We have applied this approach (termed KinomeNet) to a large dataset of kinase inhibitors extracted from databases (Pubchem, ChEMBL), articles and patents. Dataset include over 250,000 compounds and 369 kinases. We showed that KinomeNet had high average specificity (0.90) and sensitivity (0.87) of prediction, which compared favorably to similar metrics for the state-of-the-art Random Forest models (SP 0.83 and SE 0.82) across over 200 kinases. We posit that the success of KinomeNet is due to information sharing, which is especially beneficial for endpoints with small or highly imbalanced datasets, those that traditionally most challenging for QSAR methods.

11:30am-11:55am CINF 56: Analogous phylogenetic analysis using protein length and protein disorder
Haobo Guo1, guohaobo@gmail.com, Gerald Tuskan2, Xiaohan Yang2, Hong Guo1

1 BCMB, University of Tennessee Knoxville, Oak Ridge, Tennessee, United States; 2 Biology, Oak Ridge Nationa Laboratory, Oak Ridge, South Dakota, United States
In annotated proteomes from different organisms, there are significant amounts of proteins that do not have homology to other proteins with known structures and/or functions, and hence are termed the “dark matter” of the proteomes. Many intrinsically disordered proteins (IDP) or proteins with large contents of residues in the intrinsically disordered regions (IDR), are of the proteomic “dark matter”. Recognition and understanding of the IDP/IDR have even urged the retirement of the Anfinsen’s Dogma in molecular biology, i.e., sequence determines structure determines function. Here, we propose a phylogenetic method covering both protein amino acid sequence length (L) and protein intrinsic disorder (D) of the whole proteomes. The phylogeny reconstructed in the two dimensional (2D) L-D space set up an intriguing landscape of the evolutionary dynamics of organisms. This approach can clearly distinguish eukaryotes from prokaryotes. The viral and plasmid gene pools, the giant DNA viruses (Giruses), and the Archezoa (mitochondrion-free eukaryotes) are all located in the eukaryotic basal zone, or the prokaryote-to-eukaryote transition zone. Moreover, the plants and animals, and even plant monocots and eudicots, exhibit different patterns in this L-D space. These method covers all proteins of the proteomes, including IDPs and other proteins in the dark regions. In addition, since the intrinsic disorder of proteins are predictable and the analogous phylogeny reconstructed based on the protein length and protein intrinsic disorder could clearly identify the evolutionary status of the organisms, we argue that the Anfinsen’s Dogma should stand for now, as disorder itself could be treated as a special case of structure, or order.

11:55am-12:00pm Concluding Remarks
CINF: Chemistry Data for the People: From Policy to Practice 1:30pm - 4:30pm
Monday, August 22
Room 112A - Pennsylvania Convention Center
Evan Bolton, Ian Bruno, Darla Henderson, Leah McEwen, Organizing
Evan Bolton, Presiding
Cosponsored by MPPG
1:30pm-1:35pm Introductory Remarks
1:35pm-1:55pm CINF 57: Community forum for chemistry data and information
Ian Bruno1, bruno@ccdc.cam.ac.uk, Leah McEwen2, Stuart Chalk3

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 Clark Library, Cornell University, Ithaca, New York, United States; 3 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States
DIG Chemistry is a global conversation emerging in the chemical information professional community to map the challenges and opportunities for chemistry data across the enterprise. Quoting from the group presence in the open Research Data Alliance: “There is a wealth of chemical data in various heterogeneous formats, distributed across a myriad of systems with endless potential for reuse in chemistry research and many related domains. However, many social, technical and administrative factors have limited the opportunities for open sharing and interoperable exchange. The high reuse value of chemical information has sparked decades of innovative technologies addressing various challenges in handling chemical specific data, but very few approaches have persisted, are extensible beyond specific data types and/or are operable at scale. There is demonstrable need for coordinated development of updated and scaled infrastructures, hard and soft, for enabling chemical data exchange and connecting data providers with data users across sources and applications.” This discussion session will provide the opportunity for those most active in the field to identify the highest priorities and target productive community collaborations.

1:55pm-4:10pm CINF 58: Chemistry data pain points: distilled, analyzed, and next steps
Evan Bolton2, evan.e.bolton@gmail.com, Leah McEwen3, Ian Bruno1

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 National Center for Biotechnology Information, Bethesda, Maryland, United States; 3 Clark Library, Cornell University, Ithaca, New York, United States
At the last ACS meeting in San Diego, the Division of Chemical Information (CINF) held its first ever Data Summit over the course of five days. As a follow-up, this panel discussion will provide an overview of the challenges surrounding chemistry data representation. Collected from nearly one hundred 'pain points' expressed by chemical information experts, there emerged several crucial themes for managing and working with chemistry information: data access, data quality, chemical structure representation, data description and metadata, curation and management tools, and audience and community engagement. The panel will summarize these points, engage the audience in active discussion to further vet the issues, and surface potential solutions and approaches to improve the state of the art. Outcomes will be published in a form that helps to further programs and publications to reach broader audiences interested in chemistry data.

4:10pm-4:30pm Concluding Remarks
CINF: Using New Media to Communicate Chemistry to the Public 1:30pm - 4:15pm
Monday, August 22
Room 112B - Pennsylvania Convention Center
Susan Morrissey, Lauren Wolf, Organizing
Matt Davenport, Lauren Wolf, Presiding
Cosponsored by MPPG and PRES
1:30pm-1:40pm Introductory Remarks
1:40pm-2:00pm CINF 59: Communicating chemistry on YouTube
Adam Dylewski, a_dylewski@acs.org

American Chemical Society, Washington, District of Columbia, United States
Bill Nye has said that “if you want to teach something, you have to entertain people.” This entertaining, educational approach is at the heart of Reactions, a weekly ACS YouTube series highlighting the chemistry of everyday life. Over the course of more than 140 episodes, the series has explained the chemistry of pizza, wet dog smell, tattoos, cookies, bacon, moisturizer and much more. Since its launch in early 2014, Reactions episodes have received more than 20 million views and have been featured on the Today Show, NPR, Washington Post and more than 100 other media outlets. In this session, the series' creator Adam Dylewski

2:00pm-2:20pm CINF 60: Sound of science (and history and culture)
Mariel Carr, MCarr@chemheritage.org

Chemical Heritage Foundation, Philadelphia, Pennsylvania, United States
Distillations podcast explores the human stories behind science and technology, tracing a path through history in order to better understand the present. With thoughtfulness and humor we’ve traveled into the heart of a Silicon Valley asteroid mining company, explored the anti-GMO movement, and examined how new feats in “bloodless medicine” have been propelled in part by Jehovah’s Witnesses. We’ve even explained how DDT is the Britney Spears of chemicals. Distillations uncovers the many ways our daily lives intersect with science, while linking the present to the past and giving us a better grasp of the now.

2:20pm-2:40pm CINF 61: Got something to say? Engaging with social media in the time you have
David Oppenheimer, oppenhe@ufl.edu, Paris Grey

University of Florida, Gainesville, Florida, United States
After creating the blog, Undergrad in the Lab (undergradinthelab.com), to help undergraduates be successful and make meaningful connections with their research, we quickly realized that many of our strategies applied to researchers at all levels. To engage with this wider audience we employ a variety of social media, namely facebook, twitter, and Instagram as @youinthelab. This talk will cover what's working for us and why we chose these platforms.

2:40pm-2:55pm Intermission
2:55pm-3:15pm CINF 62: Compound interest: Communicating chemistry using infographics
Andy Brunning, ndbrning@gmail.com

Compound Interest, Cambridge, United Kingdom
Online engagement is increasingly driven by images and multimedia. This session will look at how Compound Interest has taken advantage of this by using images and design to communicate chemistry concepts. It will also provide suggestions on how other chemists and science communicators can take advantage of graphic design to communicate chemistry ideas and research.

3:15pm-3:35pm CINF 63: Pop culture chemistry
Raychelle Burks, rmburks@gmail.com

St. Edward's University, Austin, Texas, United States
Dodging zombies, killing kings, battling aliens, fact-checking cartoons, and sussing out stunt videos? Sounds silly, but it can be a serious science communication opportunity. In this talk, I'll share my adventures and strategies as a pop culture chemist on TV, in podcasts, and at genre cons.

3:35pm-4:15pm Panel Discussion
CINF: Sci-Mix 8:00pm - 10:00pm
Monday, August 22
Halls D/E - Pennsylvania Convention Center
8:00pm-10:00pm CINF 10: Holistic approach to cheminformatics in a liberal arts environment

Philip Adler, padler1@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States

8:00pm-10:00pm CINF 15: Modern cheminformatics tools in the teaching laboratory: A practical exercise simulating a drug discovery project

Chase Smith2, chase.smith@mcphs.edu, Tamsin Mansley1, tamsin.mansley@optibrium.com

1 Optibrium Ltd, Cambridge, Massachusetts, United States; 2 MCPHS University, Worcester, Massachusetts, United States

8:00pm-10:00pm CINF 16: Extracting and exploiting medicinal chemistry ADMET knowledge automatically from public and large pharma data

Alexander Dossetter1, al.dossetter@medchemica.com, Edward Griffen2, Andrew Leach2,3, Shane Montague2

1 MedChemica Limited, Macclesfield, United Kingdom; 2 Medchemica Ltd, Macclesfield, United Kingdom; 3 Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Liverpool, United Kingdom

8:00pm-10:00pm CINF 30: Virtual nanoparticles

Wenyi Wang1, wwyi6@hotmail.com, Alexander Sedykh3, Linlin Zhao1, Bing Yan2, Hao Zhu3,1

1 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 2 Shandong University, Jinan, China; 3 Department of Chemistry, Rutgers University, Camden, New Jersey, United States

8:00pm-10:00pm CINF 31: Experimental errors in QSAR modeling sets: What we can do and what we cannot do

Linlin Zhao2, zhaolin9142@gmail.com, Wenyi Wang1, Alexander Sedykh4, Hao Zhu3

1 Rutgers University, Camden, New Jersey, United States; 2 Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, United States; 3 Chemistry Department, Rutgers Univesity, Camden, New Jersey, United States; 4 Multicase Inc., Beachwood, Ohio, United States

8:00pm-10:00pm CINF 35: Adverse drug reactions triggered by the common HLA-B*57:01 variant: A molecular docking study

George Van Den Driessche, gavanden@ncsu.edu, Denis Fourches

Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States

8:00pm-10:00pm CINF 36: ChemML: A machine learning and informatics program suite for the chemical and materials sciences

Mojtaba Haghighatlari1, mojtabah@buffalo.edu, Johannes Hachmann1,2

1 Chemical and Biological Engineering, University at Buffalo, Buffalo, New York, United States; 2 New York State Center of Excellence in Materials Informatics, Buffalo, New York, United States

8:00pm-10:00pm CINF 3: GOSTAR and ChEMBL comparison – commercial vs. open chemogenomics databases

Johannes Voigt, johannes.voigt@gilead.com, Uli Schmitz

Gilead Sciences, Foster City, California, United States

8:00pm-10:00pm CINF 4: Exploring available compound data with the open PHACTS discovery platform and KNIME

Daniela Digles, daniela.digles@univie.ac.at, Gerhard Ecker

University of Vienna, Vienna, Austria

8:00pm-10:00pm CINF 51: Formal ontologies and software tools to facilitate integration, classification and modeling of drug discovery data

Stephan Schürer1,2, stephan.schurer@gmail.com, Asiyah Yu Lin1, Hande McGinty1, Qiong Cheng1, Amar Koleti1, Nooshin Zadeh1, Dusica Vidovic1

1 Center for Computational Science, University of Miami, Miami, Florida, United States; 2 Department of Pharmacology, University of Miami, Miami, Florida, United States

8:00pm-10:00pm CINF 52: KEA2: Multiple views of the human kinome

Nicolas Fernandez2, nicolas.fernandez@mssm.edu, Andrew Rouillard2, Klarisa Rikova1, Peter Hornbeck1, Avi Ma'ayan2

1 Cell Signaling Technology, Danvers, Massachusetts, United States; 2 Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai, New York, New York, United States

8:00pm-10:00pm CINF 55: KinomeNet: accurate prediction of protein kinase inhibitors with deep convolutional neural networks

Olexandr Isayev, olexandr@olexandrisayev.com, Alexander Tropsha

UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States

8:00pm-10:00pm CINF 85: Active machine learning perspective on hit identification and optimization

Daniel Reker, danielreker@googlemail.com, Gisbert Schneider

Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

8:00pm-10:00pm CINF 86: Binding affinity prediction using frequency of protein-ligand interactions: method validation and application to bromodomain inhibitors

Jamel Meslamani, j.meslamani@gmail.com, Adam S Vincek, Elena Russinova, Alexander N. Plotnikov, Roberto Sanchez, Ming-Ming Zhou

Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States

8:00pm-10:00pm CINF 8: Introducing SIVVU, a web-based program for modeling spectrophotometric titration data

Douglas Vander Griend, dav4@calvin.edu

Chemistry & Biochemistry, Calvin College, Grand Rapids, Michigan, United States

8:00pm-10:00pm CINF 92: VSViewer3D: An open source tool for interactive data mining of 3D virtual screening data

David Diller1, djrdiller@gmail.com, Kyle Diller2

1 Computational Chemistry, CMDBioscience, East Windsor, New Jersey, United States; 2 Rochester Institute of Technology, Rochester, New York, United States

8:00pm-10:00pm CINF 93: Strategies to improve PubChem data quality and search effectiveness through data analysis

Leonid Zaslavsky, zaslavsk@ncbi.nlm.nih.gov, Gang Fu, Asta Gindulyte, Paul Thiessen, Sunghwan Kim, Evan Bolton

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 94: Sketchy sketches: Hiding chemistry in plain sight

Daniel Lowe, daniel@nextmovesoftware.com, John May, Roger Sayle

NextMove Software, Cambridge, United Kingdom

8:00pm-10:00pm CINF 95: Hybrid search engine for chemical information in PubChem

Jie Chen1, chenj@ncbi.nlm.nih.gov, Siqian He1, Asta Gindulyte2, Evan Bolton3, Steve Bryant1

1 NCBI, NLM, NIH, Bethesda, Maryland, United States; 2 NCBI/CBB, NIH, Bethesda, Maryland, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 9: Integration of cheminformatics material into the STEMWiki hyperlibrary

Robert Belford3, rebelford@ualr.edu, Delmar Larsen2, Andrew Cornell1

1 Department of Chemistry, University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 2 Department of Chemistry, Univ California Davis, Davis, California, United States; 3 Department of Chemistry, Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States

CINF: Herman Skolnik Award Symposium 8:45am - 12:00pm
Tuesday, August 23
Room 112A/B - Pennsylvania Convention Center
Elsa Alvaro, Evan Bolton, Leah McEwen, Organizing
Evan Bolton, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 64: Developing databases and standards in chemistry
Stephen Heller, steve@hellers.com

Retired, Silver Spring, Maryland, United States
This presentation will describe how the author has been involved in developing databases and standards in chemsitry over the past 45 years, using the NIH/EPA/NIST mass spectrometry database, the NIH,/EPA Chemical Information System, the IUPAC InChI chemical structrure standard,and others as examples.

9:15am-9:40am CINF 65: Two decades of open chemical data at DTP/NCI
Daniel Zaharevitz, ZaharevD@mail.nih.gov

Information Technology Branch, Developmental Therapeutics Program, National Cancer Institute, Bethesda, Maryland, United States
The National Cancer Institute has been accepting compounds for testing in anti-cancer assays since 1955, a service run by the Developmental Therapeutics Program (DTP). The systems for keeping track of the structural information follow the history of structural information representation, from ink drawings on 3 X 5 cards to modern computers. The development of the internet and World Wide Web opened the possibility of sharing this information with the research community. In the mid 1990s the the first downloadable structure files were made available followed by the first web pages, including structure search, NCI-60 growth inhibition data, and COMPARE calculations. These tools are useful, and a proof of concept that data can be made available with only modest resources, but there was a need for a more comprehensive way to turn 'available data' into 'useful data'. A major step forward in this regard was the creation of PubChem as part of the Molecular Library Initiative. DTP was a major contributor to the early development of PubChem. When PubChem first went live, about a third of the chemical structures and all of the assay data were from DTP. The impressive growth of PubChem in the years that followed has established open chemical data as an important part of chemical research. Over two decades of providing open chemical data has given DTP perspective in how the field has developed and where it might go. This talk will discuss that the positive and negative experiences related to open chemical data and the challenges that need to be addressed in order to make open chemical data not just a useful part of chemical research, but an integral and necessary component of such research.

9:40am-10:05am CINF 66: Using InChI to manage data
Peter Linstrom, peter.linstrom@nist.gov

NIST, Gaithersburg, Maryland, United States
The IUPAC International Chemical Identifier (InChI) is a molecular identifier based on the structure of the molecular species. It and its standardized hash (InChIKey) have several properties which make them useful for database management and generating links to databases which contain data relevant to the chemical species. This talk will outline some of these advantages along with some of the challenges in the use of these identifiers.

The NIST Chemistry WebBook is a collection of data for molecular species from various sources. InChI has been shown to be useful for reliably merging such data. Its modular nature allows straightforward identification of geometric and stereo isomers along with isotoplogues. This is particularly useful in case of the Chemistry WebBook which contains legacy data collections which often do not specify stereogenic bonds.

InChI and InChIKey can also be used to link across data collections from different providers. Some of the features of the InChI string make it difficult (but certainly not impossible) to construct pre-defined web based queries or links. InChIKey, however, was designed to be compatible with Internet search engines and can be readily used for this purpose. While not as modular as InChI, InChIKey does store the hash of the molecule’s connectivity in its first component. This allows identification of similar species and the possible construction of links to “near misses.”

PubChem has become a very useful tool for accessing a wealth of information about chemical species along with links to additional resources for the species. InChI and InChIKey make it easy to link to this resource.

10:05am-10:30am CINF 67: Open chemistry resources provided by the NCI CADD group
Marc Nicklaus, mn1@helix.nih.gov

Nci Frederick Bldg 376 RM 207, Natl Inst Health Ft Detrick, Frederick, Maryland, United States
We will touch on the nearly two decades of web-based, freely accessible, small-molecule related resources the National Cancer Institute (NCI) Computer-Aided Drug Design (CADD) Group has made available to the scientific public in the fields of CADD and chemoinformatics. These resources build on even longer history of chemoinformatics work at the NCI over the past nearly 60 years. The NCI CADD Group Chemoinformatics Tools and User Services comprise services such as the Enhanced NCI Database Browser, the Optical Structure Recognition Application, and the Chemical Identifier Resolver. We will present our efforts in the context of, and how they intersect with, the history, current status, and future of other large chemoinformatics project at NIH such as PubChem.

10:30am-10:45am Intermission
10:45am-11:10am CINF 68: Evolution of open chemical information
Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States
Open chemical information is at an exciting juncture. Scientists are beginning to understand their role in providing this content. Publishers are beginning to improve their capture of these data streams. Archives exist for scientists to put their information. There are still challenges. This talk will provide a brief overview of open chemical information and the role of the Royal Society of Chemistry is playing to foster it. The impact of PubChem and other resources, like ChemSpider, will be considered.

11:10am-11:35am CINF 69: Open chemical information at the European Bioinformatics Institute
Christoph Steinbeck, steinbeck@ebi.ac.uk

Cheminformatics and Metabolism, European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, Cambridge, United Kingdom
The European Bioinformatics Institute has contributed to developing the open chemical information space over more than ten years. It started with the ChEBI ontology and database and was later extended with the ChEMBL database, UniChem and MetaboLights. This talk will highlight some of the primary contributions EBI has made to open chemical information and how they intersect with the PubChem project.

11:35am-12:00pm CINF 70: History and the future of tools and software components for working with public chemistry data
Wolf-Dietrich Ihlenfeldt, wdi@xemistry.com

Xemistry GmbH, Konigstein, Germany
Over the last 15 years, I have been involved, in an active developer or component supplier role, in the development of several important Internet-based chemistry information portals. My company has provided software both for the user interfaces of these sites as well as for their behind-the-scenes operation. For consumers of Internet-based chemistry information interested in more than looking at Web pages, we have developed tools for acessing, matching and mixing data from such sources. It has been an interesting journey. I will recount experiences related to several past, current and future projects, their specific challenges which were and are often linked to the state of the art of general Internet and computer technology at their time, and our approaches in addressing these.

CINF: Herman Skolnik Award Symposium 2:00pm - 5:05pm
Tuesday, August 23
Room 112A/B - Pennsylvania Convention Center
Elsa Alvaro, Evan Bolton, Leah McEwen, Organizing
Evan Bolton, Presiding
2:00pm-2:05pm Introductory Remarks
2:05pm-2:30pm CINF 71: PubChem a resource for cognitive computing
Stephen Boyer, sboyer@us.ibm.com

IBM Almaden Research Center, IBM, San Jose, California, United States
Without order, a collection is just a heap of stuff. For centuries, mankind has been working on bringing order to collections in the earliest libraries and archives. In the same way, without order, databases are just random bits and bytes. This talk will take a look at how we continuously seek to improve order, in databases and in searching. We are continuously challenged by factors that disturb the order, such as the growing amount of data, globalization, and obfuscation. Integration of big data demands new search technologies. We’ll take a look at promising developments in finding 'dark data' with examples from different technical fields. We'll discuss the role of computer curation to enhance the ability to organize, to search content, and ultimately to lead to predictive analysis. These developments will be viewed in the context of the value of scientific data provided to the scientific community through the efforts of NIH and PubChem.

2:30pm-2:55pm CINF 72: SPL and openFDA resources of open substance data
Yulia Borodina, yulia.borodina@fda.hhs.gov

FDA, Silver Spring, Maryland, United States
Structured Product Labeling (SPL) is an open document markup standard approved by Health Level Seven (HL7) and adopted by FDA as a mechanism for exchanging product and facility information. The SPL standard has also been used by FDA for indexing data on chemical and biological substances used as ingredients in medicinal products. Data available in form of SPL index files and its integration with openFDA and PubChem will be discussed.

2:55pm-3:20pm CINF 73: Building a network of interoperable and independently produced linked and open biomedical data
Michel Dumontier, michel.dumontier@gmail.com

Medicine, Stanford University, Stanford, California, United States
Over 15 years ago, Sir Tim Berners Lee proclaimed the founding of an exciting new future involving intelligent agents operating over smarter data in order to perform complex tasks at the behest of their human controllers. At the heart of this vision lies an uneasy alliance between tedious formal knowledge representations and powerful analytics over big, but often messy data. Bio2RDF, our decade old open source project to create Linked Data for the life sciences, has weaved emergent Semantic Web technologies such as ontologies and Linked Data to generate FAIR - Findable, Accessible, Interoperable, and Reusable - data in the form of billions of machine accessible statements for use in downstream biomedical discovery.
This revolution in data publication has been strengthened by action from global bioinformatics institutions such as the NCBI, NCBO, EBI, and DBCLS. Notably, NCBI's PubChem has successfully coupled large scale data integration with community-based standards to offer a remakable biochemical knowledge resource amenable to data hungry discovery tools. Yet, in the face of increasing pressure from researchers, funders, and publishers, will these approaches be sufficient for growing and maintaining a comprehensive knowledge graph that is inclusive of all biomedical research?

3:20pm-3:35pm Intermission
3:35pm-4:00pm CINF 74: Chemical structure representation in PubChem
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom
For all of the grief that I give Evan, often over corner cases of chemical semantics that only one or two people care about, it is fair to say that PubChem represents the current state-of-the-art in chemical structure representation. Nobody does it better. Under the surface, unseen to most users, are a large number of technical and scientific innovations that have enabled PubChem to scale over the past decade and a half to now contain approaching 100 million compounds. From simple design decisions such as the substance vs. compound distinction [that allows PubChem to avoid the early mistakes of CAS] to breakthroughs such as canonical Kekule SMILES [to avoid the early mistakes of Daylight Chemical Information Systems], the architecture of Pubchem contains a treasure trove of cheminformatics innovations, covering normalization, tautomers, mixtures, 2D fingerprints and similarity, substructure search, biopolymers, text mining and much more. During this presentation I hope to share some of the cool insights that the remarkable staff at the NCBI often forget to mention or are too modest to point out.
Congratulations Evan and Steve.

4:00pm-4:25pm CINF 75: iRAMP & PubChem: Of the people, for the people
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States

Chemistry and the need for chemical information are ubiquitous to the success of every wet lab. Infrastructure involved in supporting the ecosystem of chemistry data and information, from data quality to description, is impressive in scale and functionality to keep everything running smoothly under the hood. Narrowly focussed innovation is comparatively easy; broad-reaching, publicly accessible, semantically enabled infrastructure is a miracle. PubChem represents that miracle in chemical information daily and further enables other laboratory and research infrastructures such as chemical safety management to close the information gap. Setting the stage for these impact-driven collaborations taps chemistry librarian expertise to connect infrastructures, user communities, and technology teams. This talk will discuss the multiplying effect of the PubChem infrastructure in Recognizing, Assessing, Managing and Preparing disparate nodes of information and stakeholders for real world chemical information problems.

4:25pm-4:50pm CINF 76: Open chemical information: Where now and how?
Evan Bolton, bolton@ncbi.nlm.nih.gov

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States
Open chemical information, once a dream, is now commonplace and necessary. A primary achievement involved bridging chemical resources on the internet, whether they are open or closed. The ability to find information about chemicals was greatly enhanced. However, aggregation of chemical information has revealed a number of challenges. Some of these include data quality/corruption, data representation, and the need for improved software algorithm and knowledge representation harmonization. This talk will highlight in the open chemical information space where improvements can be made and how, as a community, we might be able to make a real impact in improving the state of the art chemistry knowledge representation.

4:50pm-4:55pm Concluding Remarks
4:55pm-5:05pm Award Presentation
CINF*: Using Public Information to Support a Chemical Safety Culture 8:25am - 12:05pm
Wednesday, August 24
Room 112A - Pennsylvania Convention Center
Evan Bolton, Leah McEwen, Ralph Stuart, Organizing
Evan Bolton, Leah McEwen, Ralph Stuart, Presiding
Cosponsored by CHAS
8:25am-8:30am Introductory Remarks
8:30am-8:45am CINF 77: Users roundtable: Laboratory use cases for chemical safety information
Ralph Stuart2, secretary@dchas.org, Leah McEwen1, Evan Bolton3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States
In response to recommendations from a variety of federal organizations, the ACS Committee on Chemical Safety has been developing resources to support risk assessment of chemical processes in the small-scale laboratory. Risk assessment is a scientific information-centric approach to identifying, prioritizing and managing laboratory hazards that goes beyond the traditional 'fume hood, eye protection and gloves' approach to laboratory safety. Developing information tools that effectively support this process require clear definitions of use case scenarios for this information.

This roundtable discussion will kick off with summary perspectives from a variety of stakeholders to identify characteristics of key use cases, including research, teaching and service laboratories. Researchers and lab supervisors, Environmental Health and Safety staff, chemistry librarians and information professionals will present their requirements for data sources, assessment of information quality and user interfaces. Cross-sector discussion of these aspects of chemical safety information will help inform development of tools made available both in public resources and institution-specific contexts.

8:45am-9:10am CINF 78: Risk assessment and crisis management in the research laboratory using online resources: A EH&S perspective
Shailendra Singh2, Neelam Bharti1, neelambh@ufl.edu

1 Marston Science Library, University of Florida, Gainesville, Florida, United States; 2 EH&S, University of Delaware, Newark, Delaware, United States
In the last few years, accidents in academic research laboratories have led to raising international concern about the safe research practices and safety of lab researchers in academic institutions. When planning for the research projects or experiments, risk assessment and crisis management are a few of the most crucial issues that a researcher needs to focus on before conducting any experiment. Apart from the hazards of the chemical/instrument being used, the risks should be outlined by the exposure and potential damage resulting from those hazards.
Risk assessment and emergency plans help lab researchers to understand the risk involved in the activities as well as preparing for the unwanted situations. Crisis communication and management is the process delivered at times of high trauma during or after an accident. In this presentation, we are going to discuss risk assessment in a research lab and recognition of the potential hazards using online information available through different resources, and formulation of emergency plans along with crisis management which could be life-saving in case of an emergency.

9:10am-9:35am CINF 79: Institutional use of chemical safety data streams
Chris Jakober, jakecattleco@yahoo.com

Davis E&HS, University of California, Woodland, California, United States
Institutional use of chemical safety data streams.

9:35am-10:00am CINF 80: Chemical safety and hazard information in PubChem
Jian Zhang3, jiazhang@ncbi.nlm.nih.gov, Paul Thiessen3, Asta Gindulyte3, Leah McEwen1, Ralph Stuart2, Evan Bolton3, Steve Bryant3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States; 3 NLM/NCBI, National Institutes of Health, Bethesda, Maryland, United States
PubChem, a public chemical information database, has been serving the scientific community for more than a decade. PubChem provides chemical information in many aspects. In addition to serving the generic chemical information such as chemical structure, chemical and physical property data, PubChem also provides drug information, chemical safety and hazard information, patent information, and more. Chemical safety is a very important topic in chemical industry, chemistry labs, and in our daily scientific lives. PubChem has integrated safety and hazard information from various public domains such as ILO International Chemical Safety Cards, NIOSH Pocket Guide to Chemical Hazards, OSHA Occupational Chemical Database, HSDB, Cameo Chemicals, and more. In the chemical safety and hazard section, PubChem has added the GHS (the Globally Harmonized System of Classification and Labeling of Chemicals) classification [1]. In this presentation, we will discuss the PubChem safety information collection, data integration, and data access.

10:00am-10:15am Intermission
10:15am-10:40am CINF 81: Semantic annotation of the laboratory chemical safety summary in PubChem
Gang Fu2, gangfu1982@gmail.com, Jian Zhang2, Evan Bolton2, Jeremy Frey4, Stuart Chalk3, Mark Borkum5, Leah McEwen1

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 NLM/NCBI, National Institutes of Health, Bethesda, Maryland, United States; 3 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States; 4 University of Southampton, Southampton, United Kingdom; 5 Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, Washington, United States

Semantic Web standards and technologies have been emerging as an increasingly important approach to distribute and integrate scientific data. Resource Description Framework (RDF) constituents a family of World Wide Web consortium specifications for data exchange, and it is a core part of Semantic Web standards. PubChemRDF integrates the knowledge base across PubChem databases and other biological and biomedical databases with National Center for Biotechnology Information. The schemaless RDF data content can be queried and analyzed using readily available Semantic Web technologies (namely, SPARQL query language and logic-based inference). PubChem has crowdsourced chemical health and safety information from multiple national regulatory agencies who aligned the hazard regulation standards with the Globally Harmonized System (GHS) of Classification and Labeling of Chemicals established by United Nationals. The available GHS safety and hazard information in addition to the chemical and physical data from the regulatory agencies in European, Australian, Japanese, and United State has been integrated in the Laboratory Chemical Safety Summary (LCSS) for PubChem compounds. In the present work, we want to demonstrate the semantic annotations of the chemical health and safety information available in LCSS, and how to facilitate data exchange and information retrieval using Semantic Web standards and technologies.

10:40am-11:05am CINF 82: GHS and NFPA diamonds: Where they come from and how they can be useful
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom
When handling chemicals in a laboratory, the pictograms and labels found on containers are an immediate and obvious indication of the potential risks associated with their contents. Indeed, it is not uncommon to be misled into believing that access to reactant MSDS or SDS data sheets is the single solution required to eliminate all accidents from laboratories. However, few people pause to think where do such data sheets and labels come from, if not handed down on stone tablets from Sigma Aldrich and other chemical vendors. In theory, hazard classification is actually based on a rigorously defined set of physical experiments that are legislated by the United Nations and therefore consistent between vendors. Unfortunately, not only are these tests performed by relatively few organizations (i.e. rarely in academia), but many vendors often skip expensive testing by simply erring on the side of caution; labels and data sheets, after all, are legal not factual documents. Fortunately, the increasing availability of both public experimental data databases and of predictive models built on them enable the estimation of hazard classifications for the many millions of compounds that don't have an SDS, such as any of the novel reaction products made in academia.

11:05am-11:30am CINF 83: Critical cases for information identifiers in chemical asset management
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States
There are several information transfer points critical to managing chemical assets in research. Several questions arise concerning the importance of tracking information provenance as well as chemical description relevant to practical use and experimental planning. What information is needed for communication between stakeholders and systems? Is a digital object identifier type approach applicable to tracking vendor SDSs as documents of record? What definitions are necessary to develop a ’chemistry' identifier that resolves to chemical components, 2D structure, mixture composition, and states and forms designations commonly encountered in chemical sample inventories? What systematic description is needed for richer, experimental chemical property and reactivity information in the systems linked behind these identifiers? What information elements are required for a QR code to facilitate communication and information transfer? This discussion will consider these questions from both scientific and community practice perspectives, and how various existing standards and projects under the auspices of IUPAC and other standards organizations can be used to address these needs.

11:30am-11:45am CINF 84: Surveying the academic laboratory population: Project updates from the iRAMP collaboration
Leah McEwen1, Ralph Stuart2, secretary@dchas.org

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States
iRAMP updates

11:45am-12:05pm Concluding Remarks
CINF: General Papers 1:30pm - 4:50pm
Wednesday, August 24
Room 112A - Pennsylvania Convention Center
Elsa Alvaro, Organizing
Elsa Alvaro, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 85: Active machine learning perspective on hit identification and optimization

Daniel Reker, danielreker@googlemail.com, Gisbert Schneider

Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
Feedback-driven hypothesis refinement is a central pillar of medicinal chemistry. Active machine learning has been successfully transferred from computer science to drug discovery research for putting artificial intelligence in charge of compound selection and allow for fully-automated design-and-test processes.<span style='font-size:10.8333px; line-height:17.3333px'> </span>As an initial test, we used random forest prediction technology to retrospectively reproduce explorative and exploitive behaviours using various selection functions. Then, we pursued different active machine learning strategies in a virtual screening framework against the cancer- and HIV-relevant GPCR CXCR4. The compound selection strategy determines the outcome of each iteration in terms of model architecture change and improvement of the predictive performance: exploitive strategies retrieved active compounds that did not necessarily provide strong feedback while the explorative selection found inactive compounds that had a large impact on the model architecture. We prospectively validated a balanced approach that identifies informative active compounds to focus model-improvement on active regions of chemical space while providing desirable hits.

2:00pm-2:25pm CINF 86: Binding affinity prediction using frequency of protein-ligand interactions: method validation and application to bromodomain inhibitors

Jamel Meslamani, j.meslamani@gmail.com, Adam S Vincek, Elena Russinova, Alexander N. Plotnikov, Roberto Sanchez, Ming-Ming Zhou

Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States
Computational technologies are fundamental components in early drug discovery projects and their application is more notable in the hit identification phase. However, scoring is ultimately a challenge in the lead optimization phase. Structure-based drug design is a crucial step in this phase, where medicinal chemists can make hundreds and even thousands of compounds to generate leads with desirable, drug-like physiochemical properties. Computational approaches are used hand in hand with medicinal chemistry to rank small molecules before they are synthesized. Recently, many advances were made predicting the free energy of binding of small molecules to proteins. But, estimating this affinity using physics-based methods is a tedious, slow process that usually requires manual intervention by experts. Thus cheminformatics approaches are valuable, fast tools to address this issue. Protein-ligand interactions are simplified cheminformatics representation of the enthalpic effect between small molecules and proteins. In this study, we introduce the use of frequency of protein-ligand interactions as descriptors in QSAR SVM models to predict ligand-binding affinity. SAR datasets of eight different targets were used to validate the methods. The overall cross validation performance was comparable to Free Energy Perturbation (FEP) protocol, with average Rp = 0.54 (vs Rp(FEP) = 0.56) and RMSE = 0.65 (vs RMSE(FEP) = 0.86). Predictive models were generated in a concrete structure-based design project involving novel bromodomain inhibitors. Binding affinity for multiple ligands were predicted and validated for the first and second domain of BRD4 protein, with an average accuracy of 70% in a ligand-ranking scenario for chemical synthesis. Our method is useful for ligand optimization and selectivity assessment with fast predictions, which can lead to fewer molecules requiring synthesis for binding affinity validation.

2:25pm-2:50pm CINF 87: MOARF, an integrated workflow for multiobjective optimization: Implementation, synthesis, and biological evaluation
Nathan Brown, nathan.brown@icr.ac.uk

The Institute of Cancer Research, Sutton, United Kingdom
Multiobjective molecular design methods have come of age and are now being considered actively in many drug design programs. However, the challenge of optimizing ligands in silico against multiple targets given extant data remains and concerted effort is being undertaken to address these challenges. The effective multiobjective de novo design is enabled by effective search of the feasible chemistry space, synthetically-appropriate designs that can be realized, and appropriate and predictive models for important biological and physicochemical endpoints.

Here, we describe the development and application of an integrated, multiobjective optimization workflow (MOARF: Multiobjective Automated Replacement of Fragments) for controlled and effective medicinal chemistry design. This new workflow couples a rule-based molecular fragmentation scheme (SynDiR: Synthetic Disconnection Rules) with a pharmacophore fingerprint-based fragment replacement algorithm (RATS: Rapid Alignment of Topological Structures) to broaden the scope of reconnection options considered in the generation of potential solution structures. Solutions are ranked by a multiobjective scoring algorithm comprising ligand-based (shape similarity) biochemical activity predictions as well as physicochemical property calculations. Application of this iterative workflow to the optimization of the CDK2 inhibitor Seliciclib (CYC202, R-roscovitine) generated solution molecules in the desired physicochemical property space. Synthesis and experimental evaluation of optimal solution molecules demonstrates CDK2 biochemical activity and improved human metabolic stability. MOARF has been experimentally demonstrated to effectively explore and exploit relevant medicinal chemistry space and successfully satisfy pressing molecular design challenges.


 

2:50pm-3:15pm CINF 88: Systematic generation of analog relationships of bioactive compounds and promiscuity analysis
Dagmar Stumpfe1, stumpfe@bit.uni-bonn.de, Dilyana Dimova2, Jürgen Bajorath3

1 B-it, Universtity of Bonn, Bonn, Germany; 2 Department of Life Science Informatics, University of Bonn, Bonn, Germany; 3 Life Science Informatics, University of Bonn, B-IT, Bonn, Germany
The generation of analogs of active compounds dominates hit-to-lead and lead optimization projects in medicinal chemistry. Most computational approaches applied in the course of chemical optimization attempt to aid in the design of better analogs and/or the exploration of SAR information associated with compound series.
A computational framework is introduced to systematically detect all synthetically accessible analogs of bioactive compounds in databases and determine how their chemical exploration might influence compound promiscuity (i.e., the ability of a compound to interact with multiple targets). For more than a third of all active compounds across 90% activity classes, no analogs were detected. For the majority of compounds with analogs, chemical exploration had no detectable influence on promiscuity. However, for a subset of ∼26% of active compounds with analog sets, notable increases in promiscuity were observed, which were mostly due to the presence of single analogs with high degrees of promiscuity.

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 89: SAR characteristics of matching molecular series and exploration of structural relationships
Dilyana Dimova1, dimova@bit.uni-bonn.de, Jürgen Bajorath2

1 Department of Life Science Informatics, University of Bonn, Bonn, Germany; 2 Life Science Informatics, University of Bonn, B-IT, Bonn, Germany
The concept of matched molecular pairs (MMPs) has experienced increasing interest in medicinal chemistry. An MMP is defined as a pair of compounds that only differ by a structural change at a single site. MMPs are often used to associate specific structural modifications with changes in molecular properties. The matching molecular series (MMS) was introduced as an extension of the MMP concept and defined as a set of compounds with pairwise MMP relationships. Thus, an MMS represents a series of analogs with modifications at a single site.
We have systematically identified all publicly available MMSs, classified their SAR characteristics, and explored structural relationships between them. The combination of SAR and structural relationship information enabled the identification of structurally related MMSs with similar or distinct SAR characteristics. Such MMSs combine series of analogs with different substitution sites and reveal how structural modifications influence SARs. They can also be used to explore analog pathways that change SAR characteristics and provide additional SAR information.

3:55pm-4:20pm CINF 90: How frequent are your clusters in hierarchical cluster analysis? Quantifying their frequencies considering ties in proximity
Guillermo Restrepo1,3, guillermorestrepo@gmail.com, Wilmer Leal2,3, Eugenio Llanos2,4, Carlos Suarez2,5, Manuel Patarroyo2,6

1 University of Leipzig, Leipzig, Saxony, Germany; 2 Fundación Instituto de Inmunología de Colombia (FIDIC), Bogota, Colombia; 3 Universidad de Pamplona, Pamplona, Colombia; 4 SCIO Corporacion colombiana del saber cientifico, Bogota, Colombia; 5 Universidad del Rosario, Bogota, Colombia; 6 Universidad Nacional de Colombia, Bogota, Colombia

A know problem of hierarchical cluster analysis (HCA), of ample use in chemoinformatics, is that resulting from ties in proximity, which comes into play once equidistances appear in the distance matrix of objects to classify. It has been shown that it is a very likely problem leading to no unique classification results (dendrograms). We have shown how big is the problem even if the HCA algorithm is run over a fixed data set, with fixed grouping methodology and similarity measure. We call attention to the widespread disregarding of the problem, where HCA results are taken as unique and conclusions are based upon them. This is for example the case of QSAR studies where HCA is used to select descriptors for the models.

We have introduced four methodologies to quantify cluster frequencies considering ties in proximity, two of them consider clusters and dendrograms as sets and the other two as graphs. We use a toy example of well separated clusters and a set of 1,666 molecular descriptors calculated for a group of molecules having hepatotoxic activity. The four methodologies can be used to derive cluster stability measurements on arbitrary sets of dendrograms having the same set of objects.

It was found that ties occurred frequently, some yielding tens of thousands of dendrograms, even for small data sets. Our results highlight the need for evaluating the effect of ties on clustering patterns before classification results can be used accurately.

4:20pm-4:45pm CINF 91: Line notations for nucleic acids (both natural and therapeutic)
Roger Sayle, roger@nextmovesoftware.com

NextMove Software, Cambridge, United Kingdom
Despite the huge advances made in nucleic acid synthetic chemistry over the last four decades, such as antisense siRNAs, the IUPAC/IUBMB recommendations on nucleic acid symbols and abbreviations haven't been updated since their original publication in 1970. This creates technical challenges for biopharmaceutical companies and efforts such as the Pistoia Alliance's HELM which attempt to encode the unusual backbone and bases found in current nucleic acid drug candidates as line notations. In this talk, we review several of the technical challenges in representing/encoding modern nucleic acid molecules, and present possible solutions to several of them. Case studies will include the FDA approved thiophosphate-linked antisense therapies fomivirsen and mipomersen. Hopefully, these proposals will help address the 'RNA informatics' gap between biological (activity) databases and the continually expanding chemical space of tractable nucleic acid analogs.

4:45pm-4:50pm Concluding Remarks
CINF: General Papers 8:45am - 11:40am
Thursday, August 25
Room 112A - Pennsylvania Convention Center
Elsa Alvaro, Organizing
Elsa Alvaro, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 92: VSViewer3D: An open source tool for interactive data mining of 3D virtual screening data

David Diller1, djrdiller@gmail.com, Kyle Diller2

1 Computational Chemistry, CMDBioscience, East Windsor, New Jersey, United States; 2 Rochester Institute of Technology, Rochester, New York, United States
The future of virtual screening is to search through large areas of virtual chemical space. To do this efficiently, one needs to include experimentalist much earlier into the project to better focus on molecules that are more likely to be synthetically accessible. The VSviewer3D is a simple open source Java tool for visual exploration of 3D virtual screening data. The VSviewer3D brings together the ability to explore numerical data, such as calculated properties and virtual screening scores, structure depiction, interactive topological and 3D similarity searching, and 3D visualization. By doing so the user is better able to quickly identify outliers, assess tractability of large numbers of compounds, visualize hits of interest, annotate hits, and mix and match interesting scaffolds. We demonstrate the utility of the VSviewer3D by describing a use case in a docking based virtual screen.


 

9:15am-9:40am CINF 93: Strategies to improve PubChem data quality and search effectiveness through data analysis

Leonid Zaslavsky, zaslavsk@ncbi.nlm.nih.gov, Gang Fu, Asta Gindulyte, Paul Thiessen, Sunghwan Kim, Evan Bolton

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States
PubChem is an NIH public repository of chemical and biological data providing, along with other NIH public databases, extensive resources for biomedical discovery. However, the tremendous growth in the amount of data, its increasing heterogeneity, and its widening variation in quality demand novel approaches that allow fast retrieval of relevant, non-redundant, and reliable information. Search results should be provided in a meaningful aggregated form, with links to important information in other NIH databases and outside resources.

The problem of biomedical big data management and data mining has been recognized as a major strategic challenge that requires collaborative efforts of biomedical and computer scientists, engineers, programmers and other specialists to innovate and develop new data models and organizational approaches. With the data deluge, we cannot treat all the data equally: the information must be organized hierarchically around the most useful and well annotated data and indexed in a way that allows fast retrieval of the most useful and reliable information. At the same time, the user should be provided with an option to conduct deep refined searches.

In this presentation, we will discuss our recent efforts to improve the reliability of chemical names (also called synonyms) in PubChem. Chemical names in the PubChem Compound database have been compared with those included in Medical Subject Heading (MeSH) as well as those extracted from PubMed abstracts using text-mining programs such as LeadMine and Pubtator. This cross-validation between synonyms from multiple sources has resulted in an improved scoring scheme for synonyms for a given compound, which promotes the most useful and reliable names in our search and retrieval procedures. It also allows us to correct or disallow some PubChem names when inconsistencies between different sources are found.

We plan to use the metadata-based relations between data from multiple sources as well as 2-D and 3-D similarity to provide advisory annotation information and web links. Further analysis of associations between records in multiple databases will help improve chemical annotation quality and more reliably link PubChem compounds to information about biological processes, hazards, genes and diseases.

9:40am-10:05am CINF 94: Sketchy sketches: Hiding chemistry in plain sight

Daniel Lowe, daniel@nextmovesoftware.com, John May, Roger Sayle

NextMove Software, Cambridge, United Kingdom
Chemical sketches are ubiquitous in the published literature. Unlike connection table formats that precisely capture chemistry for database entry, the primary purpose of a sketch format is to produce a high quality image for conveying information to other chemists. Chemical sketches can be presented in a variety of chemistry-specific formats as well as image formats, with the later presenting additional challenges to interpretation. Since 2001 the United States Patent Office has redrawn all chemical sketches in ChemDraw, yielding to date over 25 million freely available CDX files.
Correctly extracting chemistry from these files required tackling of many areas including disambiguation of ambiguous labels (e.g. B, D, P, V, Ac), interpreting labels (e.g. COOH), interpretation of free text overlaid on the structure (e.g. brackets for a repeated group) and assignment of reaction role.
We report our work on extracting chemical structures and reactions from sketches and demonstrate the improvements in quality that tackling the intricacies of sketches provides over more naïve approaches. One notable improvement is the ability to better distinguish between specific compounds, fragments, generic structures and reaction schemes. We compare the chemistry extracted from sketches with the results from text-mining, and show that a large amount of chemistry is only available from one medium or the other. We also explore cases where the combination of the output from sketches and text enables extraction of data that either method in isolation could not e.g. Markush structures, reactions where the product is given as a sketch.

10:05am-10:20am Intermission
10:20am-10:45am CINF 95: Hybrid search engine for chemical information in PubChem

Jie Chen1, chenj@ncbi.nlm.nih.gov, Siqian He1, Asta Gindulyte2, Evan Bolton3, Steve Bryant1

1 NCBI, NLM, NIH, Bethesda, Maryland, United States; 2 NCBI/CBB, NIH, Bethesda, Maryland, United States; 3 National Center for Biotechnology Information, Bethesda, Maryland, United States
Pubchem is a free chemical database and an open archive of the biological activities of millions of substances. PubChem has input data from more than 350 data sources worldwide with millions of unique compounds, deposited substance records, and bioactivities. Scientific data in PubChem are rich and refined, including calculated properties, deposited annotations, and cross-links between resources. The need of dissemination scientific knowledge covered by the data from research activities is demanding and imperative. However, neither getting access to numerous of data (numbering in millions and billions) is straightforward, nor summarizing countless pieces of information from them. The strategies for effective chemical information searching on internet have to consider the efficiency of a search job and the significance of a search result. This includes the infrastructures and databases that resource providers can use to finish a search job in a reasonable period of time, and the search results that are collected and organized to reflect the scientific knowledge and information relevant to the questions and interest of users.
This talk is about the novel PubChem search system which benefits from the speed of the text search ability of Sphinx search engine and the flexibility of retrieving various data of SQL databases. This Sphinx-SQL search system has been applied on PubChem widgets and PubChem Search. The talk will discuss the system from two perspectives: the infrastructures and the applications. For the researchers who are interested in searching for chemical information, it will also share ideas on user experience and show how to build their own custom search engine and web pages with the queries of this Sphinx-SQL system.

10:45am-11:10am CINF 96: Amoeba-inspired heuristic search dynamics for semi-quantitative estimation of unknown chemical kinetics
Masashi Aono1,2, masashi.aono@elsi.jp

1 Earth-Life Science Institute, Tokyo Institute of Technology, Meguro, Tokyo, Japan; 2 PRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama, Japan
When simulated for longer than a few microseconds, huge computational costs are required to undertake ab-initio approaches for the quantitative estimation of the chemical kinetics of unknown reactions, while that of known reactions can be readily achieved by solving rate equations. Here we introduce a heuristic approach for modeling a reaction as probabilistic dynamics to explore the optimal combinations of the bonding states of atoms, which allows us to estimate the unknown kinetics in a semi-quantitative manner that saves computational resources. We extend our previously proposed heuristic algorithm for solving an NP-complete constraint satisfaction problem (SAT) [Aono et al., Langmuir 2013], which was inspired by spatiotemporal dynamics of a unicellular amoeba that exhibits sophisticated computing capabilities. The extended model consists of numerous fluctuating units, each of which abstracts a pseudopod of the amoeba and represents bonded or unbonded states between two atoms [Aono et al., Orig. Life Evol. Biosph. 2015]. All units evolve concurrently, while unfavorable evolutions are prohibited probabilistically according to physicochemical constraints, which are defined by reflecting empirical laws such as Lewis's octet rule and electronic theory of organic chemistry. The model discovers unknown bonding combinations when stabilized as all the constraints are satisfied, indicating that stable molecules have formed. The dynamics can be viewed as traveling around metastable combinations that are identified as reactants and/or products. By properly tuning the prohibition probabilities, the difficulties of traveling across the transition states will be adjusted to achieve the semi-quantitative kinetics estimation. Our future goal is to simulate the emergence of protometabolic networks in the early Earth environment, leading to the understanding of the origin of life.


Electrophilic addition of carbon dioxide to enol pyrivic acid facilitated by probabilistic prohibition rules.

11:10am-11:35am CINF 97: Database searching and rediscovering the wheel in scientific research
Christina Gilpin1, crgilpin@selectosep.com, Roger Gilpin2

1 Select-O-Sep, Freeport, Ohio, United States; 2 Wright State University, Dayton, Ohio, United States
Although there are a number of highly sophistical electronic search service in many areas of science, rediscovering/reinventing concepts that were described years in the past continues to occur. This problem seems to occur more often when the initial idea/concept dates back 30 to 50 years or more. To illustrate this point, today’s talk will present four examples of broad-based analytical chemistry techniques/methods where this problem has occurred. Although analytical chemistry was selected, it is not because it has more rediscoveries but because the authors are most familiar with this area of chemistry and the important developments that have occurred over the last half century or more.

There are a number of reasons why this happens, which fit into broad categories: 1) there are problems with the data bases in not identifying early work, 2) researchers are not adequately using the data bases, or 3) a combination of both. In order to consider each of these, topics to be discussed will be: a) keywords and the use of keywords, b) terminology and changes in terminology, c) impatience with database searching, d) the ordering of items in search engines, e) lack of historical perspective, f) overuse of the Internet as the primary search tool, and g) default settings and direction of the search (e.g., from new to old or from old to new).

Although arguments can be made for all of the above causes, if a good search is carried out, past discoveries should not be missed and hence not be rediscovered. Likewise, if a reference is once missed and goes uncited for several years, the chances of finding it are diminished, but not impossible to find.

11:35am-11:40am Concluding Remarks

2016 CINF Officers and Functionaries

Chair
Rachelle Bienstock,
RJB Computational Modeling LLC
rachelleb1@gmail.com

Chair-Elect
Erin Davis,
Cambridge Crystallographic Data Centre
erinsdavis@gmail.com

Past-Chair
Rachelle Bienstock,
RJB Computational Modeling LLC
rachelleb1@gmail.com

Secretary
Tina Qin,
Michigan State University
ginna@mail.lib.msu.edu

Treasurer
Rob McFarland,
Washington University
rmcfarland@wustl.edu

CINF Councilors
Bonnie Lawlor,
chescot@aol.com
Andrea Twiss-Brooks,
University of Chicago
atbrooks@uchicago.edu
Svetlana N. Korolev,
University of Wisconsin, Milwaukee
skorolev@uwm.edu

CINF Alternate Councilors
Carmen Nitsche,
carmen@cinformaconsulting.com
Charles Huber,
University of California, Santa Barbara
huber@library.ucsb.edu
Jeremy Ross Garritano,
University of Virginia
jg9jh@virginia.edu

Archivist/Historian
Bonnie Lawlor,
chescot@aol.com

Audit Committee Chair
TBD

Awards Committee Chair
David Evans,
david.evans@relx.ch

Careers Committee Co-Chairs
Pamela Scott,
Pfizer
pamela.j.scott@pfizer.com
Sue Cardinal,
University of Rochester
scardinal@library.rochester.edu

Communications and Publications Committee Chair
Graham Douglas,
communications at acscinf.org

Procedures Chair
Bonnie Lawlor,
chescot@aol.com

Education Committee Chair
Grace Baysinger,
Stanford University
graceb@stanford.edu

Finance Committee Chair
Rob McFarland,
Washington University
rmcfarland@wustl.edu

Fundraising Interim Committee Chair
Graham Douglas,
communications at acscinf.org

Membership Committee Chair
Donna Wrublewski,
Caltech Library
dtwrub@caltech.edu

Nominating Committee Chair
Rachelle Bienstock,
RJB Computational Modeling LLC
rachelleb1@gmail.com

2016–2017 Program Committee Chair
Elsa Alvaro,
Northwestern University
elsa.alvaro@northwestern.edu

2015–2016 Program Committee Chair
Erin Davis,
Cambridge Crystallographic Data Centre
erindavis@gmail.com

Tellers Committee Chair
Sue Cardinal,
University of Rochester
scardinal@library.rochester.edu

Chemical Information Bulletin Editor Spring
Vincent F. Scalfani,
The University of Alabama
vfscalfani@ua.edu

Chemical Information Bulletin Editor Summer
Judith Currano,
University of Pennsylvania
currano@pobox.upenn.edu

Chemical Information Bulletin Editor Fall
Teri Vogel,
UC San Diego Library
tmvogel@ucsd.edu

Chemical Information Bulletin Editor Winter
David Shobe,
Patent Information Agent
avidshobe@yahoo.com

Webmaster
Stuart Chalk,
University of North Florida
schalk@unf.edu

Fall 2016 CINF Bulletin Contributors

Articles and Features
Rachelle Bienstock
Bonnie Lawlor
Robert E. Buntrock
Teri Vogel

Sponsor Information
Graham Douglas

Production
Teri Vogel
Vincent F. Scalfani
Stuart Chalk
Bonnie Lawlor
Wendy A. Warr

Schedule of Future ACS National Meetings

Meeting

Dates

Year

Location

Theme

253rd

Apr. 2–6

2017

San Francisco, CA

TBD

254th

Aug. 20–24

2017

Washington, DC

TBD

255th

Mar. 18–22

2018

New Orleans, LA

״

256th

Aug. 19–23

2018

Boston, MA

״

257th

Mar. 31–Apr. 4

2019

Orlando, FL

״

258th

Aug. 25–29

2019

San Diego, CA

״

259th

Mar. 22–26

2020

Philadelphia, PA

״