Vol. 68, No. 1: Spring 2016

Chemical Information Bulletin

A Publication of the Division of Chemical Information of the ACS
Spring 2016 — Vol. 68, No. 1

Image

 

Vincent F. Scalfani, Editor,
The University of Alabama, Tuscaloosa
vfscalfani@ua.edu

ISSN: 0364–1910
Chemical Information Bulletin,
© Copyright 2016 by the Division of Chemical Information of the American Chemical Society.

Message From the Chair

ImageThe upcoming meeting, with its theme “Computers in Chemistry,” is really an opportunity for our division, (since we are a computationally oriented division), to shine in the San Diego sun. And, I guess in keeping with the all “digital” theme, it is appropriate to point out that this will be the first national meeting without a “free” printed program book readily available. So make sure to download the PDF version or have the app on your phone or tablet!

Looking towards the CINF program in San Diego, at this meeting we are trying something a bit new, different and extravagant: a three-day symposium, “Chemistry, Data and The Semantic Web,” on Tuesday, Wednesday and Thursday. Hopefully you will have the opportunity to catch at least some of the presentations in this symposium, as Evan Bolton and Stuart Chalk have put together an outstanding ensemble of speakers.

The entire CINF program covers diverse areas in the application of computational studies to chemistry and chemical information. David Deng has organized a symposium dealing with the challenges involving incompatibilities in data formats. Jason Cole’s symposium discusses the use of 3D structural data for computational predictive studies. Leah McEwen and Ian Bruno’s symposium deals with global databases and data sharing. On the business-oriented side, in conjunction with the Small Business Division, we have Edlyn Simmons’ symposium on access to chemical information for small businesses and startups. Art Cho has organized a symposium involved with the application of more fundamental DFT computational methods to materials and pharmaceuticals. Elsa Alvaro and Andrea Twiss-Brooks deal with issues of Open Access publishing and funding agency requirements. I am organizing a symposium concerned with databases providing a bridge between chemical and biological pathway data, while Ye Li, and Vincent Scalfani round out the CINF program with a symposium on reimagining the new digital library.

I also hope you will join us for our Sunday evening welcoming reception where you will have the opportunity not only to network and chat informally with CINF members, but also to see the posters and meet the students competing in the CINF Scholarships for Scientific Excellence: Student Poster Competition.

I want to thank all the CINF volunteers and officers: Erin Davis, Leah McEwen, Guenter Grethe, Andrew Twiss-Brooks, Dave Martinsen, Belinda Hurley, and Carmen Nitsche who are stepping down from various positions, or in some cases, preparing to transition to new positions. Of course, we are always open to new volunteers, so if you are interested in becoming involved with CINF, please approach me or any of our volunteers or officers.

Believe it or not, the Call for papers for the Fall 2016 Philadelphia meeting has already opened (http://www.acs.org/content/acs/en/meetings/abstract-submissions/acsnm252/ division-of-chemical-information.html) so please check out the program and begin sending in submissions. The CINF submission deadline is March 23.

I am looking forward to seeing you in San Diego or Philadelphia and please email me if you have any suggestions regarding how CINF can better serve its members.

Rachelle J. Bienstock, Chair,
ACS Division of Chemical Information
Rachelleb1@gmail.com

Letter From the Editor

Thanks for reading this issue of the ACS Chemical Information Bulletin (CIB). I would first like to thank all of our wonderful authors, editors, and sponsors for contributing to the CIB.

In this issue we have two feature articles. The first is a tribute to Phil Heller from his colleagues at Thieme. Phil contributed most recently in the CINF division as Fundraising Chair. It was a pleasure to work with Phil and he is greatly missed. The second feature article is from our expert Book Reviewer, Bob Buntrock. In Bob’s piece “Are Libraries Obsolete?”, he reviewed Are Libraries Obsolete? An Argument for Relevance in the Digital Age by Herring, M. Y., BiblioTech: Why Libraries Matter More Than Ever in the Age of Google by Palfrey, J., and Fool’s Gold: Why the Internet is No Substitute for a Library by Herring, M. Y.

Lately I have been having a bit of fun with text and data mining, so I thought it would be appropriate to run the CINF abstracts through my Mathematica scripts. Below is a word cloud of the CINF abstracts for the 251st ACS Meeting in San Diego, CA. I removed some stop words and non-descriptive words. Enjoy! I look forward to seeing you in San Diego.

Image

 

 

 

 

 

 

Vincent F. Scalfani, Editor
The University of Alabama

vfscalfani@ua.edu

CINF Social Networking Events at the Spring 2016 ACS Meeting

Image      Image

It is Our Pleasure to Invite You to Attend These Division of Chemical Information Events!

The ACS Division of Chemical Information is pleased to host the following social networking events at the Spring 2016 ACS National Meeting in San Diego, CA.


Sunday Welcoming Reception & Scholarships for Scientific Excellence Posters
6:30-8:30 pm, Sunday, March 13 – Room 3, San Diego Convention Center
Reception co-sponsored by:
Journal of Chemical Information & Modeling (ACS Publications), Journal of Cheminformatics (Springer), PerkinElmer, Thieme Chemistry and Wiley ChemPlanner.

Scholarships for Scientific Excellence sponsored exclusively by InfoChem.


Tuesday Luncheon (Ticketed Event – Contact Michael Qiu at our Symposia)
12:00-1:30 pm, Tuesday, March 15 – Room 20D, San Diego Convention Center
Sponsored exclusively by the Royal Society of Chemistry.

Speaker: Dr. Christopher Tubbs
Conservation Education Division at the San Diego Zoo Institute for Conservation Research

Presentation: “Dietary phytoestrogens and reproduction in southern white rhinoceros.”


ACS Division of Chemical Information Data Summit Reception
6:00-8:30 pm Wednesday, March 16 – Awesome Location To Be Announced

Co-sponsored by: Journal of Chemical Information & Modeling (ACS Publications), Chemical Semantics, Dotmatics, MestReLab Research and tranSMART Foundation.

Tuesday Luncheon Talk

Dr. Christopher Tubbs
Dietary phytoestrogens and reproduction in southern white rhinoceros

Image

Dr. Christopher Tubbs is a Scientist in the Reproductive Physiology Division, Conservation Education Division, San Diego Zoo Institute for Conservation Research. He received his B.S. from the University of Florida and his Ph.D. from the University of Texas at Austin Marine Science Institute. His research at the San Diego Zoo Institute for Conservation Research focuses on interactions between environmental chemicals and endocrine systems of endangered species.

 

CINF Business Meetings

Saturday, March 12: 1:00-3:00 PM

Education Committee - San Diego Convention Center Room 32B

  • Awards Committee - San Diego Convention Center Room 33C
  • Program Committee - San Diego Convention Center Room 33B

Saturday, March 12: 3:00-6:00 PM

  • Executive Committee - San Diego Convention Center Room 33A

Sunday, March 13: 12:00-2:00 PM

  • Chemical Structure Association Trust - San Diego Convention Center Room 6D

 

 

 

 

 

 

 

Awards and Scholarships

Image

Chemical Structure Association Trust

 

 

 

Applications Invited for CSA Trust Grant for 2016 and 2017

The Chemical Structure Association (CSA) Trust is an internationally recognized organization established to promote the critical importance of chemical information to advances in chemical research. In support of its charter, the Trust has created a unique Grant Program and is now inviting the submission of grant applications for 2016. The deadline for receipt of proposals for the 2017 Grant is also being announced at this time.

Purpose of the Grants

The Grant Program has been created to provide funding for the career development of young researchers who have demonstrated excellence in their education, research or development activities that are related to the systems and methods used to store, process and retrieve information about chemical structures, reactions and compounds. One or more Grants will be awarded annually up to a total combined maximum of ten thousand U.S. dollars ($10,000).  Grantees have the option of payments being made in U.S. dollars or in British Pounds equivalent to the U.S. dollar amount. Grants are awarded for specific purposes, and within one year each grantee is required to submit a brief written report detailing how the grant funds were allocated. Grantees are also requested to recognize the support of the Trust in any paper or presentation that is given as a result of that support.

Who is Eligible?

Applicant(s), age 35 or younger, who have demonstrated excellence in their chemical information related research and who are developing careers that have the potential to have a positive impact on the utility of chemical information relevant to chemical structures, reactions and compounds, are invited to submit applications.  While the primary focus of the Grant Program is the career development of young researchers, additional bursaries may be made available at the discretion of the Trust. All requests must follow the application procedures noted below and will be weighed against the same criteria.

Which Activities are Eligible?

Grants may be awarded to acquire the experience and education necessary to support research activities, for example, for travel to collaborate with research groups, to attend a conference relevant to one’s area of research (including the presentation of an already accepted research paper), to gain access to special computational facilities, or to acquire unique research techniques in support of one’s research.

Application Requirements

Applications must include the following documentation:

  1. A letter that details the work upon which the Grant application is to be evaluated as well as details on research recently completed by the applicant;The amount of Grant funds being requested and the details regarding the purpose for which the Grant will be used (e.g. cost of equipment, travel expenses if the request is for financial support of meeting attendance, etc.). The relevance of the above-stated purpose to the Trust’s objectives and the clarity of this statement are essential in the evaluation of the application)A brief biographical sketch, including a statement of academic qualifications;
  2. Two reference letters in support of the application.

Additional materials may be supplied at the discretion of the applicant only if relevant to the application and if such materials provide information not already included in items 1-4. A copy of the completed application document must be supplied for distribution to the Grants Committee and can be submitted via regular mail or e-mail to the Committee Chair (see contact information below).

Deadline for Applications

Application deadline for the 2016 Grant is March 25, 2016. Successful applicants will be notified no later than May 2, 2016. Application deadline for the 2017 Grant is March 31, 2017. Successful applicants will be notified no later than May 9, 2017.

Address for Submission of Applications

The application documentation can be mailed via post or emailed to: Bonnie Lawlor, CSA Trust Grant Committee Chair, 276 Upper Gulph Road, Radnor, PA 19087, USA. If you wish to enter your application by e-mail, please contact Bonnie Lawlor at chescot@aol.com prior to submission so that she can contact you if the e-mail does not arrive.

Chemical Structure Association Trust: Recent Grant Awardees

2015 – Dr. Marta Encisco

Molecular Modeling Group, Department of Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Australia. She was awarded a Grant to cover travel costs to visit collaborators at universities in Spain and Germany and to present her work at the
European Biophysical Societies Association Conference in Dresden, Germany in July 2015.

2015 – Jack Evans

School of Physical Science, University of Adelaide, Australia. He was awarded a grant to spend two weeks collaborating with the research group of Dr. Francois-Xavier Coudert (CNRS, Chimie Paris Tech).

2015 – Dr. Oxelandr Isayev

Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmaacy, University of North Carolina at Chapel Hill. He was awarded a Grant to attend summer classes at the Deep Learning Summer School 2015 (University of Montreal) to expand his knowledge of machine learning to include Deep Learning (DL). His goal is to apply DL to chemical systems to improve predictive models of chemical bioactivity.

2015 – Aleix Gimeno Vives

Cheminformatics and Nutrition Research Group, Biochemistry and Biotechnology Dept., Universitat Rovira i Virgili. He was awarded a Grant to attend the Cresset European User Group Meeting in June 2015 in order to improve his knowledge of the software that he is using to determine what makes an inhibitor selective for PTP1B.

2014 – Dr. Adam Madarasz

Institute of Organic Chemistry, Research Centre for Natural Sciences, Hungarian Academy of Sciences. He was awarded a Grant for travel to study at the University of Oxford with Dr. Robert S. Paton, a 2013 CSA Trust Grant winner, in order to increase his experience in the development of computational methodology which is able to accurately model realistic and flexible transition states in chemical and biochemical reactions.

2014 – MJosé Ojeda Montes

Department of Biochemistry and Biotechnology, University Rovira i Virgili, Spain. She was awarded a Grant for travel expenses to study for four months at the Freie University of Berlin to enhance her experience and knowledge regarding virtual screening workflows for predicting therapeutic uses of natural molecules in the field of functional food design.

2014 – Dr. David Palmer

Department of Chemistry, University of Strathclyde, Scotland.  He was awarded a Grant to present a paper at the fall 2014 meeting of the American Chemical Society on a new approach for representing molecular structures in computers based upon on ideas from the Integral Equation Theory of Molecular Liquids.

2014 – Sona B. Warrier

Departments of Pharmaceutical Chemistry, Pharmaceutical Biotechnology, and Pharmaceutical Analysis, NMIMS University, Mumbai. She was awarded a Grant to attend the International Conference on Pure and Applied Chemistry to present a poster on her research on inverse virtual screening in drug repositioning.

2013 – Dr. Johannes Hachmann

Department of Chemistry and Chemical Biology at Harvard University, Cambridge, MA.   He was awarded the Grant for travel to speak on “Structure-property relationships of molecular precursors to organic electronics” at a workshop sponsored by the Centre
Européen de Calcul Atomique et Moléculaire (CECAM) that took place October 22 – 25, 2013 in Lausanne, Switzerland.

2013 – Dr. Robert S. Paton

University of Oxford, UK.  He was awarded the Grant to speak at the Sixth Asian Pacific Conference of Theoretical and Computational Chemistry in Korea on July 11, 2013. Receiving the invitation for this meeting provided Dr. Paton with an opportunity to further his career as a Principal Investigator.

2013 – Dr. Aaron Thornton

Material Science and Engineering at CSIRO in Victoria, Australia. He was awarded the Grant to attend the 2014 International Conference on Molecular and Materials Informatics at Iowa State University with the objective of expanding his knowledge of web semantics, chemical mark-up language, resource description frameworks and other online sharing tools. He also visited Dr. Maciej Haranczyk, a prior CSA Trust Grant recipient, who is one of the world leaders in virtual screening.

2012 – Tu Le

CSIRO Division of Materials Science & Engineering, Clayton, VIV, Australia. Tu C. was awarded the Grant for travel to attend a cheminformatics course at Sheffield University and to visit the Membrane Biophysics group of the Department of Chemistry at Imperial College London.

2011 – J. B. Brown

Kyoto University, Kyoto, Japan. J.B. was awarded the Grant for travel to work with Professor Ernst Walter-Knappat the Freie University of Berlin and Professor Jean-Phillipe Vert of the Paris MinesTech to continue his work on the development of atomic partial charge kernels.

2010 – Noel O’Boyle

University College Cork, Ireland. Noel was awarded the grant to both network and present his work on open source software for pharmacophore discovery and searching at the 2010 German Conference on Cheminformatics.

2009 – Laura Guasch Pamies

University Rovira & Virgili, Catalonia, Spain.  Laura was awarded the Grant to do three months of research at the University of Innsbruck, Austria.

2008 – Maciej Haranczyk

University of Gdansk, Poland. Maciej was awarded the Grant to travel to Sheffield University, Sheffield, UK, for a 6-week visit for research purposes.

2007 – Rajarshi Guha

Indiana University, Bloomington, IN, USA. Rajarshi was awarded the Grant to attend the Gordon Research Conference on Computer-Aided Design in August 2007.

2006 – Krisztina Boda

University of Erlangen, Erlangen, Germany. Krisztina was awarded the Grant to attend the 2006 spring National Meeting of the American Chemical Society in Atlanta, GA, USA.

2005 – Dr. Val Gillet and Professor Peter Willett

University of Sheffield, Sheffield, UK.  They were awarded the Grant for student travel costs to the 2005 Chemical Structures Conference held in Noordwijkerhout, the Netherlands.

2004 – Dr. Sandra Saunders

University of Western Australia, Perth, Australia. Sandra was awarded the Grant to purchase equipment needed for her research.

2003 – Prashant S. Kharkar

Institute of Chemical Technology, University of Mumbai, Matunga, Mumbai. Prashant was awarded the Grant to attend the conference, Bioactive Discovery in the New Millennium, in Lorne, Victoria, Australia (February 2003) to present a paper, “The Docking Analysis

of 5-Deazapteridine Inhibitors of Mycobacterium avium complex (MAC) Dihydrofolate reductase (DHFR).”

2001 – Georgios Gkoutos

Imperial College of Science, Technology and Medicine, Department of Chemistry. London, UK. Georgios was awarded the Grant to attend the conference, Computational Methods in Toxicology and Pharmacology Integrating Internet Resources, (CMTPI-2001) in Bordeaux, France, to present part of his work on internet-based molecular resource discovery tools.

Committee Reports

Report on the Council Agenda for March 16, 2016

The Council of the American Chemical Society will meet in San Diego, CA on Wednesday, March 16, 2016 from 8:00am until approximately 12:00pm in the Sapphire Ballroom of the Hilton San Diego Bayfront Hotel. All ACS members are welcome to attend, although only Councilors are permitted to vote. A continental breakfast is usually available at 7:00am for all attendees.  There are only three items for Council Action and all are routine. The action items are summarized below.

Nominations and Elections

President-Elect: The Committee on Nominations & Elections (N&E) has identified four nominees for the office of 2017 ACS President-Elect. They are as follows: Peter K. Dorhout, Thomas R. Gilbert, C. Bradley Moore, and Gregory H. Robinson. The four nominees will answer questions at the Town Hall meeting that will be held on Sunday, March 13th, at 4:30pm in the Indigo Ballroom of the Hilton San Diego Bayfront Hotel. Questions may be submitted in advance at: nomelect@acs.org. On March 16th the Council will select the final two candidates whose names will appear on the fall ballot.

Other Elections

The Committee on Nominations and Elections has announced the list of nominees to represent District II and District IV on the Board of Directors for the term 2017-2019. Nominees for District II are George M. Bodner, Christina C. Bodurow, Isai T. Urasa, and Ruth Ann Woodall. Nominees for District IV are Rigoberto Hernandez, Larry K.R. Ritchie, and Barry J. Streusand. Ballots have been emailed to the voting councilors in the two districts and the results will be announced at the Council Meeting in San Diego and in Chemical & Engineering News. On or before October 10, 2016, ballots listing the two candidates selected by the Councilors for each District will be mailed to all members of District II and District IV for the election of a Director from each District.

N&E also announced the election of Directors-at-Large that will be conducted in the fall. The candidates for a 2017-2019 term are Joseph A. Heppert, Kristin M. Omberg, Dorothy J. Phillips, and Kathleen M. Schulz.

ACS Dues for 2017

Council will vote on the recommendation from the Committee on Budget and Finance with regard to the 2017 membership dues (an increase of $4.00 - from $162 to $166). The increases to ACS dues are based upon an escalator defined in the ACS Bylaws (Bylaw XIII, Section 3,a). The dues are calculated by multiplying the base (current) rate “by a factor which is the ratio of the revised Consumer Price Index for Urban Wage Earners and Clerical Workers (Service Category) for the second year previous to the dues year to the value of the index for the third year previous to the dues year, as published by the United States Department of Labor, with the fractional dollar amounts rounded to the nearest whole dollar”.

Base rate 2016: $162.00

Change in the Consumer Price Index, Urban Wage Earners, Services Category:

December 2015 CPI-W (Services):

$288.663

December 2014 CPI-W (Services):

$281.800

Change in CPI-W Index:

2.44%

2017 Dues, Fully Escalated: $162.00 x 1.0244 = $165.96
2017 Dues, Rounded: $166.00

Petitions for Vote

Approval of the “Academic Professional Guidelines”

The Committee on Economic and Professional Affairs (CEPA) presented the revised “Academic and Professional Guidelines” for consideration at the Council meeting in Boston on August 13, 2015. No revision comments were received and the document will be up for Council approval in San Diego. For a look at the proposed changes see page 75 of the Council Agenda Book at: http://www.acs.org/content/dam/acsorg/about/governance/ councilors/council-agenda-3.16.pdf.

Petitions for Consideration

The Chemical Professional’s Code of Conduct

The Committee on Economic and Professional Affairs (CEPA) has developed revisions to the Chemical Professional’s Code of Conduct (CPCC). This was last approved in 2012. After a rigorous review of the document, an updated version of the CPCC is also included in the Council Agenda Book page 72. Please send any suggestions for further revisions to careers@acs.org before April 30, 2016 so that they can be incorporated into the revised document which will be up for Council action at the meeting in Philadelphia later this year. The Council Agenda Book can be accessed at: http://www.acs.org/content/dam/acsorg/ about/governance/councilors/council-agenda-3.16.pdf.

Petition to Extend the Unemployed Members’ Dues Waiver (Bylaw XIII, Sec. 3)

The petitioners propose changes to the ACS’s Bylaws to allow unemployed members of the Society to remain as members without paying dues for a period of up to three years. Bylaw XIII, Sec. 3, k currently provides for the dues to be waived for an unemployed member for a period of up to two years. The Committee on Membership Affairs (MAC), prepared a Market Data Status Report (as of 5-14-15), which resulted in the following data.

Consecutive Years Unemployed Count

  • 1-year unemployed 882
  • 2-years unemployed 331
  • 3-years unemployed 179

Expanding this benefit to a third year prevented 179 members from being removed from membership in the Society. The extension of the benefit by another year would result in virtually no cost to the Society, yet would preserve membership status for those individuals who have been unemployed as chemists for up to three years (see page 93 of the Council Agenda Book).

The Committee on Constitution and Bylaws has reviewed the petition and finds it to be legal and not inconsistent with the constitution of the society. The proposed Bylaw amendment accomplishes the petitioners’ goal of expanding the unemployed members’ dues waiver from two to three years. It recommends changing “a waiver” to “an annual waiver” in line 3 of the Bylaw so that it is consistent with the sentence that follows. The Committee on Constitution and Bylaws concurs with the addition of reaffirming the status each year.

C&B is concerned with the assumption in the Explanation that without the waiver extension, the Society would lose those members who would benefit from the waiver extension. The financial implications of this petition are still being assessed.

Town Hall Meeting

A Town Hall meeting organized by the Committee on Nominations and Elections is scheduled for Sunday, March 13, 2016 in the Indigo Ballroom of the Hilton San Diego Bayfront Hotel from 4:30pm-5:30pm. It will highlight a Q&A session with the candidates for President-Elect. All ACS members are encouraged to attend. It is a great way to gather first-hand information and decide for whom you might want to vote in the fall election.

Note: The Council Agenda Book can be accessed at: http://www.acs.org/content/dam/ acsorg/about/governance/councilors/council-agenda-3.16.pdf.

Respectfully submitted February 10, 2016

CINF Councilors
Bonnie Lawlor
Andrea Twiss-Brooks
Svetlana N. Korolev

Philip F. Heller, 1956-2015

Image

Every member of the chemical information community is aware of Phil’s numerous contributions, first as a long-standing  employee  of the Institute for Scientific Information,  and then for many years at Thieme. It would take too long to list them all here. Some members of the community also had the privilege to know Phil as an “industry colleague.” A few of us knew him as a coworker, a friend, a mentor, or, as Alex jokingly called him, her “work husband.” This note is about Phil, the person.

We miss Phil on Mondays and Thursdays the most – or on the days when he used to work at the New York office. It took us weeks to gather the strength to clean his desk and dispose of the VfB Stuttgart (a soccer team in Germany) coffee cup that had always been full of steaming black coffee. We weren’t quite ready to let the small mementos go yet. We wrote hundreds of emails to his customers, many of whom were his friends, and the response was always the same, “what a terrible loss, he was such a great person, such a wonderful representative, a true friend…” even from people who knew him only via email and phone.  “He was unfailingly kind,” someone said about Phil.

As difficult as those messages were, they also helped us form a lasting memory of Phil, of the colleague who was always ready to share his vast industry knowledge and help us whenever help was needed. Of the unassuming mentor whose experience we often relied upon in our work. Of the travel companion, who seemed, in his business travels, to have been everywhere and had suggestions and advice for places to which we had never been. Of an extraordinary sales executive, who personified “relationship management” and “consultative selling” in his work. And of the friend, whose memory will always be with us.

Adam Bernacki and Alexandra Williams
Phil’s closest coworkers at Thieme

Book Reviews: by Robert E. Buntrock

I was prepared to write reviews of two books for this issue of the CIB, but current developments on the title subject caused me to postpone those until the next issue. The first development was a review of “BiblioTech” in CHOICE.1 Rather than wait for a review copy from the publisher, I checked the book out from nearby Fogler Library at the University of Maine (my card from Bangor Public Library is good there). To illustrate the value of a bricks and mortar library and shelf browsing, I found another related book, “Are Libraries Obsolete?” next to it and also checked it out. A week later, while reading the Wall Street Journal, I came across a related article making the subject even more timely.2   So here goes.

__________________________________

Herring, M. Y., Are Libraries Obsolete? An Argument for Relevance in the Digital Age, MacFarland. Jefferson, NC, 2014. 258
p. + vii. ISBN 978-0-7864-7356- 4 Softcover, $25, Ebook ISBN: 978-1-4766-1591-2.

The author has more than 35 years working in libraries and has previously addressed the subject of libraries vs. the Internet. The book, “Fool’s Gold”3, was not reviewed in CHOICE nor was it in the UMaine Library, so I had to order it from another library. Interesting that CHOICE, published by the Association of College & Research Libraries (a division of ALA), did not review this book. The article,4 “10 Reasons” was available, fortunately, open access.

This book, “Obsolete” is an update of both. The approach is middle-of-the-road, bound to offend more extreme views on both sides of the issues, but necessary. Of course, the implications involve both libraries and librarians.

Updates of the “10 Reasons,” elaborated in succeeding chapters are:

  1. Everything is not on the Internet.  Improving, but still not everything.
  2. Searching the Web: not easy.  Again, improving, but still not great.
  3. Quality control or lack thereof.  With a few exceptions, lacking.
  4. What you don’t know really does hurt you. Temporal rather than permanent, incomplete information, etc.
  5. Mass digitization and wide distribution. Not always desirable or available, campus wide, statewide, etc.

At this point digressions are made to newer, but very important aspects of the issues including copyright, open access, e-books and sharing, depth, and ubiquity.

Chapters in part two cover reading and literacy, privacy, and piracy. Part three has a chapter on current trends in libraries and librarianship and interactions with technology. The final chapter presents two scenarios: 1) Yes, libraries are obsolete or soon will be, and 2)  No, libraries are not obsolete and never will be. The chapter concludes with a discussion of what will produce either scenario, allowing the readers to answer the title question for themselves. An epilogue, chapter notes, a selected bibliography, and an index conclude the book.

Key thoughts include how information leads to knowledge leads to wisdom (one of my favorite maxims).  Instead, doubling of the amount of information (including the good and the bad) leads to a halving of knowledge and a quartering of wisdom. Spoiler alert: libraries are not obsolete, but maybe we’re making them so.

Recommended to anyone who uses libraries of any kind, is concerned with their use and fate, and not just those who work in libraries. Besides friends and supporters, libraries and librarians have plenty of enemies, including possibly themselves.

References:

  1. CHOICE, 2015, 53-1061.
  2. Barker, S., In Age of Google, Librarians Get Shelved, Wall Street Journal, Jan. 11, 2016, p. A13
  3. Herring, M. T., Fool’s Gold: Why the Internet Is No Substitute for a Library, MacFarland, Jefferson, NC, 2007.
  4. 10 Reasons Why the Internet Is no Substitute for a Library, American Libraries, 2001, April, p. 76-78; modified in American Libraries Magazine, January 20, 2010. http://americanlibrariesmagazine.org/2010/01/20/10-reasons-why-the-internet-is-no- substitute-for-a-library/  (accessed Jan. 2016).

__________________________________

Palfrey, J., BiblioTech: Why Libraries Matter More Than Ever in the Age of Google, Basic Books, New York, 2015, 280 p. + vii. ISBN 978-0-465-04299-9 Hardcover, $26.99.

The author, currently the Principal of Phillips Andover Academy is a former law professor and former director of the Harvard Law Library, who has written an excellent book championing the continued existence of libraries of all types. Faced with the increasingly common canard that all information is digital and on Google (and therefore libraries are non-essential), Palfrey makes cogent arguments why this is not so, and will not be so, and he lists plans for libraries and librarians to create and preserve the alliance of digital and analogue or print information and resources as well as the constructive evolution of physical libraries.  Beginning with the 160 year old history of the founding, spread, and evolution of public libraries, free to all holders of a library card, Palfrey recounts the stresses, financial and otherwise, on libraries of all kinds. Public libraries are covered the most, but all libraries are involved. Libraries, especially public libraries, are the cornerstone sources of information to provide for and nourish healthy processes of democracy.

Chapter 1, “Perfect Storm,” sets the stage for evidence proving the title. The public’s nostalgic perception that libraries are needed less or not needed at all must be met and surmounted. Libraries should be stewards and not just collectors of knowledge.  Problems with digital archiving include data rot and temporal formats. Chapter 2 discusses customers, especially kids and students, and how they use libraries. Chapter 3 discusses the evolving physical spaces of libraries.

Chapter 4, “Platforms”, uses DPLA, the Digital Public Library of America, http://dp.la, as an example of a platform currently providing digital access to more than 11 million items from libraries, archives, and museums. Several states have such platforms, but a national one is even better.

Chapter 5 discusses how to build the future by constructive “hacking.” Chapter 6 discusses networks, including the human networks of librarians. Preservation via collaboration rather than competition, is the topic of Chapter 7. School libraries, facilitating connected learners, are covered in chapter 8.

The 500-pound gorilla in the room, copyright, and its effect on lending and users, especially of e-resources is discussed in Chapter 9. The conclusion Chapter 10 summarizes the discussions of the preceding Chapters and outlines a 10-step program to reform libraries to make them essential.

  1. Define libraries as platforms
  2. Make libraries networked institutions
  3. Redefinition is demand driven, provide the services the customers need
  4. Redefinition must account for the physical aspects and the analog/print resources
  5. Do those things that need doing
  6. There should be a “common cause” established between librarians (and their customers) and authors, agents, editors, and publishers
  7. Library spaces functioning like labs
  8. Teamwork of librarians and IT experts on creating an open, shared infrastructure and procedures
  9. Preservation should become increasing collaborative
  10. Funding, both public and private, must be ramped up

Chapter notes, an annotated bibliography, and an index conclude the book.

Exemplifying the plight of many, as an active user of information resources without an institution other than my two public library cards, I’m hindered by the restrictive, even Draconian use and lending policies of e-resources.

Highly recommended to similar audiences as for the previous book, possibly more to librarians and funders. As a former member of the group of those who work in libraries but are not librarians, Palfrey’s terminology of us as “feral” (applied by librarians) is particularly apt.  (My previous motto was “I’m a chemist, I work in a library, and I’m not a librarian”.  This is not a putdown of librarians, but rather a statement of the varieties of expertise.)

__________________________________

This is one of the few times I’ve reviewed books that I’ve borrowed from libraries (fittingly enough) rather than owned. Since I was able to obtain a loan of the previously cited Fool’s Gold, I’ll provide a brief review.  In for a penny, in for a pound …

__________________________________

Herring, M. Y., Fool’s Gold: Why the Internet is No Substitute for a Library, MacFarland. Jefferson, NC, 2007. 199 p. + vii. ISBN 978-0-7864-5393-1  Softcover, $29/95.

This book is intermediate between Herring’s original “10 Reasons” 2001 article4 and his 2014 book. Fool’s Gold is somewhat dated, but is one of the better presentations of the inadequacies of the Internet. Although the Internet has improved somewhat (and also degraded), the conclusion “no substitute for libraries” still stands.

The author insists he is not a Luddite, but he constructively criticizes both the Internet and libraries. In this vein, often humorous, he posits that the Internet has become an object of worship and the text contains several one liners. For example, “Google uber alles”, “e-books not ready for prime time,” and “forget the needle (your research), just tell me which haystack”. He defines information as “random data” as opposed to knowledge (I wouldn’t go that far, to me information resides between data and knowledge).

Chapter 1 covers the information on the Web: disinformation, misinformation, and fraud, and Chapter 2 covers the presentation of so-called information on the Web. Chapter 3 covers Web porn, the funder of much of Web material. Chapter 4 covers “link rot”: the half-life of links is on the order of 18-36 months, and the excessive power of Google to control content is discussed in Chapter 5. A critique of the rise and proliferation of e-books and their perceived value is in Chapter 6. Chapter 7 covers the fiction of the “Paperless Society” and the shallowness of the Web is covered in Chapter 8, “a mile wide and a mind- numbing inch deep.”  Web content is mostly modern, the previous 10-15 years. Chapter 9 is a summary and conclusion including “the Web is no panacea,” “the wow Factor does not equal knowledge,” “mental junk food,” “our information swamp,” and “the material on the Web is not free.” These observations are contrasted with “Libraries: Treasure Troves of Information” along with strategies for their improvement. Chapter notes and an index conclude the book.

Similarly recommended as for Herring’s other book reviewed above.

__________________________________

In summary, these three books and associated material cover topics well known to librarians and information specialists, but are not recognized by much of the public including our customers and clients. Given that both misleading laudatory paeans of Google and the Web proliferate as well as gloom and doom requiems for libraries and librarians, the general subjects are quite timely and worthy of the attention to us, the public, and funders and legislators.

When it comes to disrespect of librarians and information specialists, I am reminded of the discussions on “disintermediation” publicized more than 20 years ago which I and others countered in presentations and publications. Plus ça change, plus c’est la même chose.

Robert E. (Bob) Buntrock
Buntrock Associates
Orono, ME

Notes From Our Sponsors

Image  Image

  

 

Division of Chemical Information Sponsors Spring 2016

The American Chemical Society Division of Chemical Information is very fortunate to receive generous financial support from our sponsors. Their support allows us to maintain the high quality of the Division’s programming, to promote communication between members at social functions at the ACS Spring 2016 National Meeting in San Diego, CA, and to support other divisional activities during the year, including scholarships to graduate students in chemical Information.

The Division gratefully acknowledges contributions from the following sponsors:

Gold                            
Journal of Chemical Information & Modeling (ACS Publications)

Silver                          
InfoChem
Royal Society of Chemistry

Bronze                        
 Journal of Cheminformatics (Springer)
Chemical Semantics 
MestReLab Research
PerkinElmer
Thieme Chemistry
tranSMART Foundation
Wiley ChemPlanner

Contributors                
Bio-Rad Laboratories

Dotmatics

Opportunities are available to sponsor Division of Chemical Information events, speakers, and material. Our sponsors are acknowledged on the CINF web site, in the Chemical Information Bulletin, on printed meeting materials, and at any events for which we use their contribution. For more information please review the sponsorship brochure at http://www. acscinf.org/PDF/CINF_Sponsorship_Brochure.pdf. Please feel free to contact me if you would like more information about supporting CINF.

Graham Douglas
Chair pro tem, Fundraising Committee 2016
Email: sponsorship@acscinf.org
Tel: 510-407-0769

The ACS CINF Division is a non-profit tax-exempt organization with taxpayer ID no. 52-6054220.

Journal of Chemical Information and Modeling (ACS Publications)

Image

 

 

ImageThe Journal of Chemical Information and Modeling and the Journal of Medicinal    Chemistry are pleased to introduce a joint virtual Issue on computational methods fordrug discovery and design. This virtual Issue showcases the synergy and complementary nature of these journals by highlighting publications that delineate the entire spectrum of computational methods and applications that arerelevant for drug discovery. The Virtual Issue continues shared initiatives of the two journals including, for instance, the development of a joint editorial policy on QSAR/QSPR and proprietary data. View the issue at: bit.ly/1PRiaih.

InfoChem adds chemical search capabilities to the WIPO PATENTSCOPE system

Munich, Germany (October 19, 2015) – InfoChem GmbH (www.infochem.de), leading provider in chemical structure and reaction technology as well as data mining in chemical scientific and patent documents, announced today that they have been designated by the World Intellectual Property Organization (WIPO) in Geneva to implement the project “Addition of chemical search capabilities to the WIPO ImagePATENTSCOPE search system”.

The goal of this project is to identify, tag and index chemical entities such as IUPAC Names, trade and brand names and trivial names in the PATENTSCOPE full-text documents using InfoChem’s highly acknowledged named entity recognition technology ICANNOTATOR.  Additionally, structure search capabilities for chemical compounds will be added to the PATENTSCOPE search user interface. The cooperation is planned for at least three years, during which time various enhancements to the PATENTSCOPE search system will be implemented with the aim of improving the discoverability of the PATENTSCOPE patent full-text collections.

About InfoChem

InfoChem GmbH (www.infochem.de), based in Munich, Germany, is a market leader in structure and reaction handling and retrieval. Founded in 1989, InfoChem focuses on the production and marketing of new chemical information products, including structural and reaction databases, and the development of software tools required for these applications. The main software tools provided are the InfoChem Fast Search Engine (ICFSE), the InfoChem Chemistry Cartridge for Oracle (ICCARTRIDGE), and the widely-used InfoChem reaction classification algorithm CLASSIFY. InfoChem distributes one of the largest structural and reaction files worldwide, currently containing 5.2 million organic compounds and facts and 4.5 million reactions covering the chemical literature published since 1974 (SPRESI). In addition, InfoChem provides tools for the automatic recognition and extraction of chemical entities and their conversion into chemical structures as well as the semantic enrichment of chemical science documents. Springer GmbH (Berlin) has held a majority interest in InfoChem since 1991. For more information go to www.infochem.de.

 

Media Contacts
InfoChem GmbH
E-Mail: info@infochem.de Tel. +49 (0) 89583002
Please feel free to contact us for more information about InfoChem, our current research projects, and our products.

 

Royal Society of Chemistry

Unlock 500 years of scientific history

Founded in 1841 as the Chemical Society, the Royal Society of Chemistry is one of the oldest and most eminent chemical societies in the world. In our collections, thousands of books, journals, letters, notes and pamphlets contain a valuable and fascinating Imagechronicle of chemical science from the 16th century to the present day.

The Historical Collection is a digital archive designed to make these significant scientific records widely-available. Featuring over 380,000 pages of scientific history, it allows easy access to documents that have shaped our understanding of chemistry, and helps to pinpoint the moments when key ideas first began to develop.

Highlights include:

  • The Roscoe Collection, featuring items on alchemy and early chemistry. The oldest is ‘De Secrets Mulierum,’ a compendium of medicinal knowledge from 1505.
  • The Davy Bookcase was donated in 1919 by George Holloway, a former society member. It contains items that were formerly the property of Sir Humphrey Davy.
  • Society publications, including copies of Chemistry in Britain and Education in Chemistry dating back to the 1960s, sit alongside annual reports and proceedings from the separate societies that merged to form the Royal Society of Chemistry.

It’s a unique resource, and a significant addition to any science library.

Developing the Frontiers project

Wholly society and institute owned, the Frontiers project aims to publish a series of high- impact, quality chemistry journals that showcase the very best research from China, Asia and the rest of the world to an international audience. Each journal has a top Chinese institute in the relevant field as a partner in the collaboration.

Currently available:

  • Inorganic Chemistry Frontiers publishes research articles, reviews, notes, comments and methods covering all areas of inorganic chemistry. Because the journal’s purpose is to report high-quality work of exceptional novelty (work of significant interest to a wide readership), it has strong interdisciplinary relevance. It is developed by The Chinese Chemical Society and Peking University.
  • Organic Chemistry Frontiers is our home for research from across organic chemistry.
  • Its emphasis is placed on studies that make significant contributions to the field of organic chemistry by reporting either new or significantly-improved protocols or methodologies. It is developed by The Chinese Chemical Society and Shanghai Institute of Organic Chemistry.

  • New to the series: Materials Chemistry Frontiers focuses on the synthesis and chemistry of exciting new materials, and the development of improved fabrication techniques. Announced last year, the journal is free to access until the end of 2018. It is developed by The Chinese Chemical Society and the Institute of Chemistry, Chinese Academy of Sciences.

If you are interested in gaining access to the Historical Collection or any of the Frontiers journals, please contact sales@rsc.org for more information.

 

Springer Chemistry News

Image Springer launches new platforms for open access portal and journals.

As part of the ongoing developments to improve the stability and flexibility of our systems, Springer has been working on a project to redesign and redevelop the open access SpringerOpen portal and journal websites. We are pleased to announce that the new SpringerOpen portal is now live with journal website migration in process. Take a look at an example of our new journal websites at http://jcheminf.springeropen.com.

The new websites are designed to offer a better experience for users while reading and to provide more effective community engagement. The benefits of the new journal websites include:

  • New global, open and transparent design, ensuring journals are identified as part of the

SpringerOpen stable of reputable and trustworthy journals

  • Faster updates using a new content management system
  • Improved performance, accessibility and standardization of technology enabling journals to grow and evolve

As part of the migration of journals to the new website platform, we have also changed the URL structure for SpringerOpen journals. The new structure ({journal URL}.springeropen. com) will ensure improved website availability globally, and improved search engine optimization, making it easier to discover content via search engines like Google. As the new websites and URLs go live, current URLs will continue to work via a redirect, so saved links and promotional activity will not be affected.

Our commitment to permanent accessibility remains unchanged – all articles published by SpringerOpen are deposited with a number of safe open access archives and are registered, with their URLs, with the International DOI Foundation (IDF).

Take a look at the new SpringerOpen portal and explore the journals at http://www. springeropen.com

Charlotte Hollingworth, Editor SpringerOpen

ImageNew Co-Editor-in-Chief for Chemistry Central Journal

We are delighted to welcome our new Co-Editor-in-Chief of Chemistry  Central Journal Dr King Kuok (Mimi) Hii from Imperial College London (http://www.ch.ic.ac.uk/mimi/).

Mimi completed her Ph.D. at the University of Leeds under the supervision of Prof. B. L. Shaw, FRS before moving to the University of Oxford to carry out research on the Heck arylation reaction in the group of Dr. John M. Brown, FRS. She started her independent career back at the University of Leeds, later moving to Kings College, and then to Imperial College London in 2003 where she was promoted to a Readership position in 2009. When we asked Mimi how she would describe her research, she commented “My key research interest is in the development of catalytic methodologies for atom- and step-efficient synthesis. In my research group at Imperial College London, we adopt a highly collaborative approach to problem solving; our projects are highly interdisciplinary, particularly with engineering and state-of-the-art spectroscopy.”

Mimi also shared her thoughts on how she feels chemistry will be a key area for future scientific developments. She explained “Compared to other subjects such as physics and biology, synthetic chemistry is often perceived as a less glamorous discipline. However, the ability to make any molecule at will, ‘on demand,’ and on a meaningful timescale will unlock hitherto unimagined opportunities for future scientific advances; for example, in the development of pharmaceuticals, agrochemicals, and other functional molecules and materials.”

As a broad scope, open access chemistry journal, Chemistry Central Journal presents novel research from all fields of chemistry and the interdisciplinary areas that converge with chemistry. Therefore we are very excited to have Mimi join our Editorial Board since her own work is a perfect example of collaborative research. For all the latest articles visit the Chemistry Central Journal website: http://ccj.springeropen.com/.

Charlotte Hollingworth, Editor SpringerOpen

 

First journal articles of Topics in Current Chemistry online

 

ImageIn the last issue of the Chemical Information Bulletin we announced that the book series Topics in Current Chemistry would be relaunched as a journal beginning 2016. The first journal articles are now available online to subscribers. The respective topical collections focus on “Analytical Chemistry for Cultural Heritage” and

“Cycloadditions in Bioorthogonal Chemistry.” Once completed, topical collections will also be available as hardcover editions in the series Topics in Current Chemistry Collections.

Further information is available on the journal homepage at www.springer.com/41061 and on the series homepage at www.springer.com/series/14181.

 

Steffen Pauly,
Editorial Director Chemistry www.springer.com/gp/chemistry
Twitter: @Springer_Chem

 

Mestrelab Research – Chemistry Software Solution

Image

Mestrelab Research specializes in the development of software for the processing and analysis of analytical chemistry data and chemical information. Our main product Mnova is a multiplatform (Windows, Mac, Linux) and multivendor software suite designed for combined NMR and LC/GC/MS techniques. Our new product Mbook, is an ELN designed for the synthetic chemist.

 

R&D is the primary focus and heart of our company with in-house developed next-generation reprocessing and analysis algorithm. This is all accompanied by our customer support which has been rated by users as excellent.Image

 

BASIC PLUGINS

NMR: NMR processing, analysis, simulation and reporting at your fingertips.

MS: Processing & analyzing LC/GC/MS data made simple.

ImageMnova acts as an interface for all our specific plugins. This shared interface and its automation abilities allow our users to minimize their learning curve and optimize workflows by combining different technique data on the same application.

ADVANCED NMR PLUGINS

 

NMRP: Prediction of NMR spectra from molecular structure; allows auto-assignments if combined with Mnova NMR.

qNMR: Arrive at optimal concentration or purity values.

RM: Simple, facilitated extraction of spectroscopic and chemical kinetic concentration data.

Verify: Automatic structure verification that

really works.

SMA: An open architecture solution to analyze simple mixtures by NMR.

Screen: A state-of-the-art automatic analysis tool for ligand screening NMR data.

DB:  A new concept for the shared storage of molecules, NMR and LC/GC/MS analytical

data, chemical and metadata.

PhysChem: Next generation algorithms for the prediction of physicochemical properties.

Other products:

 

ImageMspin:  It is a new multiplatform software tool  for the computation of NMR-related molecular properties, starting from 3D molecular structures.

 

 

 

ImageMbook: The Electronic Lab Notebook designed by chemists for chemists.

 

 

 

ImageMnova tablet app: This app has been designed to increase your NMR data analysis productivity and flexibility anywhere.

 

 

 

Visit our website: www.mestrelab.com and find out more about our software solutions or contact us for any further assistance.

Thieme Chemistry

Science of Synthesis and Pharmaceutical Substances releases

Thieme Chemistry is happy to announce new versions of their electronic reference works to be released in March 2016.

ImageScience of Synthesis 4.3 

The latest version of the

unique synthetic methodology tool for the most reliable chemical transformations includes approximately 1,500 pages of new content. Highlights include over 1,000 pages on “Catalytic Transformations via C—H Activation”, edited by Jin-Quan Yu and written by 40 authors. Also, included are updated reviews on topics such as the synthesis of silyl hydrides, vinylsilanes, fluoroarenes, chloroarenes, and bromoarenes, the chemistry of hypervalent iodoarenes and aryliodonium salts, and as arylphosphine oxides and heteroatom derivatives.

A short video introduction shows in an entertaining way how researchers can benefit from

Imageusing Science of Synthesis: https://www.youtube.com/ watch?v=rzDru_VLuaQ.

 

To get access to Science of Synthesis 4.3 or a free trial please visit http://sos.thieme.com.

For more information about Science of Synthesis please visit the website at www.thieme-chemistry.com/ sos.

Pharmaceutical Substances 4.0

The new version of Pharmaceutical Substances will be released at the end of March 2016 and will feature a completely redesigned, user-friendly interface. The intuitive and powerful search functionality will give easy access to the syntheses, patents and applications of over 2,600 APIs (active pharmaceutical ingredients), with 20 new APIs being included in this release. Enhanced print and export options and other new features will help users to save time when doing a literature search.

For further information about Pharmaceutical Substances please visit https://www.thieme. de/en/thieme-chemistry/pharmaceutical-substances-54819.htm. Details about version 4.0 will be included upon release.

 

ChemPlanner 1.0.4

Unique predictive software tool helps chemists to plan synthetic routes

ImageWiley ChemPlanner is a first-of-its-kind

tool that integrates computer-aided synthesis design capabilities with reaction mining approaches to offer productivity gains in synthesis planning. Designed for organic chemists in discovery and process development, ChemPlanner is an idea generator that will assist chemists in pharma, fine chemicals, agrochem and other sectors of the life sciences and chemical industries in discovering novel approaches for synthesizing their target molecules, and in identifying the optimal synthetic routes.

ChemPlanner can make creating routes faster and easier. With its combination of predictive reactions and curated information, the tool delivers the best of both worlds: computer-aided synthetic design backed up by millions of empirical reactions.

ChemPlanner, radically, not only gives the chemist existing experimental results, but also predicts reactions that should and do exist but have not yet been captured in the literature. For hundreds of molecules that chemists need to create to get to one drug, the software returns thousands of routes for them to consider.

For the chemist, ChemPlanner delivers increased productivity, time saved and solutions returned they may not have thought of and, with an ability to select for cost and environmentally-cleaner processes, the software delivers an all-around, additional tool to both the drug discovery workflow and to other parts of the chemical industry. For instance, ChemPlanner has the potential to cut time in the drug discovery workflow, with the possibility of increasing throughput, and getting new life-saving drugs to patients faster.

Go to www.chemplanner.com to learn more.

 

 

 

 

 

 

 

 

 

 

 

 

Technical Program Listing

ACS Chemical Information Division (CINF)
251th ACS National Meeting, Spring 2016
San Diego, CA (March 13-17, 2016)

CINF Symposia

Elsa Alvaro, Erin Davis, Program Chair

[Created Fri Feb 19 2016, Subject to Change; Check ACS Online Program for Latest Changes]

CINF: Tomayto vs. Tomahto: Overcoming Incompatibilities in Scientific Data 8:30am - 12:00pm
Sunday, March 13
Room 25B - San Diego Convention Center
David Deng, Organizing
David Deng, Presiding
8:30am-8:35am Introductory Remarks
8:35am-9:05am CINF 1: Relational database file can take us beyond the plain text file format
T O'Donnell, tjo@acm.org

gNova, San Diego, California, United States

Abstract

9:05am-9:35am CINF 2: Standard JSON molecule, a solution to a cross-vendor molecule file format?

Brian Cole, coleb@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States

Abstract

9:35am-10:05am CINF 3: Rule-based capture/storage of scientific data from PDF files and export using a generic scientific data model
Stuart Chalk, schalk@unf.edu, Audrey Bartholomew, Bashar Baraz, John Turner

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

Abstract

10:05am-10:25am Intermission
10:25am-10:55am CINF 4: Building linked-data, large-scale chemistry platform: Challenges, lessons, and solutions
Valery Tkachenko, tkachenkov@rsc.org, Alexey Pshenichnov, Aileen Day, Colin Batchelor, Peter Corbett

Royal Society of Chemistry, Rockville, Maryland, United States

Abstract

10:55am-11:25am CINF 5: Towards a functional database for enzyme data: STRENDA DB
Carsten Kettner, ckettner@beilstein-institut.de, Martin Hicks

Beilstein Institut, Frankfurt/Main, Germany

Abstract

11:25am-11:55am CINF 6: Virtues and vicissitudes of curatorial data wrangling: The guide to pharmacology experience
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden

Abstract

11:55am-12:00pm Concluding Remarks
CINF: From Data to Prediction: Applying Structural Knowledge in Drug Discovery & Development 8:40am - 12:00pm
Sunday, March 13
Room 25A - San Diego Convention Center
Jason Cole, Organizing
Jason Cole, Presiding
8:40am-8:45am Introductory Remarks
8:45am-9:15am CINF 7: Finding better aim at a moving target by exploiting structural data
Marcel Verdonk, marcel.verdonk@astx.com

Astex Pharmaceuticals, Cambridge, United Kingdom

Abstract

9:15am-9:45am CINF 8: Bridging the dimensions: Seamless integration of 3D structure-based design and 2D structure-activity relationships to guide medicinal chemistry
Marcus Gastreich1, Matthew Segall3, matthew.d.segall@gmail.com, Carsten Detering2, Edmund Champness3, Christian Lemmen1

1 BioSolveIT, Sankt Augustin, Germany; 2 BioSolveIT Inc, Bellevue, Washington, United States; 3 Optibrium Ltd, Cambridge, United Kingdom

Abstract

9:45am-10:15am CINF 9: Predicting binding affinity doesn't work, or does it?
Christian Lemmen, christian.lemmen@biosolveit.de

BioSolveIT, Sankt Augustin, Germany

Abstract

10:15am-10:30am Intermission
10:30am-11:00am CINF 10: Structural knowledge by prediction: Crystal structure prediction tests and progress
Colin Groom, groom@ccdc.cam.ac.uk, Jason Cole, Anthony Reilly

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom

Abstract

11:00am-11:30am CINF 11: Using physicochemical data and predictions in the risk assessment of mutagenic impurities
Susanne Stalford, susanne.stalford@lhasalimited.org

Lhasa Limited, Leeds, United Kingdom

Abstract

11:30am-12:00pm CINF 12: Profile-QSAR generation 2: Perfection, the enemy of the good?
Valery Polyakov1, valery.polyakov@gmail.com, Eric Martin2, Li Tian1

1 GDC, NIBR, Lafayette, California, United States; 2 Computational Chemistry, Novartis, El Cerrito, California, United States

Abstract

CINF: Data Mining: Searching Non-covalent Interactions in Chemical Databases 1:00pm - 4:45pm
Sunday, March 13
Room 24C - San Diego Convention Center
Suman Sirimulla, Organizing
Suman Sirimulla
Cosponsored by: COMP, Presiding
1:00pm-1:05pm Introductory Remarks
1:05pm-1:30pm CINF 20: Sigma-hole interactions for rational drug design
Suman Sirimulla, suman.sirimulla@stlcop.edu

Basic Sciences, St.Louis College of Pharmacy, St. Louis, Missouri, United States

Abstract

1:30pm-1:55pm CINF 21: Deep convolutional neural networks for autonomous discovery of molecular interactions

Abraham Heifets, Izhar Wallach, Michael Dzamba, misko@atomwise.com

Atomwise, Inc., San Francisco, California, United States

Abstract

1:55pm-2:20pm CINF 22: Crystallographic informatics: Similarity and statistics
Simon Coles2, s.j.coles@soton.ac.uk, Graham Tizzard2, Philip Adler1

1 Chemistry, Haverford College, Haverford, Pennsylvania, United States; 2 University of Southhampton, Hampshire, United Kingdom

Abstract

2:20pm-2:45pm CINF 23: Chemical fragment analysis of halogen bonds in protein binding sites
AhWing Chan, edith.chan@ucl.ac.uk

UCL, London, United Kingdom

Abstract

2:45pm-3:00pm Intermission
3:00pm-3:25pm CINF 24: Mining interaction data in the Cambridge structural database: Getting the rewards and removing the risks!
Jason Cole, cole@ccdc.cam.ac.uk, Peter Wood, Neil Feeder, Robin Taylor, Colin Groom

CCDC, Cambridge, United Kingdom

Abstract

3:25pm-3:50pm CINF 25: Fast mining of adaptable interaction patterns in protein-ligand interface
Therese Inhester2, inhester@zbh.uni-hamburg.de, Matthias Rarey1

1 University of Hamburg, Hamburg, Germany; 2 Center for Bioinformatics, University of Hamburg, Hamburg, Germany

Abstract

3:50pm-4:15pm CINF 26: Dual nature of a halogen atom
Mahesh Narayan, mnarayan@utep.edu

Chemistry, University of Texas at El Paso, El Paso, Texas, United States

Abstract

4:15pm-4:40pm CINF 27: Crystal clear: Using statistical descriptions and analysis to understand crystallisation
Philip Adler2, padler1@haverford.edu, Simon Coles4, Alex Norquist1, Joshua Schrier2, Dave Woods4, Sorelle Friedler1, Lucy Mapp3

1 Haverford College, Bryn Mawr, Pennsylvania, United States; 2 Chemistry, Haverford College, Haverford, Pennsylvania, United States; 3 Chemistry, University of Southampton, Southampton, United Kingdom; 4 University of Southhampton, Hampshire, United Kingdom

Abstract

4:40pm-4:45pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 1:00pm - 5:00pm
Sunday, March 13
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Ian Bruno
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
1:00pm-1:15pm Introductory Remarks
1:15pm-1:45pm CINF 13: Open data is not enough: A look at the Research Data Alliance

Mark Parsons, parsom3@rpi.edu

Research Data Alliance, Boulder, Colorado, United States

Abstract

1:45pm-2:15pm CINF 14: Responses to the data revolution: CODATA on policy, data science, and capacity building
Simon Hodson1, John Rumble2, jumbleusa@earthlink.net

1 CODATA, Paris, France; 2 R&R Data Services, Gaithersburg, Maryland, United States

Abstract

2:15pm-2:45pm CINF 15: Moving research forward with persistent identifiers and services
Patricia Cruse, patricia.cruse@datacite.org

DataCite, Berkeley, California, United States

Abstract

2:45pm-3:15pm CINF 16: Discoverability and reusability of FAIR chemistry research data as a key outcome of registering persistent identifiers and standardised metadata with DataCite
Henry Rzepa1, rzepa@ic.ac.uk, Matthew Harvey2, Andrew Mclean3

1 Chemistry, Imperial College London, London, United Kingdom; 2 HPC division, Imperial College London, London, United Kingdom; 3 ICT Division, Imperial College London, London, United Kingdom

Abstract

3:15pm-3:30pm Intermission
3:30pm-4:00pm CINF 17: Surveying and tracking the biomedical data landscape
Maryann Martone, mmartone@ucsd.edu

Neurosciences, University of California, San Diego, San Diego, California, United States

Abstract

4:00pm-4:30pm CINF 18: Data Observation Network for Earth: Earth and environmental science data management and discovery
Amber Budden1, aebudden@dataone.unm.edu, William Michener1, Dave Vieglais2, Rebecca Koskela1, Heather Soyka1

1 University of New Mexico, Albuquerque, New Mexico, United States; 2 University of Kansas, Lawrence, Kansas, United States

Abstract

4:30pm-5:00pm CINF 19: California Digital Library: Advancing the digital transition of scholarly information
John Chodacki, John.Chodacki@ucop.edu

California Digital Library, University of California, Oakland, California, United States

Abstract

CINF: From Data to Prediction: Applying Structural Knowledge in Drug Discovery & Development 1:30pm - 4:50pm
Sunday, March 13
Room 25A - San Diego Convention Center
Jason Cole, Organizing
Jason Cole, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:05pm CINF 28: Towards a fully automated creation of large protein structure ensembles
Stefan Bietz, Matthias Rarey, rarey@zbh.uni-hamburg.de

University of Hamburg, Hamburg, Germany

Abstract

2:05pm-2:35pm CINF 29: On our way to the automated search for ligand-sensing cores

Tobias Brinkjost1,2, tobias.brinkjost@tu-dortmund.de, Christiane Ehrt2, Petra Mutzel1, Oliver Koch2

1 Faculty of computer science, TU Dortmund University, Dortmund, Germany; 2 Faculty of chemistry and chemical biology, TU Dortmund University, Dortmund, Germany

Abstract

2:35pm-3:05pm CINF 30: Deep learning in the 3rd dimension: Structure-based bioactivity prediction on novel targets
Abraham Heifets, abe@atomwise.com, Izhar Wallach, Michael Dzamba

Atomwise, Inc., San Francisco, California, United States

Abstract

3:05pm-3:20pm Intermission
3:20pm-3:50pm CINF 31: CDD vision: Advanced analytics, calculations, and visualization live in CDD vault
Barry Bunin, bbunin@hotmail.com

CDD, Belmont, California, United States

Abstract

3:50pm-4:20pm CINF 32: Advances in data provisioning

Marian Brodney1, marian.d.brodney@pfizer.com, Jacquelyn Klug-McLeod2, Gregory Bakken2, Robert Stanton1

1 Computational Sciences Center of Excellence, Pfizer, Cambridge, Massachusetts, United States; 2 Computational Sciences Center of Excellence, Pfizer, Groton, Connecticut, United States

Abstract

4:20pm-4:50pm CINF 33: Chemical information on the web: Find and be found

Asta Gindulyte, mandroji@yahoo.com

National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, Maryland, United States

Abstract

CINF: CINF Scholarships for Scientific Excellence: Student Poster Competition 6:30pm - 8:30pm
Sunday, March 13
Room 3 - San Diego Convention Center
6:30pm-8:30pm CINF 34: Quantifying the effect that chemical environment exerts upon changes in property in matched molecular pairs analysis
Iva Lukac1, i.lukac@2013.ljmu.ac.uk, Andrew Leach1,3, Edward Griffen3, Alexander Dossetter2

1 School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, United Kingdom; 2 MedChemica Limited, Macclesfield, United Kingdom; 3 Medchemica Ltd, Macclesfield, United Kingdom

Abstract

6:30pm-8:30pm CINF 35: CSNAP: A new chemoinformatics approach for target identification using chemical similarity networks
Yu-Chen Lo1, bennylo@ucla.edu, Silvia Senese1, Chien-Ming Li3, Qiyang Hu2, Yong Huang3, Robert Damoiseaux4, Jorge Torres1

1 Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, United States; 2 Institute for Digital Research and Education, University of California, Los Angeles, Los Angeles, California, United States; 3 Drug Study Units, University of California, San Francisco, San Francisco, California, United States; 4 Molecular Shared Screening Resource, University of California, Los Angeles, Los Angeles, California, United States

Abstract

6:30pm-8:30pm CINF 36: Prediction and quantification of cation-π interactions in ligand-bromodomain binding: Using quantum chemistry to capture electronic effects
Wilian Augusto Cortopassi, wilian.cortopassi@chem.ox.ac.uk, Robert Paton

Chemistry Research Laboratory, University of Oxford, Oxford, United Kingdom

Abstract

6:30pm-8:30pm CINF 37: 3Dmol.js: Chemical structure visualization for the modern web
Jasmine Collins1, jlc206@pitt.edu, Matthew Ragoza3, Justin Jensen4, David Koes2

1 Computer Science/Neuroscience, University Of Pittsburgh, Pittsburgh, Pennsylvania, United States; 2 Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States; 3 University Of Pittsburgh, Pittsburgh, Pennsylvania, United States; 4 Pittsburgh Science & Technology Academy, Pittsburgh, Pennsylvania, United States

Abstract

6:30pm-8:30pm CINF 38: General purpose 2D and 3D similarity approach to identify hERG blockers
Patric Schyman, pschyman@bhsai.org, Ruifeng Liu, Anders Wallqvist

DoD Biotechnology High Performance Computing Software Applications Institute, Frederick, Maryland, United States

Abstract

6:30pm-8:30pm CINF 39: Indexing techniques and algorithms to efficiently mine interaction patterns in large sets of protein-ligand-complexes
Therese Inhester2, inhester@zbh.uni-hamburg.de, Matthias Rarey1

1 University of Hamburg, Hamburg, Germany; 2 Center for Bioinformatics, University of Hamburg, Hamburg, Germany

Abstract

6:30pm-8:30pm CINF 40: Development and application of multiclass QSAR models for predicting human skin sensitization
Vinicius Alves3,2, viniciusm.alves@gmail.com, Alexey Zakharov1, Eugene Muratov3, Denis Fourches5, Nicole Kleinstreuer4, Judy Strickland4, Carolina Andrade2, Alexander Tropsha3

1 CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States; 2 Faculty of Pharmacy, Federal University of Goias, Goiania, Goias, Brazil; 3 UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States; 4 Contractor supporting the NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), ILS, Inc., Research Triangle Park, North Carolina, United States; 5 Department of Chemistry and Bioinformatics Research Center, North Carolina State University, Chapel Hill, North Carolina, United States

Abstract

6:30pm-8:30pm CINF 41: Virtual screening in the cloud computing environment
Aaron Cooper1, aaron.cooper@stlcop.edu, Mathew Koebel3, Grant Schmadeke1, Suman Sirimulla2

1 Basic Sciences, St. Louis College of Pharmacy, St. Louis, Missouri, United States; 2 Basic Sciences, St.Louis College of Pharmacy, St. Louis, Missouri, United States

Abstract

6:30pm-8:30pm CINF 42: Structural evolution of Tcn (n = 4–20) clusters from first-principles global minimization
Chad Priest1, cprie003@ucr.edu, De-en Jiang2

1 Chemsitry, University California, Riverside, Riverside, California, United States; 2 Department of Chemistry, University of California, Riverside, Riverside, California, United States

Abstract

CINF: Beyond Digitized Paper: The Next Generation of ELNs 8:15am - 12:00pm
Monday, March 14
Room 24C - San Diego Convention Center
Erin Davis, David Deng, Organizing
Erin Davis, David Deng, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 50: Toward semantic representation of science in electronic laboratory notebooks (ELNs)
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

Abstract

8:45am-9:10am CINF 51: New cloud-based ELN with built-in raw analytical data support and automatic structure confirmation capabilities
Santiago Dominguez Vivero1, sdominguez@mestrelab.com, Juan Cobas Gomez1, Santiago Fraga Castro1, Francisco Javier Sardina2

1 Mestrelab Research SL, Hereford, Herefordshire, United Kingdom; 2 Chemistry, University of Santiago de Compostela, Santiago De Compostela, A Coruña, Spain

Abstract

9:10am-9:35am CINF 52: Mobile interfaces for a digital research notebook
Jeremy Frey2, j.g.frey@soton.ac.uk, Cerys Willoughby2, Simon Coles1, Richard Whitby3, Colin Bird2

1 University of Southampton, Hampshire, United Kingdom; 2 University of Southampton, Southampton, United Kingdom; 3 Univeristy of Southampton, Southampton, Hants, United Kingdom

Abstract

9:35am-10:00am CINF 53: Not just another reaction database
Aileen Day2, Valery Tkachenko2, tkachenkov@rsc.org, Alexey Pshenichnov2, Leah McEwen1, Simon Coles3, Richard Whitby3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Royal Society of Chemistry, Rockville, Maryland, United States; 3 University of Southhampton, Hampshire, United Kingdom

Abstract

10:00am-10:15am Intermission
10:15am-10:40am CINF 54: Directly upload data from an ELN into PubChem
Ben Shoemaker, shoemake@mail.nih.gov, Asta Gindulyte, Evan Bolton, Steve Bryant

NCBI / NLM / NIH, Bethesda, Maryland, United States

Abstract

10:40am-11:05am CINF 55: Intuitive collaboration platform: A Scilligence story
Rajeev Hotchandani1, hotchandani@yahoo.com, Jinbo Lee2

1 Scilligence, Watertown, Massachusetts, United States; 2 Scilligence Corporation, Burlington, Massachusetts, United States

Abstract

11:05am-11:30am CINF 56: ACAS LIMS simplifies diverse data loading, management, and querying
John McNeil, john@mcneilco.com, Guy Oshiro, guy@mcneilco.com, Brian Fielder, bfielder@hmc.edu, Eva Gao, Samuel Meyer, Brian Bolt, Fiona McNeil, Matthew Shaw, Kelley Carr

John McNeil & Co., San Diego, California, United States

Abstract

11:30am-11:55am CINF 57: ChemEngine: An automated chemical data harvesting tool for molecular inventory and chemical computing from scientific literature

Muthukumarasamy Karthikeyan1, karthincl@gmail.com, Renu Vyas2

1 Digital Information Resource Centre, CSIR National Chemical Laboratory, Pune, India; 2 Chemical Engineering and Process Development, CSIR-National Chemical Laboratory, Pune, MH, India

Abstract

11:55am-12:00pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 8:15am - 11:55am
Monday, March 14
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Leah McEwen
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 43: PubChem BioAssay: A decade’s practice for managing chemistry research data
Yanli Wang, ywang@ncbi.nlm.nih.gov

NCBI, NLM, NIH, Building 38A, Room 5S506, 8600 Rockville Pike, Bethesda, Maryland, United States

Abstract

8:45am-9:15am CINF 44: Data infrastructural design for informing critical evaluation
Kenneth Kroenlein, kenneth.kroenlein@nist.gov

Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, Colorado, United States

Abstract

9:15am-9:40am CINF 45: Community-driven disciplinary data repositories: A case study
Ian Bruno, bruno@ccdc.cam.ac.uk, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom

Abstract

9:40am-10:10am CINF 46: ICSU World Data System: Trusted data services for global science
Mustapha Mokrane1, Jean-Bernard Minster2, jbminster@ucsd.edu, Rorie Edmunds1

1 International Programme Office, ICSU World Data System, Koganei, Tokyo, Japan; 2 Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, La Jolla, California, United States

Abstract

10:10am-10:25am Intermission
10:25am-10:55am CINF 47: STRENDA and MIRAGE: Examples of community-based data reporting standardization initiatives
Martin Hicks, mhicks@beilstein-institut.de, Carsten Kettner, ckettner@beilstein-institut.de

Beilstein Institut, Frankfurt, Germany

Abstract

10:55am-11:25am CINF 48: Standardizing the description of nanomaterials: The CODATA uniform description system
John Rumble1, jumbleusa@earthlink.net, Steven Freiman2, Clayton Teague3

1 R&R Data Services, Gaithersburg, Maryland, United States; 2 Freiman Consulting, Potomac, Maryland, United States; 3 Teague Consulting, Gaithersburg, Maryland, United States

Abstract

11:25am-11:55am CINF 49: Scientific units in the electronic age
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

Abstract

CINF: Informatics & Quantum Mechanics: Combining Big Data & DFT in Pharma & Materials 8:40am - 12:00pm
Monday, March 14
Room 25A - San Diego Convention Center
Art Cho, Organizing
Art Cho, Presiding
8:40am-8:45am Introductory Remarks
8:45am-9:15am CINF 58: Screening of materials for energy applications based on transport properties: Methods and data automation tools

Boris Kozinsky, bkoz37@gmail.com

Bosch Research, Waban, Massachusetts, United States

Abstract

9:15am-9:45am CINF 59: High-throughput chemical simulations and virtual screening for materials discovery
Mathew Halls, mhalls@mhalls.com, David Giesen, Thomas Hughes, Shaun Kwak, Thomas Mustard, Jacob Gavartin, Alexander Goldberg, Yixiang Cao

Schrodinger Inc., San Diego, California, United States

Abstract

9:45am-10:15am CINF 60: Machine learning and high-throughput quantum chemistry methods for the discovery of organic materials
Alan Aspuru-Guzik, aspuru@chemistry.harvard.edu

Harvard University, Cambridge, Massachusetts, United States

Abstract

10:15am-10:30am Intermission
10:30am-11:00am CINF 61: Using drug discovery methods to accelerate the search for better battery materials
Joshua Schrier, jschrier@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States

Abstract

11:00am-11:30am CINF 62: Combining density functional theory with cheminformatics for development of a new-paradigm ligand screening method in computational drug discovery
Art Cho1,2, artcho@korea.ac.kr

1 Korea University, Seoul, Korea (the Republic of); 2 Quantum Bio Solutions, Seoul, Korea (the Republic of)

Abstract

11:30am-12:00pm CINF 63: Discovery through deterministic optimization: Navigating chemical space for effective material design

Jennifer Elward, jen.elward@gmail.com, Christopher Rinderspacher

Army Research Laboratory, Aberdeen Proving Ground, Maryland, United States

Abstract

CINF: Chemical Information for Small Businesses & Startups 1:00pm - 4:55pm
Monday, March 14
Room 24C - San Diego Convention Center
Edlyn Simmons, Organizing
Edlyn Simmons
Cosponsored by: CPRM and SCHB, Presiding
1:00pm-1:15pm Introductory Remarks
1:15pm-1:40pm CINF 72: Building a business with and without scientific computing: The five W's and one H
Steven Muskal, smuskal@eidogen-sertanty.com

Suite 103-475, Eidogen, Oceanside, California, United States

Abstract

1:40pm-2:05pm CINF 73: Interactive cheminformatics for occasional use in SMEs
Therese Inhester1, inhester@zbh.uni-hamburg.de, Matthias Hilbig3, Matthias Rarey2

1 Center for Bioinformatics, University of Hamburg, Hamburg, Germany; 2 University of Hamburg, Hamburg, Germany

Abstract

2:05pm-2:30pm CINF 74: Playing by the rules: Knowing what applies and what information you have to maintain regarding your chemical inventory
Frankie Wood-Black, fwblack@cableone.net

Ag., Science and Engineering, Northern Oklahoma College, Ponca City, Oklahoma, United States

Abstract

2:30pm-2:55pm CINF 75: ChemSpider: Search and share chemistry… for free
Serin Dabb, dabbs@rsc.org

The Royal Society of Chemistry, Cambridge, United Kingdom

Abstract

2:55pm-3:10pm Intermission
3:10pm-3:35pm CINF 76: What chemists and other scientists need to know about their duty of disclosure under the new law governing the patenting process in the US
Xavier Pillai, xpillai@leydig.com

Leydig Voit Mayer Ltd, Chicago, Illinois, United States

Abstract

3:35pm-4:00pm CINF 77: Monitoring the minnows: Using IP information to understand what small businesses are doing
Stephen Adams, stephen.adams@magister.co.uk

Magister Ltd, Roche, Cornwall, United Kingdom

Abstract

4:00pm-4:25pm CINF 78: Patent information in PubChem for small businesses and startups
Sunghwan Kim, kimsungh@ncbi.nlm.nih.gov, Paul Thiessen, Evan Bolton, Steve Bryant

National Library of Medicine, National Institutes of Health, Rockville, Maryland, United States

Abstract

4:25pm-4:50pm CINF 79: Open patent chemistry “big bang” presents large opportunities for small enterprises
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden

Abstract

4:50pm-4:55pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 1:00pm - 5:00pm
Monday, March 14
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Ian Bruno, Leah McEwen
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
1:00pm-1:05pm Introductory Remarks
1:05pm-1:35pm CINF 64: Authoring tools to automate data sharing in scientific publishing
John Kitchin, jkitchin@andrew.cmu.edu

Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States

Abstract

1:35pm-2:00pm CINF 65: Facilitating the inclusion of analytical raw data in the submission and review process
Santiago Dominguez Vivero1, sdominguez@mestrelab.com, Juan Cobas Gomez1, Felipe Seoane1, Jose Garcia Pulido1, Agustin Barba1, Jesus Varela Carrete2

1 Mestrelab Research SL, Hereford, Herefordshire, United Kingdom; 2 Chemistry, University of Santiago de Compostela, Santiago de Compostela, A Coruña, Spain

Abstract

2:00pm-2:30pm CINF 66: Crystallography: A domain exemplar for chemistry data management
Simon Coles, s.j.coles@soton.ac.uk

University of Southhampton, Hampshire, United Kingdom

Abstract

2:30pm-2:55pm CINF 67: Are data management solutions developed for commercial organizations suitable for academic research?
Mariana Vaschetto, mariana.vaschetto@dotmatics.com, Tom Oldfield, Michael Hartshorn

Dotmatics, Bishops Stortford, United Kingdom

Abstract

2:55pm-3:10pm Intermission
3:10pm-3:30pm CINF 68: Data sharing in life sciences R&D: Pre-competitive collaboration through the Pistoia Alliance
Carmen Nitsche, cnitsche@swbell.net

Pistoia Alliance, San Antonio, Texas, United States

Abstract

3:30pm-3:50pm CINF 69: The Royal Society of Chemistry and the data publication landscape
Serin Dabb, dabbs@rsc.org

The Royal Society of Chemistry, Cambridge, United Kingdom

Abstract

3:50pm-4:10pm CINF 70: Digital IUPAC: The need for global representation of chemistry and chemical information in the digital age
Jeremy Frey, j.g.frey@soton.ac.uk

University of Southampton, Southampton, United Kingdom

Abstract

4:10pm-4:30pm CINF 71: DIG chemistry: Establishing a research data interest group to address the many faces of chemical data management
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States

Abstract

4:30pm-5:00pm Panel Discussion
CINF: Informatics & Quantum Mechanics: Combining Big Data & DFT in Pharma & Materials 1:30pm - 4:45pm
Monday, March 14
Room 25A - San Diego Convention Center
Art Cho, Organizing
Art Cho, Presiding
1:30pm-2:00pm CINF 80: In silico, high-throughput screening of non-fullerene acceptor materials for applications of organic photovoltaic devices: A Harvard clean energy project study
Steven Lopez, stevenlopez0209@gmail.com, Edward Pyzer-Knapp, Alan Aspuru-Guzik

Harvard University, Cambridge, Massachusetts, United States

Abstract

2:00pm-2:30pm CINF 81: Regioselectivity prediction of metabolic reactions based on ab initio derived descriptors

Arndt Finkelmann2, arndt.finkelmann@pharma.ethz.ch, Andreas Göller1, Gisbert Schneider2

1 Global Drug Discovery, Bayer Pharma AG, Wuppertal, Germany; 2 Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

Abstract

2:30pm-3:00pm CINF 82: COSMO-based approach for the design of solvents to optimize reaction rates
Nicholas Austin1, nick.austin111@gmail.com, Nikolaos Sahinidis2, Daniel Trahan3

1 Chemical Engineering, Carnegie Mellon University, Bowling Green, Kentucky, United States; 2 Dept Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States; 3 The Dow Chemical Company, Freeport, Texas, United States

Abstract

3:00pm-3:15pm Intermission
3:15pm-3:45pm CINF 83: Efficient, first-principles-based screening for high-charge carrier mobility in organic crystals
Christoph Schober, christoph.schober@ch.tum.de, Karsten Reuter, Harald Oberhofer

Chair of Theoretical Chemistry, Technical University Munich, Garching, Germany

Abstract

3:45pm-4:15pm CINF 84: Data-driven chemistry: From small molecules to discovery of new functional materials
Olexandr Isayev2, olexandr@olexandrisayev.com, Alexander Tropsha1

1 Univ of North Carolina, Chapel Hill, North Carolina, United States; 2 UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States

Abstract

4:15pm-4:45pm CINF 85: Multi-agent approach for molecular modeling in chemical vapor deposition
Luke Achenie, achenie@vt.edu

Virginia Tech, Blacksburg, Virginia, United States

Abstract

CINF: Sci-Mix 8:00pm - 10:00pm
Monday, March 14
Hall D/E - San Diego Convention Center
8:00pm-10:00pm CINF 105: Supporting openness and reproducibility in scientific research: The Center for Open Science

Sara Bowman, sed8n@virginia.edu

Center for Open Science, Charlottesville, Virginia, United States

8:00pm-10:00pm CINF 110: Building a better materials science database: Challenges and opportunities

Robin Padilla, robin.padilla@springer.com, Michael Klinge, michael.klinge@springer.com

Corporate Markets & Databases, Springer Nature, Heidelberg, Germany

8:00pm-10:00pm CINF 116: Competitive intelligence workbench: Getting access to information for decision making

Huijun wang, huijun.wang@merck.com

Merck, Kenilworth, New Jersey, United States

8:00pm-10:00pm CINF 117: Using systems biology in computational drug design workflows

George Nicola, george.nicola@outlook.com, Bruce Kovacs

Afecta Pharmaceuticals, Irvine, California, United States

8:00pm-10:00pm CINF 131: Comparative toxicogenomics database: Advancing understanding of molecular connections among chemicals, genes, and diseases

Cynthia Grondin, cjgrondin@ncsu.edu, Allan Davis, Thomas Weigers, Carolyn Mattingly

Biology, North Carolina State University, Raleigh, North Carolina, United States

8:00pm-10:00pm CINF 139: Enhanced chemical understanding through 3D-printed models

Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk, Peter Wood4, Ian Bruno1, Ye Li2, Vincent Scalfani3, Shawn O'Grady2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 University of Michigan, Ann Arbor, Michigan, United States; 3 University Libraries, University of Alabama, Tuscaloosa, Alabama, United States; 4 CCDC, Cambridge, United Kingdom

8:00pm-10:00pm CINF 13: Open data is not enough: A look at the Research Data Alliance

Mark Parsons, parsom3@rpi.edu

Research Data Alliance, Boulder, Colorado, United States

8:00pm-10:00pm CINF 143: Chemical knowledge representation and access in Wolfram|Alpha and Mathematica

Eric Weisstein, eww@wolfram.com

Scientific Content, Wolfram|Alpha, Champaign, Illinois, United States

8:00pm-10:00pm CINF 147: Leveraging the VIVO research networking system to facilitate collaboration and data visualization

Michaeleen Trimarchi, Danielle Bodrero Hoggan, danielle@scripps.edu

Kresge Library, The Scripps Research Institute, La Jolla, California, United States

8:00pm-10:00pm CINF 165: Predicting drug-induced hepatic systems' toxicity by integrating transporter interaction profiles

Eleni Kotsampasakou, eleni.kotsampasakou@univie.ac.at, Gerhard Ecker

Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria

8:00pm-10:00pm CINF 21: Deep convolutional neural networks for autonomous discovery of molecular interactions

Abraham Heifets, Izhar Wallach, Michael Dzamba, misko@atomwise.com

Atomwise, Inc., San Francisco, California, United States

8:00pm-10:00pm CINF 29: On our way to the automated search for ligand-sensing cores

Tobias Brinkjost1,2, tobias.brinkjost@tu-dortmund.de, Christiane Ehrt2, Petra Mutzel1, Oliver Koch2

1 Faculty of computer science, TU Dortmund University, Dortmund, Germany; 2 Faculty of chemistry and chemical biology, TU Dortmund University, Dortmund, Germany

8:00pm-10:00pm CINF 2: Standard JSON molecule, a solution to a cross-vendor molecule file format?

Brian Cole, coleb@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States

8:00pm-10:00pm CINF 32: Advances in data provisioning

Marian Brodney1, marian.d.brodney@pfizer.com, Jacquelyn Klug-McLeod2, Gregory Bakken2, Robert Stanton1

1 Computational Sciences Center of Excellence, Pfizer, Cambridge, Massachusetts, United States; 2 Computational Sciences Center of Excellence, Pfizer, Groton, Connecticut, United States

8:00pm-10:00pm CINF 33: Chemical information on the web: Find and be found

Asta Gindulyte, mandroji@yahoo.com

National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 57: ChemEngine: An automated chemical data harvesting tool for molecular inventory and chemical computing from scientific literature

Muthukumarasamy Karthikeyan1, karthincl@gmail.com, Renu Vyas2

1 Digital Information Resource Centre, CSIR National Chemical Laboratory, Pune, India; 2 Chemical Engineering and Process Development, CSIR-National Chemical Laboratory, Pune, MH, India

8:00pm-10:00pm CINF 58: Screening of materials for energy applications based on transport properties: Methods and data automation tools

Boris Kozinsky, bkoz37@gmail.com

Bosch Research, Waban, Massachusetts, United States

8:00pm-10:00pm CINF 63: Discovery through deterministic optimization: Navigating chemical space for effective material design

Jennifer Elward, jen.elward@gmail.com, Christopher Rinderspacher

Army Research Laboratory, Aberdeen Proving Ground, Maryland, United States

8:00pm-10:00pm CINF 81: Regioselectivity prediction of metabolic reactions based on ab initio derived descriptors

Arndt Finkelmann2, arndt.finkelmann@pharma.ethz.ch, Andreas Göller1, Gisbert Schneider2

1 Global Drug Discovery, Bayer Pharma AG, Wuppertal, Germany; 2 Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

8:00pm-10:00pm CINF 99: Applications of drug-target data in translating genomic variation into drug discovery opportunities

Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Tuesday, March 15
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 86: Towards knowledge representation improvements in chemistry
Evan Bolton, evan.e.bolton@gmail.com

NCBI / NLM / NIH, Warrenton, Virginia, United States

Abstract

8:45am-9:10am CINF 87: Chemical classifications for biology and medicine
Minoru Kanehisa, kanehisa@kuicr.kyoto-u.ac.jp

Institute for Chemical Research, Kyoto University, Uji Kyoto, Japan

Abstract

9:10am-9:35am CINF 88: Withdrawn
9:35am-10:00am CINF 89: ChEBI database and ontology: A key resource for chemical biology and metabolomics
Gareth Owen, gowen@ebi.ac.uk

EMBL-EBI, Ely, United Kingdom

Abstract

10:00am-10:15am Intermission
10:15am-10:40am CINF 90: Classifying chemistry: Current efforts in Canada
David Wishart, dwishart@ualberta.ca

Biological Sciences, University of Alberta, Edmonton, Alberta, Canada

Abstract

10:40am-11:05am CINF 91: Classifying compounds in public databases
Lutz Weber, lutz.weber@ontochem.com

IT, OntoChem, Germering, Germany

Abstract

11:05am-11:30am CINF 92: Automated structural and functional annotation of small molecules using integrated chemical ontologies: ClassyFire, ChemOnt, and downstream applications
Yannick Djoumbou Feunang, djoumbou@ualberta.ca

Biological Sciences, University Of Alberta, Edmonton, Alberta, Canada

Abstract

11:30am-11:55am CINF 93: Evaluation of machine-generated chemical ontologies for molecular information
Stephen Boyer, skboyer@gmail.com, Thomas Griffin, Eric Louie

IBM Research, San Jose, California, United States

Abstract

CINF: Driving Change: Impact of Funders on the Research Data & Publications Landscape 8:35am - 12:00pm
Tuesday, March 15
Room 25A - San Diego Convention Center
Elsa Alvaro, Andrea Twiss-Brooks, Organizing
Elsa Alvaro
Cosponsored by: MEDI and ORGN, Presiding
8:35am-8:40am Introductory Remarks
8:40am-8:50am Update on NSF MPS Open Data Policies
8:50am-9:15am CINF 100: NIH public access policy
Neil Thakur, thakurn@od.nih.gov

NIH, Rockville, Maryland, United States

Abstract

9:15am-9:40am CINF 101: U.S. Department of Energy public access plan
Laura Biven, laura.biven@science.doe.gov

US Department of Energy, Washington, D.C., District of Columbia, United States

Abstract

9:40am-10:05am CINF 102: Helping authors and funders achieve open access goals at ACS Publications
Darla Henderson, D_Henderson@acs.org

Publications Division, American Chemical Society, Washington, District of Columbia, United States

Abstract

10:05am-10:30am CINF 103: Libraries at the hub as the federally funded research wheel turns to open
Shannon Kipphut-Smith1, sk60@rice.edu, Betty Rozum2, betty.rozum@usu.edu, Becky Thoms3, becky.thoms@usu.edu

1 Rice University, Houston, Texas, United States; 2 Utah State University, Logan, Utah, United States

Abstract

10:30am-10:45am Intermission
10:45am-11:10am CINF 104: SHARE phase II: Enhancing the dataset and engaging the community
Judy Ruttenberg, judy@arl.org

Association of Research Libraries, Washington, District of Columbia, United States

Abstract

11:10am-11:35am CINF 105: Supporting openness and reproducibility in scientific research: The Center for Open Science

Sara Bowman, sed8n@virginia.edu

Center for Open Science, Charlottesville, Virginia, United States

Abstract

11:35am-12:00pm CINF 106: Impact of open publishing: Scalability, sustainability, and success
Ann Gabriel, a.gabriel@elsevier.com

Elsevier, New York, New York, United States

Abstract

CINF: Linking Big Data with Chemistry: Databases Connecting Genomics, Biological Pathways & Targets to Chemistry 9:30am - 11:50am
Tuesday, March 15
Room 24C - San Diego Convention Center
Rachelle Bienstock, Organizing
Rachelle Bienstock, Presiding
9:30am-9:35am Introductory Remarks
9:35am-9:55am CINF 94: Connecting 3D chemical data with biological information
Ian Bruno, bruno@ccdc.cam.ac.uk, Suzanna Ward, Elizabeth Thomas, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom

Abstract

9:55am-10:15am CINF 95: PubChem BioAssay: Link chemical research to GenBank and beyond
Yanli Wang, ywang@ncbi.nlm.nih.gov

Building 38a, Room 5s506, Bethesda, Maryland, United States

Abstract

10:15am-10:35am CINF 96: Withdrawn
10:35am-10:50am Intermission
10:50am-11:10am CINF 97: Predicting adverse drug events using literature-based pathway analysis
James Rinker, j.rinker@elsevier.com, Timothy Hoctor

R & D Solutions, Elsevier Inc., Philadelphia, Pennsylvania, United States

Abstract

11:10am-11:30am CINF 98: Intersecting different databases to define the inner and outer limits of the data-supported druggable proteome
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden

Abstract

11:30am-11:50am CINF 99: Applications of drug-target data in translating genomic variation into drug discovery opportunities

Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom

Abstract

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:45pm
Tuesday, March 15
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 107: Representing the chemistry of 800,000 crystal structures
Suzanna Ward, ward@ccdc.cam.ac.uk, Ian Bruno, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom

Abstract

2:00pm-2:25pm CINF 108: CHEMnetBASE and beyond: CRC handbooks and dictionaries in today's world
Fiona Macdonald1, fiona.macdonald@taylorandfrancis.com, Megan Eisenbraun2

1 Taylor and Francis, Boca Raton, Florida, United States; 2 Taylor & Francis, London, United Kingdom

Abstract

2:25pm-2:50pm CINF 109: Collection, curation, and communication of thermophysical and thermochemical property data at the NIST Thermodynamics Research Center
Andrei Kazakov1, andrei.kazakov@nist.gov, Robert Chirico3, Chris Muzny4, Vladimir Diky5, Eugene Paulechka1, Ala Bazyleva1, Joseph Magee2, Scott Townsend1, Kenneth Kroenlein2

1 NIST, Boulder, Colorado, United States; 2 Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, Colorado, United States; 3 National Institute of Standards Technology, Boulder, Colorado, United States

Abstract

2:50pm-3:15pm CINF 110: Building a better materials science database: Challenges and opportunities

Robin Padilla, robin.padilla@springer.com, Michael Klinge, michael.klinge@springer.com

Corporate Markets & Databases, Springer Nature, Heidelberg, Germany

Abstract

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 111: TCI’s approaches to chemical information for researchers
Haruhiko Taguchi1, Tracey Barber2, Tracey.Barber@tcichemicals.com

1 RD (Information Management) Department, Tokyo Chemical Industry Co Ltd, Chuo-ku Tokyo, Japan; 2 Marketing, TCI America, Cambridge, Massachusetts, United States

Abstract

3:55pm-4:20pm CINF 112: Presenting the latest scientific knowledge on an e-commerce website
Jonathan Stephan, jon.stephan@sial.com

Sigma Aldrich, Saint Louis, Missouri, United States

Abstract

4:20pm-4:45pm CINF 113: Beyond chemistry: Collect, organize, and visualize scientific data on the web
David Deng, dengw2@gmail.com, Rajeev Hotchandani, Jinbo Lee

Scilligence, Burlington, Massachusetts, United States

Abstract

CINF: Driving Change: Impact of Funders on the Research Data & Publications Landscape 2:00pm - 4:50pm
Tuesday, March 15
Room 25A - San Diego Convention Center
Elsa Alvaro, Andrea Twiss-Brooks, Organizing
Andrea Twiss-Brooks
Cosponsored by: MEDI and ORGN, Presiding
2:00pm-2:25pm CINF 119: Are we ready to define the scholarly commons?
Maryann Martone1,2, mmartone@ucsd.edu

1 Neurosciences, University of California, San Diego, San Diego, California, United States; 2 Hypothes.is, San Francisco, California, United States

Abstract

2:25pm-2:50pm CINF 120: Research data curation services at UC San Diego library
Ho Jung Yoo, hjsyoo@ucsd.edu, David Minor

Library, UC San Diego, San Diego, California, United States

Abstract

2:50pm-3:15pm CINF 121: Is open science an inevitable outcome of e-science?
Jeremy Frey, j.g.frey@soton.ac.uk

University of Southampton, Southampton, United Kingdom

Abstract

3:15pm-3:40pm CINF 122: Navigating the research data ecosystem
Dan Valen, dan@figshare.com

figshare, Brooklyn, New York, United States

Abstract

3:40pm-3:55pm Intermission
3:55pm-4:20pm CINF 123: Funding mandates and policies: A database provider's response
Ian Bruno1, Colin Groom2, Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 CCDC, Cambridge, United Kingdom

Abstract

4:20pm-4:45pm CINF 124: Quest to find 'broader impact': How funding bodies are using altmetrics to evaluate funded research and grant applications
Sara Rouhi, sara@altmetric.com

Altmetric, Washington, DC, District of Columbia, United States

Abstract

4:45pm-4:50pm Concluding Remarks
CINF: Linking Big Data with Chemistry: Databases Connecting Genomics, Biological Pathways & Targets to Chemistry 2:00pm - 4:05pm
Tuesday, March 15
Room 24C - San Diego Convention Center
Rachelle Bienstock, Organizing
Rachelle Bienstock, Presiding
2:00pm-2:05pm Introductory Remarks
2:05pm-2:25pm CINF 114: How can genomic databases be linked to chemical structural information?
Rachelle Bienstock, rachelleb1@gmail.com

RJB Computational Modeling LLC, Chapel Hill, North Carolina, United States

Abstract

2:25pm-2:45pm CINF 115: Reactome pathway knowledgebase: Connecting pathways, networks, and disease
Robin Haw, robin.haw@oicr.on.ca

Informatics and Bio-computing, OICR, Toronto, Ontario, Canada

Abstract

2:45pm-3:05pm CINF 116: Competitive intelligence workbench: Getting access to information for decision making

Huijun wang, huijun.wang@merck.com

Merck, Kenilworth, New Jersey, United States

Abstract

3:05pm-3:15pm Intermission
3:15pm-3:35pm CINF 117: Using systems biology in computational drug design workflows

George Nicola, george.nicola@outlook.com, Bruce Kovacs

Afecta Pharmaceuticals, Irvine, California, United States

Abstract

3:35pm-3:55pm CINF 118: Combining semantic triples across domains to identify new and novel relationships and knowledge
Matthew Clark, m.clark@elsevier.com, Frederik van den Broek, Anton Yuryev, Maria Shkrob, Sherri Matis-Mitchell, Timothy Hoctor

R & D Solutions, Elsevier Inc., Philadelphia, Pennsylvania, United States

Abstract

3:55pm-4:05pm Concluding Remarks
CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Wednesday, March 16
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 125: Analytical data, the web, and standards for unified laboratory informatics databases
Graham Mc Gibbon1, scitechmaven@gmail.com, Patrick Wheeler2, pwheeler@yahoo.com

1 Advanced Chemistry Development (ACD/Labs), Toronto, Ontario, Canada; 2 Product Development, Advanced Chemistry Development, Encinitas, California, United States

Abstract

8:45am-9:10am CINF 126: From molecular formulas to Markush structures: Different levels of knowledge representation in chemistry
Michael Braden, mbraden@chemaxon.com

ChemAxon, Cambridge, Massachusetts, United States

Abstract

9:10am-9:35am CINF 127: Strategies for creating knowledge from chemistry and text data
Tom Oldfield1, tom.oldfield@dotmatics.com, Mariana Vaschetto1, mariana.vaschetto@dotmatics.com, Jeff Nauss2, jeff.nauss@linguamatics.com

1 Dotmatics, Bishops Stortford, United Kingdom; 2 Linguamatics, San Diego, California, United States

Abstract

9:35am-10:00am CINF 128: Combined structure and reaction retrieval in scientific content: What satisfied users in the past and what they demand for the future
Guido Herrmann1, guido.herrmann@thieme.de, Josef Eiblmaier1, Valentina Eigner-Pitto1

1 Georg Thieme Verlag Kg, Stuttgart, Germany; 1 InfoChem GmbH, Munich, Germany

Abstract

10:00am-10:15am Intermission
10:15am-10:40am CINF 129: Harnessing chemical and toxicological data for the evaluation of food ingredients and packaging
Diane Schmit, dschmit@alumni.ucla.edu, Tammy Page, Kirk Arvidson, Patra Volarath, Leighna Holt

US Food and Drug Administration, College Park, Maryland, United States

Abstract

10:40am-11:05am CINF 130: Expansion of DSSTox: Leveraging public data to create a semantic cheminformatics resource with quality annotations for support of U.S. EPA applications
Christopher Grulke2, Inthirany Thillainadarajah1, Antony Williams1, David Lyons1, Jeff Edwards1, Ann Richard1, richard.ann@epa.gov

1 National Center for Computational Toxicology, US EPA, Research Triangle Park, North Carolina, United States; 2 Zachary Piper Solutions, New Hill, North Carolina, United States

Abstract

11:05am-11:30am CINF 131: Comparative toxicogenomics database: Advancing understanding of molecular connections among chemicals, genes, and diseases

Cynthia Grondin, cjgrondin@ncsu.edu, Allan Davis, Thomas Weigers, Carolyn Mattingly

Biology, North Carolina State University, Raleigh, North Carolina, United States

Abstract

11:30am-11:55am CINF 132: Wikidata: Advancing science through semantic integration of genes, diseases, and drugs
Benjamin Good1, bgood@scripps.edu, Elvira Mitraka2, Andra Waagmeester1,3, Sebastian Burgstaller-Muehlbacher1, Timothy Putman1, Andrew Su1, Lynn Schriml4

1 Department of Molecular and Experimental Medicine, Scripps Research Institute, La Jolla, California, United States; 2 Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States; 3 Micelio, Antwerp, Belgium; 4 Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States

Abstract

CINF: Reimagining Libraries as Innovation Centers: Enabling, Facilitating & Collaborating throughout the Research Life Cycle 8:45am - 12:00pm
Wednesday, March 16
Room 24C - San Diego Convention Center
Ye Li, Vincent Scalfani, Organizing
Ye Li, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 133: From dusty stacks to an information hub: Reimagining the UF libraries
Neelam Bharti1, neelambh@ufl.edu, Sara Gonzalez2

1 Marston Science Library, University of Florida, Gainesville, Florida, United States; 2 Marston Science Library, Gainesville, Florida, United States

Abstract

9:15am-9:40am CINF 134: Expanding the research commons model into disciplinary instances
Jeremy Garritano, jgarrita@umd.edu

University Libraries, University of Maryland, College Park, Maryland, United States

Abstract

9:40am-10:05am CINF 135: Libraries for the future: A digital economy perspective
Jeremy Frey, j.g.frey@soton.ac.uk, Steven Brewer

University of Southampton, Southampton, United Kingdom

Abstract

10:05am-10:20am Intermission
10:20am-10:45am CINF 136: Leveraging the interdisciplinarity of chemistry: Building interdisciplinary collaborations
Kiyomi Deards, kiyomideards@gmail.com

Research and Instructional Services, University of Nebraska-Lincoln, Lincoln, Nebraska, United States

Abstract

10:45am-11:10am CINF 137: Predicting local trends in scholarly communication for decision-making in collection development: An exploration beyond citation analysis
Ye Li, liye@umich.edu

University of Michigan, Ann Arbor, Michigan, United States

Abstract

11:10am-11:35am CINF 138: Academic technologies: A new library service to offer advanced software training
Vincent Scalfani, vincent.scalfani@gmail.com, Melissa Green

University Libraries, University of Alabama, Tuscaloosa, Alabama, United States

Abstract

11:35am-12:00pm CINF 139: Enhanced chemical understanding through 3D-printed models

Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk, Peter Wood4, Ian Bruno1, Ye Li2, Vincent Scalfani3, Shawn O'Grady2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 University of Michigan, Ann Arbor, Michigan, United States; 3 University Libraries, University of Alabama, Tuscaloosa, Alabama, United States; 4 CCDC, Cambridge, United Kingdom

Abstract

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:45pm
Wednesday, March 16
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 140: IUPHAR/BPS guide to pharmacology (GtoPdb): Concise mapping for the triples of chemistry, data, and protein target classifications
Christopher Southan, cdsouthan@gmail.com, Joanna Sharman, Adam Pawson, Elena Faccenda, Jamie Davies

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden

Abstract

2:00pm-2:25pm CINF 141: Open PHACTS: Semantic interoperability for drug discovery
Herman Van Vlijmen1, hvvlijme@its.jnj.com, Open PHACTS Consortium2

1 Computational Chemistry, Discovery Sciences EU, Janssen, Beerse, Belgium; 2http://www.openphacts.org, Vienna, Austria

Abstract

2:25pm-2:50pm CINF 142: Representation of drug discovery knowledge in the ChEMBL and SureChEMBL databases
Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom

Abstract

2:50pm-3:15pm CINF 143: Chemical knowledge representation and access in Wolfram|Alpha and Mathematica

Eric Weisstein, eww@wolfram.com

Scientific Content, Wolfram|Alpha, Champaign, Illinois, United States

Abstract

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 144: Helping people navigate the changing seas of scientific information
David Evans1, david.evans@relx.ch, Pieder Caduff1, Thibault Geoui2, Juergen Swienty-Busch2

1 Reed Elsevier Properties SA, Neuchatel, Switzerland; 2 Elsevier Information Systems, GmbH, Frankfurt, Germany

Abstract

3:55pm-4:20pm CINF 145: Characterization and categorization of novel knowns, unknowns, and the interface between physical and digital
Graeme Whitley1, gwhitley@wiley.com, Bernd Berger2, Timothy Adams2

1 Wiley, Hoboken, New Jersey, United States; 2 Wiley-VCH, Weinheim, Germany

Abstract

4:20pm-4:45pm CINF 146: Semantic approaches for biochemical knowledge discovery
Michel Dumontier, michel.dumontier@gmail.com

Medicine, Stanford University, Stanford, California, United States

Abstract

CINF: Reimagining Libraries as Innovation Centers: Enabling, Facilitating & Collaborating throughout the Research Life Cycle 1:30pm - 4:45pm
Wednesday, March 16
Room 24C - San Diego Convention Center
Ye Li, Vincent Scalfani, Organizing
Vincent Scalfani, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 147: Leveraging the VIVO research networking system to facilitate collaboration and data visualization

Michaeleen Trimarchi, Danielle Bodrero Hoggan, danielle@scripps.edu

Kresge Library, The Scripps Research Institute, La Jolla, California, United States

Abstract

2:00pm-2:25pm CINF 148: Stanford profiles created to support the university’s scholarly community
Grace Baysinger, graceb@stanford.edu

Swain Chem & Chem Eng Library, Stanford University Libraries, San Jose, California, United States

Abstract

2:25pm-2:50pm CINF 149: Managing researchers' reputations throughout the research life cycle
Linda Galloway, galloway@syr.edu, Anne Rauh

Syracuse University Libraries, Syracuse, New York, United States

Abstract

2:50pm-3:05pm Intermission
3:05pm-3:30pm CINF 150: Anatomy of the chemistry research enterprise in the academic sector: Serving the underserved in a large research institution
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States

Abstract

3:30pm-3:55pm CINF 151: Safety use case for chemical safety information
Ralph Stuart, secretary@dchas.org

Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States

Abstract

3:55pm-4:20pm CINF 152: PubChem BioAssay: Grow with the community
Yanli Wang, ywang@ncbi.nlm.nih.gov

Building 38a, Room 5s506, Bethesda, Maryland, United States

Abstract

4:20pm-4:40pm Discussion
4:40pm-4:45pm Concluding Remarks
CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Thursday, March 17
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 153: Linking chemical and non-chemical data in structured product labeling
Yulia Borodina, yulia.borodina@fda.hhs.gov, Bill Hess, CoCo Tsai, Pete Phong, Lonnie Smith

FDA, Catonsville, Maryland, United States

Abstract

8:45am-9:10am CINF 154: Ginas: A global effort to define and index substances in medical products
Tyler Peryea1, tylerperyea@gmail.com, Lawrence Callahan2

1 Informatics, NIH NCATS, North Bethesda, Maryland, United States; 2 FDA, Silver Spring, Maryland, United States

Abstract

9:10am-9:35am CINF 155: TranSMART Foundation: An open-data and open-science platform to integrate molecular and clinical data in translational research and precision medicine
Rudolph Potenzone, rudypoten@me.com

tranSMART Foundation, Redmond, Washington, United States

Abstract

9:35am-10:00am CINF 156: Leveraging RxNorm and drug classifications for analyzing prescription datasets
Olivier Bodenreider, obodenreider@mail.nih.gov

Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland, United States

Abstract

10:00am-10:15am Intermission
10:15am-10:40am CINF 157: Evolution of digital and semantic chemistry at Southampton
Jeremy Frey1, j.g.frey@soton.ac.uk, Simon Coles2, Colin Bird1

1 University of Southampton, Southampton, United Kingdom; 2 University of Southhampton, Hampshire, United Kingdom

Abstract

10:40am-11:05am CINF 158: Implementing chemistry platform for OpenPHACTS: Lessons learned
Colin Batchelor, Alexey Pshenichnov, Jon Steele, Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States

Abstract

11:05am-11:30am CINF 159: Representation of molecular structures and related computations on the semantic web: A universal data model and its ontology
Mirek Sopek2, sopek@makolab.com, Stuart Chalk1, Neil Ostlund2, Jacob Bloom2

1 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States; 2 Chemical Semantics, Inc., Gainesville, Florida, United States

Abstract

11:30am-11:55am CINF 160: GlyTouCan international glycan structure repository using semantic web technologies
Issaku Yamada1, issaku@noguchi.or.jp, Kiyoko Aoki-Kinoshita2,3, Nobuyuki Aoki2, Daisuke Shinmachi2, Masaaki Matsubara1, Akihiro Fujita2, Shinichiro Tsuchiya2, Shujiro Okuda4, Noriaki Fujita3, Hisashi Narimatsu3

1 The Noguchi Institute, Tokyo, Japan; 2 Graduate School of Engineering, Soka University, Tokyo, Japan; 3 Research Center for Medical Glycoscience, AIST, Tsukuba, Japan; 4 Graduate School of Medical and Dental Sciences, Niigata University, Niigata, Japan

Abstract

CINF: General Papers 9:00am - 11:50am
Thursday, March 17
Room 24C - San Diego Convention Center
Elsa Alvaro, Erin Davis, Organizing
Elsa Alvaro, Erin Davis, Presiding
9:00am-9:05am Introductory Remarks
9:05am-9:35am CINF 161: Progress toward a conformational database for sesquiterpene reaction pathways
Jordan Zehr2, jordan.zehr001@albright.edu, Dean Tantillo1, Christian Hamann3, chamann@albright.edu

1 Dept Chemistry, UC Davis, Davis, California, United States; 2 Chemistry & Biochemistry, Albright College, Reading, Pennsylvania, United States

Abstract

9:35am-10:05am CINF 162: OMPOL: Visualization of large chemical spaces
Peter Corbett, Colin Batchelor, Alexey Pshenichnov, Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States

Abstract

10:05am-10:35am CINF 163: Comparison of machine learning algorithms for the prediction of critical values and acentric factors for pure compounds
Wendy Carande, wendy.carande@nist.gov, Andrei Kazakov, Kenneth Kroenlein

NIST, Boulder, Colorado, United States

Abstract

10:35am-10:50am Intermission
10:50am-11:20am CINF 164: Optimal superposition of arbitrarily ordered molecules using the Kuhn-Munkres algorithm
Berhane Temelso1, berhane.temelso@bucknell.edu, Joel Mabey1, Toshiro Kubota3, George Shields2

1 701 Moore Avenue, Bucknell University, Lewisburg, Pennsylvania, United States; 2 Deans Office, 113 Marts Hall, Bucknell University, Lewisburg, Pennsylvania, United States; 3 Mathematical Sciences, Susquehanna University, Selinsgrove, Pennsylvania, United States

Abstract

11:20am-11:50am CINF 165: Predicting drug-induced hepatic systems' toxicity by integrating transporter interaction profiles

Eleni Kotsampasakou, eleni.kotsampasakou@univie.ac.at, Gerhard Ecker

Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria

Abstract

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:20pm
Thursday, March 17
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 166: Ontology for biomedical investigations (OBI)
Bjoern Peters, bpeters@lji.org, James Overton, Randi Vita, OBI consortium

Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, California, United States

Abstract

2:00pm-2:25pm CINF 167: Protein ontology: Fostering connections in chemical biology
Darren Natale1,2, dan5@georgetown.edu

1 Georgetown University Medical Center, Washington, District of Columbia, United States; 2 PRO Consortium, Washington, District of Columbia, United States

Abstract

2:25pm-2:50pm CINF 168: Ontologies for classifying and modeling drug discovery data
Stephan Schuerer1,3, stephan.schurer@gmail.com, Asiyah Yu Lin1, Saurabh Mehta1, Hande Kücük McGinty2, Qiong Cheng3, Amar Koleti3, Nooshin Zadeh1, Dusica Vidovic1,3

1 Pharmacology, University of Miami, Miami, Florida, United States; 2 Computer Science, University of Miami, Miami, Florida, United States; 3 Center for Computational Science, University of Miami, Miami, Florida, United States

Abstract

2:50pm-3:05pm Intermission
3:05pm-3:30pm CINF 169: Immune Epitope Database (IEDB) and its use of formal ontologies
Randi Vita, rvita@liai.org, James Overton, Bjoern Peters

Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, California, United States

Abstract

3:30pm-3:55pm CINF 170: PubChemRDF: Semantic annotation and search
Gang Fu1, gangfu1982@gmail.com, Evan Bolton2

1 NCBI, NIH, Rockville, Maryland, United States; 2 NCBI, NIH, Bethesda, Maryland, United States

Abstract

3:55pm-4:20pm CINF 171: Generic scientific data model and ontology for representation of chemical data
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

Abstract

Cosponsored Symposia

CHED: Fall 2015 InterCollegiate Cheminformatics Course 8:30am - 11:50am
Sunday, March 13
Mission Beach A/B - Manchester Grand Hyatt San Diego
Robert Belford, Stuart Chalk, Leah McEwen, Organizing
Robert Belford
Cosponsored by: CINF and MPPG, Presiding
8:30am-8:40am Introductory Remarks
8:40am-8:55am CHED 8: Using cheminformatics to develop the next aspirin
John Langenstein, jslangenstein@mix.wvu.edu, John Penn

Chem Dept, West Virginia Univ, Morgantown, West Virginia, United States

8:55am-9:10am CHED 9: Correlation of anti-cancer drug structure to efficacy
John Turner, whinis@whinis.com, Stuart Chalk

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

9:10am-9:25am CHED 10: Performing variable substituent chemical structure searches
John House1, jxhouse@ualr.edu, Robert Belford1, Sunghwan Kim2

1 University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 2 National Library of Medicine, Bethesda, Maryland, United States

9:25am-9:40am CHED 11: Advanced database search
Sarah House1, sxhouse@ualr.edu, Robert Belford2, Sunghwan Kim3

1 UALR, Little rock, Arkansas, United States; 2 Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States; 3 National Library of Medicine, Bethesda, Maryland, United States

9:40am-9:50am Intermission
9:50am-10:05am CHED 12: pH and acid-base equilibria with cheminformatics
Benjamin Brown, ben.brown@centre.edu, Jennifer Muzyka

Chemistry Dept, Centre College, Danville, Kentucky, United States

10:05am-10:20am CHED 13: Aggregation of solubility data for quick access
Parijat Sharma1, parijat.sharma@centre.edu, Brandon Davis2,3, Robert Belford2, Jennifer Muzyka1, Andrew Lang4, Jordi Cuadros5

1 Chemistry Dept, Centre College, Danville, Kentucky, United States; 2 Chemistry, Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States; 3 Forensic Chemistry, Arkansas State Crime Laboratory, Little Rock, Arkansas, United States; 4 Oral Roberts University, Tulsa, Oklahoma, United States; 5 IQS Universitat Raman Llull, Barcelona, Spain

10:20am-10:35am CHED 14: Cross-walking metadata from the IUPAC-NIST solubility database to a new scientific data model
Natalia Gutierrez, nataliagb23@gmail.com, Stuart Chalk

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

10:35am-10:50am CHED 15: Semantic annotation of thermochemical data from the NIST-JANAF dataset
Nilab Azim, nilabazim@gmail.com, Stuart Chalk

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States

10:50am-11:00am Intermission
11:00am-11:15am CHED 16: Integration of a spectral viewer for data stored in an open source electronic laboratory notebook
Andrew Cornell2, apcornell@ualr.edu, Robert Belford1, Daniel Berleant1, Michael Bauer3, Ottis Rothenberger4, Herman Bergwerf5

1 Univ of Arkansas at Little Rck, Little Rock, Arkansas, United States; 2 Chemistry/Biology, University of Arkansas at Little Rock, Little Rock, Arkansas, United States; 3 Myelmo Institute, University of Arkansas for Medical Science, Little Rock, Arkansas, United States; 4 Illinois State University, Normal, Illinois, United States; 5 TU Delft / Erasmus University, Rotterdam, Netherlands

11:15am-11:30am CHED 17: Automated spectrum resolver with InChI enhanced lookup
Alexander Williams, alexander.williams@centre.edu, Jennifer Muzyka

Chemistry Dept, Centre College, Danville, Kentucky, United States

11:30am-11:45am CHED 18: LabPal: Chemical information for android
Daniel Graham, daniel.graham@centre.edu, Jennifer Muzyka

Chemistry Dept, Centre College, Danville, Kentucky, United States

11:45am-11:50am Concluding Remarks
COMP: From Synthesis to Design: Modeling Tools for Medicinal Chemists 8:30am - 11:50am
Sunday, March 13
Room 26A - San Diego Convention Center
Melissa Landon, Organizing
Melissa Landon
Cosponsored by: CINF and MEDI, Presiding
8:30am-8:35am Introductory Remarks
8:35am-9:05am COMP 23: Advancing compound design with structure-liability models
Shana Posy2, slposy@gmail.com, Malcolm Davis1, Brian Claus1

1 Bristol Myers Squibb, Princeton, New Jersey, United States; 2 Bristol-Myers Squibb Co, Princeton, New Jersey, United States

9:05am-9:35am COMP 24: Closing the loop between synthesis and design: Helping chemists to use all the information in compound optimization
Tamsin Mansley2, tmansley@optibrium.com, Edmund Champness1, Peter Hunt1, James Chisholm1, Chris Leeding1, Alex Elliott1, Samuel Dowling1, Fayzan Ahmed1, Matthew Segall1

1 Optibrium Ltd, Cambridge, United Kingdom; 2 Optibrium Ltd, Cambridge, Massachusetts, United States

9:35am-10:05am COMP 25: Shifting medchem tasks in 21st century drug discovery: The importance of syncing 2D and 3D
Carsten Detering, detering@biosolveit.com

BioSolveIT Inc, Bellevue, Washington, United States

10:05am-10:20am Intermission
10:20am-10:50am COMP 26: Putting modeling in the non-modelers' hands using LiveDesign

Michelle Hall, michelle.lynn.hall@gmail.com

Schrodinger, Inc, Somerville, Massachusetts, United States

10:50am-11:20am COMP 27: From structural chemistry to medicinal chemistry
Jason Cole2, cole@ccdc.cam.ac.uk, Colin Groom1, Erin Davis2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 Cambridge Crystallographic Data Centre, Piscataway, New Jersey, United States

11:20am-11:50am COMP 28: Structure- and knowledge-driven interactive design
Matthias Rarey, rarey@zbh.uni-hamburg.de

University of Hamburg, Hamburg, Germany

11:50am-11:50am Discussion
PROF: Ethics 101 11:00am - 11:50am
Sunday, March 13
Balboa - Marriott Marquis San Diego Marina
Karlo Lopez, Leah McEwen, Susan Schelble, Organizing
Karlo Lopez, Leah McEwen, Susan Schelble
Cosponsored by: CHED, CINF and ETHC, Presiding
11:00am-11:25am PROF 1: Ethics education resources
Susan Schelble, sschelbl@msudenver.edu

Campus Box 52, Metropolitan State University of Denver, Denver, Colorado, United States

11:25am-11:50am PROF 2: Academic research ethics in the 21st century
Karlo Lopez, klopez@csub.edu

Department of Chemistry and Biochemistry, California State University, Bakersfield, Bakersfield, California, United States

PRES: Discussions with the President's Task Force on Employment 1:30pm - 4:00pm
Sunday, March 13
Room 2 - San Diego Convention Center
Debbie Crans, Donna Nelson, Organizing
Donna Nelson, Attila Pavlath
Cosponsored by: BIOL, BMGT, CARB, CELL, CHED, CINF, COLL, COMSCI, DAC, GEOC, I&EC, IAC, INOR, MEDI, ORGN, PHYS, PMSE, POLY, PROF, SCHB and WCC, Presiding
1:30pm-1:45pm PRES 1: Purpose of Task Force and future pans
Donna Nelson1, djnelson@ou.edu, Attila Pavlath1, attila@pavlath.org

1 University of Oklahoma, Norman, Oklahoma, United States; 1 USDA Ars Western Reg Rsrch Lab, Albany, California, United States

1:45pm-2:00pm PRES 2: Evolving nature of supply and demand factors in the chemical workforce
Tiffany Hoerter1, thoerter@gmail.com, Bryan Balazs2

1 DuPont, Wilmington, Delaware, United States; 2 Lawrence Livermore National Laboratory, Livermore, California, United States

2:00pm-2:15pm PRES 3: It's not in the job title. Realities of the chemical industries: Career opportunities for undergraduate professionals
Mary Engelman2, mkengelman@eastman.com, Susan Butts1

1 The Dow Chemical Company, Retired, Midland, Michigan, United States; 2 Eastman Chemical Company, Jonesborough, Tennessee, United States

2:15pm-2:30pm PRES 4: Can professional certificates enhance your career opportunities? Case studies and lessons learned
Allison Campbell2, allison.campbell@pnnl.gov, Paul Jagodzinski1, Paul.Jagodzinski@nau.edu

1 Office of the Dean, NAU Clg of Eng Forestry Nat Sci, Flagstaff, Arizona, United States; 2 MS J4-02, Pacific Northwest National Laboratory, Richland, Washington, United States

2:30pm-2:45pm PRES 5: Do we prepare our graduates for the jobs offered by the industry?
Karl Haider1, karl.haider@bayer.com, Debbie Crans2, Debbie.Crans@ColoState.edu

1 Bldg 8, Bayer MaterialScience, Pittsburgh, Pennsylvania, United States; 2 Colorado State University, Fort Collins, Colorado, United States

2:45pm-3:00pm PRES 6: Addressing the challenges of unemployment of young graduates and mid-carrier chemical professionals
Peter Dorhout1, dorhout@ksu.edu, William Ewing2, william.ewing@bms.com

1 College of Arts Sciences, Kansas State University, Manhattan, Kansas, United States; 2 Bristol-Meyers-Squibb, Yardley, Pennsylvania, United States

3:00pm-3:15pm PRES 7: Global factors influencing employment in the U.S.
Wayne Jones2, wjones@binghamton.edu, Marinda Wu1, marindawu@gmail.com

1 Science is Fun, Orinda, California, United States; 2 Department of Chemistry, State University of New York at Binghamton, Binghamton, New York, United States

3:15pm-4:00pm Panel Discussion
PRES: My Comments to the President's Task Force on Employment 8:00pm - 10:00pm
Sunday, March 13
Hall D - San Diego Convention Center
8:00pm-10:00pm PRES 10: What are the benefits and handicaps of possible certification, licensing, and registration of chemical professionals?

Allison Campbell1, Allison.campbell@pnnl.gov, Paul Jagodzinski2, Paul.Jagodzinski@nau.edu, Donna Nelson4, djnelson@ou.edu, Attila Pavlath3, attila@pavlath.org

1 MS K8-84, Batelle Pacific NW Natl Lab, Richland, Washington, United States; 2 Office of the Dean, NAU Clg of Eng Forestry Nat Sci, Flagstaff, Arizona, United States; 3 USDA Ars Western Reg Rsrch Lab, Albany, California, United States; 4 University of Oklahoma, Norman, Oklahoma, United States

8:00pm-10:00pm PRES 11: Do we prepare our graduates for jobs offered by industry?

Debbie Crans2, Debbie.Crans@ColoState.edu, Karl Haider1, karl.haider@bayer.com, Donna Nelson3, djnelson@ou.edu, Attila Pavlath4, attilapavlath@yahoo.com

1 Bldg 8, Bayer MaterialScience, Pittsburgh, Pennsylvania, United States; 2 Colorado State University, Fort Collins, Colorado, United States; 3 University of Oklahoma, Norman, Oklahoma, United States; 4 None, None, California, United States

8:00pm-10:00pm PRES 12: What causes unemployment among young graduate and mid-career chemical professionals, and how can we help?

Peter Dorhout1, dorhout@ksu.edu, William Ewing4, william.ewing@bms.com, Donna Nelson3, djnelson@ou.edu, Attila Pavlath2, attila@pavlath.org

1 College of Arts Sciences, Kansas State University, Manhattan, Kansas, United States; 2 USDA Ars Western Reg Rsrch Lab, Albany, California, United States; 3 University of Oklahoma, Norman, Oklahoma, United States; 4 None, Yardley, Pennsylvania, United States

8:00pm-10:00pm PRES 13: What is needed to increase underrepresented groups in the workforce?

Donna Nelson2, djnelson@ou.edu, Attila Pavlath1, attila@pavlath.org

1 USDA Ars Western Reg Rsrch Lab, Albany, California, United States; 2 University of Oklahoma, Norman, Oklahoma, United States

8:00pm-10:00pm PRES 14: What global factors influence the U.S. employment situation, and how do outsourcing and immigration contribute to this situation?

Marinda Wu1, marindawu@gmail.com, Wayne Jones2, wjones@binghamton.edu, Donna Nelson4, djnelson@ou.edu, Attila Pavlath3, attila@pavlath.org

1 Science is Fun, Orinda, California, United States; 2 Department of Chemistry, State University of New York at Binghamton, Binghamton, New York, United States; 3 USDA Ars Western Reg Rsrch Lab, Albany, California, United States; 4 University of Oklahoma, Norman, Oklahoma, United States

8:00pm-10:00pm PRES 15: AGFD Division of Agricultural and Food Chemistry: Opportunities and advances in future chemistry
Michael Appell2, michael.appell@ars.usda.gov, Bosoon Park1

1 USNPRC, USDA, ARS, Athens, Georgia, United States; 2 NCAUR-MPM, USDA, ARS, Peoria, Illinois, United States

8:00pm-10:00pm PRES 16: SCHB experience helps you meet the challenges of employment in the chemical sciences sector

Jennifer Maclachlan, pidgirl@gmail.com, Anis Rahman, chair@acs-schb.org, Joseph Sabol, jsabol@chem-consult.com, Mukund Chorghade, chorghade@comcast.net

ACS Division of Small Chemical Businesses, Harrisburg, Pennsylvania, United States

8:00pm-10:00pm PRES 17: Who are COMP members and where have they gone? Demographics and national meeting attendance

Emilio Esposito, emilio.esposito@gmail.com

exeResearch LLC, East Lansing, Michigan, United States

8:00pm-10:00pm PRES 18: Women Chemists Committee (WCC) efforts to support chemists in the workforce

Kimberly Woznack1, woznack@calu.edu, Amber Charlebois5, Laura Sremaniak6, Amy Nicely7, Christine Chow4, Amy Debaillie8, Michelle Rogers8, Mary Shultz3, Lisa Kemp2

1 Box 56, California University of Pennsylvania, California, Pennsylvania, United States; 2 Mississippi Polymer Institute, Hattiesburg, Mississippi, United States; 3 Tufts Univ, Medford, Massachusetts, United States; 4 Dept of Chem Rm 479, Wayne State University, Detroit, Michigan, United States; 5 Department of Chemistry, State University of New York-Geneseo, Geneseo, New York, United States; 6 Department of Chemistry, North Carolina State University, Raleigh, North Carolina, United States; 7 Department of Natural Sciences, Parkland College, Champaign, Illinois, United States; 8 Women Chemists Committee, Washington, District of Columbia, United States

8:00pm-10:00pm PRES 19: Chemical Innovation and Entrepreneurship Council (CIEC): Working to enhance and highlight the impact of women in STEM worldwide

Janet Bryant5,6, janetlbryant@pnnl.gov, Judith Giordan6, Elizabeth Nalley1, Jennifer Maclachlan4, Lisa Kemp3, Natalie LaFranzo2

1 Physical Science Department, Cameron University, Lawton, Oklahoma, United States; 2 Horizon Discovery, Ltd., Saint Louis, Missouri, United States; 3 Mississippi Polymer Institute, Hattiesburg, Mississippi, United States; 4 PID Analyzers, LLC, Centerville, Massachusetts, United States; 5 PNNL, Richland, Washington, United States; 6 ecosVC, Amherst, Massachusetts, United States

8:00pm-10:00pm PRES 20: Help me get a job: The Portland Section's approach to helping new graduates and working chemists find employment in chemistry

James Tung1, jimtung@gmail.com, Marilyn Mackiewicz2

1 Lacamas Laboratories, Portland, Oregon, United States; 2 Chemistry, Portland State University, Beaverton, Oregon, United States

8:00pm-10:00pm PRES 21: Perspectives on the landscape of chemistry-related employment in the ACS Puget Sound Section
Craig Fryhle2, fryhle@chem.plu.edu, Gary Christian1, Gregory Milligan3, Mark Wicholas4

1 Department of Chemistry, University of Washington, Seattle, Washington, United States; 2 Department of Chemistry, Pacific Lutheran University, Tacoma, Washington, United States; 3 Department of Chemistry, Saint Martin's University, Olympia, Washington, United States; 4 Department of Chemistry, Western Washington University, Bellingham, Washington, United States

8:00pm-10:00pm PRES 22: Welcoming work environments and broadening participation for LGBTQ+ chemists
Barbara Belmont1,3, BBelmont@noglstp.org, Mary Crawford2,3

1 Chemistry & Biochemistry, California State University Dominguez Hills, Carson, California, United States; 2 Chemistry, Knox College, Galesburg, Illinois, United States; 3 NOGLSTP, Pasadena, California, United States

8:00pm-10:00pm PRES 23: Current career challenges in the chemical sciences- A younger chemist's perspective
Wasiu Lawal, wasiulawal79@yahoo.co.uk

Earth and Environmental Science, University of Texas at Arlington, Arlington, Texas, United States

8:00pm-10:00pm PRES 24: How do changes in public higher education affect career opportunities in chemistry?

Manfred Philipp, manfred.philipp@gmail.com

Chemistry and Biochemistry, Lehman College & Graduate Center, CUNY, Scarsdale, New York, United States

8:00pm-10:00pm PRES 25: Benefits of two-year institutions for employment and employers

Frankie Wood-Black, fwblack@cableone.net

Ag., Science and Engineering, Northern Oklahoma College, Ponca City, Oklahoma, United States

8:00pm-10:00pm PRES 26: Focus on career preparation within the requirements of the ACS certified bachelor’s degree in chemistry

Thomas Wenzel1, twenzel@bates.edu, Laura Kosbar2, kosbar@gmail.com

1 Bates College, Lewiston, Maine, United States; 2 IBM, Mohegan Lake, New York, United States

8:00pm-10:00pm PRES 27: Professional master program in chemistry and biochemistry technology as a tool to improve professional qualification

Denise Petri, dfsp@usp.br

University of Sao Paulo, Sao Paulo, Brazil

8:00pm-10:00pm PRES 28: Increasing unemployment among Ph.D graduates: A problem to solve or a solution to problem?

Sofya Kostina, sofyaberezin@gmail.com

not affiliated with any, Montreal, Quebec, Canada

8:00pm-10:00pm PRES 29: Finding your way in computational electronic structure

Rudolph Magyar, rjmagya@sandia.gov

Org. 1444 Multiscale Physics, Sandia National Laboratories, Albuquerque, New Mexico, United States

8:00pm-10:00pm PRES 30: Branching out from the central science

Linda Schultz1, schultz@tarleton.edu, Michele McAfee2, mmcafee@tarleton.edu

1 Tarleton State Univ, Stephenville, Texas, United States; 2 Medical Lab Science, Tarleton State University, Fort Worth, Texas, United States

8:00pm-10:00pm PRES 31: Promoting STEM disciplines in industry through hands-on applications using the biochemical excellence in science and technology (BEST) NSF grant at Milwaukee Area Technical College (MATC)

Scott Schlipp, schlipps@matc.edu

Natural Science, Milwaukee Area Technical College, Milwaukee, Wisconsin, United States

8:00pm-10:00pm PRES 32: New reality of the chemical enterprise: Traditional and non-traditional career paths
Mary Engelman1, Edward Rosenberg2, edward.rosenberg@mso.umt.edu

1 Eastman Chemical Company, Jonesborough, Tennessee, United States; 2 Univ of Montana, Missoula, Montana, United States

8:00pm-10:00pm PRES 33: Innovation ecosystems: Technology-based economic development and workforce development
Joseph Curtis, jcbioteck@yahoo.com

Cascade Biotherapeutics, Inc, Bethesda, Maryland, United States

8:00pm-10:00pm PRES 34: Demand, regulation, and experience: The hindrance factors involved in American industry employment

Julia Pischek1, jdp1107@jagmail.southalabama.edu, Matthew Reichert3, Larry Yet2

1 Chemistry, University of South Alabama, Mobile, Alabama, United States; 2 Department of Chemistry, University of South Alabama, Mobile, Alabama, United States; 3 Chemistry, University of South Alabama, Mobile, Alabama, United States

8:00pm-10:00pm PRES 35: Recognition of– and adaptation to– the changing career landscape for chemists

Matthew Windsor, mawindsor1@gmail.com

Association for Research in Vision and Ophthalmology, Rockville, Maryland, United States

8:00pm-10:00pm PRES 36: Inside track on getting a better return on your job search investment

Jess Stinson, jess@centuryassociates.com

Century Global Executive Search, LLC, Philadelphia, Pennsylvania, United States

8:00pm-10:00pm PRES 37: Restoring an Ethical Balance in Research, Training and Career for Chemists(in absentia suggestions from genericchemist2015@gmail.com on a roadmap to restoring chemistry as the central scientific occupation)
Generic Chemist, genericchemist2015@gmail.com

Hard, Knocks, Unemployed, California, United States

8:00pm-10:00pm PRES 38: Engaging the global chemistry community through partnerships and opportunity

Christopher LaPrade2, c_laprade@acs.org, Lori Brown1

1 American Chemical Society, Washington, District of Columbia, United States; 2 Office of International Activities, American Chemical Society, Washington, District of Columbia, United States

8:00pm-10:00pm PRES 39: Solving humanitarian problems leads to innovations and jobs
Satinder Ahuja, sutahuja@atmc.net

Ahuja Academy of Water Quality, Calabash, North Carolina, United States

8:00pm-10:00pm PRES 40: Global factors and trends influencing U.S. employment, outsourcing, and immigration as related to the science industry

Nestor Maceda-Johnson1, nm724@nova.edu, Nic Ledra1, ledra@nova.edu, Jennifer Corwin1, jc2687@nova.edu, Terrance McCaffrey2

1 Chemistry, Nova Southeastern University, Fort Lauderdale, Florida, United States; 2 chemistry, Nova Southeastern University, Davie, Florida, United States

8:00pm-10:00pm PRES 41: Education and employment of chemists in Germany– Activities of the Gesellschaft Deutscher Chemiker (German Chemical Society, GDCh)
Hans-Georg Weinig, h.weinig@gdch.de, Karin Schmitz

Gesellschaft Deutscher Chemiker (GDCh), Frankfurt am Main, Germany

8:00pm-10:00pm PRES 8: What factors determine the balance between supply and demand?

Donna Nelson1, djnelson@ou.edu, Attila Pavlath2, attilapavlath@yahoo.com

1 University of Oklahoma, Norman, Oklahoma, United States; 2 None, None, California, United States

8:00pm-10:00pm PRES 9: What is the employment situation for technicians?

Mary Engelman1, mkengelman@eastman.com, Susan Butts2, sbbuttsdc@gmail.com, Donna Nelson4, djnelson@ou.edu, Attila Pavlath3, attila@pavlath.org

1 Eastman Chemical Company, Jonesborough, Tennessee, United States; 2 Susan Butts Consulting, Midland, Michigan, United States; 3 USDA Ars Western Reg Rsrch Lab, Albany, California, United States; 4 University of Oklahoma, Norman, Oklahoma, United States

PRES: My Experience with & Advice for Improving Diversity in Chemistry 8:00pm - 10:00pm
Sunday, March 13
Hall D - San Diego Convention Center
8:00pm-10:00pm PRES 42: Social networking and other 21st century tools to promote the diverse job seeker in an all inclusive chemical industry

Cary Supalo, cas380@gmail.com

Independence Science, West Lafayette, Indiana, United States

8:00pm-10:00pm PRES 43: Text-to-speech enabled organic chemistry drawing tool opens new opportunities for the blind in chemistry

Cary Supalo, cas380@gmail.com

Independence Science, West Lafayette, Indiana, United States

8:00pm-10:00pm PRES 44: Minority student pipeline math science partnership: Recruiting underrepresented minorities into science fields

Dewayne Morgan, dmorgan@usmd.edu

Office of Academic Affairs, University System of Maryland, Adelphi, Maryland, United States

PRES: My Experiences in & Advice for Organic Chemistry Courses 8:00pm - 10:00pm
Sunday, March 13
Hall D - San Diego Convention Center
8:00pm-10:00pm PRES 45: A new milestone in chemical education at the secondary level

A.K. Fazlur Rahman, frahman@ossm.edu

Chemistry, School of Science and Mathematics, Oklahoma City, Oklahoma, United States

8:00pm-10:00pm PRES 46: Learner-centered approach to teaching undergraduate organic chemistry

Amy Brown, amylynn136@gmail.com

Chemistry, Neumann University, Havertown, Pennsylvania, United States

8:00pm-10:00pm PRES 47: Advancing graduate education in the chemical sciences with a modular curriculum

Ronald Halterman2, RHalterman@ou.edu, Michael Ashby1

1 Dept of Chem Biochem, Norman, Oklahoma, United States; 2 Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma, United States

8:00pm-10:00pm PRES 48: Identifying areas of need for the learning of organic chemistry in prerequisite classes

Olivia Kinney, Debbie Crans, Debbie.Crans@ColoState.edu

Colorado State University, Fort Collins, Colorado, United States

8:00pm-10:00pm PRES 49: Organic chemistry, life, the universe & everything (OCLUE)

Melanie Cooper1, mmc@msu.edu, Michael Klymkowsky2

1 Michigan State University, East Lansing, Michigan, United States; 2 MCD Biology, UC Boulder, Boulder, Colorado, United States

MPPG: Preparing for the Real World: Challenges Faced by Young Investigators 8:30am - 12:00pm
Monday, March 14
Room 4 - San Diego Convention Center
Whitney Kellett, Benjamin Levine, Kenneth Merz, Sereina Riniker, Dominika Zgid, Organizing
Sereina Riniker, Dominika Zgid
Cosponsored by: CHED, CINF, COMP, PHYS and YCC, Presiding
8:30am-8:45am MPPG 21: Choosing your research adviser wisely
T. Daniel Crawford, crawdad@vt.edu

Virginia Tech, Blacksburg, Virginia, United States

8:45am-9:00am MPPG 22: How to choose an academic advisor: Do's and don'ts
Anna Krylov, krylov@usc.edu

Univ of Southern California, Los Angeles, California, United States

9:00am-9:15am MPPG 23: Do what you like, like what you do: Navigating the academic world after college
Francesco Paesani, 30004521@acs.org

Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, United States

9:15am-9:30am MPPG 24: Finding advisors whose research you like the best
Toru Shiozaki, shiozaki@northwestern.edu

Department of Chemistry, Northwestern University, Evanston, Illinois, United States

9:30am-10:00am Panel Discussion: 'Choosing a Graduate and Postgraduate Advisor'
10:00am-10:30am Intermission – Café con Ordenadores
10:30am-10:45am MPPG 25: Research career in industry: A glass filled with life
Christopher Bayly, bayly@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States

10:45am-11:00am MPPG 26: Finding the perfect job: Careers for organic chemists in pharma and academia
Amy Dounay, amy.dounay@coloradocollege.edu

Chemistry and Biochemistry, Colorado College, Colorado Springs, Colorado, United States

11:00am-11:15am MPPG 27: From academia, to startup, to big pharma, and back again?
Gregory Landrum, gregory.landrum@novartis.com

NIBR, Basel, Switzerland

11:15am-11:30am MPPG 28: Down the rabbit hole: from B3LYP to x86
Jeff Hammond, jeff_hammond@acm.org

Parallel Computing Lab, Intel Corporation, Portland, Oregon, United States

11:30am-12:00pm Panel discussion: 'Choosing Between Careers in Academia vs. Industry'
PRES: Is There a Crisis in Organic Chemistry Education? 9:00am - 11:50am
Monday, March 14
Room 2 - San Diego Convention Center
Debbie Crans, Donna Nelson, Organizing
Melanie Cooper, Donna Nelson
Cosponsored by: BIOL, CELL, CHED, CINF, DAC, GEOC, I&EC, INOR, MEDI, ORGN, POLY and PROF, Presiding
9:00am-9:15am PRES 50: Introduction: Evaluating organic chemistry textbooks
Donna Nelson, djnelson@ou.edu

University of Oklahoma, Norman, Oklahoma, United States

9:15am-9:30am PRES 51: Cengage: Is the organic chemistry course changing in reaction to the new MCAT?
Maureen Rosener, maureen.rosener@cengage.com

Cengage Learning, Boston, Massachusetts, United States

9:30am-9:45am PRES 52: Elsevier: Is there a crisis in organic chemistry education?
Kathleen Birtcher, k.birtcher@elsevier.com

Science & Technology Books, Elsevier, Cambridge, Massachusetts, United States

9:45am-10:00am PRES 53: McGraw-Hill: Adapting to the modern organic chemistry student
Andrea Pellerito, andrea_pellerito@mcgraw-hill.com

McGraw-Hill, Dubuque, Iowa, United States

10:00am-10:15am PRES 54: Macmillan: How can a publisher partner with and support faculty in times of curriculum change in organic chemistry.
Lauren Schultz, lauren.schultz@macmillan.com

Life and physical sciences, Macmillan Publishers, New York, New York, United States

10:15am-10:30am PRES 55: Pearson: Future of teaching organic chemistry
Jeanne Zalesky, jeanne.zalesky@ablongman.com

Pearson Education, Newton, Massachusetts, United States

10:30am-10:45am PRES 56: Wiley: How will/does technology change the classroom
Sean Hickey, shickey@wiley.com

Chemistry, John Wiley & Sons, Hoboken, New Jersey, United States

10:45am-10:50am Remarks and Structure - Donna Nelson
10:50am-11:20am Panel Discussion
MPPG: Preparing for the Real World: Challenges Faced by Young Investigators 1:00pm - 2:30pm
Monday, March 14
Room 4 - San Diego Convention Center
Whitney Kellett, Benjamin Levine, Kenneth Merz, Sereina Riniker, Dominika Zgid, Organizing
Benjamin Levine
Cosponsored by: CHED, CINF, COMP, PHYS and YCC, Presiding
1:00pm-1:15pm MPPG 32: Doing theory with undergraduates and having a great time
Robert Cave, robert_cave@hmc.edu

Department of Chemistry, Claremont, California, United States

1:15pm-1:30pm MPPG 33: Building an undergraduate research program at a large, comprehensive university
Maria-Clelia Milletti, mmilletti@emich.edu

Chemistry, Eastern Michigan University, Detroit, Michigan, United States

1:30pm-1:45pm MPPG 34: Running a productive lab where students are transformed and you actually publish
George Shields, george.shields@bucknell.edu

Deans Office, 113 Marts Hall, Bucknell University, Lewisburg, Pennsylvania, United States

1:45pm-2:00pm MPPG 35: Building a new research program in medicinal chemistry at a small liberal arts college
Amy Dounay, amy.dounay@coloradocollege.edu

Chemistry and Biochemistry, Colorado College, Colorado Springs, Colorado, United States

2:00pm-2:30pm Panel Discussion: 'Building a Research Program at a Primarily Undergraduate Institution'
SCHB: Computers in Chemistry: Bridging the Gap between Clients & Software 1:05pm - 5:00pm
Monday, March 14
Santa Rosa - Marriott Marquis San Diego Marina
M.C. Johnson, Organizing
M.C. Johnson
Cosponsored by: CINF and ORGN, Presiding
1:05pm-1:10pm Introductory Remarks
1:10pm-1:40pm SCHB 8: Connecting the needs of the customer with what a small chemical software company has to offer
M Catherine Johnson, mcjohnson@inchemdesign.com, John Clark, Cliff Cannon

Integrated Chemistry Design, San Diego, California, United States

1:40pm-2:10pm SCHB 9: Perspectives on selling custom software development services to R&D scientists in large organizations
Eric Milgram, Eric.Milgram@pepsico.com

Applied Scientific Consulting, Sandy Hook, Connecticut, United States

2:10pm-2:40pm SCHB 10: Sometimes the mountain has to move…but you cannot let it realise it’s happening
Edmund Champness2, ed.champness@optibrium.com, Matthew Segall1

1 R&D, Optibrium Limited, Cambridgeshire, United Kingdom; 2 Optibrium Ltd, Cambridge, United Kingdom

2:40pm-3:10pm SCHB 11: Vendors are from Venus, clients are from Mars: How to build a successful partnership
Christopher Waller, chris_l_waller@hotmail.com

MRLIT, Merck and Co. (MSD), Boston, Massachusetts, United States

3:10pm-3:25pm Intermission
3:25pm-3:55pm SCHB 12: Creative market solutions from customer requests: Simple ideas can lead to big products
Tim Cheeseright1, tim@cresset-bmd.com, Robert Scoffin2

1 Cresset, Cambridgeshire, United Kingdom; 2 Cresset BMD Ltd, Cambridgeshire, United Kingdom

3:55pm-4:25pm SCHB 13: Enabling large-scale ligand discovery on the cloud
Paul Hawkins, phawkins@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States

4:25pm-4:55pm SCHB 14: From CDD vault, CDD vision to CDD models: Software for biologists and chemists doing drug discovery
Sean Ekins, ekinssean@yahoo.com, Barry Bunin

Collaborative Drug Discovery, Inc, Burlingame, California, United States

4:55pm-5:00pm Concluding Remarks
PRES: Diversity-Quantification-Success? 1:30pm - 4:00pm
Monday, March 14
Room 2 - San Diego Convention Center
Debbie Crans, Donna Nelson, Organizing
Elizabeth Nalley
Cosponsored by: BIOL, CELL, CHED, CINF, COLL, COMSCI, DAC, GEOC, I&EC, INOR, MEDI, ORGN, PHYS, POLY, PROF and WCC, Presiding
1:30pm-1:45pm PRES 65: Introduction: Diversity strengthening STEM education
Elizabeth Nalley1, annn@cameron.edu, Donna Nelson2

1 Chemistry, Physics & Engineering, Cameron University, Lawton, Oklahoma, United States; 2 University of Oklahoma, Norman, Oklahoma, United States

1:45pm-2:00pm PRES 66: A decade of tracking demographics in the Top 50 Chemistry Departments via the Nelson Diversity Surveys
Donna Nelson, djnelson@ou.edu

University of Oklahoma, Norman, Oklahoma, United States

2:00pm-2:15pm PRES 67: Accelerating change: #DiversitySolutions on social media
Dontarie Stallings2,1, DontarieStallings@gmail.com, Rigoberto Hernandez2,1

1 Chemistry, Georgia Institute of Technology, Atlanta, Georgia, United States; 2 OXIDE, Atlanta, Georgia, United States

2:15pm-2:30pm PRES 68: Progress made in smashing the glass ceiling
Valerie Kuck, vjkuck@yahoo.com

Polymeric Materials, Bell Labs, Lucent Technologies (ret.), Poway, California, United States

2:30pm-2:45pm PRES 69: Critical mass takes courage: Diversity in the chemical sciences
Sibrina Collins, sibrina.collins@gmail.com

The Charles H. Wright Museum of African American History, Detroit, Michigan, United States

2:45pm-3:00pm PRES 70: The challenges facing women in chemistry and other scientific and engineering fields
Madeleine Jacobs, madeleine.susan.jacobs@gmail.com

Council of Scientific Society Presidents, North Potomac, Maryland, United States

3:00pm-3:15pm PRES 71: Demographics of research-active chemistry departments
Rigoberto Hernandez, hernandez@gatech.edu, Dontarie Stallings, Srikant Iyer

School of Chemistry Biochemistry, MC0400, Georgia Institute of Technology, Atlanta, Georgia, United States

3:15pm-4:00pm Panel Discussion
MPPG: Computer-Aided Drug Design 8:00am - 12:00pm
Tuesday, March 15
Room 4 - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
Clara Christ
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
8:00am-8:30am MPPG 49: Binding affinity prediction from molecular simulations– a new standard method in structure-based drug design?
Clara Christ, c.christ@gmx.de

Bayer Pharma AG, Berlin, Germany

8:30am-9:00am MPPG 50: Improving and applying alchemical binding free energy calculations
David Mobley, dmobley@gmail.com

Pharmaceutical Sciences, University of California, Irvine, Irvine, California, United States

9:00am-9:40am MPPG 51: Incorporating changes in protein-ligand hydration in free energy calculations
Jonathan Essex, jwe1@soton.ac.uk, Gregory Ross

Chemistry, University of Southampton, Southampton, United Kingdom

9:40am-10:20am MPPG 52: Real-world impact of free energy perturbation
Mark Murcko, mark_murcko@comcast.net

Disruptive Biomedical, LLC, Holliston, Massachusetts, United States

10:20am-10:40am Intermission– Café con Ordenadores
10:40am-11:20am MPPG 53: Attempts to improve free-energy simulation binding-affinity estimates by quantum-mechanical methods
Ulf Ryde, Ulf.Ryde@teokem.lu.se

Lund University, Lund, Sweden

11:20am-12:00pm MPPG 54: Improving kinase inhibitor selectivity with free energy perturbation molecular dynamics simulations
Benoit Roux2, roux@uchicago.edu, Yilin Meng1

1 Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois, United States; 2 Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, United States

MPPG: Computer-Aided Drug Design 8:20am - 11:50am
Tuesday, March 15
Room 5A - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
M. Holloway, Charles Reynolds
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
8:20am-8:50am MPPG 55: Computer-aided drug design: Successes and opportunities
M. Holloway2, kate_holloway@merck.com, Charles Reynolds1, creynolds@gfreebio.com

1 Gfree Bio, LLC, Lansdale, Pennsylvania, United States; 2 Structural Chemistry, Merck & Co, Lansdale, Pennsylvania, United States

8:50am-9:30am MPPG 56: Cheminformatics: Past, present, future
Frank Brown1, frank.brown@merck.com, Huijun Wang2

1 Structural Chemistry, Merck & Co., West Point, Pennsylvania, United States; 2 Structural Chemistry, Merck & Co., Kenilworth, New Jersey, United States

9:30am-10:10am MPPG 57: Docking and scoring: A perspective on exploiting protein structures for CADD
Ajay Jain, ajain@jainlab.org

Bioengineering and Therapeutic Sciences, UCSF, San Francisco, California, United States

10:10am-10:30am Intermission– Café con Ordenadores
10:30am-11:10am MPPG 58: Current issues with computer-aided lead optimization
William Jorgensen, william.jorgensen@yale.edu

Dept of Chemistry, Yale University, New Haven, Connecticut, United States

11:10am-11:50am MPPG 59: Computer-aided drug design: Looking forward
Catherine Peishoff, catherine.e.peishoff@gsk.com

Chemical Sciences, GlaxoSmithKline, Collegeville, Pennsylvania, United States

MPPG: Computer-Aided Drug Design 1:00pm - 4:40pm
Tuesday, March 15
Room 5A - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
Veerabahu Shanmugasundaram
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
1:00pm-1:40pm MPPG 71: Enthalpy good, entropy bad? What can we learn from protein-ligand binding thermodynamic signatures?
David Hepworth, david.hepworth@pfizer.com

Worldwide Medicinal Chemistry, Pfizer Worldwide Research and Development, Cambridge, Massachusetts, United States

1:40pm-2:20pm MPPG 72: Plumbing the depths of entropy and enthalpy in molecular recognition
Michael Gilson1, mgilson@ucsd.edu, Andrew Fenley1, Samuel Kantonen1, Hari Muddana2, Michael Potter3, Simon Webb4

1 School of Pharmacy and Pharmaceutical Sci., U. C. San Diego, La Jolla, California, United States; 1 School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, California, United States; 2 UCSD, La Jolla, California, United States; 3 VeraChem LLC, Germantown, Maryland, United States

2:20pm-3:00pm MPPG 73: Ins and outs of binding: Why dynamic drug-target occupancy relationships matter in the in vivo setting
Jose Duca1, jose.duca@novartis.com, Robert Pearlstein2

1 Novartis Institutes for BioMedical Research, Hopkinton, Massachusetts, United States; 2 Novartis, Cambridge, Massachusetts, United States

3:00pm-3:20pm Intermission
3:20pm-4:00pm MPPG 74: Kinetic stability of protein-ligand complexes: Applications in virtual screening
4:00pm-4:40pm MPPG 75: Water: A small but revolutionary molecule that together with GPCR X-ray structures enables new design approaches for kinetics, selectivity and potency
Jonathan Mason, jonathan.mason@heptares.com, Andrea Bortolato, Dahlia Weiss, Francesca Deflorian

Heptares Therapeutics Ltd, Welwyn Garden City, United Kingdom

MPPG: Computer-Aided Drug Design 8:00am - 11:35am
Wednesday, March 16
Room 5A - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
Vijay Pande
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
8:00am-8:40am MPPG 86: In silico fragment based drug discovery by molecular simulations
Gianni De Fabritiis, g.defabritiis@gmail.com

UPF, Barcelona, Spain

8:40am-9:20am MPPG 87: Redesigning drug design
John Chodera, john.chodera@choderalab.org

Computational Biology Program, Memorial Sloan Kettering Cancer Center, New York, New York, United States

9:20am-10:00am MPPG 88: Can molecular dynamics simulations cure what ails ya?
David Shaw1,2, david@DEShawResearch.com

1 D. E. Shaw Research, New York, New York, United States; 2 Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States

10:00am-10:15am Intermission– Café con Ordenadores
10:15am-10:55am MPPG 89: Future of molecular dynamics simulation
Vijay Pande, pande@stanford.edu

Stanford University, Stanford, California, United States

10:55am-11:35am MPPG 90: Allostery through the computational microscope: Conformational selection in a canonical signaling domain
Rommie Amaro2, ramaro@ucsd.edu, Robert Malmstrom3, Alexandr Kornev4, Susan Taylor1

1 Leichtag 412 MC 0654, Univ of California, La Jolla, California, United States; 2 Chemistry and Biochemistry, University of California, San Diego, San Clemente, California, United States; 3 Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, United States; 4 UC San Diego, La Jolla, California, United States

ANYL: Big Data & Small Data 8:10am - 11:50am
Wednesday, March 16
East Coast - Wyndham San Diego Bayside
Barry Lavine, Organizing
Barry Lavine
Cosponsored by: CINF and MPPG, Presiding
8:10am-8:15am Introductory Remarks
8:15am-8:40am ANYL 281: Classification and geospatial estimation of titanium dioxide polymorphs using multivariate exploratory methods
Joseph Smith1, Frank Smith2, Billy Glass2, Karl Booksh1, kbooksh@udel.edu

1 Chemistry and Biochemistry, The University of Delaware, Newark, Delaware, United States; 2 Geological Sciences, The University of Delaware, Newark, Delaware, United States

8:40am-9:05am ANYL 282: Infrared imaging and multivariate curve resolution for the forensic examination of automotive paint chips
Barry Lavine1, Matthew Allen1, matthew.d.allen@okstate.edu, Nuwan Perera2, Koichi Nishikida3

1 Oklahoma State University, Stillwater, Oklahoma, United States; 2 Chemistry, Oklahoma State University, Stillwater, Oklahoma, United States; 3 2Materials Science Center, College of Engineering, University of Wisconsin, Madison, Wisconsin, United States

9:05am-9:30am ANYL 283: Quality assessments for organically-complex botanical extracts
Brian Rohrback2, brian_rohrback@infometrix.com, Scott Ramos1, Peter Gibson3

1 Infometrix Inc, Bothell, Washington, United States; 2 Infometrix, Inc., Bothell, Washington, United States; 3 GW Pharmaceuticals, Sittingbourne, Kent, United Kingdom

9:30am-10:10am ANYL 284: Pattern recognition assisted infrared library searching of automotive paints for forensic analysis
Barry Lavine1, bklab@chem.okstate.edu, Matthew Allen2, Collin White1, Ayuba Fasasi3

1 Oklahoma State University, Stillwater, Oklahoma, United States; 2 Chemistry, Oklahoma State University, Stillwater, Oklahoma, United States

10:10am-10:30am Intermission
10:30am-10:55am ANYL 285: Withdrawn
10:55am-11:20am ANYL 286: Withdrawn
11:20am-11:45am ANYL 287: Data processing challenges in single neuron whole genome sequencing
Suzanne Rohrback1, suzanne.rohrback@gmail.com, Jerold Chun2

1 Biomedical Sciences, UC San Diego, San Diego, California, United States; 2 Molecular Neuroscience, The Scripps Research Institute, San Diego, California, United States

11:45am-11:50am Concluding Remarks
ANYL: Chemical Imaging: Applications, Advances & Challenges 8:10am - 11:50am
Wednesday, March 16
Bay Room - Wyndham San Diego Bayside
Raychelle Burks, Jeffrey Terry, Organizing
Jeffrey Terry
Cosponsored by: CINF and MPPG, Presiding
8:10am-8:50am ANYL 298: Nanoscience approaches to heterogeneity in biological systems
Paul Weiss, psw@cnsi.ucla.edu

MC 722710, California NanoSystems Inst. UCLA, Los Angeles, California, United States

8:50am-9:10am ANYL 299: Rapid-target bio-imaging of tumors through specific biosynthesis of fluorescent probes
Jing Ye, jing832808@163.com, Jianling Wang, Shengping Gao, Xuemei Wang

School of Biological Science and Medical Engineering, Southeast University, Nanjing, China

9:10am-9:30am Intermission
9:30am-9:50am ANYL 300: High-throughput screening method for creating and assessing ionic liquid/porous silicon microarrays
Shruti Trivedi, shrutitrivedi.bhu@gmail.com, Frank Bright

Chemistry, SUNY-Buffalo, Buffalo, New York, United States

9:50am-10:10am ANYL 301: Micro-Raman analysis of crayfish exoskeleton mineralization using a newly released spectroscopic imaging software
Seth Brittle2, brittle.3@wright.edu, Daniel Foose2, Kevin O'Neil1, Zofia Gagnon4, Ioana Pavel Sizemore3

2 Chemistry, Wright State University, Dayton, Ohio, United States; 4 Environmental Science and Policy, Marist College, Poughkeepsie, New York, United States

10:10am-10:30am ANYL 302: Raman microspectroscopic mapping with multivariate curve resolution-alternating least squares (MCR-ALS) applied to a high-pressure polymorph of titanium dioxide, TiO2-II
Joseph Smith1, joesmith@udel.edu, Frank Smith2, Billy Glass2, Karl Booksh1

1 Chemistry and Biochemistry, The University of Delaware, Newark, Delaware, United States; 2 Geological Sciences, The University of Delaware, Newark, Delaware, United States

10:30am-10:50am Intermission
10:50am-11:30am ANYL 303: Multidimensional imaging and computational approaches to understanding tissue morphogenesis
Kristen Kwan1, kmkwan@genetics.utah.edu, Yong Wan2, Charles Hansen2, Hannah Gordon1, Sydney Stringham1, Brooke Froelich1

1 Human Genetics, University of Utah, Salt Lake City, Utah, United States; 2 Computer Science, University of Utah, Salt Lake City, Utah, United States

11:30am-11:50am ANYL 304: Nanospectral imaging and nanospectroscopy via photo-induced force
Derek Nowak, derek.b.nowak@gmail.com, William Morrison, Sung Park

Molecular Vista, San Jose, California, United States

CHAS: Chemical, Sample & Asset Management Tools 9:00am - 12:05pm
Wednesday, March 16
Marina Room - Hilton Gaslamp San Diego
Leah McEwen, Joseph Pickel, Ralph Stuart, Organizing
Leah McEwen, Joseph Pickel, Ralph Stuart
Cosponsored by: CCS and CINF, Presiding
9:00am-9:10am Introductory Remarks
9:10am-9:35am CHAS 30: Chemical inventories: What are they good for?
Ralph Stuart, secretary@dchas.org

Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States

9:35am-10:00am CHAS 31: How UNHCEMS® has evolved from a chemical inventory tracking system to an environmental management tool
Karrie Myer, Karrie.Myer@unh.edu, Phillip Collins, Philip.Collins@unh.edu, Andy Glode, andy.glode@unh.edu

University of New Hampsire, Durham, New Hampshire, United States

10:00am-10:25am CHAS 32: Use of RFID and scanning technologies for managing large chemical Inventories
Joseph Pickel, pickeljm@ornl.gov

Oak Ridge National Laboratory, Knoxville, Tennessee, United States

10:25am-10:50am Intermission
10:50am-11:15am CHAS 33: Developing a cloud based chemical inventory application for the University of California system (UC Chemicals)
Haim Weizman, hweizman@ucsd.edu

Dept Chem Biochem, Univ of California San Diego, La Jolla, California, United States

11:15am-11:40am CHAS 34: Using a chemical inventory system to optimize safe laboratory research
Grace Baysinger2, graceb@stanford.edu, R. Kevin Creed3, Lawrence Gibbs1

1 Stanford Univ, Palo Alto, California, United States; 2 Swain Chem & Chem Eng Library, Stanford University Libraries, San Jose, California, United States; 3 Stanford University, Stanford, California, United States

11:40am-12:05pm CHAS 35: Chemical stockroom management: Lessons learned ten years in
Samuella Sigmann, sigmannsb@appstate.edu

Chemistry, Appalachian State University, Boone, North Carolina, United States

ANYL: Chemical Imaging: Applications, Advances & Challenges 1:00pm - 5:00pm
Wednesday, March 16
Bay Room - Wyndham San Diego Bayside
Raychelle Burks, Jeffrey Terry, Organizing
Raychelle Burks
Cosponsored by: CINF and MPPG, Presiding
1:00pm-1:40pm ANYL 323: Emergent structure and dynamics of patchy coarse-grained nanoparticles
Rigoberto Hernandez, hernandez@gatech.edu

School of Chemistry Biochemistry, MC0400, Georgia Institute of Technology, Atlanta, Georgia, United States

1:40pm-2:00pm ANYL 324: Highly sensitive detection and bio-imaging of cancers based on new supramolecular probes and multifunctional nano-interface
Xuemei Wang, xuewang@seu.edu.cn

Southeast University, Nanjing, China

2:00pm-2:20pm ANYL 325: Investigation of surface morphology and conductance of multi-acid side chain membranes by atomic force microscopy
Austin Barnes, abarnes@chem.ucsb.edu, Nicholas Economou, Steven Buratto

Chemistry, University of California, Santa Barbara, Santa Barbara, California, United States

2:20pm-2:40pm Intermission
2:40pm-3:00pm ANYL 326: Withdrawn
3:00pm-3:20pm ANYL 327: Plasmonic nanofocusing NSOM-Raman tip for high resolution chemical imaging
Ruoxue Yan, rxyan@engr.ucr.edu

Department of Chemical and Environmental Engineering, UC Riverside, Riverside, California, United States

3:20pm-3:40pm ANYL 328: Tip enhanced Raman scattering: New nanoscale chemical imaging method
Andrey Krayev1, akrayev@aist-nt.com, Marc Chaigneau2

1 AIST-NT Inc, Novato, California, United States; 2 Horiba Scientific, Palaiseau, France

3:40pm-4:00pm Intermission
4:00pm-4:20pm ANYL 329: Evaluating small molecule histone inhibitors with high resolution mass spectrometry and 3D cell cultures
Amanda Hummon1, ahummon@nd.edu, Benjamin Garcia3, Simone Sidoli3, Monica Schroll1, Xin Liu2, Peter Feist1

1 Univ of Notre Dame, Notre Dame, Indiana, United States; 2 University of Notre Dame, Notre Dame, Indiana, United States; 3 Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, Pennsylvania, United States

4:20pm-4:40pm ANYL 330: Human islet amyloid polypeptide N-terminus fragment self-assembly: Effect of conserved disulfide bond on aggregation propensity
Maxwell Giammona4, mgiammona@chem.ucsb.edu, Alexandre Ilitchev4, Thanh Do6, Joan Shea2, Daniel Raleigh1, Michael Bowers3, Steven Buratto5

1 SUNY Stony Brook, Stony Brook, New York, United States; 2 U of Cal Santa Barbara, Santa Barbara, California, United States; 3 Univ of Californ Santa Barbara, Santa Barbara, California, United States; 4 Chemistry and Biochemistry, University of California Santa Barbara, Pleasanton, California, United States; 6 Chemistry and Biochemistry, University of California Santa Barbara, Santa Barbara, California, United States

4:40pm-5:00pm ANYL 331: Nanoscale chemical mapping of polymer matrix composites
Dhriti Nepal, dhriti.nepal@gmail.com

Materials and Manufacturing Directorate, Air Force Research Lab, Wright Patterson AFB, Ohio, United States

MPPG: Computer-Aided Drug Design 1:00pm - 4:45pm
Wednesday, March 16
Room 5A - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
Darrin York
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
1:00pm-1:40pm MPPG 101: Structure guided design of nucleic acid modifications for antisense drug discovery
Punit Seth, pseth100@yahoo.com

Medicinal Chemistry, Isis Pharmaceuticals, Carlsbad, California, United States

1:40pm-2:20pm MPPG 102: High-throughput platform assay technology for the discovery of pre-microRNA-selective small molecule probes
Amanda Garner, algarner@umich.edu

Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan, United States

2:20pm-3:00pm MPPG 103: Exploiting the ribosome and RNA: Small-molecule interactions for a pipeline of new antibiotics
Erin Duffy, eduffy@rib-x.com

Melinta Therapeutics, Inc, New Haven, Connecticut, United States

3:00pm-3:15pm Intermission
3:15pm-4:00pm MPPG 104: DNA and RNA in multi-target drug design for the microsatellite disease myotonic dystrophy
Steven Zimmerman, sczimmer@illinois.edu, Long Luu, Lien Nguyen, Julio Serrano, Juyeon Lee

Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States

4:00pm-4:45pm MPPG 105: Light at the end of the tunnel in modeling RNA structure, dynamics and interactions
Thomas Cheatham, tec3@utah.edu

Dept of Med Chem Skaggs 307, University of Utah, Salt Lake City, Utah, United States

ANYL: Big Data & Small Data 1:10pm - 4:30pm
Wednesday, March 16
East Coast - Wyndham San Diego Bayside
Barry Lavine, Organizing
Barry Lavine
Cosponsored by: CINF and MPPG, Presiding
1:10pm-1:15pm Introductory Remarks
1:15pm-1:40pm ANYL 305: Ranking multivariate calibration models formed from multiple tuning parameters: Model penalties
John Kalivas, kalijohn@isu.edu, Alister Tencate

Idaho State Univ, Pocatello, Idaho, United States

1:40pm-2:05pm ANYL 306: Adaptive regression via subspace elimination: A novel algorithm for predicting in the presence of uncalibrated interferents
Joshua Ottaway2, Jottaway@udel.edu, Karl Booksh1

1 University of Delaware, Newark, Delaware, United States; 2 Chemistry and Biochemistry, University of Delaware, Newark, Delaware, United States

2:05pm-2:30pm ANYL 307: Compensating for the effects of unusual samples and variables in data for multivariate calibrations
Steven Brown, sub@udel.edu, Cannon Giglio

Chemistry and Biochemistry, University of Delaware, Newark, Delaware, United States

2:30pm-2:50pm Intermission
2:50pm-3:15pm ANYL 308: Development of a predictive screening method for selection of two-dimensional liquid chromatography column pair combinations
Rebecca Lindsey1, r.lindsey78@gmail.com, Dwight Stoll2, Peter Carr3, Joern Siepmann1,4

1 Chemistry and the Chemical Theory Center, University of Minnesota, Minneapolis, Minnesota, United States; 2 Chemistry, Gustavus Adolphus College, Saint Peter, Minnesota, United States; 3 Chemistry, University of Minnesota, Minneapolis, Minnesota, United States; 4 Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, Minnesota, United States

3:15pm-3:40pm ANYL 309: Discovery-based analysis of GC x GC - TOFMS data using tile-based Fisher ratio software and combinatorial threshold determination
Robert Synovec, synovec@chem.washington.edu, Brendon Parsons, Nathanial Watson, Brooke Reaser, Christopher Freye, David Pinkerton

Department of Chemistry, University of Washington, Seattle, Washington, United States

3:40pm-4:05pm ANYL 310: Using multidimensional data to simplify the analysis of individual lipoprotein and cholesterol distributions
Michael Eagleburger, Jason Cooley, Renee Jiji, jijir@missouri.edu

Chemistry, University of Missouri, Columbia, Missouri, United States

4:05pm-4:30pm ANYL 311: Surface-enhanced Raman spectroscopy study of the interaction between colloidal silver nanoparticles and Dengue virus virions: Unsupervised automated peak detection and quantification using a newly released spectroscopic imaging software
Daniel Foose1, dpfoose@gmail.com, Sesha Paluri1, Kelley Williams3, Kevin Dorney1, Catherine Anders1, Nancy Bigely2, Ioana Pavel Sizemore1

1 Chemistry, Wright State University, Dayton, Ohio, United States; 2 Neuroscience, Cell Biology and Physiology, Wright State University, Dayton, Ohio, United States; 3 Pharmacology & Toxicology, Wright State University, Dayton, Ohio, United States

CHAS: Chemical, Sample & Asset Management Tools 1:30pm - 4:35pm
Wednesday, March 16
Marina Room - Hilton Gaslamp San Diego
Leah McEwen, Joseph Pickel, Ralph Stuart, Organizing
Leah McEwen, Joseph Pickel, Ralph Stuart
Cosponsored by: CCS and CINF, Presiding
1:30pm-1:40pm Introductory Remarks
1:40pm-2:05pm CHAS 36: UC safety: An integrated approach to your chemical management needs
Safa Hussain1, smhussain@ucdavis.edu, Ken Smith2, ken.smith@ucop.edu

1 UC Risk and Safety Solutions, University of California, Davis, California, United States; 2 EH&S, University of California, Carlsbad, California, United States

2:05pm-2:30pm CHAS 37: Targeted safety assessments through technology
James Crandall, james_crandall@hotmail.com

Environmental Health and Safety, Weill Cornell Medicine, New York, New York, United States

2:30pm-2:55pm CHAS 38: Withdrawn
2:55pm-3:20pm Intermission
3:20pm-3:45pm CHAS 39: PubChem’s laboratory chemical safety summary (LCSS)
Sunghwan Kim1, kimsungh@ncbi.nlm.nih.gov, Jian Zhang1, Asta Gindulyte1, Paul Thiessen1, Leah McEwen2, Ralph Stuart3, Evan Bolton1, Steve Bryant1

1 National Library of Medicine, National Institutes of Health, Rockville, Maryland, United States; 2 Clark Physical Sciences Library, Cornell University, Ithaca, New York, United States; 3 Keene State College, Keene, New Hampshire, United States

3:45pm-4:10pm CHAS 40: Socio-legal issues in the application of semantic web technology to chemical safety
Jeremy Frey, j.g.frey@soton.ac.uk, Mark Borkum

University of Southampton, Southampton, United Kingdom

4:10pm-4:35pm CHAS 41: Pre-competitive collaboration to advance laboratory safety
Carmen Nitsche, cnitsche@swbell.net

Pistoia Alliance, San Antonio, Texas, United States

MPPG: Computer-Aided Drug Design 8:00am - 12:00pm
Thursday, March 17
Room 5A - San Diego Convention Center
Rommie Amaro, M. Holloway, Johanna Jansen, Organizing
Jose Duca
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
8:00am-8:40am MPPG 116: Withdrawn
8:40am-9:10am MPPG 117: Structure-based design of inhibitors of the riboflavin pathway targeting the bacterial FMN riboswitch
Thierry Fischmann, thierry.fischmann@merck.com

Merck Research Laboratories, Kenilworth, New Jersey, United States

9:10am-9:40am MPPG 118: Boosting antibody developability through computational protein design
Qing Chai, qingchai@lilly.com

Eli Lilly, San Diego, California, United States

9:40am-9:50am Intermission– Café con Ordenadores
9:50am-10:30am MPPG 119: Peptide drug hunter: Exploring intracellular target space and druggability
Tomi Sawyer, tomi.sawyer@merck.com

Merck Research Laboratories, Boston, Massachusetts, United States

10:30am-11:00am MPPG 120: Enhanced sampling methods in drug design
Adrian Roitberg, roitberg@ufl.edu

Chemistry, University of Florida, Gainesville, Florida, United States

11:00am-11:30am MPPG 121: Withdrawn
11:30am-12:00pm MPPG 122: Unnatural DNA aptamers and the potential to generate unique macromolecular targeting modalities
Glen Spraggon1, gspraggon@yahoo.com, Lori Jennings2, Darbi Witmer1, Andreas Kreusch1, Badry Bursulaya1, Jennifer Shaffer1, David Jones1, Susanne Swalley2, Scott Clarkson2, Mark Knuth1, Scott Lesley1

1 Protein Science and Biotherapeutics, GNF, La Jolla, California, United States; 2 Developmental and Molecular Pathways, Novartis Institute for Biomedical Research, Cambridge, Massachusetts, United States

ANYL: Chemical Imaging: Applications, Advances & Challenges 8:30am - 11:50am
Thursday, March 17
West Coast - Wyndham San Diego Bayside
Raychelle Burks, Jeffrey Terry, Organizing
Raychelle Burks
Cosponsored by: CINF and MPPG, Presiding
8:30am-9:10am ANYL 340: Imaging the mobility of Ag films encapsulated in 3C-SiC as a function of annealing temperature
Daniel Velazquez, dvelazqu@hawk.iit.edu

Department of Physics, Illinois Institute of Technology, Chicago, Illinois, United States

9:10am-9:30am ANYL 341: Open plans of a low cost fluorescence and imaging ellipsometry microscope
Victoria Nguyen1, victoria.nguyen300@gmail.com, John Rizzo2, Jacquelyn Zehner3, Walter Cook4, Babak Sanii5

1 Keck Science Department, Scripps College, Claremont, California, United States; 2 Keck Science Department, Claremont McKenna College, Claremont, California, United States; 4 Keck Science Department, Claremont, California, United States; 5 Keck Science Department, Chemistry, Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, United States

9:30am-9:50am ANYL 342: In-cell fluorogenic tag-probe system for protein localization and dynamics imaging
Wataru Nomura, nomura.mr@tmd.ac.jp, Nami Ohashi, Hirokazu Tamamura

Institute of Biomaterials Bioengineering, Tokyo Medical Dental University, Tokyo, Japan

9:50am-10:10am Intermission
10:10am-10:50am ANYL 343: 3D imaging of cells with soft x-rays
Carolyn Larabell1,2, carolyn.larabell@ucsf.edu, Gerry McDermott1,2, Mark LeGros1,2

1 Department of Anatomy, University of California San Francisco, San Francisco, California, United States; 2 Molecular Biophysics & Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California, United States

10:50am-11:10am ANYL 344: PCA-based method for identifying spectra of different wood cell wall layers in Raman imaging data set and its applications
Xun Zhang1,2, zhangxunyy@bjfu.edu.cn, Feng Xu1,2

1 College of Material Science and Technology, Beijing Forestry University, Beijing, China; 2 Beijing Key Laboratory of Lignocellulosic Chemistry, Beijing Forestry University, Beijing, China

11:10am-11:30am ANYL 345: Early brain tumor detection by chemical imaging of deoxyhemoglobin
Chencai Wang, wangchencai@gmail.com, Chao-Hsiung Hsu, Zhao Li, Yung-Ya Lin

Chemistry and Biochemistry, UCLA, Los Angeles, California, United States

11:30am-11:50am ANYL 346: Deep and high-resolution three-dimensional tracking of single particles using nonlinear and multiplexed illumination
Evan Perillo, Yen-Liang Liu, Cong Liu, Andrew Dunn, Tim Yeh, tim.yeh@austin.utexas.edu

Biomedical Engineering Department, University of Texas at Austin, Austin, Texas, United States

MPPG: Big Data Science 8:30am - 12:00pm
Thursday, March 17
Room 3 - San Diego Convention Center
Victoria Feher, John Irwin, Brian Shoichet, Alexander Tropsha, Organizing
Victoria Feher, John Irwin
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
8:30am-9:00am MPPG 106: Mining the chemical universe database GDB-17 for drug discovery
Jean-Louis Reymond, jean-louis.reymond@ioc.unibe.ch

Chem Dept Univ of Bern, Bern, Switzerland

9:00am-9:30am MPPG 107: Enamine REAL DataBase – an instrumental and practical vehicle for charting new regions of the relevant drug discovery chemical space
Yurii Moroz2, mys@univ.kiev.ua, Alexander Chuprina1, Dmytro Mykytenko1

1 Enamine Ltd., Kyiv, Ukraine; 2 National T. Shevchenko University of Kiev, Kiev, Ukraine

9:30am-10:00am MPPG 108: Ligand discovery using big data with ZINC
John Irwin, jji@cgl.ucsf.edu

Pharmaceutical Chemistry, University of California San Francisco, San Rafael, California, United States

10:00am-10:30am Intermission– Café con Ordenadores
10:30am-11:00am MPPG 109: How to use 797,834 small molecule crystal structures
Erin Davis2, erinsdavis@gmail.com, Colin Groom1, Suzanna Ward1, Ian Bruno1, Amy Sarjeant2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 Cambridge Crystallographic Data Centre, Piscataway, New Jersey, United States

11:00am-11:30am MPPG 110: Small-molecule ligand/drug representation and validation in the Protein Data Bank
Stephen Burley1,2, sburley@proteomics.rutgers.edu

1 RCSB Protein Data Bank, Rutgers University, New Brunswick, New Jersey, United States; 2 San Diego Supercomputer Center, University of California-San Diego, San Diego, California, United States

11:30am-12:00pm MPPG 111: Drug design data resource: Leveraging blinded datasets for improved docking methodologies and workflows
Victoria Feher2,3, vickiafeher@yahoo.com, Rommie Amaro2,3, Michael Gilson1,3

1 School of Pharmacy and Pharmaceutical Sci., U. C. San Diego, La Jolla, California, United States; 2 Department of Chemistry and Biochemistry, UCSan Diego, La Jolla, California, United States; 3 Drug Design Data Resource, UC San Diego, La Jolla, California, United States

MPPG: Big Data Science 1:30pm - 4:30pm
Thursday, March 17
Room 3 - San Diego Convention Center
Victoria Feher, John Irwin, Brian Shoichet, Alexander Tropsha, Organizing
Victoria Feher, John Irwin
Cosponsored by: BIOL, CINF, COMP, MEDI and PHYS, Presiding
1:30pm-4:30pm MPPG 123: Withdrawn
1:30pm-2:00pm MPPG 124: Influence of data curation on QSAR Modeling– examining issues of quality versus quantity of data
Kamel Mansouri3, Christopher Grulke2, Ann Richard1, richard.ann@epa.gov, Antony Williams1

2 Zachary Piper Solutions, New Hill, North Carolina, United States; 3 National Center for Computational Toxicology, US EPA, Research Triangle Park, North Carolina, United States

2:00pm-2:30pm MPPG 125: What do open databases have to offer drug discovery?
Anne Hersey, ahersey@ebi.ac.uk

European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom

2:30pm-3:00pm MPPG 126: Using machine learning models based on phenotypic data to discover new molecules for neglected diseases
Sean Ekins2,1, ekinssean@yahoo.com

1 Collaborations Pharmaceuticals, Fuquay Varina, North Carolina, United States; 2 Collaborative Drug Discovery, Inc, Burlingame, California, United States

3:00pm-3:30pm Intermission
3:30pm-4:00pm MPPG 127: Extracting actionable knowledge from large scale in vitro pharmacology data
Edward Griffen1, ed.griffen@medchemica.com, Andrew Leach1,2, Alexander Dossetter1, Lauren Reid1

1 Medchemica Ltd, Macclesfield, United Kingdom; 2 Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, Merseyside, United Kingdom

4:00pm-4:30pm MPPG 128: PubChem - A chemical information hub
Jian Zhang3, jiazhang@ncbi.nlm.nih.gov, Paul Thiessen3, Asta Gindulyte2, Evan Bolton1, Steve Bryant3

1 NCBI / NLM / NIH, Warrenton, Virginia, United States; 2 NCBI/CBB, NIH, Bethesda, Maryland, United States; 3 NCBI/NLM, National Institutes of Health, Bethesda, Maryland, United States

Technical Program with Abstracts

ACS Chemical Information Division (CINF)
251th ACS National Meeting, Spring 2016
San Diego, CA (March 13-17, 2016)

CINF Symposia

Elsa Alvaro, Erin Davis, Program Chair

[Created Fri Feb 19 2016, Subject to Change; Check ACS Online Program for Latest Changes]

CINF: Tomayto vs. Tomahto: Overcoming Incompatibilities in Scientific Data 8:30am - 12:00pm
Sunday, March 13
Room 25B - San Diego Convention Center
David Deng, Organizing
David Deng, Presiding
8:30am-8:35am Introductory Remarks
8:35am-9:05am CINF 1: Relational database file can take us beyond the plain text file format
T O'Donnell, tjo@acm.org

gNova, San Diego, California, United States

I propose a relational database file and associated table schema that can replace plain text chemical file formats for sharing chemical structures and data. This proposal uses the open-source relational database engine SQLite. I will argue that a computer program reading such a file should use the Structured Query Language (SQL) to select data from database tables as needed, rather than reading it all at once into the program as is typically done when reading formatted text files. I will show a diagram of the database table schema. Example computer code for selecting from, and inserting into the tables from scratch or from existing files will be presented. I will discuss issues of file size and access speed.

Using a relational database allows extensions without alterations to a text file format. There are over 110 file formats used to store chemical structures, properties and data. SDF and PDB formats are very common and could be considered a standard. However, ad-hoc variants of these formats, intended to implement new features, cause errors in programs that rely on a standard format. The proposed SQLite database file could replace SDF, PDB and perhaps all those 110 file formats. I will discuss how a relational database maintains data integrity in a way that is more robust than a text-based file format. The relational database approach will be compared to the use of XML/CML formats and the Resource Description Framework (RDF) in the semantic web.

The database schema contains several basic tables describing molecules, atoms, bonds and properties. These tables reside within a single file that may contain multiple molecules. The tables may be fully or sparsely populated, depending on how much information is present or relevant for each molecule. Updates to these tables can be made as more data becomes available. Extensions to the database schema are possible in order to accommodate new types of data, for example internal/z-matrix coordinates, spectroscopic data or biological assay data. These are implemented not as new or modified basic table columns, but as additional tables with foreign key references to the basic tables.

The cross-platform format of SQLite, its portability across 32-bit, 64-bit, big-endian and little-endian architectures obviates any data incompatibilities caused by differences in hardware and software.

An open-source project at github.com/tjod/umdb provides a wiki, table definitions and example computer code.

9:05am-9:35am CINF 2: Standard JSON molecule, a solution to a cross-vendor molecule file format?

Brian Cole, coleb@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States
Sharing information in cheminformatics and molecular modeling is still more tricky than it needs to be. There are numerous file formats to navigate, each with its own pros and cons that takes years to master. And with the advent of cloud computing and service based architectures, interoperability between various software packages is becoming more important.

OpenEye has developed a JSON representation of molecules to be able to seamlessly integrate with web technologies. We have written a specification of that format as well and would like to work with the community to gather feedback and make it a standard others can rely on. Though we are not blind to history, file formats, and standardization processes. We are proposing the following guidelines for the process:

- Minimal: no chem-informatics necessary to produce/consume, and thus also human editable. Though storing that information is encouraged to enable the next item.
- Interoperability: the extensibility of JSON should really encourage the use for painful modeling tasks like sharing charges and radii between packages.
- Tested: reference implementations are necessary to prove utility, but are a bad idea as a specification. A proper document and most importantly, a test suite, are how successful standards are created. Any implementation that claims to be compliant must pass the test suite.

We are seeking more feedback on who would be interested in contributing to such a process.

9:35am-10:05am CINF 3: Rule-based capture/storage of scientific data from PDF files and export using a generic scientific data model
Stuart Chalk, schalk@unf.edu, Audrey Bartholomew, Bashar Baraz, John Turner

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States
Recently, the US government has mandated that publicly funded scientific research data be freely made available in a useable form, allowing integration of data in other systems. While this mandate has been articulated, existing publications and new papers (PDF) still do not provide accessible data, meaning that the usefulness is limited without human intervention.

This presentation outlines our efforts to extract scientific data from PDF files, using the PDFToText software and regular expressions (regex), and process it into a form that structures the data and its context (metadata). Extracted data is processed (cleaned, normalized), organized, and inserted into a contextually developed MySQL database. The data and metadata can then be output using a generic JSON-LD based scientific data model (SDM) under development in our laboratory.

10:05am-10:25am Intermission
10:25am-10:55am CINF 4: Building linked-data, large-scale chemistry platform: Challenges, lessons, and solutions
Valery Tkachenko, tkachenkov@rsc.org, Alexey Pshenichnov, Aileen Day, Colin Batchelor, Peter Corbett

Royal Society of Chemistry, Rockville, Maryland, United States
Chemical databases have been around for decades, but in recent years we observed a qualitative change from rather small in-house built proprietary databases to large-scale, open and increasingly complex chemistry knowledgebases. This tectonic shift imposed new requirements for database design and system architecture as well as implementation of completely new components and workflows which did not exist in chemical databases before. Probably the most profound change is being caused by the linked nature of modern resources - individual databases become nodes and hubs of huge and truly distributed web of knowledge. This change puts forward such important aspects as data and formats standards, interoperability, provenance, security, quality control and metainformation standards.

ChemSpider at the Royal Society of Chemistry was first public chemical database which incorporated rigorous quality control by introducing both community curation and automated quality checks at the scale of tens of millions of records. Yet we come to realization that this approach may now be incomplete in a quickly changing world of linked data. In this presentation we will talk about challenges associated with building modern public and private chemical databases as well as lessons that we learned from our past and present experience. We will also talk about solutions for some common problems.

10:55am-11:25am CINF 5: Towards a functional database for enzyme data: STRENDA DB
Carsten Kettner, ckettner@beilstein-institut.de, Martin Hicks

Beilstein Institut, Frankfurt/Main, Germany
Scientific research has reached a stage in which the rapid improvement of technologies and methodologies has contributed to the accumulation of a vast amount of data in the published literature. However, since scientific publishing serves an important role in communicating new data, both journal editors and readers are challenged to identify novel findings. In addition, mainstream publication practices often have a number of deficiencies in the way that data are reported, resulting in the publication of incomplete, irreproducible and even unusable data sets. Several years ago, a group of biochemists working in the field of enzymology opened the debate that reliable data are a basic requirement for subsequent research and knowledge generation for all “-omics” sciences, in particular for systems biology. Under the auspices of the Beilstein-Institut, this group formed the STRENDA Commission and developed community-based recommendations for authors reporting enzymological data – the STRENDA guidelines.
The submission of this data to a public database is essential to ensure maximal accuracy and accessibility of the experimental kinetic enzymatic data. The direct, electronic submission by authors prior to publication has proven to be essential for comprehensive data acquisition in macromolecular sequencing and structural biology, for example, the Protein Data Bank (PDB). The development of robust, web-based, software tools and the implementation of experimental and informatics standards to assess the experimental data in a manuscript with respect to the compliance of the STRENDA guidelines resulted in STRENDA DB. This open access database is intended to provide a knowledge base for researchers and publishers; the former can use this data for reproduction, interpretation and additional experiments and the latter for a quick assessment of the degree of innovation and novelty of the data submitted. Here, with the presentation of STRENDA DB, we propose a change in the current publication workflow: from manuscript to database to publication, rather than from manuscript to publication to database.

11:25am-11:55am CINF 6: Virtues and vicissitudes of curatorial data wrangling: The guide to pharmacology experience
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden

A wide range of valuable databases, both academic and commercial, use the curation model to extract and standardise selected result sets from the literature. This is the classic unstructured-to-structured transformation, predominantly of target binding data (e.g. IC50, Ki or Kd) between ligands and targets. Since 2009 the the Guide to PHARMACOLOG (GoPdb) has now curated quantitative interactions between 1300 protein targets and 6000 ligands covering a substantial proportion of the druggable proteome. The team has thus considerable expertise in the challenges of standardisation. This needs to be interposed between not only the primary literature but also other the databases that have extracted data. The wide range of compatibility and other issues associated with selecting new content for GtoPdb will be outlined. These will include the problem of equivocal measurement units as well as non-standard chemical IDs, protein IDs being used by authors. The issue of data gaps will also be expanded on. The presentation will conclude with an assessment of new initiatives from several publishers for authors to mark-up their chemical compounds before publication.

11:55am-12:00pm Concluding Remarks
CINF: From Data to Prediction: Applying Structural Knowledge in Drug Discovery & Development 8:40am - 12:00pm
Sunday, March 13
Room 25A - San Diego Convention Center
Jason Cole, Organizing
Jason Cole, Presiding
8:40am-8:45am Introductory Remarks
8:45am-9:15am CINF 7: Finding better aim at a moving target by exploiting structural data
Marcel Verdonk, marcel.verdonk@astx.com

Astex Pharmaceuticals, Cambridge, United Kingdom
Structural databases like the Protein Data Bank (PDB) contain a wealth of information that is widely used in the structure-based drug discovery community. A range of applications in this area have been reported, including knowledge-based scoring functions, interaction fields and pocket similarity searching. In general, when PDB data is used for such applications, the structures are treated as static, and all atoms are considered “equal”. However, flexible, solvent exposed protein atoms are significantly less likely to be involved in ligand binding than more buried, tightly packed atoms. Here, we show that this effect can be clearly observed in a statistical analysis of protein-ligand interactions in the PDB. Furthermore, we will illustrate how such analyses can be used to improve structure-based design applications like pocket finding algorithms, interaction fields and knowledge-based scoring.

9:15am-9:45am CINF 8: Bridging the dimensions: Seamless integration of 3D structure-based design and 2D structure-activity relationships to guide medicinal chemistry
Marcus Gastreich1, Matthew Segall3, matthew.d.segall@gmail.com, Carsten Detering2, Edmund Champness3, Christian Lemmen1

1 BioSolveIT, Sankt Augustin, Germany; 2 BioSolveIT Inc, Bellevue, Washington, United States; 3 Optibrium Ltd, Cambridge, United Kingdom
The effective use of software can have a major impact on timelines and innovation in drug discovery. However, the traditional split between computational modellers and synthetic chemists has been blurred and software must be accessible across disciplines to quickly understand and predict structure-activity relationships (SAR). There has been a similar divide between tools for three-dimensional (3D) structure-based design and those for analysis of SAR based on a two-dimensional (2D) compound structure. Seamless integration between these approaches would enable all of the available structural knowledge to be used to guide the efficient design of high quality, active compounds.

In this talk we will illustrate how information from 2D models of key physicochemical and absorption, distribution, metabolism, elimination and toxicity (ADMET) properties can be superimposed on 3D views of protein-ligand complexes. The influence of each atom or functional group on these properties can be highlighted and combined with visualization of the atomistic contributions to binding affinity, enabling development of optimization strategies that balance potency with the ADMET properties required in a safe and efficacious drug.

Furthermore, 2D analyses, such as activity cliff detection and matched molecular pair analyses, are commonly used to explore compound data sets and quickly identify important SAR within a chemical series or library. We will demonstrate how a seamless, highly visual link between the results of these analyses and related 3D structural information helps to understand and rationalize this SAR. This enables the efficient design of compounds with improved target affinity in a truly multi-parameter optimization environment.


Linking activity cliffs with 3D structural information to rationalise SAR

9:45am-10:15am CINF 9: Predicting binding affinity doesn't work, or does it?
Christian Lemmen, christian.lemmen@biosolveit.de

BioSolveIT, Sankt Augustin, Germany
Predicting binding affinity remains the holy grail in computational drug discovery. It may works on some target but not on others. Even extremely compute-intense approaches seem not to work, that work consistently well. One reason is that we are over-estimating the quality of our data (the PDB structures) we are working with. Some flaws in this data are well-known low resolution or high temperature factors could be taken into account but checking the electorn density we find many more issues. After all the crystal structure is also just a model that more or less fits to the primary experimental data. Next, the path from a raw PDB file to the input necessary for detailled molecular modelling is not always obvious. We've analyzed this carefully and provide a novel solution. Finally water poses a particular problem in the modeling process. Some water molecules are crucial, others should be replaced to gain affinity. However, careful analysis again shows that also the water molecules in the crystal structure model are not all 'well defined'. Therefore we will have a look at water molecules and how we can measure their experimental support. In summary, in many cases a quantitative assessment is next to impossible but it is good to 'know your enemies' and a qualitative assessment may still be very helpful. E.g. for the prioritization of compounds no absolute predictions are needed, but structure-activity relations and trends in the affinity data. We will show how the Hyde scoring pinpoints issues in the data and provides at least a qualitative assessment that is extremely helpful for the every-day tasks of SAR-analysis and compound prioritization.

10:15am-10:30am Intermission
10:30am-11:00am CINF 10: Structural knowledge by prediction: Crystal structure prediction tests and progress
Colin Groom, groom@ccdc.cam.ac.uk, Jason Cole, Anthony Reilly

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
Wouldn’t it be nice to be able to predict the crystal structure of organic molecules?

To encourage this progress to this goal the CCDC set up a blind test of crystal structure prediction methods back in 1999. By the time of this symposium, the sixth blind test will have concluded. Over twenty research groups will have applied their methodology to rigid molecules, flexible molecules, salts and other multicomponent systems.

This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, will review progress in the field. Moreover, it will discuss whether purely informatics-based approaches, trained using over 800,000 known structures can successfully predict the structures of unknown systems.

11:00am-11:30am CINF 11: Using physicochemical data and predictions in the risk assessment of mutagenic impurities
Susanne Stalford, susanne.stalford@lhasalimited.org

Lhasa Limited, Leeds, United Kingdom
ICH M7 guidance on mutagenic impurities (MIs) supports the control of potential MIs based on a sufficient understanding of the manufacturing process. This strategy would reduce the need to perform testing on MIs predicted to be purged during synthesis of an active ingredient.
A concept was brought forward in which semi-quantitative “purge factors” are calculated, based on physicochemical properties such as reactivity, solubility and volatility, to give confidence that a MI is likely to be absent in an end-product. This approach is used by several organisations within the pharmaceutical industry to support regulatory submission for e.g. late phase development. Our goal is to expand its use and standardise the approach through a consortium in order to establish a framework which would estimate “purge factors” and provide sufficient support for regulatory submissions. The key aims are to 1) standardise how calculations are performed throughout industry, 2) collate existing data and promote cross-industry data sharing to facilitate supported and accurate decision making, and 3) provide an automated in silico system which predicts purge factors based on experimental data and expert knowledge.
A successful international collaboration has been established, with a number of pharmaceutical companies guiding the development of the software and models for the prediction of physicochemical properties. This work has the potential to save both time and money in regards to analytical testing and also to ensure effort is focussed correctly on those impurities that present a substantive risk. This communication describes our scientific approach and recent progress.

11:30am-12:00pm CINF 12: Profile-QSAR generation 2: Perfection, the enemy of the good?
Valery Polyakov1, valery.polyakov@gmail.com, Eric Martin2, Li Tian1

1 GDC, NIBR, Lafayette, California, United States; 2 Computational Chemistry, Novartis, El Cerrito, California, United States
Profile-QSAR achieves unprecedented accuracy and broad domain of application by augmenting a few hundred IC50s for a target of interest with millions of additional IC50s from 100,000s of compounds tested across many hundreds of historical assays from the same protein family. The accuracy and domain of application have now been dramatically further improved by replacing the original Bayesian formulation with a random forest formulation. Highlights of this presentation include
- A new random forest-based algorithm yields unprecedented accuracy and domain of application,
- A new challenging “Novelty” test set for evaluating virtual screening models, which mirrors the great diversity of actual historical compound selections,
- A demonstration that linear PLS models can extrapolate to novel chemistry better than non-linear random forests,
- A surprising demonstration that excluding the seemingly most relevant training data can prevent overfitting and greatly improve extrapolation, and
- A comparison and analysis of models built on internal and public data sets.


 

CINF: Data Mining: Searching Non-covalent Interactions in Chemical Databases 1:00pm - 4:45pm
Sunday, March 13
Room 24C - San Diego Convention Center
Suman Sirimulla, Organizing
Suman Sirimulla
Cosponsored by: COMP, Presiding
1:00pm-1:05pm Introductory Remarks
1:05pm-1:30pm CINF 20: Sigma-hole interactions for rational drug design
Suman Sirimulla, suman.sirimulla@stlcop.edu

Basic Sciences, St.Louis College of Pharmacy, St. Louis, Missouri, United States
Sigma hole interactions are gaining increased attention in medicinal chemistry. Halogen bond is now widely accepted as an important non covalent interaction for rational drug design by medicinal chemistry community. Bivalent sulfur atoms are also known to have electron deficiency in its outer lobe, exhibiting sigma hole. In this study we discuss the importance of sigma hole interactions in protein-ligand complexes and present scoring functions to score sigma hole interactions for molecular docking purposes. The insights of analyses of sigma hole interactions obtained from datamining of Protein Data Bank are also presented.

1:30pm-1:55pm CINF 21: Deep convolutional neural networks for autonomous discovery of molecular interactions

Abraham Heifets, Izhar Wallach, Michael Dzamba, misko@atomwise.com

Atomwise, Inc., San Francisco, California, United States
Deep convolutional neural networks (neural nets with a constrained architecture that leverages the spatial and temporal structure of the domain they model) achieve the best predictive performance in areas such as speech and image recognition. Such deep convolutional neural networks autonomously discover and hierarchically compose simple local features into complex models. We demonstrate that biochemical interactions, being similarly local, are amenable to automatic discovery and modeling by similarly-constrained machine learning architectures. We describe the training of AtomNet, the first structure-based, deep convolutional neural network designed to predict the bioactivity of small molecules for drug discovery applications, on millions of training examples derived from ChEMBL and the PDB. We visualize the automatically-derived convolutional filters and demonstrate that the system is discovering chemically sensible interactions. Finally, we demonstrate the utility of autonomously-discovered filters by outperforming previous docking approaches and achieving an AUC greater than 0.9 on 57.8% of the targets in the DUDE benchmark.


Sulfonyl detection with autonomously-trained convolutional filters.

1:55pm-2:20pm CINF 22: Crystallographic informatics: Similarity and statistics
Simon Coles2, s.j.coles@soton.ac.uk, Graham Tizzard2, Philip Adler1

1 Chemistry, Haverford College, Haverford, Pennsylvania, United States; 2 University of Southhampton, Hampshire, United Kingdom
For several years we have been systematically synthesizing and crystallizing families of related compounds, providing insights into polymorphism, similarity and crystal structure formation. These families can get rather large, which has necessitated developing computerized approaches to searching for patterns and packing similarities. Whilst many of the properties under investigation can be driven by conventional or primary intermolecular interactions there is an increasing awareness that weak or secondary interactions (in addition of course to shape and steric factors) can have considerable influence on the arrangement of the solid state. Accordingly we use two approaches that are not reliant purely on characterizing and understanding primary, hydrogen bonded, interactions.
Quantification of similarity in terms of dimensionality is important in understanding polymorphism, phase transitions and crystal growth. The XPac program disregards the standard descriptors of intermolecular interactions and is a geometrical analysis that assigns vectors between equivalent points in adjacent molecules in order to try to capture the effects of diffuse Van der Waals interactions. Pairs, or indeed large families, of vector representations can then be used to index the degree of similarity between all the members of a collection. This has the advantage of not only being agnostic to intermolecular interactions but also to an extent molecular similarity – it is indeed possible to compare oranges and lemons!
Our second method takes quite the opposite approach. Using molecular, interaction and topological statistical descriptors gleaned from all possible characteristics of a crystal structure (we have used around 4000!) it is possible to build statistical models. Again when looking at related sets of structures, one can compare models and also look for correlations between descriptors. In this way it is not only possible to characterize similarity, but also relate physical properties to structural characteristics and answer questions such as “will it crystallize?” or “what coformers will produce a cocrystal?”.
We are lacking tools to mine really big and diverse collections in this way – our methods will be illustrated through studies on our own compound/structure libraries, however we demonstrate the potential, and shortcomings, of mining much larger databases by these approaches.

2:20pm-2:45pm CINF 23: Chemical fragment analysis of halogen bonds in protein binding sites
AhWing Chan, edith.chan@ucl.ac.uk

UCL, London, United Kingdom
We present an analysis of chemical fragments from halogen-containing, lead-like molecules in the Protein Data Bank (PDB) forming halogen bonds to protein main chain or side chain atoms. A fragment is defined as the largest ring assembly containing the halogen atoms involved in the halogen bond(s). Linear chains with halogens are also examined. Although there are about 2000 halogen-containing ligands in the PDB, only around 250 form halogen bond with protein. The halogen atoms are most often attached to an aromatic ring, with only very few linear motifs found. Our findings are useful for drug design, especially in the area of fragment-based design, chemical library design, and selection of screening compounds.

2:45pm-3:00pm Intermission
3:00pm-3:25pm CINF 24: Mining interaction data in the Cambridge structural database: Getting the rewards and removing the risks!
Jason Cole, cole@ccdc.cam.ac.uk, Peter Wood, Neil Feeder, Robin Taylor, Colin Groom

CCDC, Cambridge, United Kingdom
The Cambridge Structural Database is the biggest single resource describing non-covalent interactions. Think of every atom in 800,000 molecules interacting with all the molecules surrounding in in a lattice. These tens of millions of interactions are astonishingly informative – but only if we have tools to analyse them.

This presentation will describe the new Full Interaction Mapping functionality developed as part of the CSD System – a technology that reveals the energy yielding interactions holding molecules together and highlights the compromises that are sometimes made.

We will also see what a pure statistical analysis of every atom-atom interaction in over 800,000 crystals reveal. Just which interactions do appear more often than random? Are some interactions seen just because ‘everything has to be somewhere’? Are some of the interactions claimed to stabilise protein-ligand complexes observed less than one might expect and actually unfavourable? What really makes a halogen bond favourable?

In this presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, we will showcase examples of applying Full Interaction Maps real life problems, such as understanding the relative stability of structural polymorphs and look at how interaction propensity gives us insight into the true worth of non-covalent interactions.

3:25pm-3:50pm CINF 25: Fast mining of adaptable interaction patterns in protein-ligand interface
Therese Inhester2, inhester@zbh.uni-hamburg.de, Matthias Rarey1

1 University of Hamburg, Hamburg, Germany; 2 Center for Bioinformatics, University of Hamburg, Hamburg, Germany

Improving the specificity or the affinity of a small molecule to its target protein are just two typical tasks in structure-based drug design and structure-driven developments in biotechnology. In both scenarios, a profound understanding of interatomic interactions on a geometric and an energetic level is required. The ever increasing number of high-quality protein-ligand structures provides the opportunity of gathering detailed knowledge about preferred geometrical arrangements of atoms (interaction patterns). Yet, advanced data mining-methods are needed to deduce such spatial information from the large amount of data. Ideally, these methods have to be highly adaptable with regard to the possibilities of the query design. Moreover, they need to be efficient in order to enable near-interactive use. Equally important is a result browser which presents the resulting hits and the query in a comprehensive way.
In an attempt to address all these requirements we developed a new retrieval system to mine large sets of protein-ligand complexes for specific interaction patterns. Highly efficient indexing techniques and different graph-based algorithms are used to rapidly detect all occurrences of a spatial query. The queries represent a 3D constellation of atom descriptions within a protein-ligand interface. Additionally, constraints for physico-chemical properties of the protein and the ligand, e.g. the resolution, are combined with the spatial query.
We are going to present an application of our new approach in a detailed analysis of different interaction patterns including specific interactions and molecular substructures demonstrating its value for various molecular design tasks.

3:50pm-4:15pm CINF 26: Dual nature of a halogen atom
Mahesh Narayan, mnarayan@utep.edu

Chemistry, University of Texas at El Paso, El Paso, Texas, United States
Halogen bonding interactions between halogenated ligands and proteins were examined using the crystal structures deposited to date in the PDB. The data was analyzed as a function of halogen bonding to main chain Lewis bases, viz. oxygen of backbone carbonyl and backbone amide nitrogen. This analysis also examined halogen bonding to side-chain Lewis bases (O, N, and S) and to the electron-rich aromatic amino acids. The data reveals that while fluorine and chlorine have strong tendencies favoring interactions with the backbone Lewis bases at glycine, the trend is not restricted to the achiral amino acid backbone for larger halogens. Halogen side-chain interactions are not restricted to amino acids containing O, N, and S as Lewis bases. Electron-rich aromatic amino acids host a high frequency of halogen bonds as does Leu. A closer examination of the latter hydrophobic side chain reveals that the 'propensity of interactions' of halogen ligands at this oily residue is an outcome of strong classical halogen bonds with Lewis bases in the vicinity. Furthermore, an examination of Theta 1 (C, X, O and C, X, N) and Theta 2 (X, O, Z and X, N, Z) angles reveals that very few ligands adopt classical halogen bonding angles, suggesting that steric and other factors may influence these angles.
Outcomes from understanding halogen bonding trends in the PDB were applied towards rational drug design. Currently there are no specific pharmacological therapies (drugs) to treat post traumatic stress disorder (PTSD). Emerging data reveal that the opioid receptor-like 1 gene (Oprl1), encoding the nociception (NOP)/orphanin FQ receptor, is involved in stress-mediated enhancement of amygdala-dependent fear in PTSD syndrome. Here, we attempted to design and develop specific novel nociception receptor agonists with the objective of achieving better receptor specificity. A structure based drug design method was used while availing of the crystal structure of nocicpetin receptor in the PDB. Halogen bonding was fully explored in designing the drugs. The results presented through our study illustrate the application of halogen bonding in rational drug design.

4:15pm-4:40pm CINF 27: Crystal clear: Using statistical descriptions and analysis to understand crystallisation
Philip Adler2, padler1@haverford.edu, Simon Coles4, Alex Norquist1, Joshua Schrier2, Dave Woods4, Sorelle Friedler1, Lucy Mapp3

1 Haverford College, Bryn Mawr, Pennsylvania, United States; 2 Chemistry, Haverford College, Haverford, Pennsylvania, United States; 3 Chemistry, University of Southampton, Southampton, United Kingdom; 4 University of Southhampton, Hampshire, United Kingdom
Statistical methods can be applied to understanding and predicting crystallisation. Methods have been deduced to apply these techniques to problematic systems such as co-crystals and inorganic-organic hybrid materials.
To use statistical approaches rigorously generally requires quite strictly defined experimental designs. While these designs are well understood, they are not necessarily practicable in terms of chemical properties for which we do not already have a solid understanding, for instance the problem of predicting outcomes in crystallisation experiments.One of the largest problems in this field is parameterising the chemical space; the methods of describing the systems such that statistical methods can apply, in particular the problem of keeping such descriptors invariant and mutually orthogonal. In addition, choosing the correct aspects of the system, that is those that demonstrate potential causal relationships, can prove to be challenging. This is because there is a plethora of means available to chemists to describe their systems. The application of statistical techniques such as decision trees and support vector machines has permitted the derivation of hypotheses and helped enhance understanding of crystallisation experiments, and this forms the presented work.

4:40pm-4:45pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 1:00pm - 5:00pm
Sunday, March 13
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Ian Bruno
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
1:00pm-1:15pm Introductory Remarks
1:15pm-1:45pm CINF 13: Open data is not enough: A look at the Research Data Alliance

Mark Parsons, parsom3@rpi.edu

Research Data Alliance, Boulder, Colorado, United States
In recent years governments and research institutions have emphasized the need for open data as a fundamental component of open science. But we need much more than the data themselves for them to be reusable and useful. We need descriptive and machine-readable metadata, of course, but we also need the software and the algorithms necessary to fully understand the data. We need the standards and protocols that allow us to easily read and analyze the data with the tools of our choice. We need to be able to trust the source and derivation of the data. In short, we need an interoperable data infrastructure, but it must be a flexible infrastructure able to work across myriad cultures, scales, and technologies. This talk will present a concept of infrastructure as a body of human, organisational, and machine relationships built around data. It will illustrate how a new organization, the Research Data Alliance, is working to build those relationships to enable functional data sharing and reuse.

1:45pm-2:15pm CINF 14: Responses to the data revolution: CODATA on policy, data science, and capacity building
Simon Hodson1, John Rumble2, jumbleusa@earthlink.net

1 CODATA, Paris, France; 2 R&R Data Services, Gaithersburg, Maryland, United States
Talk of a data revolution is not hyperbole. The ease in which data relating to human behaviour and transactions can be gathered has led to new industries driven by their ability to elicit predictive and commercially advantageous information from masses of data. In academia, new data science courses and multidisciplinary centres are springing up to feed the demand for these analytical and data management skills. The data revolution, the phenomenon of Big Data and advances in data science are everywhere impacting both scientific research and industry.

Exploiting the opportunities and addressing the challenges afforded by the data revolution and using them to generate wider societal benefit will fundamentally depend on the creation of a complementary ‘Open Data’ environment. Open data is crucial to the maintenance of scientific ‘self-correction’ whereby the data underlying published concepts are open to scrutiny, replication or invalidation. The rapid growth of data makes this crucial principle of research ever more difficult to sustain, and increasingly requires both the data and the code used in data analysis to be open, accessible and useable.

Our response must include new technical solutions for presenting, sharing and analysing data; on capacity building in “data science”; and on changing the habits and norms of researchers and their institutions to create a culture of openness and data sharing. Science is an international activity, done in a national cultural setting, thereby requiring national strategies to fit within a common international frame. The role of international bodies such as CODATA and the International Council of Science is to facilitate the fit between national priorities and processes and rapidly developing international norms.

To help address these issues, CODATA promotes Open Data and Open Science through three strategic priorities:

1) Supporting implementation of data principles, policies and practices
2) Addressing the frontiers of data science and its adaptation to scientific research.
3) Capacity building for data science (particularly in low and middle income countries - LMICs)

This presentation will examine the context of the ‘Data Revolution’ and provide an introduction to CODATA’s analysis of these issues and activities in the priority areas identified. Particular emphasis will be placed on the policy environment, a holistic approach to capacity building and the research data skills that benefit various role in science systems.

2:15pm-2:45pm CINF 15: Moving research forward with persistent identifiers and services
Patricia Cruse, patricia.cruse@datacite.org

DataCite, Berkeley, California, United States
The presenter will provide information on DataCite, an international consortium which aims to increase the acceptance of research data as a legitimate, citable contribution to the corpus of scholarly communication. DataCite has developed many services that directly support data sharing, management and attribution. To enable these activities DataCite assigns persistent identifiers to research datasets and manages the infrastructure that supporst simple and effective methods of data citation, discovery and access. DataCite leverages the DOI infrastructure, which is already well-established. DOI names are the mostly widely used identifier for scientific journal articles, so researchers, authors, and publishers are familiar with their use. DataCite actively works with other identifier services such as ORCID and Crossref to deliver services that support researchers. In addition, the presenter will discuss the THOR project, a 30 month project funded by the European Commission under the Horizon 2020 program. DataCite is active participant in THOR and is working collaboratively to establish seamless integration between articles, data, and researchers across the research lifecycle. Please join the conversation and learn how your research can benefit from DataCite and THOR.

2:45pm-3:15pm CINF 16: Discoverability and reusability of FAIR chemistry research data as a key outcome of registering persistent identifiers and standardised metadata with DataCite
Henry Rzepa1, rzepa@ic.ac.uk, Matthew Harvey2, Andrew Mclean3

1 Chemistry, Imperial College London, London, United Kingdom; 2 HPC division, Imperial College London, London, United Kingdom; 3 ICT Division, Imperial College London, London, United Kingdom
A research data management (RDM) system for computational and other chemical data is described using DataCite for registration of persistent (DOI) identifiers along with standardised metadata, including ORCID identifiers. Examples of the benefits of using such a FAIR model (Findable, Accessible, Inter-operable and Re-usable) will include automated repository retrieval and display based only on the assigned DOI and the media type required (doi.org/73z), the standards-based curation of a ten-year old repository dataset (doi.org/73x) and illustrations of how the metadata associated with the assigned DOIs can be used to enhance research data discoverability and impact.


 

3:15pm-3:30pm Intermission
3:30pm-4:00pm CINF 17: Surveying and tracking the biomedical data landscape
Maryann Martone, mmartone@ucsd.edu

Neurosciences, University of California, San Diego, San Diego, California, United States
In the past few years, data and data science has exploded into the academic and public consciousness, as “Big data” and “Data science” have taken hold. In this presentation, I will present an overview of projects and initiatives across biomedicine that are working to make data FAIR: Findable, Accessible, Interoperable and Re-usable in the age of global search. I will present projects such as the Neuroscience Information Framework, the NIDDK Information Network (dkNET) and bioCADDIE, the NIH Data Discovery Index, which are working to index data available across distributed resources. Such projects are confronting first hand the heterogeneity and dynamism of the current biomedical data landscape. I will also highlight some of the community efforts underway to update and expand our current citation system to make it easier to track usage of research resources such as reagents, organisms and data. In particular, I will highlight the Resource Identification Initiative and Data Citation projects underway at FORCE11: the Future of Research Communications and e-Scholarship. FORCE11 is a grass roots organization dedicated to transforming scholarship through technology. The Resource Identification Initiative is working with authors and journals on a simple method for supplying unique identifiers for key research resources in the literature. Through efforts such as the Joint Declaration of Data Citation Principles and follow on projects to implement the principles in a formal system of data citation, organizations around the world are working to develop a machine-actionable system for data citation as well, so that data is properly handled and cited as a primary product of scholarship. These projects are starting to point to key infrastructure requirements to transition our current paper based system into a system designed for networks and global search.

4:00pm-4:30pm CINF 18: Data Observation Network for Earth: Earth and environmental science data management and discovery
Amber Budden1, aebudden@dataone.unm.edu, William Michener1, Dave Vieglais2, Rebecca Koskela1, Heather Soyka1

1 University of New Mexico, Albuquerque, New Mexico, United States; 2 University of Kansas, Lawrence, Kansas, United States
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data. In this overview we will introduce the guiding principles of DataONE, the primary components of the DataONE cyberinfrastructure, provide a brief demonstration of DataONE search and discovery and discuss community perspectives and outreach opportunities.


Data Observation Network for Earth (DataONE)

4:30pm-5:00pm CINF 19: California Digital Library: Advancing the digital transition of scholarly information
John Chodacki, John.Chodacki@ucop.edu

California Digital Library, University of California, Oakland, California, United States
Researchers are increasingly being asked to ensure that all products of research activity – not just traditional publications – are preserved and made widely available. Adoption of good data curation practices is critical to open scientific inquiry, discourse, and advancement. With their long history in the management and dissemination of multifarious information resources, libraries play a key role in providing scholars with the tools and services necessary for the effective long-term curation of research data, encompassing data lifecycle management, preservation, sharing, dissemination, and reuse. This presentation will provide an overview of efforts the California Digital Library’s University of California Curation Center (UC3) group and their partners have undertaken to develop services that help researchers improve their handling, sharing, and archiving of datasets. Services include the DMPTool (Data Management Planning Tool), the Merritt Repository, the Dash Repository, and a other services that help researchers manage and get credit for their data.

CINF: From Data to Prediction: Applying Structural Knowledge in Drug Discovery & Development 1:30pm - 4:50pm
Sunday, March 13
Room 25A - San Diego Convention Center
Jason Cole, Organizing
Jason Cole, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:05pm CINF 28: Towards a fully automated creation of large protein structure ensembles
Stefan Bietz, Matthias Rarey, rarey@zbh.uni-hamburg.de

University of Hamburg, Hamburg, Germany
With the continuously growing number of available crystal structures, a great wealth of information related to molecular interactions and macromolecular conformational flexibility becomes available for drug discovery. Collecting and aligning all structures available for a certain protein target of interest is therefore a central problem to be addressed prior to structure analysis and knowledge exploitation. Classical methods include sequence and structure alignments followed by a superimposition of the extracted structures. Since these methods are mostly developed for protein structure analysis, they do not focus on the protein’s binding site resulting in major deficiencies. Not surprisingly, the construction of protein structure ensembles is mostly hand-curated work. While this is acceptable for individual cases, the construction of protein structure ensembles for large scale analysis or evaluation of computational techniques relying on ensembles like molecular docking is prohibitively work intense.

In this talk, we present a series of algorithms especially tailored for a binding-site focused, fully-automated construction of protein structure ensembles. The key element is a novel alignment algorithm for binding sites named ASCONA[1]. By taking sequence and structure information into account, ASCONA is able to calculate correct amino acid alignments of binding sites even in complicated scenarios like homo-dimer interfaces and binding sites consisting of patches from multiple protein domains. ASCONA allows to specifically control the variability like the gap and mutation rate in the binding site. The new alignment approach is embedded into a workflow for an automatic construction of protein structure ensembles named SIENA[2]. The process includes the extraction of binding sites from the PDB, their alignment, a reasonable reduction of structures, and the superimposition. In Summary SIENA enables the construction of arbitrary protein structure ensembles with typical computing times of 5-20 seconds.

[1] Bietz, S.; Rarey, M. (2015). ASCONA: Rapid Detection and Alignment of Protein Binding Site Conformations. Journal of Chemical Information and Modeling, 55(8):1747–1756.
[2] Bietz, S.; Rarey, M. (2015). SIENA: Efficient Compilation of Selective Protein Binding Site Ensembles. Journal of Chemical Information and Modeling, submitted for publication

2:05pm-2:35pm CINF 29: On our way to the automated search for ligand-sensing cores

Tobias Brinkjost1,2, tobias.brinkjost@tu-dortmund.de, Christiane Ehrt2, Petra Mutzel1, Oliver Koch2

1 Faculty of computer science, TU Dortmund University, Dortmund, Germany; 2 Faculty of chemistry and chemical biology, TU Dortmund University, Dortmund, Germany
The investigation of protein-ligand interactions is one of the prerequisites for structure-based design of small molecule modulators of protein function. These interactions can be regarded based on structural similarity of secondary structure elements with impact on rational drug design [1]. The basic idea of the presented approach is the fact that a similar spatial arrangement of secondary structure elements around the binding site (‘ligand-sensing cores’) can recognize similar scaffolds independent of the overall fold [2]. The discovery of Namoline as a lysine-specific demethylase inhibitor, which impairs the growth of prostate cancer cells, by Willmann et al. demonstrated the pharmaceutical relevance of this concept [3]. However, to date there is no automated procedure available to compare 'ligand-sensing cores' of various proteins.

We will present the results of our ongoing progress to develop an automated computational method to identify similar 'ligand-sensing cores' in binding pockets of otherwise unrelated proteins for all known protein structures. Our approach is based on labelled graphs generated based on the secondary structure information provided by Secbase [4], an extension module of the Relibase. Calculations on available test datasets reveal a robust and very fast implementation, so that an all-against-all comparisons of the whole PDB should be possible within one or two months of calculation time on a recent workstation. We are currently optimizing our approach on several datasets and have also generated very promising results on ATP and other target specific datasets recently. We will also present the results of the all-against-all comparison.

In the end, this information of all similar ligand-sensing cores within all known protein structures will provide access to previously unused data to predict polypharmacology and to identify new lead structures. Therefore, this development leads to a valuable tool for rational structure-based drug design.

[1] Koch O; In Future Medicinal Chemistry; 2011:699-708.
[2] Koch M A, Waldmann H; In Drug Discovery Today; 2005:471-483.
[3] Willmann D, Lim S, Wetzel S, Metzger E, Jandrausch A, Wilk W, Jung M, Forne I, Imhof A, Janzer A, Kirfel J, Waldmann H, Schüle R, Buettner R; In International Journal of Cancer; 2012:2704-2709
[4] Koch O, Cole J, Block P, Klebe G; J. Chem. Inf. Model; 2009:2388-2402

2:35pm-3:05pm CINF 30: Deep learning in the 3rd dimension: Structure-based bioactivity prediction on novel targets
Abraham Heifets, abe@atomwise.com, Izhar Wallach, Michael Dzamba

Atomwise, Inc., San Francisco, California, United States
Existing deep learning techniques for bioactivity prediction require significant prior knowledge of known active molecules for each protein target against which they predict, limiting their applicability in practice. We describe how to incorporate information about the structure of target proteins into the predictions made by deep learning neural networks. We discuss data cleaning, collation, and scaling techniques that are necessary to integrate large structural databases, such as the PDB, with large bioactivity databases, such as ChEMBL and PubChem. Finally, we present case studies where structural target information allowed the successful prediction of hits for targets with no known binders.

3:05pm-3:20pm Intermission
3:20pm-3:50pm CINF 31: CDD vision: Advanced analytics, calculations, and visualization live in CDD vault
Barry Bunin, bbunin@hotmail.com

CDD, Belmont, California, United States
Drug Discovery Collaborations have been securely hosted in the CDD Vault for over a decade. We present a new web based data mining and visualization module for high throughput drug discovery data that makes use of a novel technology stack following modern reactive web design principles. CDD Vision allows researchers to simultaneously visualize, manipulate, and create publication quality graphics for hundreds of thousands of data points. Scientists can now perform complex, multidimensional analysis of experiment, calculated, and predicted properties to optimize activity, selectivity, and drug-like properties. The advanced analytics, calculations, and visualization suite is conveniently integrated within the CDD Vault for facile registration, structure activity relationships and secure collaboration. The synergy between these two systems allows users to quickly and graphically move through, sift and test stages of the drug discovery process to re-focus on advancing the science via intuitive data manipulation. Innovative capabilities for bio-computational across data sets leveraging the bioassay ontology, as well as newly shared open source technologies highlight the conversion of collaborative innovation into professional, useful products.

3:50pm-4:20pm CINF 32: Advances in data provisioning

Marian Brodney1, marian.d.brodney@pfizer.com, Jacquelyn Klug-McLeod2, Gregory Bakken2, Robert Stanton1

1 Computational Sciences Center of Excellence, Pfizer, Cambridge, Massachusetts, United States; 2 Computational Sciences Center of Excellence, Pfizer, Groton, Connecticut, United States
Discovery project teams depend heavily on data to drive decision making. This information usually comes from various data sources (external, internal, in silico, etc.) in numerous formats (chemistry, in vitro, in vivo, pharmacology, ADME). It can be a difficult and time consuming process to access, collate and manage the amount and the variety of information needed to support teams, and to then keep this data current and presented in a reusable format for multiple disciplines. Our Computational Sciences (CSCoE) group has developed a data provisioning platform to aggregate project information, allowing teams to focus on analyzing and visualizing their data to advance towards their goal(s).

4:20pm-4:50pm CINF 33: Chemical information on the web: Find and be found

Asta Gindulyte, mandroji@yahoo.com

National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, Maryland, United States
While numerous chemical data repositories are available to chemists these days, the task of finding and collating all the data relevant to the problem of interest can be arduous and time consuming. A lot of effort may be required in compiling a list of trusted resources and learning how to use them, their scope, quirks, etc. And even then, it is a laborious process to use multiple data sources – whether manually searching each database or writing custom code to perform the task. Thus, it is not surprising that many chemists are turning to generic web search engines such as Google and Bing for their preliminary research. However, just because the database is “on the web”, it does not necessarily mean that it can be searched effectively using Google. For instance, data on the melting point of aspirin might be in the database, but it won’t necessarily show up in Google search results. That can happen if the resource requires a login to access the data. But, it can also happen if the web pages being served by the resource are not optimized for web search engines.

This talk will discuss the strategies for effective chemical information searching on the web from two perspectives: the owners of the chemical information databases and the users of such databases. This includes the steps that resource providers can take to “expose” their data to web search engines whether they are behind a login wall or not, and will give examples of PubChem’s experience in this endeavor. For the researchers using the web to search for chemical information, this talk will share ideas on how to streamline the user experience and show how to build their own custom Google chemical search engine (no programming experience required).

CINF: CINF Scholarships for Scientific Excellence: Student Poster Competition 6:30pm - 8:30pm
Sunday, March 13
Room 3 - San Diego Convention Center
6:30pm-8:30pm CINF 34: Quantifying the effect that chemical environment exerts upon changes in property in matched molecular pairs analysis
Iva Lukac1, i.lukac@2013.ljmu.ac.uk, Andrew Leach1,3, Edward Griffen3, Alexander Dossetter2

1 School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, United Kingdom; 2 MedChemica Limited, Macclesfield, United Kingdom; 3 Medchemica Ltd, Macclesfield, United Kingdom
Matched Molecular Pairs Analysis (MMPA) is a technique that links differences in molecular structure with changes in properties and so allows useful information to be extracted and shared without disclosing full structures. By encoding the output as SMIRKS, it can then be used to suggest new molecules. It also provides the probability of each property changing in the desired direction. Like picking the winning horse by studying its previous results and current race conditions, the decision can be logical and data driven but success cannot be guaranteed.

MMPA has recently been applied to the ADMET databases of Roche, AstraZeneca and Genentech and the output combined to create what we believe is the world’s largest repository of medicinal chemistry knowledge, akin to a textbook. A by-product of this merging is that a large set of MMPA data are available that can answer fundamental questions about the technique: i) how many pairs are needed in order to be confident that a particular change in structure actually causes an increase (or decrease) in property? ii) for small sets of molecules, how large must the average change in property be in order to be confident that a particular change in structure actually causes an increase (or decrease) in property? iii) is there an upper limit to the number of pairs needed? iv) do chemically specific changes need less data than general changes? Each of these questions will be addressed.

A starting point for the analysis is that each pair can be viewed as a coin-flip experiment where an increase in a property corresponds to “heads” and a decrease to “tails”. This reduces the challenge to detecting biased coins. The coin flip analogy reduces the likelihood of misassigning the direction of change, but greatly reduces the number of structural transformations that are identified as having a significant effect upon properties. The large database available has permitted us to probe the link between the coin flip approach and the mean changes in property. The encoding of the chemical environment has further permitted us to analyse its impact which is to reduce the number of pairs required as the chemical environment becomes more specific.


 

6:30pm-8:30pm CINF 35: CSNAP: A new chemoinformatics approach for target identification using chemical similarity networks
Yu-Chen Lo1, bennylo@ucla.edu, Silvia Senese1, Chien-Ming Li3, Qiyang Hu2, Yong Huang3, Robert Damoiseaux4, Jorge Torres1

1 Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, California, United States; 2 Institute for Digital Research and Education, University of California, Los Angeles, Los Angeles, California, United States; 3 Drug Study Units, University of California, San Francisco, San Francisco, California, United States; 4 Molecular Shared Screening Resource, University of California, Los Angeles, Los Angeles, California, United States

Target identification is one of the most critical steps following cell-based phenotypic chemical screens to determine the molecular mechanism of drugs and remains the major bottleneck of many drug discovery programs for developing novel therapies. Traditional in-silico target identification methods, including chemical similarity database searches, predict drug targets by single or sequential ligand similarity comparison, which have limited capabilities for accurate deconvolution of a large number of hits with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype recognition and drug target profiling from chemical screens. CSNAP orders query and annotated compounds into chemical similarity networks for rapid chemotype classification followed by consensus target scoring to identify the most probable target for query compounds into the network. Our benchmark study showed that CSNAP achieved higher target prediction accuracy than traditional target identification approach. Additionally, CSNAP is capable of integrating with biological knowledge-based databases and high-throughput biology platforms for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/).

6:30pm-8:30pm CINF 36: Prediction and quantification of cation-π interactions in ligand-bromodomain binding: Using quantum chemistry to capture electronic effects
Wilian Augusto Cortopassi, wilian.cortopassi@chem.ox.ac.uk, Robert Paton

Chemistry Research Laboratory, University of Oxford, Oxford, United Kingdom
CREBBP bromodomains are protein modules that recognize acetylated lysine residues and their selective inhibition shows potential in the design of more effective molecules to treat cancer. Recently, combining experimental and computational studies, [1] we have discovered that the ability of a series of dihydroquinoxalinone (DHQ) derivatives to bind to the CREBBP receptor is strongly influenced by the potential to form a cation-π interaction with an active site arginine residue. To get more insights into the importance of the cation-π interaction for the design of other CREBBP inhibitors, we took the systematic modification of aromatic substituents for a series of 5-isoxazolyl-benzimidazoles CREBBP inhibitors and constructed a quantitative structure-activity relationship (QSAR) based on the computed electrostatic potential of each π-system. Our model has been trained against literature data (R2 = 0.88, n=15) of binding affinities of 5-isoxazolyl-benzimidazoles and shows promise in testing against newly synthesized DHQ derivatives. Our work shows that a quantum chemical consideration of the electrostatic potential at a point remote from the ring center is a necessary condition to obtain good quantitative agreement, and leads to an improved qualitative understanding of binding and recognition involving inhibitors in the CREBBP active site.

[1] Rooney, T. P. C. et al. Angew. Chem. Int. Ed. 2014, 126(24), 6240-6244.

6:30pm-8:30pm CINF 37: 3Dmol.js: Chemical structure visualization for the modern web
Jasmine Collins1, jlc206@pitt.edu, Matthew Ragoza3, Justin Jensen4, David Koes2

1 Computer Science/Neuroscience, University Of Pittsburgh, Pittsburgh, Pennsylvania, United States; 2 Computational and Systems Biology, University of Pittsburgh, Pittsburgh, Pennsylvania, United States; 3 University Of Pittsburgh, Pittsburgh, Pennsylvania, United States; 4 Pittsburgh Science & Technology Academy, Pittsburgh, Pennsylvania, United States

3Dmol.js is an object-oriented JavaScript library for visualizing 3D molecular data in the browser that does not require Java or plugins and provides interactive performance comparable to desktop applications. We will review the essential features of 3Dmol.js as well as describe several recent improvements, including improved rendering styles, support for animation, and improved crystallographic features.


Outlined cartoon style

6:30pm-8:30pm CINF 38: General purpose 2D and 3D similarity approach to identify hERG blockers
Patric Schyman, pschyman@bhsai.org, Ruifeng Liu, Anders Wallqvist

DoD Biotechnology High Performance Computing Software Applications Institute, Frederick, Maryland, United States
Screening compounds for human ether-à-go-go-related gene (hERG) channel inhibition is an important component of early-stage drug development and assessment. In this study, we developed a high-confidence hERG prediction model based on a combined two-dimensional (2D) and three-dimensional (3D) modeling approach. We developed a 3D Similarity Conformation Approach (SCA) based on examining a limited fixed number of pairwise 3D similarity scores between a query molecule and a set of known hERG blockers. By combining 3D SCA with 2D Similarity Ensemble Approach (SEA) methods, we achieved a maximum sensitivity in hERG inhibition prediction with an accuracy not achieved by either method separately. The combined model achieved 69% sensitivity and 95% specificity on an independent external data set. Further validation showed that the model correctly picked up documented hERG inhibition or interactions among the Food and Drug Administration- approved drugs with the highest similarity scores–with 18 of 20 correctly identified. The combination of ascertaining 2D and 3D similarity of compounds allowed us to synergistically use 2D fingerprint matching with 3D shape and chemical complementarity matching.

6:30pm-8:30pm CINF 39: Indexing techniques and algorithms to efficiently mine interaction patterns in large sets of protein-ligand-complexes
Therese Inhester2, inhester@zbh.uni-hamburg.de, Matthias Rarey1

1 University of Hamburg, Hamburg, Germany; 2 Center for Bioinformatics, University of Hamburg, Hamburg, Germany
The number of high-quality protein-ligand structures is increasing every year and opens the route for a detailed knowledge discovery needed for various structure-based molecular design tasks. Especially the analysis and comparison of spatial arrangements of atoms (interaction patterns) in large sets of protein-ligand interfaces can help deepening the understanding of molecular recognition and improving the design of highly affine ligands. The deduction of such information however requires efficient databases mining systems which are capable of dealing with the complexity of varying spatial queries in reasonable time.
To address this need we developed a new database and connected retrieval system which is able to mine large sets of protein-ligand complexes for specific interaction patterns. Highly efficient indexing techniques and different graph-based algorithms are used to rapidly detect all occurrences of a spatial query. The query consists of several atom descriptions which are connected by distance intervals or precompiled interactions. An atom description can be further refined by a molecule type and a substructure it belongs to. Additionally, constraints for physico-chemical properties can be combined with the spatial query. A graphical user interface allows an intuitive definition of queries starting from scratch or from a known binding site. Results can easily be analyzed and compared in a 3D viewer showing the structures aligned to the query.
In this contribution we will present the structure of the spatial queries used in our approach and explain in detail which indexing techniques and algorithms we used to efficiently mine large sets of protein-ligand structures.

6:30pm-8:30pm CINF 40: Development and application of multiclass QSAR models for predicting human skin sensitization
Vinicius Alves3,2, viniciusm.alves@gmail.com, Alexey Zakharov1, Eugene Muratov3, Denis Fourches5, Nicole Kleinstreuer4, Judy Strickland4, Carolina Andrade2, Alexander Tropsha3

1 CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, Maryland, United States; 2 Faculty of Pharmacy, Federal University of Goias, Goiania, Goias, Brazil; 3 UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States; 4 Contractor supporting the NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), ILS, Inc., Research Triangle Park, North Carolina, United States; 5 Department of Chemistry and Bioinformatics Research Center, North Carolina State University, Chapel Hill, North Carolina, United States
We have developed classification QSAR models to predict chemical skin sensitization potential (non-sensitizers, weak, moderate, and strong/extreme sensitizers). We have (i) compiled, curated, and integrated the largest publicly-available dataset comprising 515 chemicals tested in the LLNA assay; (ii) used this data to generate and validate multi-class QSAR models; (iii) established SAR rules for skin sensitizers; and (iv) employed QSAR models for virtual screening of the COSMOS database of cosmetics inventory. Model developed with SiRMS and PASS/QNA descriptors using random forest and deep learning techniques showed total average accuracy of 63% and coverage of 59% as applied to several external validation sets. Virtual screening of the COSMOS inventory yielded 250 putative strong skin sensitizers. Structural fragments promoting or interfering with skin sensitization potential as well as structural modifications required to decrease chemical toxicity were identified. Models developed herein could be used to guide the rational design of safe cosmetic products.

6:30pm-8:30pm CINF 41: Virtual screening in the cloud computing environment
Aaron Cooper1, aaron.cooper@stlcop.edu, Mathew Koebel3, Grant Schmadeke1, Suman Sirimulla2

1 Basic Sciences, St. Louis College of Pharmacy, St. Louis, Missouri, United States; 2 Basic Sciences, St.Louis College of Pharmacy, St. Louis, Missouri, United States
Cloud computing is becoming increasingly popular because of its ease, convenience and on-demand access to a shared pool of configurable computing resources. Several industries are turning to cloud technology as an efficient way to improve quality services due to its capabilities to reduce overhead costs, downtime, and an automated infrastructure development. Here we present an application that utilizes Amazon web Services platform to run virtual screening in a cloud environment. Virtual screening (VS) is a widely used computational technique in drug discovery process. It is used to search libraries of small molecules in order to identify the compounds that are most likely to bind the drug target (usually a protein or DNA). It involves running molecular docking calculations on a defined macromolecule against a library of small molecules. Currently there are several publicly available and downloadable chemical libraries (Chembridge, ZINC etc). These chemical libraries range from a few million to a couple billion ligand structures. Screening of such big libraries is a computationally expensive task. This process can be expedited by running these calculations massively paralleled on clustered computers containing several nodes. Here we present the “VSCloud,” an application that is optimized to run virtual screening on Amazon Web Services (AWS) and available for users on AWS market place. It would be an on-demand service for users where they pay-per-use. Users do not have to worry about maintaining the computer hardware, cloud infrastructure or the software.

6:30pm-8:30pm CINF 42: Structural evolution of Tcn (n = 4–20) clusters from first-principles global minimization
Chad Priest1, cprie003@ucr.edu, De-en Jiang2

1 Chemsitry, University California, Riverside, Riverside, California, United States; 2 Department of Chemistry, University of California, Riverside, Riverside, California, United States
The structural evolution of Tcn (n = 4–20) clusters using a first-principles global minimization technique, namely, basin-hopping from density functional theory geometry optimization (BH-DFT). BH-DFT permits the exploration of a large configurational landscape for the evolution of Tc nanostructures. The method yielded significantly more stable structures have been found in comparison with previous models, indicating the power of DFT-based basin hopping in finding new structures for clusters. The growth sequence and pattern for n from 4 to 20 are analyzed from the perspective of geometric shell formation. The binding energy per atom, relative stability, and magnetic moments are examined as a function of the cluster size. Several magic sizes of higher stability and symmetry are discovered for cluster sizes 6, 10, 12, 14, and 18. In particular, we find that Tc19 prefers an Oh symmetry structure, resembling a piece of a face-centered-cubic metal, and its electrostatic potential map shows interesting features that indicate special reactivity of the corner atoms. Furthermore, when comparing the structural evoultion with neighboring elements magnesium and ruthenium, a strikingly similar cubic-like structural pathway is observed with ruthenium while a disparaging icosahedral growth pattern is noticed for the same group element magnesium.

CINF: Beyond Digitized Paper: The Next Generation of ELNs 8:15am - 12:00pm
Monday, March 14
Room 24C - San Diego Convention Center
Erin Davis, David Deng, Organizing
Erin Davis, David Deng, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 50: Toward semantic representation of science in electronic laboratory notebooks (ELNs)
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States
An electronic laboratory Notebook (ELN) can be characterized as a system that allows scientists to capture the data and resources used in performing scientific experiments. This allows users to easily organize and find their data however, little information about the scientific process is recorded.

In this paper we present initial attempts to integrate an Electronic Notebook Ontology (ENO) into the Eureka! Research Workbench, an open source semantic ELN. A discussion of the ENO, integration into the backend data store of Eureka!, and possibilities using this approach will be presented.

8:45am-9:10am CINF 51: New cloud-based ELN with built-in raw analytical data support and automatic structure confirmation capabilities
Santiago Dominguez Vivero1, sdominguez@mestrelab.com, Juan Cobas Gomez1, Santiago Fraga Castro1, Francisco Javier Sardina2

1 Mestrelab Research SL, Hereford, Herefordshire, United Kingdom; 2 Chemistry, University of Santiago de Compostela, Santiago De Compostela, A Coruña, Spain
Laboratory notebooks represent a critical component of the research and development workflow of many companies and academic and research groups. Whilst many research organizations are still using handwritten recording procedures, Electronic Laboratory Notebooks (ELNs) are progressively replacing traditional paper books in both commercial research establishments and academic institutions.
A plethora of new ELN products have been rolled out across the pharmaceutical and chemical industry by a rising legion of vendors in recent years. The result is a motley crew of ELNs, ranging from generic authoring tools to custom solutions dedicated to very specific scientific disciplines.

In this talk we present Mbook, a new cloud-based ELN (in-house or client-server versions are also supported) specially designed for the field of organic synthesis and which implements unique features in the context of analytical data handling that makes it stand apart from other ELN solutions. These include:

- Ability to automatically process, analyse, store and report raw NMR data recorded in all NMR manufactures (both high field and benchtop NMR instruments are supported) as well as LC/GC/MS data acquired in many different formats.
- A new powerful fully automatic structure verification (ASV) system of small molecules using NMR, LC/GC/MS or both jointly. More specifically, this ASV module can be used with raw samples before purification to get a quick assessment about whether or not the expected product has been successfully produced as the result of a reaction. In addition, it can also be used with purified samples to get a higher degree of confidence of the proposed molecular structure, for example for registration, which is also supported in one single click by the integration between Mbook and a cloud-based Mestrelab registration system.


NMR data uploaded to Mbook and viewed within its web interface

9:10am-9:35am CINF 52: Mobile interfaces for a digital research notebook
Jeremy Frey2, j.g.frey@soton.ac.uk, Cerys Willoughby2, Simon Coles1, Richard Whitby3, Colin Bird2

1 University of Southampton, Hampshire, United Kingdom; 2 University of Southampton, Southampton, United Kingdom; 3 Univeristy of Southampton, Southampton, Hants, United Kingdom
Despite the clear advantages of digital notebooks over the traditional paper log, laboratory researchers continue to show reluctance to make the transition. Some of the rationale for their inhibitions arises from the ease with which they can note their observations on paper, whereas their desktop computer will almost certainly be located outside the lab. It is an undeniable fact that capturing experiment metadata at source will produce a more reliable record than relying on subsequent recall.

Using mobile devices to curate experimental observations in the laboratory can overcome this difficulty. In this paper we describe the evolution of mobile experiment recording at Southampton, from an early prototype developed during the CombeChem project through to Notelus, a native iPad application that can interface directly to an ELN and also has a linked experiment planning application. We project forward to the vision of a generic Digital Research Notebook (DRN) that has all the required functionality, implemented with an Application Programming Interface (API) that enables the DRN to be interface-agnostic. Experiments would be recorded in the lab using a mobile (e.g., tablet, phone, camera) interface, using cues to encourage metadata capture at source. Subsequent data analysis would be done out of the lab, using a desktop or web interface to the same API as used by the mobile device.

9:35am-10:00am CINF 53: Not just another reaction database
Aileen Day2, Valery Tkachenko2, tkachenkov@rsc.org, Alexey Pshenichnov2, Leah McEwen1, Simon Coles3, Richard Whitby3

1 Clark Library, Cornell University, Ithaca, New York, United States; 2 Royal Society of Chemistry, Rockville, Maryland, United States; 3 University of Southhampton, Hampshire, United Kingdom
The need for a high quality reaction database underpins synthetic reaction planning, as highlighted by the roadmap of the Dial-a-Molecule grand challenge [1] (the aims of which are to be able to predict the outcome of a reaction a priori and therefore generate products on demand, and also to optimise a reaction).
A number of reaction databases are available [2] - most of these focus on storing basic reaction schemas and details and link to publications for more details. However their main limitation is that because their major source is the abstraction of published literature, insufficient structured reaction detail is recorded:
for someone else to reproduce the reaction
to fully record all reaction products (not just the target product)
previous attempts to reach the optimised reaction route so that this 'work-up' can be correlated to allow better prediction of reaction outcomes.
As a result, the reactions domain of the chemical data repository that the Royal Society of Chemistry is developing will capture:
reactions and processes directly from Electronic Lab Notebooks
reactions which gave low yields or unintended products
processes, parameters and equipment in S88 process recipe [3] style for maximum reproducibility
multistep reactions
reactants, products etc. not just as small organic molecules
raw characterisation data linked to products
We will demonstrate a first version, populated with reactions text-mined from RSC articles and examples of notebook reactions and processes as recorded by an academic research group at Cornell University.
[1] Dial a Molecule Grand Challenge, http://generic.wordpress.soton.ac.uk/dial-a-molecule/ (accessed Oct 8, 2015)
[2] Organic Chemistry Resources Worldwide, http://www.organicworldwide.net/content/reaction-databases (accessed Oct 8, 2015)
[3] ISA, 'Batch Control Part 1: Model and Terminology,' The International Society for Measurement and Control, ISA Press, ISA - S88.01-1995

10:00am-10:15am Intermission
10:15am-10:40am CINF 54: Directly upload data from an ELN into PubChem
Ben Shoemaker, shoemake@mail.nih.gov, Asta Gindulyte, Evan Bolton, Steve Bryant

NCBI / NLM / NIH, Bethesda, Maryland, United States
Managing data in a laboratory across multiple projects, instruments, collaborators and people coming and going requires the organization and flexibility that an electronic laboratory notebook (ELN) can provide. Such automation at publication time, however, falls short when it comes to submitting data to a public database. Public accessibility of data is often a required step in journal publication or grant renewal. Public repositories offer a stable, well-publicized platform in which to house such data. During the time crunch of publishing or an administrative deadline, an additional data upload procedure must be initiated to load and often reformat the data to maximize its exposure and usability.
PubChem is providing the PubChem Upload RESTful programming interface (API) to allow ELN providers to directly incorporate data publishing into PubChem as a built-in tool or button. A suite of features lets a submitter upload data while scheduling a delayed release. Such flexibility provides the submitter with immediate, uniquely-generated PubChem data identifiers to prove public submission while timing the actual release with a publication or other future date. In addition, semi-private view codes can be generated to share access with collaborators and administrators. Simultaneous access to the PubChem Upload web interface is available throughout the process.
The PubChem Upload API offers seamless integration of public data reporting via existing ELN systems.

10:40am-11:05am CINF 55: Intuitive collaboration platform: A Scilligence story
Rajeev Hotchandani1, hotchandani@yahoo.com, Jinbo Lee2

1 Scilligence, Watertown, Massachusetts, United States; 2 Scilligence Corporation, Burlington, Massachusetts, United States
Scilligence is a leading innovator of cross-platform, mobile cheminformatics and bioinformatics solutions. Scilligence's proprietary technologies address three main areas of R&D informatics needs: knowledge management and collaboration; knowledge mining of unstructured data; project and material management.
Scilligence’s enterprise platform such as ELN enhance knowledge sharing and productivity of researchers in discovery and development of small molecule and biologics therapeutics. Scilligence ELN is a cross-platform Electronic Laboratory Notebook for research and education. It has been widely adopted across industry and academic institutions handling multitude of internal & external collaborations. What’s unique about Scilligence ELN is its powerful data mining technology and advanced informatics for biologics and conjugates such as ADCs (antibody-drug conjugates). Scilligence ELN supports all disciplines of research including medicinal chemistry, process chemistry, bioassays, HTS, in vivo pharmacology and toxicology.
Scilligence’s applications are specifically designed to minimize IT footprints and require no client-side installation.


,

 


 

11:05am-11:30am CINF 56: ACAS LIMS simplifies diverse data loading, management, and querying
John McNeil, john@mcneilco.com, Guy Oshiro, guy@mcneilco.com, Brian Fielder, bfielder@hmc.edu, Eva Gao, Samuel Meyer, Brian Bolt, Fiona McNeil, Matthew Shaw, Kelley Carr

John McNeil & Co., San Diego, California, United States
Traditional ELNs have historically struggled with the integration of disparate experimental data into the digital notebook. Limitations on what types of data can be loaded and the complexity of reformatting data are barriers to effective usage. Traditional ELNs capture the information about the experiment but do not facilitate querying and reporting of the experimental data. We developed ACAS LIMS (Assay Capture & Analysis System) to provide a streamlined approach to load disparate assay types, and built a powerful querying and reporting engine to transform the data into knowledge. The cornerstone of ACAS is the Experiment Loader module, which enables the rapid entry of experimental and computational data into the system while capturing context about the protocol and experiment. ACAS LIMS also includes a Compound Registration module that tracks parent compounds, lots, and synthesis reactions. The ACAS Data Viewer module facilitates easy retrieval of all of the linked data by structure search or simple text searches. ACAS is an integral part of the experimental data processing workflow.

11:30am-11:55am CINF 57: ChemEngine: An automated chemical data harvesting tool for molecular inventory and chemical computing from scientific literature

Muthukumarasamy Karthikeyan1, karthincl@gmail.com, Renu Vyas2

1 Digital Information Resource Centre, CSIR National Chemical Laboratory, Pune, India; 2 Chemical Engineering and Process Development, CSIR-National Chemical Laboratory, Pune, MH, India
There is an urgent need for scientific data sharing and data standardization in public domain as required by research supported by government grants. Usually the chemical data published in scientific literature is very hard to retrieve, recover and re-use. Chemoinformatics tools are matured enough to harvest chemical data to certain extent especially converting name to structure and vice versa. Chemoinformatics tools have been developed to harvest molecular data in image format to corresponding connection tables and chemical names with moderate success. Recently we developed a chemoinformatics application for harvesting chemical data from scientific literature in plain text format and PDF formats that is beyond simple name to structure conversion. The methodology involves parsing raw data from scientific literature in PDF format and recognize the chemical data and convert them efficiently into molecular data for reusability in inventory (ELN applications) and Chemical Computing (using QM/QC tools). The truly computable molecules generated using this approach were directly subjected to further atomic level energy calculation . The challenges of selective recognition of chemical data from large amount of non-chemical data and apply for inventory application and further computing will be discussed. The ChemEngine a chemoinformatics tool developed for this approch will be presented with case studies especially on the performance and optimization.


 

11:55am-12:00pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 8:15am - 11:55am
Monday, March 14
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Leah McEwen
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 43: PubChem BioAssay: A decade’s practice for managing chemistry research data
Yanli Wang, ywang@ncbi.nlm.nih.gov

NCBI, NLM, NIH, Building 38A, Room 5S506, 8600 Rockville Pike, Bethesda, Maryland, United States
The PubChem BioAssay database was created in 2004 by the National Center for Biotechnology Information (NCBI) as a public repository for biochemical biology and medicinal chemistry data of small molecule. The database now contains over 1,000,000 bioassay records (BioAssay accession, AID), 200 million bioactivity outcomes, and tens of thousands protein and gene targets. Building this public information system has been an effort with challenges on many fronts. This presentation will describe the project history, the ten-year, multiple-cycle, and still continuing development, and the current functionalities provided at PubChem. BioAssay data may be freely accessed and downloaded using the NCBI information retrieval system Entrez at http://www.ncbi.nlm.nih.gov/pcassay/. A suite of services provided by PubChem are available at http://pubchem.ncbi.nlm.nih.gov, including the most recent development for the BioAssay record page at https://pubchem.ncbi.nlm.nih.gov/bioassay/myAID. Chemical structures and assay results may be deposited via the submission system at: http://pubchem.ncbi.nlm.nih.gov/upload. PubChem welcomes feedback and contribution from the community.

8:45am-9:15am CINF 44: Data infrastructural design for informing critical evaluation
Kenneth Kroenlein, kenneth.kroenlein@nist.gov

Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, Colorado, United States
Exponential growth rates in data generation combined with non-negligible error rates in the scientific literature [1] have conspired to make critical evaluations impracticable in many scenarios for the average scientist. Data volumes have grown to such a degree that many traditional data collection and interpretation approaches cannot scale sufficiently to remain comprehensive and current, or to effectively track shifting interests within research and industrial communities. It is thus necessary to strongly rely on a substantially increased role for digital archives, automated analysis and machine learning approaches.

The approach adopted at the Thermodynamics Research Center (TRC) at the National Institute of Standards and Technology (NIST) is dynamic data evaluation, whereby a reliable and comprehensive data archive is used in conjunction with an algorithmically-encoded expert analysis in order to generate up-to-date property recommendations. In developing an infrastructure to support this high-throughput critical analysis for thermophysical properties, TRC staff have created technological solutions for data collection, curation and communication to quickly meet the challenges encountered under real-world conditions. These include user experience-driven data entry tools, and international data standards and ad hoc derivatives of them. The particulars of these tools will be discussed as well the importance of being both proactive and reactive in the information technology development process.

[1] Chirico, R.D.; Frenkel, M.; Magee, J.W.; Diky, V.; Muzny, C.D.; Kazakov, A.; Kroenlein, K.; Abdulagatov, I.; Hardin, G.; Acree, W.E.; Brenneke, J.F.; Brown, P.L.; Cummings, P.T.; de Loos, T.W.; Friend, D.G.; Goodwin, A.R.H.; Hansen, L.D.; Haynes, W.M.; Koga, N.; Mandelis, A.; Marsh, K.N.; Mathias, P.M.; McCabe, C.; O’Connell, J.P.; Pádua, A.; Rives, V.; Schick, C.; Trusler, J.P.M.; Vyazovkin, S.; Weir, R.D.; Wu, J. “Improvement of Quality in Publication of Experimental Thermophysical Property Data: Challenges, Assessment Tools, Global Implementation, and Online Support” J. Chem. Eng. Data 2013, 58, 2699-2716.

9:15am-9:40am CINF 45: Community-driven disciplinary data repositories: A case study
Ian Bruno, bruno@ccdc.cam.ac.uk, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
In 2015 we celebrated the fiftieth anniversary of the Cambridge Structural Database - the world’s primary repository for small molecule crystal structure data. This has come a long way since its initial origins as a series of printed indices first published in 1965. Today crystallographers across the globe deposit over 60,000 datasets per year and researchers across disciplines apply the knowledge derived from over 800,000 structures to a range of scientific challenges.

A key driver in the development of the Cambridge Structural Database has been the input and support of the wider research community. This has come from academia and industry, publishers and researchers, scientific unions and individuals. This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, will provide a brief reflection on the last half century. We’ll look at the key moments where community engagement was pivotal to the establishment and evolution of a widely respected disciplinary data repository.

9:40am-10:10am CINF 46: ICSU World Data System: Trusted data services for global science
Mustapha Mokrane1, Jean-Bernard Minster2, jbminster@ucsd.edu, Rorie Edmunds1

1 International Programme Office, ICSU World Data System, Koganei, Tokyo, Japan; 2 Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, La Jolla, California, United States
Today’s research is international, transdisciplinary, and data-enabled, which requires scrupulous data stewardship, full and open access to data, and efficient collaboration and coordination. New expectations on researchers based on policies from governments and funders to share data fully, openly, and in a timely manner present significant challenges but are also opportunities to improve the quality and efficiency of research and its accountability to society. Researchers should be able to archive and disseminate data as required by many institutions or funders, and civil society to scrutinize datasets underlying public policies. Thus, the trustworthiness of data services must be verifiable. In addition, the need to integrate large and complex datasets across disciplines and domains with variable levels of maturity calls for greater coordination to achieve sufficient interoperability and sustainability.

The World Data System (WDS) of the International Council for Science (ICSU) promotes long-term stewardship of, and universal and equitable access to, quality-assured scientific data and services across a range of disciplines in the natural and social sciences. WDS aims at coordinating and supporting trusted scientific data services for the provision, use, and preservation of relevant datasets to facilitate scientific research, in particular under the ICSU umbrella, while strengthening their links with the research community. WDS certifies it Members, holders and providers of data or data products, using internationally recognized standards. Thus, providing the building blocks of a searchable common infrastructure, from which a data system that is both interoperable and distributed can be formed.

This presentation will describe the coordination role of WDS and more specifically activities developed by its Scientific Committee to:
– Improve and stimulate basic level Certification for Scientific Data Services, in particular through collaboration with the Data Seal of Approval.
– Identify and define best practices for Publishing Data and to test their implementation by involving the core stakeholders, namely, researchers, institutions, data centres, scholarly publishers, and funders.
– Establish an open WDS Metadata Catalogue, Knowledge Network, and Global Registry of Trusted Data Services.

10:10am-10:25am Intermission
10:25am-10:55am CINF 47: STRENDA and MIRAGE: Examples of community-based data reporting standardization initiatives
Martin Hicks, mhicks@beilstein-institut.de, Carsten Kettner, ckettner@beilstein-institut.de

Beilstein Institut, Frankfurt, Germany
An essential requirement for scientific progress is unrestricted access to research results in a form that is directly usable by researchers. However, there are many deficiencies in the way that data are currently reported, resulting often in incomplete and even unusable data sets that are not suitable for subsequent research and knowledge generation. There are various reasons for this, ranging from the lack of a framework for structured and standardized data reporting to the largely outdated infrastructure for reporting and publishing scientific research results. The diverse data requirements in scientific research mean that no one-size-fits-all solution would be applicable; thus domain-specific guidelines and infrastructure for reporting and data management are required.

The Beilstein-Institut has initiated and runs two data standards projects: STRENDA, which is concerned with the standardization of reporting enzymology data, and MIRAGE, with the reporting of glycomics experimental results. Each project is made possible and advanced by a commission of experts in these fields, who work together in defining the reporting standards. The STRENDA reporting guidelines were published in 2010 and have been recently implemented as a web-based front-end to the STRENDA-DB, enabling an open access database of validated enzymatic experimental data to be built up. This talk will address the issues that have had to be overcome in setting up an effective bottom-up mechanism to create workable solutions – from scientists for scientists – in these projects.

10:55am-11:25am CINF 48: Standardizing the description of nanomaterials: The CODATA uniform description system
John Rumble1, jumbleusa@earthlink.net, Steven Freiman2, Clayton Teague3

1 R&R Data Services, Gaithersburg, Maryland, United States; 2 Freiman Consulting, Potomac, Maryland, United States; 3 Teague Consulting, Gaithersburg, Maryland, United States
The complexity and newness of nanomaterials has made describing them accurately a challenge. Traditional nomenclature systems for chemicals and bulk engineering materials do not capture the nanoscale details and features that give nanomaterials their interesting properties. CODATA has established an international, multi-disciplinary working group to develop a uniform description system for materials on the nanoscale (UDS). The UDS has been designed to meet the needs of diverse user communities, including researchers, regulators, and database managers in disciplines ranging from chemistry and materials science to food science, nutrition science, and toxicology. The UDS has identified the information categories and descriptors useful for describing individual nano-objects and collections of nano-objects, including those in various media such as biological and environmental fluids, as well as nano-objects embedded in bulk materials. The CODATA UDS is freely available at www.codata.org/nanomaterials for use in designing database schemas, developing ontologies for nanomaterials applications, reporting experimental and computational results in the literature, and depositing results into nanomaterials repositories. This presentation will briefly review the history of the CODATA UDS as well describe the UDS itself.

11:25am-11:55am CINF 49: Scientific units in the electronic age
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States
Scientists have standardized on the SI unit system since the late 1700’s. While much work has been done over the years to refine and redefine the system, little has formally done to standardize the representation of the SI units in electronic systems.

This paper will present a summary of current efforts toward electronic representation of scientific units, an analysis of needs for current computer/network systems, and an outline of future work.

CINF: Informatics & Quantum Mechanics: Combining Big Data & DFT in Pharma & Materials 8:40am - 12:00pm
Monday, March 14
Room 25A - San Diego Convention Center
Art Cho, Organizing
Art Cho, Presiding
8:40am-8:45am Introductory Remarks
8:45am-9:15am CINF 58: Screening of materials for energy applications based on transport properties: Methods and data automation tools

Boris Kozinsky, bkoz37@gmail.com

Bosch Research, Waban, Massachusetts, United States
Design of new functional materials relying on transport phenomena is complicated by the highly nonlinear sensitivity of conductivity to structural and composition changes. This makes brute-force computational screening impossible and requires the development of descriptors and efficient approximations to narrow down the space of possibilities. I will briefly present our recent efforts on developing practical methods and data-driven approaches for the discovery and design of battery and thermoelectric materials. In each case there is a need to automate computational tools, organize and analyze the data, and preserve full record of data flow for reproducibility, while allowing for data sharing. The resulting workflows and data formats are heterogeneous and an automation platform is needed that is flexible enough to cover the common requirements and to leave the API interfaces open for implementation of specific scientific plug-ins by the users. The necessary features include tight coupling of data capture with automation, connecting computational engines in a high-level working environment, recording complete provenance information, and organizing data in an efficiently query-able form. Finally, data science tools are also needed for analyzing transport data, extracting and validating trends, to be used in iterative screening. I will highlight our current efforts to implement an open-source platform aimed at satisfying these requirements.

9:15am-9:45am CINF 59: High-throughput chemical simulations and virtual screening for materials discovery
Mathew Halls, mhalls@mhalls.com, David Giesen, Thomas Hughes, Shaun Kwak, Thomas Mustard, Jacob Gavartin, Alexander Goldberg, Yixiang Cao

Schrodinger Inc., San Diego, California, United States
Virtual screening is an approach first developed and applied in the pharmaceutical industry to identify leads in the drug discovery process. The process involves the automated computational analysis and subsequent filtering of chemical structure libraries based on predicted properties to identify promising systems for further investigation. Virtual screening for materials solutions is a promising new development. Advances in the efficiency of simulation codes and the significantly improved performance of commodity computing resources has dramatically reduced the time required for analysis; pushing the applicability from small molecules to extensive surface models and bulk systems. Moreover, electronic structure and molecular dynamics packages are extremely robust for routine analysis and property prediction, usually requiring no user intervention once the chemical models and parameters have been decided. This makes it possible for automated property predictions for candidate systems with varying structure and composition. The structure library can then be sorted and ranked to identify lead systems and estimate critical structure-property limits across a target chemical design space. An alternative approach to exhaustive screening involves the automated evolution of a set of input structures toward target property characteristics using a simulation informed genetic algorithm. An evolutionary approach dramatically reduces the number of simulations needed to identify chemical systems having the desired property profile, and samples chemical space not covered by deterministic library generation. In this presentation, examples of the use of high-throughput chemical simulation for materials discovery are presented.

9:45am-10:15am CINF 60: Machine learning and high-throughput quantum chemistry methods for the discovery of organic materials
Alan Aspuru-Guzik, aspuru@chemistry.harvard.edu

Harvard University, Cambridge, Massachusetts, United States
In this talk, I will overview my group's efforts towards the discovery of organic materials. I will focus on the methods that we employ such as neural fingerprints, deep neural networks, Gaussian processes and even simple linear regressions to correlate theory and experiment. I will describe ways of accelerating the exploration of the chemical space and strategies for close collaboration with experimental partners. I will overview briefly the functionality of our software stack. Applications include organic light-emitting diodes, organic flow battery materials and organic photovoltaics. I will talk about at least one of these, again, focusing on the computational methodologies and lessons learned.

10:15am-10:30am Intermission
10:30am-11:00am CINF 61: Using drug discovery methods to accelerate the search for better battery materials
Joshua Schrier, jschrier@haverford.edu

Chemistry, Haverford College, Haverford, Pennsylvania, United States
Redox flow batteries (RFB) using water-soluble organic redox couples are a new strategy for low-cost, eco-friendly, and durable stationary electrical energy storage. To be useful, these molecules must have extreme (either high or low) oxidation/reduction potentials and high aqueous solubility. Like many molecular design problems, the search space of possible functional derivatives is too large to be completely explored with ab initio calculation, but cheminformatics methods can make the search tractable.

In this talk, I'll describe our exploration of 105 possible thiophenoquinone derivatives. By using existing cheminformatics tools from the drug-discovery community to eliminate insufficiently soluble compounds from our pipeline,we reduced the space to a more tractable 103 compounds. Using exhaustive B3LYP/6-311+G(d,p) thermochemical calculations with SMD solvation model—which reproduce experimental reduction potentials to within ±0.04—we computed redox voltages and free energies of solvation for all of these compounds, resulting in 51 new candidates with the high solubility and wide voltage range needed for high performance aqueous RFB applications.

This ab initio data-set provided us with the opportunity to develop and test cheminformatics and lead-screening strategies for finding high-performing battery materials. A group-additivity model predicts the redox potential to within ±0.09 V, and can be trained with as few as 200 examples. Surprisingly, the 'quantum-free' group-additivity model was more accurate than models that used the ab initio LUMO energy or information from semiempirical Hückel calculations as descriptors. Using these models to perform simulated screening experiments, we found that 'active' (high voltage or low voltage species) candidates could be identified with an enrichment factor of 2-6, depending on the model and framework type. Having validated this drug-discovery inspired approach with an exhaustive dataset (where we know the 'right' answer), we are now applying this to the much larger space of phenazine derivatives for aqueous redox batteries, and will describe preliminary results on that effort.


 

11:00am-11:30am CINF 62: Combining density functional theory with cheminformatics for development of a new-paradigm ligand screening method in computational drug discovery
Art Cho1,2, artcho@korea.ac.kr

1 Korea University, Seoul, Korea (the Republic of); 2 Quantum Bio Solutions, Seoul, Korea (the Republic of)
Density functional theory (DFT) has been successfully applied to many fields for its efficiency and versatility. Materials science and quantum chemistry are examples of those fields in which DFT methods are inseparable in current research. On the other hand, it has been rare to utilize DFT for biological problems until recently. Computational drug discovery, in which protein docking is the central methodology, is one such field. This is due to 2 main reasons: 1. biomolecules are large compared to other chemical molecules and therefore it would be prohibitively time-consuming to run DFT calculations on the whole systems, and 2. most interactions within biomolecular systems are non-quantum in nature. However, it turns out that there are cases in computational drug discovery, for which molecular mechanical level description is not enough. For this reason, we have developed a series of methods incorporating DFT calculations for computational drug discovery, which proved to be effective for a number of different classes of problems. In order to take advantage of the power of DFT in virtual ligand screening, however, something must be done on the other side of the process because of the time consuming nature of quantum-level calculations. For this, we have envisioned a paradigm-shifting ligand screening method, in which ligand libraries to be screened contain much fewer compounds than current drug-like libraries, yet, effectively span much larger compound space.
In this talk, I will briefly summarize our previous efforts in development of protein docking methods using DFT calculations and then present a new cheminformatic method that will enable use of them in industrial pharmaceutical environment.

11:30am-12:00pm CINF 63: Discovery through deterministic optimization: Navigating chemical space for effective material design

Jennifer Elward, jen.elward@gmail.com, Christopher Rinderspacher

Army Research Laboratory, Aberdeen Proving Ground, Maryland, United States
Computational molecular design and optimization increasingly plays a critical role in the design of novel materials. Part of the desirability of optimization lies in the ability to traverse and explore the largely untapped potential of chemical space in a manner that will most benefit the materials discovery process. In the present work, we have developed a deterministic, constrained optimization method that is able to leverage satisfying multiple constraints with efficient navigation of a large optimization space. Density functional theory provides the computational backbone to the optimization and was chosen due to the tradeoff between speed and accuracy and its wide application base. We compared a variety of breadth-first search and gradient-analogous local search algorithms for the optimization process. Each of the algorithms has been benchmarked with respect to efficiency and performance. One of the key benefits of utilizing deterministic techniques in this work is the ability to retain chemical and structural information at each step of the optimization procedure. This feature has allowed for detailed visual analysis of the algorithmic path from input structure to final candidate material and has been beneficial in both the structure search and further algorithm development. In addition, large data libraries can be created and examined to produce qualitative structure-property relationships based on the optimization constraints. At present, this method has been successfully applied to a number of materials science systems of interest including high-hyperpolarizability materials for optics applications and energetic materials. It was found in each case that it was only necessary to explore a small fraction of chemical space (< 1%) to find performant candidates which satisfy the optimization constraints.

CINF: Chemical Information for Small Businesses & Startups 1:00pm - 4:55pm
Monday, March 14
Room 24C - San Diego Convention Center
Edlyn Simmons, Organizing
Edlyn Simmons
Cosponsored by: CPRM and SCHB, Presiding
1:00pm-1:15pm Introductory Remarks
1:15pm-1:40pm CINF 72: Building a business with and without scientific computing: The five W's and one H
Steven Muskal, smuskal@eidogen-sertanty.com

Suite 103-475, Eidogen, Oceanside, California, United States
Startups and small businesses today have a clear and distinct advantage over their predecessors. Coupling very experienced resource pools of contracted and/or outsource staff with utility (i.e. cloud-based) and mobile computing can equip such businesses with an unprecedented level of capability. Open source and multiple options for lower cost technologies and content also represent exciting new opportunities or a quagmire depending on your background and experience. Having lived through both the sell- and buy-sides of scientific research computing over last 25 years, we will discuss the why's, when's, who's, what's, where's and how's for building a scientific research computing effort.

1:40pm-2:05pm CINF 73: Interactive cheminformatics for occasional use in SMEs
Therese Inhester1, inhester@zbh.uni-hamburg.de, Matthias Hilbig3, Matthias Rarey2

1 Center for Bioinformatics, University of Hamburg, Hamburg, Germany; 2 University of Hamburg, Hamburg, Germany

In the past decade, more and more data sets of chemical compounds became freely available, due to large chemical databases such as ChEMBL or PubChem as well as due to vendors which provide their catalogs electronically. This large amount of freely-available small molecule data sets opens the route for a large community of life-scientists, including small businesses and start-ups, to profit from this data wealth. For this purpose, easy-to-use but precise cheminformatics software tools are required to perform elementary tasks. These tasks can span from chemical library browsing to removal of duplicates and filtering by physicochemical properties which is often required previous to virtual screenings. Moreover, compound collections need to be compared and merged according to different annotations, e.g. to find all compounds which are in a vendor catalog and also listed in ChEMBL.
We propose a new intuitive approach to interactively manipulate compound collections. Our software MONA [1] is able to rapidly perform different set operations on molecule sets, as well as filtering and visualizing large compound collections. Using the recent second release of MONA[2], arbitrary molecule properties can be added to the molecules and can be used for filtering, too. Furthermore, similarity clustering and structure depiction alignments strongly improve the visual comparison of molecules. With the help of MONA, standard processes on annotated molecule collections can easily be performed on a regular personal computer by scientists with occasional cheminformatics needs like they occur in biotechnology SMEs. We are going to present different application scenarios in order to demonstrate the utility of our approach.
[1] Hilbig M. et al., MONA– Interactive manipulation of molecule collections. Journal of Cheminformatics 2013, 5:38.
[2] Hilbig M. and Rarey M., MONA 2: A light Cheminformatics Platform for Interactive Compound Library Processing. J.Chem.Inf.Model 2015.

2:05pm-2:30pm CINF 74: Playing by the rules: Knowing what applies and what information you have to maintain regarding your chemical inventory
Frankie Wood-Black, fwblack@cableone.net

Ag., Science and Engineering, Northern Oklahoma College, Ponca City, Oklahoma, United States
Think about your business. Do you know exactly what chemicals you have on hand? Do you know if you are subject to various regulations such as SARA, TSCA, or DHS? How do can you tell if they apply to you? When working in a small chemical business or starting up a business, you have an number of things that you have to manage. Knowing and understanding what you have, what rules apply and what documentation you need to maintain is just one more thing that you need to consider in how you manage your business. This paper focuses on some of the key environmental regulations that you may or may not be aware of when dealing with a business that uses and maintains a chemical inventory.

2:30pm-2:55pm CINF 75: ChemSpider: Search and share chemistry… for free
Serin Dabb, dabbs@rsc.org

The Royal Society of Chemistry, Cambridge, United Kingdom
ChemSpider is a free chemical structure database providing fast text and structure search access to over 35 million structures from hundreds of data sources. This presentation will demonstate the content of ChemSpider, where we get our data from and how we aggregate it into one interface. ChemSpider is provided for the chemistry community, and we encourage researchers to curate and add more content. It provides chemical structural information, physical and chemical properties, in addition to literature references including patents. The Royal Society of Chemistry is committed to working with the community to improve access to free tools and services around data; other examples of these will be discussed.

2:55pm-3:10pm Intermission
3:10pm-3:35pm CINF 76: What chemists and other scientists need to know about their duty of disclosure under the new law governing the patenting process in the US
Xavier Pillai, xpillai@leydig.com

Leydig Voit Mayer Ltd, Chicago, Illinois, United States
Scientists and inventors are aware that most patent applications filed after March 16, 2013 will be examined under the Leahy-Smith America Invents Act (AIA) which was enacted into law on September 16, 2011. That is, the patent applications will be examined on the basis of the first inventor to file the patent applications as opposed to the first one to invent. The law brought about many changes to the patent practice. In particular, the law expanded the scope of prior patents and publications (prior art) that can be applied by an examiner against a patent application, which in turn expanded the scope of the duty of disclosure by the patent applicants. This talk would address the expanded scope of the applicable prior art and the attendant duty of disclosure.

3:35pm-4:00pm CINF 77: Monitoring the minnows: Using IP information to understand what small businesses are doing
Stephen Adams, stephen.adams@magister.co.uk

Magister Ltd, Roche, Cornwall, United Kingdom
The intellectual property system can sometimes be regarded as a tool which is optimised for the multi-national corporation, exclusively for protecting blockbuster inventions, and not very relevant to small businesses. However, many developed economies, and almost all developing countries, rely upon the small to medium-sized enterprise (SME) sector for a substantial proportion of their innovative capacity. Paradoxically, it can be more difficult to identify research originating from these small companies than from larger ones. This in turn can hinder other negotiations such as establishing joint ventures, valuing intangible assets before a formal takeover bid, or head-hunting key personnel. This session will consider some of the challenges inherent in locating and using records of the IP generated and owned by small businesses.

4:00pm-4:25pm CINF 78: Patent information in PubChem for small businesses and startups
Sunghwan Kim, kimsungh@ncbi.nlm.nih.gov, Paul Thiessen, Evan Bolton, Steve Bryant

National Library of Medicine, National Institutes of Health, Rockville, Maryland, United States
PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public chemical information resource, developed and maintained by the U.S. National Institutes of Health (NIH). It contains more than 157 million chemical substance descriptions, 60 million unique compounds, and 229 million bioactivities determined from one million assay experiments. Importantly, data contribution from a growing number of organizations, including IBM and SureChEMBL (formerly known as SureChem), allows PubChem to provide links to patent information for chemicals. Currently, PubChem offers links between about 6 million patent documents and more than 16 million unique chemical structures, with over 336 million chemical substance-patent links covering U.S., European, and World Intellectual Property Organization patent documents published since 1800. This presentation will provide an overview of the patent information in PubChem as well as the best practice for using it.

4:25pm-4:50pm CINF 79: Open patent chemistry “big bang” presents large opportunities for small enterprises
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden
In 2012, after the first IBM open deposition of 2.5 million structures, few would have predicted that PubChem compounds that include patent-extracted submissions would approach 20 million by 2015 (PMID 26194581). The current major open patent chemistry feeds (in size order) are NextMove, SCRIPDB, Thomson Pharma, IBM and SureChEMBL. The comparative statistics of sources and the arguments that the coverage probability of lead compound prior-art structures is now very high, will be presented. The consequences are that the academic community and small companies can now patent-mine extensively in PubChem and SureChEMBL, possibly even without needing commercial sources to support their own filings. Other recent major enabling aspects for small institutions include a) the open availability of patent full-text for querying b) a range of free tools for DIY chemistry extraction (PMID 23618056) and c) automatic bioentity mark-up in patent text (e.g. protein names) from the SureChEMBL/SciBite collaboration. Examples of DIY analysis of newly published patents will be shown. Even for small enterprises not filing directly open patent chemistry presents a big expansion in accessible SAR space and aspects of mining this will be exemplified. However, open chemistry extraction does bring in a variety of artefacts that add confounding structural “noise” These include a) permutations of mixtures and chiral exemplifications, b) virtual structures c) extractions from documents cannot directly indicate IP status and d) “common chemistry” swamping. These problems and some partial solutions using PubChem filters will be discussed.

4:50pm-4:55pm Concluding Remarks
CINF: Global Initiatives in Research Data Management & Discovery 1:00pm - 5:00pm
Monday, March 14
Room 25B - San Diego Convention Center
Ian Bruno, Leah McEwen, Organizing
Ian Bruno, Leah McEwen
Cosponsored by: ANYL, COMP, MEDI and PHYS, Presiding
1:00pm-1:05pm Introductory Remarks
1:05pm-1:35pm CINF 64: Authoring tools to automate data sharing in scientific publishing
John Kitchin, jkitchin@andrew.cmu.edu

Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States
Data sharing and reproducibility of research are increasingly important issues. Funding agencies are mandating data sharing in calls for proposals, and journals are increasingly requiring data sharing as a condition of publication. Scientists are increasingly interested in open access to data. A requirement, or even desire to share is not sufficient, however, if sharing is difficult or tedious. We believe that new authoring tools are needed that will integrate data and analysis into the research and publishing processes. These tools will reduce the difficulty of sharing and reusing data.

We have developed a new approach to writing scientific documents that enables the direct inclusion of human-readable, and machine-addressable data and code. In this talk we will illustrate the approach by example from papers we have recently published using the approach. We show that the combination of an extensible editor (Emacs) with a lightweight markup language (org-mode) provides a remarkable solution to data sharing and research reproducibility issues. This combination enables the documentation of experimental setup, data generation, and analysis in a single document, and subsequent export of a scientific manuscript that is suitable for submission to most journals. When coupled with external data repositories, the approach enables sharing of large or complex data sets that cannot easily be captured in a manuscript. We will conclude with an outlook on the approach and where we see it going.

1:35pm-2:00pm CINF 65: Facilitating the inclusion of analytical raw data in the submission and review process
Santiago Dominguez Vivero1, sdominguez@mestrelab.com, Juan Cobas Gomez1, Felipe Seoane1, Jose Garcia Pulido1, Agustin Barba1, Jesus Varela Carrete2

1 Mestrelab Research SL, Hereford, Herefordshire, United Kingdom; 2 Chemistry, University of Santiago de Compostela, Santiago de Compostela, A Coruña, Spain
Scientific articles have traditionally been supported by the inclusion of supplementary data, also referred to as supporting information, designed to provide additional information necessary for understanding the principal points of the publication and support the conclusions presented in the article but which cannot be included in the paper itself due to space reasons or technical limitations.
The inclusion and review of supplementary data is not without challenges, particularly when papers are supported by a large volume of analytical information. Preparation and formatting of the data ready for upload is a time consuming, error-prone process. Also, the data uploaded has traditionally been ‘dead’ data, (PDF, images or Word) which allows visualization but no interaction, interrogation or re-processing. Validation of results achieved would therefore require repetition of the processes described in the paper and reacquisition of the analytical data. Whilst this would be highly desirable, it is not often feasible.
The last few years have seen a significant increase in calls for the inclusion, by the authors, of raw analytical data which support the paper’s conclusion, to allow reviewers and readers to verify the work of the authors, prevent fraud and identify errors.
This talk aims to present an integrated system developed by the authors to facilitate the submission and review of supplementary information. The system includes:
- Software automations which prepare and format supplementary information commonly provided.
- Software automations which prepare the raw data for one-click submission.
- A software tool, of free access to all reviewers and readers, which allows the reprocessing and reanalysis of raw data and includes the visualization, reprocessing and analysis capabilities implemented in the Mnova software.
- Searching capabilities to leverage, in future research, the body of knowledge built by the submission of publications and supporting information.

2:00pm-2:30pm CINF 66: Crystallography: A domain exemplar for chemistry data management
Simon Coles, s.j.coles@soton.ac.uk

University of Southhampton, Hampshire, United Kingdom
Crystallographers have been generating data for over a century. It wasn’t until about half way along this timeline that the case was made for a coherent community approach to managing the data outputs. This resulted in the Cambridge Structural Database (http://www.ccdc.cam.ac.uk/Solutions/CSDSystem/Pages/CSD.aspx), with the main driver for collecting data being that it could be reused and new science could be driven from what we can learn by having a collection. Data management was an added bonus, but quickly the community seized on this opportunity and aligned the database to publishing procedures.

The great leap forward however was the introduction of the Crystallographic Information Framework (CIF) in the early 1990’s (http://www.iucr.org/resources/cif). The power behind CIF is that it is driven by a structured dictionary of terms managed by a committee convened by a learned society (IUCr). With such a comprehensive dictionary many things are possible – it can be expressed as a common file format and from this numerous applications can be driven. Not only does it provide a curation format that stands the test of time, but it can be rendered in many ways, it drives all data exchange, it can be automatically validated and it even underpins several forms of publication.

Having built such a culture and toolset, the arrival of the internet and the ability to relatively easily build software systems provides a rich environment for development. This is timely because these technological innovations have also resulted in an exponential increase in the amount of data generated. This talk will outline not only what we can achieve as a global body, but also how individuals or organisations can now act globally. Originally CIF only covered results data, but this was soon extended to raw data and now the community, through the Diffraction Data Deposition Working Group (http://www.iucr.org/resources/data/dddwg), is moving forward with archival and publication of ALL of its data. Furthermore, crystallographers are now integrating with the wider chemistry community through the development of common standards. On a more local level much is also possible – the UK National Crystallography Service operates an archival policy that must be in line with numerous funder mandates but at the same time it has been possible to set up an open data repository that integrates with community processes (http://ecrystals.chem.soton.ac.uk/).

2:30pm-2:55pm CINF 67: Are data management solutions developed for commercial organizations suitable for academic research?
Mariana Vaschetto, mariana.vaschetto@dotmatics.com, Tom Oldfield, Michael Hartshorn

Dotmatics, Bishops Stortford, United Kingdom
Research data management solutions such as Electronic Laboratory Notebooks (ELNs) have been standard in Pharma and Biotech for quite some time. In addition, the widespread acceptance of web-based technologies facilitated the adoption of hosted solutions in cloud environments by commercial organizations. In addition, commercial organizations are shifting their focus towards collaborative research with academic groups and non-for-profit institutes. This leads to the questions: Are cloud data management solutions developed with commercial R&D groups in mind transferable to Academia? And can these solutions facilitate collaborative research across both worlds?
In this talk we will discuss how Dotmatics has worked with its customers to develop a cloud based configurable infrastructure that is equally useful in commercial or academic environments. This web-based solution enables scientists to customize the read and write access to this data making the real-time transfer and sharing of information across different teams seamless. The tools also provide secure access for all partners with restricted views that allows sharing of knowledge while protecting IP for the commercial sector. The ideas presented not only include spread sheets built on database technology but also interactive dashboards to share documents and applications and facilitate the collaboration of ideas globally

2:55pm-3:10pm Intermission
3:10pm-3:30pm CINF 68: Data sharing in life sciences R&D: Pre-competitive collaboration through the Pistoia Alliance
Carmen Nitsche, cnitsche@swbell.net

Pistoia Alliance, San Antonio, Texas, United States
The Pistoia Alliance (http://www.pistoiaalliance.org) is a group of life sciences industry experts. We use pre-competitive collaboration to address issues around aggregating, accessing, and sharing data that are essential to innovation, but provide little competitive advantage. We have a strong track record in delivering value from our projects, providing our membership with perspective on current problems, and being a source of impartial opinion. We were established in 2009 by representatives of AstraZeneca, GSK, Novartis and Pfizer who met at a conference in Pistoia, Italy.

3:30pm-3:50pm CINF 69: The Royal Society of Chemistry and the data publication landscape
Serin Dabb, dabbs@rsc.org

The Royal Society of Chemistry, Cambridge, United Kingdom
Like many global funding agencies, the UK’s, Engineering and Physical Science Research Council (EPSRC) has mandated research data preservation and put the responsibility on institutions to comply. A key component of this preservation is ensuring accessibility and discoverability. The Royal Society of Chemistry is building a research data repository to hold different types of derived chemical information, as part of our EPSRC-funded National Chemical Database Service, to address some of these needs.
We believe a number of steps need to take place before the chemistry community incorporates data management into their routine workflow. One is encouraging the development of community data standards. Another is showing the utility of wider information sharing, and our Compound Collection pilot for extracting compounds from theses has had great buy-in from academia, libraries and pharma companies. As a Publisher we are investigating overlap between research data availability, and our journal publication processes. As a learned society we feel our role is to aid our community by encouraging skills development and a wider understanding of the issues involved, and appreciating the potential opportunities.

3:50pm-4:10pm CINF 70: Digital IUPAC: The need for global representation of chemistry and chemical information in the digital age
Jeremy Frey, j.g.frey@soton.ac.uk

University of Southampton, Southampton, United Kingdom
The growing importance of chemical information challenges all international bodies and especially IUPAC to fulfill its mission to address global issues involving the chemical sciences, given that in the modern digital world all manufacture, research, teaching and learning is now assisted by computer systems. I wish to argue the case for “Digital IUPAC” as in this increasingly digital age, IUPAC will and must take a lead in providing machine-readable (i.e. computable and understandable) representations of chemical information as well as structure, using standards that IUPAC define and standards that other international authorities agree to use.
Looking at the wider data and information agenda, the Royal Society (of London) report “Science as an open enterprise” argues the absolute necessity for intelligent access to the data on which scientific conclusions are based. Intelligent openness is fundamental to the whole progress of science. In the modern digital world intelligent access really requires that this access can be mediated by computers. The comprehensive conversion of IUPAC’s knowledge base of standards and definitions from human-readable to computer-readable form is essential. It is vital that this conversion be done now as a matter of extreme urgency, if IUPAC is to maintain its role as the international authority for the chemical sciences. If computers cannot find and use the information provided by IUPAC, that information will effectively cease to exist for the “Wikipedia generation”.

4:10pm-4:30pm CINF 71: DIG chemistry: Establishing a research data interest group to address the many faces of chemical data management
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States
Chemistry is a central science with a long history of rich information and data resources traditionally compiled from articles. Is this really the best way of managing research data to ensure reproducibility, reuse, efficiency, and application? What are the compelling use cases for chemical data and what practices are missing to support these? The international chemical information community is examining current practices and connecting with other disciplines and data initiatives to explore how we can collectively and effectively fill the gaps. Experimental and theoretical researchers, educators, data and information scientists, librarians, publishers, database providers, and colleagues across the academic, industrial, private and public sectors are forming a Chemistry Interest Group within the Research Data Alliance (RDA). This presentation will highlight specific challenges we could address now with a view to discussing how these can best be tackled.

4:30pm-5:00pm Panel Discussion
CINF: Informatics & Quantum Mechanics: Combining Big Data & DFT in Pharma & Materials 1:30pm - 4:45pm
Monday, March 14
Room 25A - San Diego Convention Center
Art Cho, Organizing
Art Cho, Presiding
1:30pm-2:00pm CINF 80: In silico, high-throughput screening of non-fullerene acceptor materials for applications of organic photovoltaic devices: A Harvard clean energy project study
Steven Lopez, stevenlopez0209@gmail.com, Edward Pyzer-Knapp, Alan Aspuru-Guzik

Harvard University, Cambridge, Massachusetts, United States
Organic Photovoltaics (OPVs) have shown a steady growth in efficiencies since the 1980s, and reported percent conversion efficiencies (PCEs) up to 12% are reported in multi-junction cells. OPVs are lightweight, easy to produce, and feature chemically diverse components. While PCBM is the standard fullerene n-type (acceptor) material, it is not without limitations, which include limited spectral breadth, small range of LUMO energies, and relatively high costs of industrial production. We have undertaken an in silico high-throughput screening utilizing the Harvard Clean Energy Project to explore the chemical space associated with non-fullerene acceptor materials. A library of 100,000 n-type materials including perylene diimides, tetraazabenzodifluoroanthenes, diketopyropyrroles, and fluoroanthene-fused imides. This work is carried out through a tight feedback loop with experimental colleagues that synthesize target materials and create OPV devices.

2:00pm-2:30pm CINF 81: Regioselectivity prediction of metabolic reactions based on ab initio derived descriptors

Arndt Finkelmann2, arndt.finkelmann@pharma.ethz.ch, Andreas Göller1, Gisbert Schneider2

1 Global Drug Discovery, Bayer Pharma AG, Wuppertal, Germany; 2 Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
The complexity and diversity of chemical transformations involved in drug metabolism impose a hurdle on computational models for Site of Metabolism (SoM) prediction [1]. Ligand-based machine learning models are general and potentially useful tools for SoM prediction. However, they require suitable atom descriptors that ideally capture the reactivity-determining features. We approached the SoM prediction problem by developing descriptors that characterize an atom's steric and electronic environment. The electronic environment is approximated by the partial charge distribution in the atom's proximity. The partial charges are obtained from quantum chemistry calculations and represent the electron distribution in a molecule. To identify the partial charge scheme that is best suited for descriptor construction, we investigated the dependence of different partial charges on molecular conformation and calculation method. NPA and CM5 charges turned out to have a low dependence on conformation and allow for a one-conformer approach. We demonstrate that our descriptors enable the construction of accurate and robust cytochrome SoM prediction models.

[1] Kirchmair J.; Goeller A.H.; Lang D.; Kunze J.; Testa B.; Wilson I. D.; Glen R.C.; Schneider G. Predicting drug metabolism: experiment and/or computation? Nat. Rev. 2015, 14, 387-404.

2:30pm-3:00pm CINF 82: COSMO-based approach for the design of solvents to optimize reaction rates
Nicholas Austin1, nick.austin111@gmail.com, Nikolaos Sahinidis2, Daniel Trahan3

1 Chemical Engineering, Carnegie Mellon University, Bowling Green, Kentucky, United States; 2 Dept Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States; 3 The Dow Chemical Company, Freeport, Texas, United States
The reaction medium plays a critical role in determining the success of a particular reaction, the rate at which it proceeds, and whether any undesirable side-products are formed. The selection of a solvent is often empirical or based on somewhat rudimentary properties (H-bond donor/acceptor abilities, dielectric constant, solubility parameters, etc.). Furthermore, there is limited customization in solvent choice as many reactions are performed in one of a handful of common laboratory solvents or a simple blend of these solvents. For this reason, choosing an optimal (or simply improved) solvent for a particular reaction has significant application potential in liquid-phase chemistry.
From an optimization point of view, this problem is challenging for two main reasons: (1) there are virtually an infinite number of potential structures to choose from and (2) any blend of solvents requires the additional determination of mole fractions. In this work, we propose the use of the COSMO solvation model and its –RS and –SAC post-processing steps to calculate solvation free energies and thereby estimate reaction rates. The use of COSMO presents a distinct advantage over other approaches as COSMO-RS and –SAC estimates of any composition require only a single calculation for each component of a mixture. Incorporating these methods into an efficient optimization framework necessitates the development of semi-empirical methods (specifically, group contribution methods) These provide estimates to sigma profiles, which are averaged representations of charge density versus surface area and are key in calculating mixture properties.
The design space of the optimization (molecular structures and mole fractions) is projected into a much lower-dimensional space, specifically that of the statistical moments of the sigma profiles of each component of the solvent mixture. This enables the use of efficient derivative-free optimization methods to quickly design solvents for specific reaction-rates applications. This approach will be discussed in detail and applied to several solvent design problems.

3:00pm-3:15pm Intermission
3:15pm-3:45pm CINF 83: Efficient, first-principles-based screening for high-charge carrier mobility in organic crystals
Christoph Schober, christoph.schober@ch.tum.de, Karsten Reuter, Harald Oberhofer

Chair of Theoretical Chemistry, Technical University Munich, Garching, Germany
In organic electronics, charge carrier mobility is a key performance parameter. Due to the complex manufacturing processes of e.g. organic field effect transistors (OFETs) measured mobilities are often heavily affected by the device preparation. This masks the intrinsic materials properties and therewith hampers the decision whether further device optimization for a given organic molecule is worthwhile or not. We developed a fast and efficient protocol with a descriptor based on electronic coupling values to assess the expected performance of organic materials for application in organic electronic devices. Applying this protocol to experimental structures of organic crystals obtained from the Cambridge Structural Database (CSD), we screen about 40000 structures employing only first principle methods. Out of the 28000 successfully calculated structures we select 2000 candidates with above-average electronic couplings for additional calculations and in-depth analysis using statistical methods and automated classification based on chemical structure. This allows us not only to identify a number of specific crystals with exceptionally high electronic coupling values and therefore promising properties, but also possible lead structures which can be the basis for in-depth theoretical and experimental studies of new classes of materials for organic electronics.

3:45pm-4:15pm CINF 84: Data-driven chemistry: From small molecules to discovery of new functional materials
Olexandr Isayev2, olexandr@olexandrisayev.com, Alexander Tropsha1

1 Univ of North Carolina, Chapel Hill, North Carolina, United States; 2 UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States
The Materials Genome Initiative is transforming Materials Science into a data-rich discipline. These developments open exciting opportunities for knowledge discovery in materials databases using informatics approaches to inform the rational design of novel materials with the desired physical and chemical properties. Statistical and data mining approaches have been successfully employed in both chemistry and biology leading to the development of cheminformatics and bioinformatics, respectively. However, until recently their application in materials science has been limited due to the lack of sufficient body of data.
In this work we showcase a pilot materials informatics applications capable of (i) instantaneously query and retrieve the necessary material information in the desired form, (ii) identify, visualize and study important data patterns, and (iii) generate experimentally-testable hypotheses by building predictive Machine Learning (ML) models based on materials’ characteristics. Specifically, we posit that materials with similar structural, topological, and electronic characteristics are expected to have similar physical chemical properties irrespective of their formal composition. To enable uniform comparison of materials by their intrinsic properties, we will represent all materials uniquely by multiple numerical descriptors, or fingerprints. This representation will enable the use of classical cheminformatics and ML approaches to mine, visualize, and model any set of materials as we demonstrated in our recent pioneering studies on Materials Cartography [1].

Reference:
Isayev, O., Fourches, D., Muratov, E,N,. Oses, C., Rasch K.M., Tropsha, A., and Curtarolo, S. Chem, Mater, 2015, 27, 735–743. DOI: 10.1021/cm503507h

4:15pm-4:45pm CINF 85: Multi-agent approach for molecular modeling in chemical vapor deposition
Luke Achenie, achenie@vt.edu

Virginia Tech, Blacksburg, Virginia, United States
Zinc sulfide continues to generate interest due to the fact that compared to other semiconductors it has a large direct band gap, making it desirable in optical applications. In these applications only high quality zinc sulfide films can be employed. The default approach for producing zinc sulfide films are through chemical vapor deposition (CVD). However, there is a large potential for defects in the deposited film primarily due to the fact that the morphology of adducts are different in the gas phase compared to that in the solid phase (i.e. deposited film). The impact of adducts depend on the size distribution of clusters, which create large distorted grains on the substrate that do not normally have the same morphology as the deposited film. Basically the main issue is how big these clusters grow; larger clusters make it more likely to have defects in the deposited film.

In our previous research we employed a computational approach to predict the size distribution and morphology of the clusters. With this information we were able to explain the link between the cluster size and the morphological defects on the deposited film. In our approach we coupled the macro-scale computational fluid dynamics with molecular scale molecular dynamics and nano-scale ab-initio calculations in order to estimated the nucleation, growth, dynamics, and size distribution of the particles inside the CVD reactor. This presentation shows advances we have made using new modeling modalities; specifically we will discuss a Multi-Agent Approach for coupling molecular modeling, specifically molecular dynamics in which the force fields are periodically updated with DFT calculations with macro level (i.e. continuum based) computational fluid dynamics and/or general particle dynamics. In summary our approach uses a multi-agent approach to bridge the time and space scales of molecular level calculations and continuum scale modeling.

CINF: Sci-Mix 8:00pm - 10:00pm
Monday, March 14
Hall D/E - San Diego Convention Center
8:00pm-10:00pm CINF 105: Supporting openness and reproducibility in scientific research: The Center for Open Science

Sara Bowman, sed8n@virginia.edu

Center for Open Science, Charlottesville, Virginia, United States

8:00pm-10:00pm CINF 110: Building a better materials science database: Challenges and opportunities

Robin Padilla, robin.padilla@springer.com, Michael Klinge, michael.klinge@springer.com

Corporate Markets & Databases, Springer Nature, Heidelberg, Germany

8:00pm-10:00pm CINF 116: Competitive intelligence workbench: Getting access to information for decision making

Huijun wang, huijun.wang@merck.com

Merck, Kenilworth, New Jersey, United States

8:00pm-10:00pm CINF 117: Using systems biology in computational drug design workflows

George Nicola, george.nicola@outlook.com, Bruce Kovacs

Afecta Pharmaceuticals, Irvine, California, United States

8:00pm-10:00pm CINF 131: Comparative toxicogenomics database: Advancing understanding of molecular connections among chemicals, genes, and diseases

Cynthia Grondin, cjgrondin@ncsu.edu, Allan Davis, Thomas Weigers, Carolyn Mattingly

Biology, North Carolina State University, Raleigh, North Carolina, United States

8:00pm-10:00pm CINF 139: Enhanced chemical understanding through 3D-printed models

Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk, Peter Wood4, Ian Bruno1, Ye Li2, Vincent Scalfani3, Shawn O'Grady2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 University of Michigan, Ann Arbor, Michigan, United States; 3 University Libraries, University of Alabama, Tuscaloosa, Alabama, United States; 4 CCDC, Cambridge, United Kingdom

8:00pm-10:00pm CINF 13: Open data is not enough: A look at the Research Data Alliance

Mark Parsons, parsom3@rpi.edu

Research Data Alliance, Boulder, Colorado, United States

8:00pm-10:00pm CINF 143: Chemical knowledge representation and access in Wolfram|Alpha and Mathematica

Eric Weisstein, eww@wolfram.com

Scientific Content, Wolfram|Alpha, Champaign, Illinois, United States

8:00pm-10:00pm CINF 147: Leveraging the VIVO research networking system to facilitate collaboration and data visualization

Michaeleen Trimarchi, Danielle Bodrero Hoggan, danielle@scripps.edu

Kresge Library, The Scripps Research Institute, La Jolla, California, United States

8:00pm-10:00pm CINF 165: Predicting drug-induced hepatic systems' toxicity by integrating transporter interaction profiles

Eleni Kotsampasakou, eleni.kotsampasakou@univie.ac.at, Gerhard Ecker

Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria

8:00pm-10:00pm CINF 21: Deep convolutional neural networks for autonomous discovery of molecular interactions

Abraham Heifets, Izhar Wallach, Michael Dzamba, misko@atomwise.com

Atomwise, Inc., San Francisco, California, United States

8:00pm-10:00pm CINF 29: On our way to the automated search for ligand-sensing cores

Tobias Brinkjost1,2, tobias.brinkjost@tu-dortmund.de, Christiane Ehrt2, Petra Mutzel1, Oliver Koch2

1 Faculty of computer science, TU Dortmund University, Dortmund, Germany; 2 Faculty of chemistry and chemical biology, TU Dortmund University, Dortmund, Germany

8:00pm-10:00pm CINF 2: Standard JSON molecule, a solution to a cross-vendor molecule file format?

Brian Cole, coleb@eyesopen.com

OpenEye Scientific Software, Santa Fe, New Mexico, United States

8:00pm-10:00pm CINF 32: Advances in data provisioning

Marian Brodney1, marian.d.brodney@pfizer.com, Jacquelyn Klug-McLeod2, Gregory Bakken2, Robert Stanton1

1 Computational Sciences Center of Excellence, Pfizer, Cambridge, Massachusetts, United States; 2 Computational Sciences Center of Excellence, Pfizer, Groton, Connecticut, United States

8:00pm-10:00pm CINF 33: Chemical information on the web: Find and be found

Asta Gindulyte, mandroji@yahoo.com

National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, Maryland, United States

8:00pm-10:00pm CINF 57: ChemEngine: An automated chemical data harvesting tool for molecular inventory and chemical computing from scientific literature

Muthukumarasamy Karthikeyan1, karthincl@gmail.com, Renu Vyas2

1 Digital Information Resource Centre, CSIR National Chemical Laboratory, Pune, India; 2 Chemical Engineering and Process Development, CSIR-National Chemical Laboratory, Pune, MH, India

8:00pm-10:00pm CINF 58: Screening of materials for energy applications based on transport properties: Methods and data automation tools

Boris Kozinsky, bkoz37@gmail.com

Bosch Research, Waban, Massachusetts, United States

8:00pm-10:00pm CINF 63: Discovery through deterministic optimization: Navigating chemical space for effective material design

Jennifer Elward, jen.elward@gmail.com, Christopher Rinderspacher

Army Research Laboratory, Aberdeen Proving Ground, Maryland, United States

8:00pm-10:00pm CINF 81: Regioselectivity prediction of metabolic reactions based on ab initio derived descriptors

Arndt Finkelmann2, arndt.finkelmann@pharma.ethz.ch, Andreas Göller1, Gisbert Schneider2

1 Global Drug Discovery, Bayer Pharma AG, Wuppertal, Germany; 2 Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland

8:00pm-10:00pm CINF 99: Applications of drug-target data in translating genomic variation into drug discovery opportunities

Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Tuesday, March 15
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 86: Towards knowledge representation improvements in chemistry
Evan Bolton, evan.e.bolton@gmail.com

NCBI / NLM / NIH, Warrenton, Virginia, United States
Scientific knowledge is vast and nuanced. Summarizing countless pieces of information (numbering in the billions and trillions) is not straightforward. There are many opportunities to improve the quality and navigability of data. This talk will highlight efforts to handle the open scientific corpus in how it pertains to chemical information.

8:45am-9:10am CINF 87: Chemical classifications for biology and medicine
Minoru Kanehisa, kanehisa@kuicr.kyoto-u.ac.jp

Institute for Chemical Research, Kyoto University, Uji Kyoto, Japan
Life would not exist without chemical substances. For the purpose of developing bioinformatics methods, they are divided into two categories: metabolic substances and regulatory substances. Metabolic substances are interconverted as substrates and products of enzyme-catalyzed reactions. Regulatory substances interact with proteins, DNA, RNA, and other endogenous molecules in many ways. The chemical space of metabolic substances is determined by the universe of enzyme-catalyzed reactions, which in turn is determined by the genomic space of enzyme genes [1]. Here we focus on regulatory substances, including xenobiotic compounds and drugs, and present how knowledge is organized in KEGG [2], which is an integrated resource of sixteen main databases. There are two relevant databases. One is KEGG BRITE, which contains hierarchical classifications of various biological objects that are linked to both internal and outside databases. The other is KEGG DGROUP for drug grouping where individual instances of drugs are grouped into classes of functionally identical or similar drugs. KEGG DGROUP can be compared to KEGG ORTHOLOGY (KO) for genes and proteins, where generalization of instances to classes is the basis for interpretation and prediction of molecular interaction networks and associated high-level functions.

[1] Kanehisa, M.; Chemical and genomic evolution of enzyme-catalyzed reaction networks. FEBS Lett. 587, 2731-2737 (2013).
[2] Kanehisa, M., Sato, Y., Kawashima, M., Furumichi, M., and Tanabe, M.; KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, in press (2016).

9:10am-9:35am CINF 88: Withdrawn
9:35am-10:00am CINF 89: ChEBI database and ontology: A key resource for chemical biology and metabolomics
Gareth Owen, gowen@ebi.ac.uk

EMBL-EBI, Ely, United Kingdom
ChEBI (http://www.ebi.ac.uk/chebi), a manually curated database and ontology of chemical entities of biological interest, is widely recognized as a key player in the chemical ontology arena. The ChEBI Ontology includes both chemical and functional descriptors of the molecules and groups in the database. This is of particular importance in enabling the incorporation of chemical concepts of importance to chemical biology and drug discovery into resources that have different foci. Thus the ChEBI Ontology is widely used by other biomedical ontologies - most notably the Gene Ontology - for handling their chemistry-containing terms.

After reviewing the motivation behind the development of the ChEBI Ontology and its three component sub-ontologies, the various relationships used to link entities within the ontology and the methods used to classify new entries in the ChEBI database will be described. Finally, plans for future developments of the ChEBI Ontology will be discussed, and possible applications and uses of the ontology by other resources will be outlined.

10:00am-10:15am Intermission
10:15am-10:40am CINF 90: Classifying chemistry: Current efforts in Canada
David Wishart, dwishart@ualberta.ca

Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
Our group has been actively involved in developing databases for metabolomics (HMDB), exposomics (T3DB), food chemistry (FooDB) and medicinal chemistry (DrugBank). This work has given us a unique perspective on chemical information and how it can connect to biological or biomedical information. It has also highlighted the need to develop better methods to “harvest” chemical and biochemical data for our databases as well as better methods to “structure” the data within these databases. Over the past 3 years we have developed several publicly accessible tools to facilitate chemical data harvesting, text mining and feature prediction. We have also started developing novel tools for classifying chemical structures, storing biochemical processes and providing more structured ontologies to describe the chemical and biological data in these databases. In this presentation I will describe some of these tools in more detail and highlight some useful applications that are enabled by these tools.

10:40am-11:05am CINF 91: Classifying compounds in public databases
Lutz Weber, lutz.weber@ontochem.com

IT, OntoChem, Germering, Germany
OntoChem is engaged in developing novel tools and algorithms that enable the interconnection of chemical and biological ontologies with use cases focused on knowledge generation in drug discovery. For example, OntoChem ontologies (http://www.ontochem.de/it-solutions/what-we-offer/ontologies.html) provide not only a computational classification of chemical compounds but also chemical fragments, substituents, scaffolds and other chemical terms directed towards substances and materials like drugs, vitamins, polymers or alloys.

OntoChem integrates semantic text mining and annotation toolsto extract knowledge and factual data of compound properties. A modular UIMA pipeline is used to annotate any document type with a range of Life Science technologies. These technologies have been optimized and are ideally suited for high speed and high quality annotation necessary to handle searching large data volumes.

As an example, we will demonstrate the classification of PubChem compounds using an open access chemistry classification system derived from ChEBI, autoritative chemistry text books and other sources. Such chemistry classifications can be used to enhance search engines (e.g. www.ocminer.com), improve knowledge extraction technologies and support higher level abstraction in Life Sciences. Using PubChem compound classifications we will demonstrate their utility as a basis for judging on novelty and trend analytics

 

11:05am-11:30am CINF 92: Automated structural and functional annotation of small molecules using integrated chemical ontologies: ClassyFire, ChemOnt, and downstream applications
Yannick Djoumbou Feunang, djoumbou@ualberta.ca

Biological Sciences, University Of Alberta, Edmonton, Alberta, Canada
Centuries of discoveries in chemistry and biology have produced a large amount of knowledge about chemicals and their interactions with the environment. Recent efforts to harvest and store this data into electronic warehouses have emphasized two issues: 1) the scarcity and incompleteness of the data, and 2) the need to organize it into more comprehensible and exchangeable formats. Moreover, organizing chemical information in a structured way could not only improve our understanding of chemistry and related sciences, but also facilitate new discoveries. In this spirit, we have developed ClassyFire and ChemOnt. ClassyFire is a computational tool for a rapid, consistent, dataset-independent, automated structure-based classification of chemical compounds. It relies on the structure-based sub-ontology of ChemOnt, a well-defined chemical ontology, which covers a wide spectrum of compounds, and roles (applications, health effects, etc.). In a join effort, ChemOnt has been mapped to other ontologies for the sake of interoperability. ClassyFire was used to classify major databases, including PubChem, DrugBank, HMDB, and ChEBI. Additionally, we have developed other tools and frameworks to integrate more concepts (proteins, pathways, phenotypes) in order to represent or study their interactions with small molecules. In this presentation, we will describe these tools, some of their applications, and how they could be combined with semantic technologies to infer knowledge, suggest new hypothesis, as well as make new discoveries.

11:30am-11:55am CINF 93: Evaluation of machine-generated chemical ontologies for molecular information
Stephen Boyer, skboyer@gmail.com, Thomas Griffin, Eric Louie

IBM Research, San Jose, California, United States
With today's proliferation of the scientific literature and the massive databases resulting from computer curation, it is imperative to automate the classification processes. Several programs have been developed to ingest machine-readable forms of molecules and to generate a set of molecular attributes (descriptors). One example is ingesting a SMILES string and correlating it with the context in which it occured. We have evaluated several of these programs for classification purposes and for input into downstream operations such as knowledge graphs, data mining, regulatory compliance, and cognitive computing.

 

CINF: Driving Change: Impact of Funders on the Research Data & Publications Landscape 8:35am - 12:00pm
Tuesday, March 15
Room 25A - San Diego Convention Center
Elsa Alvaro, Andrea Twiss-Brooks, Organizing
Elsa Alvaro
Cosponsored by: MEDI and ORGN, Presiding
8:35am-8:40am Introductory Remarks
8:40am-8:50am Update on NSF MPS Open Data Policies
8:50am-9:15am CINF 100: NIH public access policy
Neil Thakur, thakurn@od.nih.gov

NIH, Rockville, Maryland, United States
The NIH public access policy has been place since 2005, and mandatory since 2008. It requires all papers arising from NIH funds to be made public on PubMed Central within 12 months of publication. Since PubMed Central is an XML archive, papers need to be in XML format before they can be posted. There are four ways in which papers can be posted on PubMed Central, and they vary in level of effort that an author must undertake. The submission method is determined by author and publisher preference. We will describe these methods and their implications for authors. We will also discuss the various strategies NIH uses to monitor compliance. The American Chemical society has been using different submission methods, and we will explore how these various approaches have impacted compliance.

9:15am-9:40am CINF 101: U.S. Department of Energy public access plan
Laura Biven, laura.biven@science.doe.gov

US Department of Energy, Washington, D.C., District of Columbia, United States
The Department of Energy’s (DOE) Public Access Plan aims to increase access to data and publications resulting from DOE-funded research. As part of the implementation of the Public Access Plan, the Department has developed PAGES –the Public Access Gateway for Energy & Science- to provide public access to full text versions of peer reviewed publications, and now requires data management plans for DOE-funded research. This presentation will discuss the history and philosophy for the new activities and requirements as well as some thoughts for future work.

9:40am-10:05am CINF 102: Helping authors and funders achieve open access goals at ACS Publications
Darla Henderson, D_Henderson@acs.org

Publications Division, American Chemical Society, Washington, District of Columbia, United States
During 2014-2015, in response to increasing funder mandates imposed on authors and several far-reaching trends in scholarly publishing and open access, ACS Publications implemented a significant expansion of its open access publishing program. This expansion included:

- The launch of ACS Central Science, the Society’s first fully open access journal (with no author publishing charges) aiming to publish the most impactful multidisciplinary research and showcasing the centrality of chemistry;
- A new ACS-sponsored program making one noteworthy new article from an ACS journal open access each day (ACS Editors’ Choice);
- New license types for authors choosing to publish open access (ACS AuthorChoice options); and
- A $60-million stimulus program to support authors selecting to publish their work as open access across ACS journals (ACS Author Rewards).

ACS works closely with funding organizations to comply with various new mandates and to communicate these policy requirements to authors. Engaging directly with funders directly is one way we are seeking to simplify the process for authors. In addition to leveraging direct engagement with funders to simplify author compliance, ACS also serves as a founding member of CHORUS, a suite of services and best practices for sustainable public access to published articles reporting on funded research in the US.

This session will address how the American Chemical Society’s Publications Division is working with funders as they develop new mandates for publications.

10:05am-10:30am CINF 103: Libraries at the hub as the federally funded research wheel turns to open
Shannon Kipphut-Smith1, sk60@rice.edu, Betty Rozum2, betty.rozum@usu.edu, Becky Thoms3, becky.thoms@usu.edu

1 Rice University, Houston, Texas, United States; 2 Utah State University, Logan, Utah, United States
Academic libraries are strong partners in supporting researcher compliance with both funder public access policies and institutional open access policies, and are increasingly involved in research data management activities. The 2008 National Institutes of Health (NIH) Public Access Policy, requiring researchers to deposit copies of all NIH-funded publications in PubMed Central, provided an opportunity for academic librarians to use their expertise in education and training, copyright, and author rights issues to assist with policy compliance. At the same time, many institutions began conversations about management of research data and adoption of institutional open access (OA) policies, requiring faculty to place copies of their scholarship in institutional repositories (IRs). Academic libraries play an important role in these policies, promoting the benefits of OA, managing IRs, and facilitating article deposit.

Naturally, many of those already engaged with services and resources related to public access, OA policies, and research data welcomed the 2013 White House Office of Science and Technology Policy (OSTP) memo, calling for increased public access to the results of federally-funded publications and research data. This presentation shares the results of a study conducted to better understand how academic libraries are leveraging existing services and resources when addressing the new public access policies. The researchers will survey libraries and research offices at the Carnegie very high and high research activity universities regarding OA policies, and services and collaborations that have been developed to assist faculty in meeting the new federal mandates. Using the results of the survey, and case studies from Rice University and Utah State University, we will offer a detailed snapshot of the role of academic libraries and research offices in addressing these funder policies as well as identify opportunities for more collaborative efforts.

10:30am-10:45am Intermission
10:45am-11:10am CINF 104: SHARE phase II: Enhancing the dataset and engaging the community
Judy Ruttenberg, judy@arl.org

Association of Research Libraries, Washington, District of Columbia, United States
SHARE is building a free, open data set about scholarly research activities across their lifecycle. Stakeholders across the scholarly research ecosystem - funders, institutions, researchers, libraries - can both participate and benefit from an open data set about research activity, especially with an increasing trend toward public and open access to the results of that activity. This session will update the community on the progress of SHARE Notify, currently processing and freely distributing millions of research release events from sources including ArXiv, Figshare, PLOS, PubMed Central, and a number of institutional repositories. We will share the objectives and progress of Phase II of SHARE - expanding the number of data providers, enhancing the aggregated metadata, and looking for opportunities for institutional integration of SHARE's dataset. The expansion, enhancement, and integration of the dataset will ensure that SHARE is a timely and reliable source of data for universities about their own research output, and for funders about their investments. SHARE is an open source development project led by three higher education associations (ARL, AAU, and APLU) in partnership with the Center for Open Science, a nonprofit technology start-up.

11:10am-11:35am CINF 105: Supporting openness and reproducibility in scientific research: The Center for Open Science

Sara Bowman, sed8n@virginia.edu

Center for Open Science, Charlottesville, Virginia, United States
New policies by funding agencies and require researchers to make publicly available their data and other research outputs. Evolving journal policies increasingly require more data and materials sharing by authors. Researchers must learn to navigate these ever-changing policies, often with little infrastructure support. The non-profit Center for Open Science (COS) seeks to provide researchers with both the infrastructure tools and training to meet these needs.
COS builds The Open Science Framework (OSF), a free and open-source web application designed to manage the entire research lifecycle, from project inception and planning through data archiving and dissemination. The OSF is connects tools researchers already use to increase efficiency and streamline workflows. Features like automatic file versioning and logging of actions make the research process more transparent. The OSF can be used privately, among collaborators, or opened to the general public with just the click of a button. Every resource, project, and contributor is given a persistent, unique identifier, which allows work to be cited and researchers to earn credit for contributions. The OSF represents a technical solution for researchers wishing to increase the openness of their work and meet funder mandates regarding data access.
The COS Community team focuses its efforts on building communities of researchers, funders, librarians, journal editors, and other stakeholders around open and reproducible practices in science. Two full-time staff members support researchers with free statistical and methodological consulting services, providing guidance to help researchers both meet funder mandates and make their work more open and reproducible. In another major initiative, the Community team seeks to support journal editors and funders with templates of guidelines that can be adopted to increase transparency of the research process and product. In collaboration with the Berkeley Initiative for Transparency in the Social Sciences and SCIENCE magazine, COS convened a meeting of stakeholders to write the Transparency and Openness Promotion (TOP) Guidelines. This talk will provide an overview of the guidelines, an update on the adopting journals, and provide more information on how journals in the chemical sciences can participate to enhance their own transparency standards.
This talk will highlight initiatives COS has undertaken to improve the openness, integrity and reproducibility of science.

11:35am-12:00pm CINF 106: Impact of open publishing: Scalability, sustainability, and success
Ann Gabriel, a.gabriel@elsevier.com

Elsevier, New York, New York, United States
New policies concerning dissemination of funded research are influencing traditional modes of scholarly communication. This segment will explore how Publishers are working to comply with and enhance a range of mandates from global interests, as well as streamline publication workflow for both institutions and endusers. We will examine paths to compliance, including new business models and content types. We will also discuss sharing across scholarly collaboration networks, with a specific focus on Open Data.

CINF: Linking Big Data with Chemistry: Databases Connecting Genomics, Biological Pathways & Targets to Chemistry 9:30am - 11:50am
Tuesday, March 15
Room 24C - San Diego Convention Center
Rachelle Bienstock, Organizing
Rachelle Bienstock, Presiding
9:30am-9:35am Introductory Remarks
9:35am-9:55am CINF 94: Connecting 3D chemical data with biological information
Ian Bruno, bruno@ccdc.cam.ac.uk, Suzanna Ward, Elizabeth Thomas, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
Understanding the 3D structure of molecules and their interactions with biological systems is a crucial element of successful drug design. A vital resource in this is the world’s collection of over 800,000 crystal structures of organic and metal organic compounds. Many of these are directly biologically relevant. Even those that aren’t contain conformational and interaction data explaining molecular properties and interactions.

Sophisticated software is available in the Cambridge Structural Database System to release this knowledge, but until now, it has been designed for human consumption. This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database (CSD), will describe the development of an Application Programming Interfaces (APIs) that enables the linking of the CSD to other resources as well as interoperability with other suites of software.

We will see how the 3D structures of small molecules can be linked to the 3D structures of equivalent protein ligands. How the search and analysis tools previously the domain of expert structural chemists can be accessed through Pipeline Pilot and KNIME. How we might generate streamlined workflows to link structural information in the CSD with target, pathway and disease information data in resources such as Open PHACTS. Finally we will look at the insights we can gain from linking 3D structural chemistry to biological data and the challenges involved in bridging these domains.

9:55am-10:15am CINF 95: PubChem BioAssay: Link chemical research to GenBank and beyond
Yanli Wang, ywang@ncbi.nlm.nih.gov

Building 38a, Room 5s506, Bethesda, Maryland, United States
The PubChem BioAssay database hosted by the National Center for Biotechnology Information (NCBI) at NIH serves as a public repository for biological results from Chemo-genomic research and RNAi screenings, with the former conducting systematic screening of small molecule libraries against disease targets and pathways, and the later aiming to gain insights into biological process and facilitate therapeutic target discovery. In particular, advanced technology in RNAi research enables genome-wide functional screens, and that in small molecule high-throughput screening (HTS) enables testing large compound library across wide assay target panel. PubChem BioAssay has grown rapidly in the past ten years with over 200 million bioactivity outcomes currently in its database. It devises multiple mechanisms in its data model for recording molecular information for the corresponding protein and nucleotide assay targets, and represents an important information resource for mining chemical modulators for over nine thousand protein targets that are associated with small molecule data, and for mining significance of biological relevance for over 30,000 genes provided by RNAi research. PubChem BioAssay links chemical research data to GenBank and related genomic resources through multiple tools and annotations. This integration helps to close the gap between genomic and chemical biology research, and provides a unique annotation service for the genomic information, which enables the retrieval of drug and chemical modulators for a particular protein in GenBank, as well as for searching biological and therapeutic relevance suggested by RNAi research for many gene records.

10:15am-10:35am CINF 96: Withdrawn
10:35am-10:50am Intermission
10:50am-11:10am CINF 97: Predicting adverse drug events using literature-based pathway analysis
James Rinker, j.rinker@elsevier.com, Timothy Hoctor

R & D Solutions, Elsevier Inc., Philadelphia, Pennsylvania, United States
Unexpected drug safety issues in clinical development can lead to suspending or ending the development of a clinical candidate. The cost of failed drug candidates in both time and money can greatly hinder the development of other promising candidates due to lost development resources. The ability to more accurately predict potential adverse drug events for pre-clinical candidates would greatly help in the process of deciding to move forward or suspend the development of candidates. One potential method for the prediction of adverse drug events would employ pathway analysis of adverse event regulators. Mining the literature for evidence of regulators implicated in specific adverse events can be extracted and mapped to known drugs or potential drug candidates based on their target profile. The target profile for a drug candidate can then be used to mine through hundreds of potential adverse events and their regulators. Employment of statistical, pathway, and subnetwork analysis can then be used to score and predict the likelihood of a specific adverse event for a drug based on either direct or indirect target modulation.

11:10am-11:30am CINF 98: Intersecting different databases to define the inner and outer limits of the data-supported druggable proteome
Christopher Southan, cdsouthan@gmail.com

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden
Hopkins and Groom coined the term “druggable genome” in 2002 for the extrapolated total of ~ 10% of the human proteome likely to bind small molecules with lead-like chemical properties and sufficient binding affinity for activity modulation. Fast-forward to 2015 and the UniProtKB website now include four database cross-references in the new Chemistry section. These provide a more detailed picture, based largely on chemistry-to-protein mapping data curated from the literature. They are thus evidence-supported statistics rather than homology-based transitive estimates. These included (Sept 2015) human protein links to 2927 target entries from ChEMBL, 2191 from BindingDB, 1563 from DrugBank and 1340 from the IUPHAR/BPS Guide to PHARMACOLOGY (GtoPdb). Statistical comparisons between these will be presented here defining different levels evidence support and following their continued expansion. The union of all four sets, 3603, encompasses ~ 18% of the proteome. However, the proportion that would match the most stringently curated of these, GtoPdb for chemistry-to-protein mapping is lower and comparison indicate curation strategies and source selections for each database diverge considerably (PMID 24533037). This is manifest in the relatively high unique content of 1147 (31% of the union) for the sources. However, they converge as a 4-way intersect for 490 proteins (13% of the union). Concordance between at least two independent sources (i.e. the non-unique proportion) expands to 2456 or 12% of the proteome. This represents the most precise data-supported druggable proteome snapshot for each UniProtKB release. Orthogonal comparative analyses of these intersecting sets will be presented, including by Gene Ontology functional categories, target class content, secreted vs. non-secreted, and disease gene links. The utility of this druggable proteome assessment is very high in pharmacology and drug discovery, especially in terms of being able to data mine leads as chemical starting points for target validation experiments.

11:30am-11:50am CINF 99: Applications of drug-target data in translating genomic variation into drug discovery opportunities

Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom
Advances in sequencing and genotyping technologies offer opportunities for large-scale target identification and validation through genetic association studies1,2. However, successfully translating genotype-phenotype relationships into new therapeutics necessitates understanding of the associated biological pathways and the chemical tractability of the implicated proteins.

The ChEMBL3 database collates and organizes drug, target and bioactivity data, with the aim of tracking the drug discovery process from target and lead identification through to drug approval. This talk will present examples of the integration of ChEMBL druggability and drug-target data with results of genome-wide association studies to facilitate the identification of novel drug discovery and drug repurposing opportunities.

References

1. Hingorani, A. & Humphries, S. Nature’s randomised trials. Lancet 366, 1906–8 (2005).
2. Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nature Reviews Drug discovery 12, 581–94 (2013).
3. Bento, A.P., Gaulton, A., Hersey, A., Bellis, L.J., Chambers, J. Davies, M., Krüger, F.A., Light, Y., Mak, L., McGlinchey, S., Nowotka, M., Papadatos, G., Santos S., Overington, J.P. The ChEMBL bioactivity database: an update. Nucleic Acids Research 42, D1083-D1090 (2014).

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:45pm
Tuesday, March 15
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 107: Representing the chemistry of 800,000 crystal structures
Suzanna Ward, ward@ccdc.cam.ac.uk, Ian Bruno, Colin Groom

Cambridge Crystallographic Data Centre, Cambridge, United Kingdom
For over 50 years the crystallographic community has used the Cambridge Structural Database (CSD) as the worldwide repository to share over 800,000 experimentally determined 3D crystal structures with the broader chemistry community. But these structures are typically represented as ‘just’ the coordinates of atoms in space. In order to be of use to other scientists this data must be enriched with both a chemical representation and the associated metadata necessary to contextualize an entry. Moreover, the structures must also be understandable by computer software.
This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, will look at how the existing chemical knowledge in 800,000 crystal structures can be used generate representations of new structures. It will look at how these representations are used in validation and standardization and in linking crystal data with other resources.
We will look at how we can make structures more discoverable and more useful, before addressing what the broader chemistry and informatics communities can do to improve scientific knowledge representation.

2:00pm-2:25pm CINF 108: CHEMnetBASE and beyond: CRC handbooks and dictionaries in today's world
Fiona Macdonald1, fiona.macdonald@taylorandfrancis.com, Megan Eisenbraun2

1 Taylor and Francis, Boca Raton, Florida, United States; 2 Taylor & Francis, London, United Kingdom
While the CRC Handbook of Chemistry & Physics has been a mainstay for scientists since 1913, its utility is no longer restricted to the printed page. Since 1999 it's been available online in one form or other, and in the summer of 2016 the next incarnation will make its debut.

Along with the Chapman & Hall Chemical Dictionaries (Combined Chemical Dictionary, Dictionary of Natural Products, Dictionary of Organic Compounds) it makes up CHEMnetBASE, a suite of fully searchable databases containing physical properties, structures and chemical names. All of these products will be redesigned to align with the new and improved online Handbook, providing consistent search functionality, indexing protocols, and display of search results.

We will present the motivation behind the development of these resources, outline plans for integrating the search systems and showcase our vision for the future of CHEMnetBASE. Previews of the new online Handbook will also be presented.

2:25pm-2:50pm CINF 109: Collection, curation, and communication of thermophysical and thermochemical property data at the NIST Thermodynamics Research Center
Andrei Kazakov1, andrei.kazakov@nist.gov, Robert Chirico3, Chris Muzny4, Vladimir Diky5, Eugene Paulechka1, Ala Bazyleva1, Joseph Magee2, Scott Townsend1, Kenneth Kroenlein2

1 NIST, Boulder, Colorado, United States; 2 Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, Colorado, United States; 3 National Institute of Standards Technology, Boulder, Colorado, United States
Exponential growth in publication rates and data generation has yielded tremendous challenges as well as potential rewards for data analysis groups. Data volumes have grown to such a degree that many traditional data collection and interpretation approaches cannot scale sufficiently to remain comprehensive and current, or to effectively track shifting interests within research and industrial communities. It is thus necessary to strongly rely on a substantially increased role for digital archives, automated analysis, and machine learning approaches.

The Thermodynamics Research Center (TRC) at the National Institute of Standards and Technology (NIST) maintains an extensive database of published experimental thermophysical and thermochemical properties for pure compounds, binary and ternary mixtures, and chemical reactions. All stored experimental data are associated with estimated combined experimental uncertainties. The large-scale data collection effort is complemented by the Guided Data Capture (GDC) software developed at TRC. GDC is designed to enforce the completeness of the information extracted, validate the information through data definition, range checks, etc., and guide the uncertainty assessment to ensure consistency between compilers with diverse levels of experience The resulting database, in combination with expert system software (ThermoData Engine, TDE), allows on-demand (i.e., dynamic) critical evaluation of thermophysical and thermochemical property data.

While the challenges in implementing such a system are significant, the potential benefits are quite noteworthy. These large, well-vetted data sets generated therewith can be then used as inputs for large scale efforts in chemical modeling, such as chemical candidate screening or development and optimization of property estimation methods. Dynamic access to large validated data sets such as these can also be used to very quickly compare data in submitted manuscripts to a nearly-comprehensive set of existing published data, as well as facilitate robust, property-based literature searches, improving the quality of published information and preventing the propagation of erroneous data. These efforts have facilitated a decade's long collaboration with key journals in the field where reported data are vetted for consistency by TRC before publication. The published data are disseminated in a free and open context via ThermoML, an XML-based file format and IUPAC standard.

2:50pm-3:15pm CINF 110: Building a better materials science database: Challenges and opportunities

Robin Padilla, robin.padilla@springer.com, Michael Klinge, michael.klinge@springer.com

Corporate Markets & Databases, Springer Nature, Heidelberg, Germany
SpringerMaterials presents large amounts of data from materials science, chemistry, and physics. The database draws on the Landolt-Börnstein Series and other specialized databases. Recent development is focused on adding new data sources, digitizing and enriching existing data, enhancing search algorithms, linking diverse content collections, and optimizing user experience design.


 

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 111: TCI’s approaches to chemical information for researchers
Haruhiko Taguchi1, Tracey Barber2, Tracey.Barber@tcichemicals.com

1 RD (Information Management) Department, Tokyo Chemical Industry Co Ltd, Chuo-ku Tokyo, Japan; 2 Marketing, TCI America, Cambridge, Massachusetts, United States
TCI manufactures and provides organic reagents to researchers around the world to support the advancement of chemistry. TCI also supplies chemical information to its customers through various ways including its website, www.TCIchemicals.com, on which each product has its own dedicated page. Each product page contains the link to Reaxys, PubChem, and the Spectral Database for Organic Compounds (SDBS) for helping researchers to quickly collect chemical information. In addition, TCI’s website product pages provide reagent applications and the links to related academic journals and articles. TCI provides original chemical information too, including MSDS’s that are available in multiple languages for safety use, physical properties and regulations for each product. To further aid researchers in finding the reagents they need quickly, TCI offers searching by various ways including CAS number, keywords, category, structure and more.

Providing researchers the reagents they need when they need them, with the information required to keep their research moving forward quickly, is the challenge of all chemical suppliers today. TCI must ensure that it offers all of the technical information needed to support the research. TCI will show how it supplies chemical information through its website.

3:55pm-4:20pm CINF 112: Presenting the latest scientific knowledge on an e-commerce website
Jonathan Stephan, jon.stephan@sial.com

Sigma Aldrich, Saint Louis, Missouri, United States
Sigma-Aldrich has always strived to deliver the latest information to scientists. As chemical, biological, and overall scientific information has increased, Sigma-Aldrich has built a strong content backbone using Automation and Informatics. The process starts at Product Attributes and Descriptions and moves to the more complex Safety Data Sheets, Technical Bulletins and Peer-reviewed Papers. This presentation will describe how a Catalog based company has used Automation to successfully transition to a leading provider of Chemical and Biological Information to the scientific community.

4:20pm-4:45pm CINF 113: Beyond chemistry: Collect, organize, and visualize scientific data on the web
David Deng, dengw2@gmail.com, Rajeev Hotchandani, Jinbo Lee

Scilligence, Burlington, Massachusetts, United States
We live in a time when technology advancement makes the amount of scientific data grow exponentially. For instance, improvements in laboratory technologies allows us to explore new chemical spaces and expedite data generation; scientific literature is being digitalized for easier access... All these developments have resulted in greater scientific data availablity. However, how to collect, organize, analysis and visualize this large amount of scientific data remains challenging.

In this presentation, a case study of managing chemical and biologic data within Scilligence’s web-based systems will be introduced. A typical workflow starts from synthesis planning, product registration, assay data analysis, to sample management. The information related to small molecules or biologics can be scattered around in the document repository system. It is however, fully recorded and searchable with Scilligence’s knowledge-mining tools.

CINF: Driving Change: Impact of Funders on the Research Data & Publications Landscape 2:00pm - 4:50pm
Tuesday, March 15
Room 25A - San Diego Convention Center
Elsa Alvaro, Andrea Twiss-Brooks, Organizing
Andrea Twiss-Brooks
Cosponsored by: MEDI and ORGN, Presiding
2:00pm-2:25pm CINF 119: Are we ready to define the scholarly commons?
Maryann Martone1,2, mmartone@ucsd.edu

1 Neurosciences, University of California, San Diego, San Diego, California, United States; 2 Hypothes.is, San Francisco, California, United States
The question of open access must be considered through the duality of modern scholarship: access to research products involves both human and machine. FORCE11, the Future of Research Communications and e-Scholarship, is a grass roots community that arose to address the question of how scholarship needs to adapt to maximize machine-based access in the age of networks and global search. On the flip side, technology must adapt to the requirements and reality of scholarship and its need for persistence and chains of evidence.

FORCE11 is a broad tent, welcoming those across all scholarly disciplines within academia, industry, government and at large. These diverse stakeholder groups allow insight into different practices and cultures and also efforts underway around the globe to provide new platforms and services for scholarly communication. It is clear that even within a single domain, e.g., biomedicine, access to scholarship is fragmented for machines and humans alike. It is also clear that different communities, even within the same domain, are at vastly different stages in transitioning their scholarship to e-scholarship platforms.

Through projects like the Neuroscience Information Framework and the NIDDK Information Network, I have been involved first hand in cataloging the thousands of databases, tools, materials, produced by the biomedical community. There has been a huge investment in the creation of these resources, but less on long term sustainability or interoperability. Part of the reason for this is that we really didn’t know how to do either. Sustainability is still challenging, but I believe we are making headway on the latter.

What is emerging from discussions around the globe is a better sense of the principles, best practices, interfaces and minimal standards that should govern information flow across the scholarly ecosystem to maximize machine and human access. At FORCE11, we are calling this the Scholarly Commons. We are considering not just what practices govern digital objects, but how researchers must handle physical and conceptual entities as they transition into the digital realm.

FORCE11 will be hosting a series of workshops that will explore defining the scholarly commons. The outcomes of these workshops will not be an endorsement of a particularly platform or technology, but rather what any stakeholder in modern scholarship should aim to achieve to create a vibrant, dynamic ecosystem that maximizes access for both machine and human.

2:25pm-2:50pm CINF 120: Research data curation services at UC San Diego library
Ho Jung Yoo, hjsyoo@ucsd.edu, David Minor

Library, UC San Diego, San Diego, California, United States
In 2008, the heads of the major campus service providers at UC San Diego recognized the need to streamline and enhance access to technology services on campus. With strong resource support from the Chancellor’s office, the team of service providers formed the Research Cyberinfrastructure Initiative, a campus program designed to centralize access for faculty to the abundance of technology services on campus, which included storage, networking, and high performance computing. One of the major new thrusts of this initiative was to commission the Library to develop a Research Data Curation Program (RDCP). The RDCP was formed at the end of 2013 to support the data management, publishing, and preservation needs that faculty would imminently need to address as a part of their research activities. The program now has a staff of 10 librarians and analysts, in partnership with other Library programs, to support data curation services on campus for faculty, staff, and students. These services include administration of online tools for writing data management plans and minting persistent identifiers, management of a data repository for sharing research data publicly, long term digital preservation, training classes, and consultation services.

2:50pm-3:15pm CINF 121: Is open science an inevitable outcome of e-science?
Jeremy Frey, j.g.frey@soton.ac.uk

University of Southampton, Southampton, United Kingdom
The advent of e-Science building on the digital revolution in information production, exchange, and consumption, has created new ways of interacting with colleagues and disseminating discoveries. It also opened up radially new possibilities for regulation and governance of the research process and therefore unsurprisingly attracted the interest of the funders of science and the professional bodies as guardians of professional research practice. The players in the research life-cycle are still exploring and exploiting these opportunities and they are having major consequences on the securing of funding and the obligations placed on researchers, but are also creating new opportunities for different types of research. I will attempt to address some of these aspects in the context of the research landscape in the UK.

3:15pm-3:40pm CINF 122: Navigating the research data ecosystem
Dan Valen, dan@figshare.com

figshare, Brooklyn, New York, United States
Financial, social, and ethical pressures are increasingly requiring grantees to make their research results accessible in order to validate findings and spur scientific discovery. Collaboration around research data and the development of scholarly communication initiatives is fast becoming a requirement at institutions as more and more funding bodies mandate research data sharing. With the rise in funder mandates and public access policies around funded research, researchers, as well as publishers and institutions, are faced with a compliance puzzle.

This puzzle is one of the main drivers for the continuing evolution of figshare.com. At Figshare, we build tools to support researchers, publishers, and institutions that aid in the storing, sharing, and discoverability of both the positive and negative research outputs. Our ultimate goal is to aid in the reproducibility, replication, and reuse of research data and to help the research community realize this goal.

Good data management and infrastructure is at the foundation of reproducible research. This talk will touch on the evidence and challenges for reproducibility we’ve seen at Figshare and will delve deeper into incentives to motivate different stakeholders and communities toward best practices and workflows to achieve transparency in scientific research.

3:40pm-3:55pm Intermission
3:55pm-4:20pm CINF 123: Funding mandates and policies: A database provider's response
Ian Bruno1, Colin Groom2, Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 CCDC, Cambridge, United Kingdom
From the very start of the Cambridge Structural Database (CSD) to its current state as the repository for the world’s crystal structures, those who have curated these data strived to make it available to all researchers, everywhere. After all, what’s the point of having 800,000 crystal structures, if no one can make use of them? The mandates from research funding agencies that all scientific results should be publicly available dovetails with the mission of the Cambridge Crystallographic Data Centre (CCDC) to provide access to crystal structure data for anyone who requires it. How do the services that the CCDC provides match up to funder expectations and how have they evolved in response to these? What can a database provider do to ensure the quality of data is maintained while public access is guaranteed not just today but for future generations? How should it be paid for?

This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, explores the influence of funding agencies on data providers and the services they provide. It will also take a look at what remains to be done in order to meaningfully realise the benefits that funding agencies seek to achieve.

4:20pm-4:45pm CINF 124: Quest to find 'broader impact': How funding bodies are using altmetrics to evaluate funded research and grant applications
Sara Rouhi, sara@altmetric.com

Altmetric, Washington, DC, District of Columbia, United States
As funding bodies both public and private evolve to accomodate a soaring number of applicants and diminishing pools of funds, they are increasingly looking beyond traditional metrics to evaluate new applicants and past reward recipients. While traditional metrics like H-index, citations, journal impact factor, and journal prestige all speak to the scholarly impact of an applicant, they cannot indicate impact across broader audiences like practitioners (educators, doctors, lawyers, legislators -- non-scholars who use peer-reviewed research in their work), the general public, interested parties, and research communicators (like journalists). Traditional metrics also take months or years to accrue making them lagging indicators of impact in the scholarly pace. They also do not serve early career researchers or researchers working in niche fields with non-traditional research outputs. Altmetrics begin to solve some of these issues by service as qualitative, attention and immediacy indicators. Private and public funders alike are increasingly using these indices to measure not only grants they have funded -- are they in keeping with the funder mission? Are they reaching key audiences? Are they engaging new/emerging communities of interest? -- but to evaluate potential grant applicants and existing applications. This presentation will walk through changes in the grant funding process at public and private funders, a case study outlining why funders are using altmetrics in this way, why they have pivoted to add these new metrics in their evaluation process, and what tools you can bring to your libraries to help support your researchers' grant application efforts.

4:45pm-4:50pm Concluding Remarks
CINF: Linking Big Data with Chemistry: Databases Connecting Genomics, Biological Pathways & Targets to Chemistry 2:00pm - 4:05pm
Tuesday, March 15
Room 24C - San Diego Convention Center
Rachelle Bienstock, Organizing
Rachelle Bienstock, Presiding
2:00pm-2:05pm Introductory Remarks
2:05pm-2:25pm CINF 114: How can genomic databases be linked to chemical structural information?
Rachelle Bienstock, rachelleb1@gmail.com

RJB Computational Modeling LLC, Chapel Hill, North Carolina, United States
There are more and more databases containing genomic, biological assay and pathway data. The new Nucleic Acids Research Database issue (NAR, 2015, 43, D1-D5) contains 177 databases including genomic, RNA, protein structure, toxicity and metabolic information. However, small ligand and chemical structure compound data is not linked in an efficient way to biological assay, biological pathway and protein target information. How can ligand and structural information successfully be combined and used with biological pathway, toxicity and target pathway information in the most efficient and coherent way for drug discovery? Methods for connecting disparate database information and linking database information will be discussed.

2:25pm-2:45pm CINF 115: Reactome pathway knowledgebase: Connecting pathways, networks, and disease
Robin Haw, robin.haw@oicr.on.ca

Informatics and Bio-computing, OICR, Toronto, Ontario, Canada
Modern health initiatives and drug discovery are focused increasingly on targeting diseases that arise from perturbations in complex cellular events. Consequently, there has been a tremendous effort in biological research to elucidate the molecular mechanisms that underpin normal cellular processes. A reaction-network pathway knowledgebase is the tool of choice for assembling and visualizing the “parts list” of proteins and functional RNAs, as a foundation for understanding cellular processes, function and disease. The Reactome Knowledgebase (www.reactome.org) is a publically accessible, open access bioinformatics resource that stores full descriptions of human biological reactions, pathways and processes. Curated pathway knowledgebases, like Reactome, are uniquely powerful and flexible tools for extracting biologically and clinically useful information from the flood of genomic data. Our data model accommodates the annotation of disease processes, allowing us to represent the altered biological behaviour of mutant variants frequently found in cancer, and to describe the mode of action and specificity of drugs and therapeutics. Bio- and chemoinformaticians use Reactome to interpret high-throughput experimental datasets, to develop novel algorithms for data mining and visualization, and to build predictive models of normal and abnormal pathways. Specific features of Reactome support the visualization of interactions of many gene products in a complex biological process, and the application of bioinformatics tools to find causal patterns in genomic data sets. To maximize Reactome’s coverage of the genome, we have supplemented curated data with a conservative set of predicted functional interactions (FI), roughly doubling our coverage of the translated genome. We have developed a Cytoscape app called “ReactomeFIViz”, which utilizes this FI network to assist biologists to perform pathway and network analysis to search for gene signatures from within gene expression data sets or identify significant genes within a list. Pathway and network-based tools for building and validating interaction networks derived from multiple data sets will give researchers substantial power to screen intrinsically noisy experimental data in order to uncover biologically relevant information.

2:45pm-3:05pm CINF 116: Competitive intelligence workbench: Getting access to information for decision making

Huijun wang, huijun.wang@merck.com

Merck, Kenilworth, New Jersey, United States
Pharmaceutical Companies have a large past generated and continue enlarged data collection. Meanwhile, there is rich information available externally due to the new techniques. Information is vital to identify new innovative drugs and drug targets. However, it remains a challenge for research scientists to quickly and easily obtain information and use it to make informed decisions. Our Competitive intelligence workbench is aimed to provide a self-services platform to enable scientists to access the latest information from both internal and external sources and make decisions with strong supporting data. In this project, we integrated multiple sources using big data approach and built various reusable components and services to find associations among compounds, target and clinical phenotypes, which is useful for novel repurposing opportunities, MOA elucidation, etc. We also developed project dashboards that provide comprehensive knowledge overview on projects in an easy to navigate interface. Scientists were able to access the most recent advances in their chosen fields to support decision-making. More important, the change of information access methods will decrease the data bottleneck for new medicine innovation and ever change landscape of Research.


 

3:05pm-3:15pm Intermission
3:15pm-3:35pm CINF 117: Using systems biology in computational drug design workflows

George Nicola, george.nicola@outlook.com, Bruce Kovacs

Afecta Pharmaceuticals, Irvine, California, United States
We have built an automated, workflow-based system that predicts mechanism of action for new indications of safe, off-patent drugs. The platform technology can also design new molecules for a known target or an active drug program. We do this through a combination of enumerating derivatives from a patent, generating a combinatorial library of analogues around a Markush scaffold, chemical fingerprint searches, 3D similarity (shape, pharmacophores, electrostatics), ADMET descriptor matching, gene expression profiling, and protein docking.

The platform is built in the KNIME workflow environment, and uses both open source as well as proprietary software. The prediction algorithm is custom designed using machine learning models that have been trained on large data sets. We connect and make use of multiple web-accessible databases including those for binding activity, chemical and protein structures, biological pathways, and gene expression.

To feed compounds into the workflow, we have also built a comprehensive compound registration system that analyses, isomerizes, de-duplicates, and uploads compounds to an Instant JChem-enabled MySQL database server. Our base library consists of 10,000 commercially available drug compounds, as well as several hundred hand-picked compounds with known activities.

Our workflow-based platform technology has proven especially useful when partnering with small and mid-size pharmaceutical companies seeking to address an unmet medical need by redesigning an existing product, and where regulatory approval is likely to be achieved rapidly. We provide an example of this platform being used successfully to repurpose an antipsychotic molecule into a drug candidate currently in Phase III clinical trials. We are currently in the process of designing better molecular analogues for this project.

3:35pm-3:55pm CINF 118: Combining semantic triples across domains to identify new and novel relationships and knowledge
Matthew Clark, m.clark@elsevier.com, Frederik van den Broek, Anton Yuryev, Maria Shkrob, Sherri Matis-Mitchell, Timothy Hoctor

R & D Solutions, Elsevier Inc., Philadelphia, Pennsylvania, United States
The focus on methods to analyze large databases, ‘big data’, continues to increase as the collections of scientific observations accumulate. Elsevier has collected tens of millions of facts from scientific literature in the form of semantic triples. In biology an example triple is “A regulates/causes/changes B” where A and B can be compounds, diseases drugs or other entity types. The relationship is also qualified by species, tissues, and other variables. In chemistry the triples are similar, e.g. “compound C inhibits protein A“ and are also qualified by variables such as potency, assay type, species, and variant. The possible combinations increase factorially with the number of facts joined together by disease, target, or chemical compound.
By combining these observations in biology and chemistry we can explore questions such as “based on the known targets drug A inhibits, what other diseases might it treat, based on disease pathways reported for all other diseases?” and “given the proteins related to a disease, and compounds known to inhibit those proteins what known compounds/structure scaffolds could be tested to treat the disease?” We will present examples of using data frameworks that combine Elsevier and open source pathway and biological activity databases to explore these questions with the broadest available knowledge base.

3:55pm-4:05pm Concluding Remarks
CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Wednesday, March 16
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 125: Analytical data, the web, and standards for unified laboratory informatics databases
Graham Mc Gibbon1, scitechmaven@gmail.com, Patrick Wheeler2, pwheeler@yahoo.com

1 Advanced Chemistry Development (ACD/Labs), Toronto, Ontario, Canada; 2 Product Development, Advanced Chemistry Development, Encinitas, California, United States
For knowledge management solutions to be widely embraced by the chemical community there must be standards for handling not just chemical structures but also analytical data and metadata. This includes dealing with different data sources, types and formats. More importantly platforms must support this from experiment inception through data acquisition and interpretation then eventually to presentation including appropriate storage and querying capabilities. Technology integration also gains importance considering the modern laboratory informatics environment and increasing externalization.
We present here how our organization has applied 20 years of experience in chemistry and informatics to developed technologies that unify data from distinct formats and types, and use common exchange protocols and compatibility with web based presentation layers. At the heart of this is chemical nomenclature, molecular structure, spectral and chromatographic information, and databases that store, relate and allow access to these elements and their associated relationships. Further, we illustrate such a technology, namely a platform for live data and unified laboratory intelligence, and how is utilized. We will also look toward future application of this knowledge management representation.

8:45am-9:10am CINF 126: From molecular formulas to Markush structures: Different levels of knowledge representation in chemistry
Michael Braden, mbraden@chemaxon.com

ChemAxon, Cambridge, Massachusetts, United States
Chemical compounds can be characterized in many different ways. Depending on the level of detail we have as to the composition, we can easily end up with very general or very specific descriptions. The representation of the available information is crucial in lots of use cases, where the actual chemical knowledge drives further important decisions. These use cases include quite 'simple' ones, like the checking of compounds' uniqueness, but also very complex ones, like the coverage of the patent space by a certain Markush structure. The presentation will provide a review of the existing solutions for these problems within a suite of informatics tools, a comprehensive knowledge management solution for chemical sciences. The motivation behind the development of these resources will be described, a vision on how they can be used by others, and successful user stories along with the exciting science behind them.

9:10am-9:35am CINF 127: Strategies for creating knowledge from chemistry and text data
Tom Oldfield1, tom.oldfield@dotmatics.com, Mariana Vaschetto1, mariana.vaschetto@dotmatics.com, Jeff Nauss2, jeff.nauss@linguamatics.com

1 Dotmatics, Bishops Stortford, United Kingdom; 2 Linguamatics, San Diego, California, United States
Chemical data representation is a challenge that has been addressed using different methodologies. Representation includes not only a set of unique chemical descriptors for the molecules themselves, but also the linking process (reactions) that they belong to in the form of metadata. The structured nature of this data makes it easy to store in structured databases. However, one common issue remains: the low quality of metadata associated with each chemical entity. This could hinder the extraction of meaningful knowledge from the stored information without time consuming human intervention. Efforts have been made in a) the optimization of chemical and reaction representation in order to achieve real-time text and data mining and b) the integration of chemical information with semantic analysis of surrounding text generated by researchers. In this talk we will focus on addressing the first issue in detail and discuss strategies for the second part.
We will provide the background on chemical /reactions representations used by Dotmatics and the tools that enable Chemists to generate these into a comprehensive chemistry toolkit. Additionally this talk will cover how chemistry descriptors can be converted into computer fingerprints or bit-strings, allowing high performance searching (super and sub-structure searching) and ranking of chemistry data. These solutions also take advantage of advanced memory mapping and threading to provide interactive capability additional to those available on standard laptop computers. This enables data discovery to done at the application level instantly and without recourse to large scale server infrastructure.
Finally, we will explore how all Dotmatics technologies can make use of standardize ontology dictionaries and other commercially available natural language processing (NLP) based text mining tools providing additional added value in the knowledge discovery process.

9:35am-10:00am CINF 128: Combined structure and reaction retrieval in scientific content: What satisfied users in the past and what they demand for the future
Guido Herrmann1, guido.herrmann@thieme.de, Josef Eiblmaier1, Valentina Eigner-Pitto1

1 Georg Thieme Verlag Kg, Stuttgart, Germany; 1 InfoChem GmbH, Munich, Germany
Thieme has been a chemistry publisher since 1909. We publish scientific information in various formats: journals, reference works, encyclopaedia, monographs and textbooks. Together with InfoChem GmbH, a software company focusing on the production and marketing of new products for chemical information advanced solutions have been developed to handle, store and retrieve chemical structures and reactions.

In our talk we present the motivation behind the development of these resources and a vision on how they can be used by others. We will highlight for reference works, journals and encyclopaedias’ how a combination of semantic technologies, advanced text, structure and data representation in combination with sophisticated search technologies lead to a greatly enhanced user experience and discoverability.

10:00am-10:15am Intermission
10:15am-10:40am CINF 129: Harnessing chemical and toxicological data for the evaluation of food ingredients and packaging
Diane Schmit, dschmit@alumni.ucla.edu, Tammy Page, Kirk Arvidson, Patra Volarath, Leighna Holt

US Food and Drug Administration, College Park, Maryland, United States

The U.S. Food and Drug Administration’s (FDA’s) primary mission is to promote and protect public health. FDA's Center for Food Safety and Applied Nutrition (CFSAN) is one of six product-oriented centers within the FDA that carries out the mission of FDA to enforce the Federal Food, Drug, and Cosmetic (FD&C) Act and other laws that are designed to protect consumers' health and safety. The Office of Food Additive Safety (OFAS) within CFSAN manages FDA's pre-and post-market safety review of food additives, color additives, food contact substances, and generally recognized as safe (GRAS) substances. A result of OFAS’ responsibilities, it has amassed a very large volume of chemical, toxicological and regulatory data on chemicals under its purview. As such, OFAS has developed a number of web-based informatics tools that link regulatory submissions, regulations, chemical data and toxicological data to facilitate the identification of the regulatory history of a particular chemical as well as the chemical and toxicological data available within our internal administrative files. STARI is an ontology of scientific and foods terminology and regulatory data,
organized in a multi-hierarchical structure, and cross-linked to CERES and other data resources. CERES is OFAS’ chemical-centric knowledgebase that links regulatory history with human intake estimates and toxicological data in one resource. CERES also provides informatics tools to probe potential toxicity as well as identify potential structural analogs for read-across approaches to assist in the safety evaluation of new and previously regulated food additives and ingredients.

10:40am-11:05am CINF 130: Expansion of DSSTox: Leveraging public data to create a semantic cheminformatics resource with quality annotations for support of U.S. EPA applications
Christopher Grulke2, Inthirany Thillainadarajah1, Antony Williams1, David Lyons1, Jeff Edwards1, Ann Richard1, richard.ann@epa.gov

1 National Center for Computational Toxicology, US EPA, Research Triangle Park, North Carolina, United States; 2 Zachary Piper Solutions, New Hill, North Carolina, United States
The expansion of chemical-bioassay data in the public domain is a boon to science; however, the difficulty in establishing accurate linkages from CAS registry number (CASRN) to structure, or for properly annotating names and synonyms for a particular structure is well known. DSSTox has long been considered a trusted source for highly curated CASRN to name to structure relationships within the environmental toxicology community. DSSTOX recently expanded to include accurate annotation of the more than 8000 chemical substances being tested in the ToxCast and Tox21 programs. To extend cheminformatics integrity beyond DSSTox’s initial 25K substances, we collected data from various public sources and performed a series of checks to evaluate the consistency of chemical information within and across these public repositories. Incoming data were constrained by strictly enforcing a 1:1 mapping of CASRN to structure, and each substance was assigned to one of six “QCLevels” to capture the level of confidence in CASRN to name to structure associations. The number of chemicals now supported in DSSTox has expanded to over 750k with over 150k curated to be higher quality than public resources. This expanded version of DSSTox is available to the public in legacy DSSTox flat file and SDF formats, through web interfaces supporting EPA’s Chemical Safety and Sustainability (CSS) projects (including ToxCast and Tox21), and as RDF graph format to facilitate semantic data efforts. Our efforts have quantified a high degree of inconsistency in publicly available chemical annotations, as well as highlighted the challenges caused by limited adoption of semantic data in chemistry to date. This abstract does not reflect U.S. EPA policy.

11:05am-11:30am CINF 131: Comparative toxicogenomics database: Advancing understanding of molecular connections among chemicals, genes, and diseases

Cynthia Grondin, cjgrondin@ncsu.edu, Allan Davis, Thomas Weigers, Carolyn Mattingly

Biology, North Carolina State University, Raleigh, North Carolina, United States
Exposure to chemicals in the environment plays a key role in the etiology of many human diseases and phenotypes. Chemicals influence genes and proteins, molecular pathways, and disease susceptibility, yet a clear understanding of their direct role in human disease is lacking. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org) promotes understanding about the effects of environmental chemicals on human health by manually curating and presenting data from scientifically reviewed literature on the interactions between chemicals, genes, and diseases in vertebrates and invertebrates. In our curation paradigm, CTD scientists use controlled vocabularies, ontologies, mnemonic codes, symbols, and structured notation to transform the scientific literature into a semantic, computable structure. This information is integrated with gene attributes (including Gene Ontology annotations), molecular pathways, species, and general toxicology information to provide a free knowledgebase of over 28 million toxicogenomic relationships that can inform user hypotheses. CTD chemicals align with MeSH chemical terms and link to CCRIS, ChEBI, ChemIDplus, GENE-TOX, Household Products Database, Hazardous Substances Data Bank, PubChem and TOXLINE. Numerous CTD tools enable novel enrichment and comparative analyses of user-defined or CTD-based data sets. In addition, the structured information is made available for computational analysis in the form of XML, BEL, and other formats. Here, we present an overview of CTD functionality with emphasis on chemical representation and its integration with molecular and disease data.

11:30am-11:55am CINF 132: Wikidata: Advancing science through semantic integration of genes, diseases, and drugs
Benjamin Good1, bgood@scripps.edu, Elvira Mitraka2, Andra Waagmeester1,3, Sebastian Burgstaller-Muehlbacher1, Timothy Putman1, Andrew Su1, Lynn Schriml4

1 Department of Molecular and Experimental Medicine, Scripps Research Institute, La Jolla, California, United States; 2 Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States; 3 Micelio, Antwerp, Belgium; 4 Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States
Wikidata is an openly accessible and editable, Semantic Web-compatible knowledge base that now underlies Wikipedia as a “knowledge commons” https://www.wikidata.org/. It is a full-fledged member of the linked data cloud, with a SPARQL endpoint available at https://query.wikidata.org/. Wikipedia articles can now render content queried directly from Wikidata and each Wikipedia article is hyperlinked to a corresponding data item in Wikidata.
Our team is addressing the ongoing challenge of biomedical knowledge dissemination and integration by populating Wikidata with the seeds of a semantic network linking genes, drugs and diseases. Nodes and edges in this network are populated automatically by ‘bots’ that integrate data from trusted authorities such as NCBI’s Entrez Gene, DrugBank, and the Human Disease Ontology. Using this content, we are automatically enhancing the number, content and semantic inter-relations of Wikipedia articles about genes, diseases and drugs.
Outside of Wikipedia, the open API of Wikidata provides the capacity to generate or enhance many other applications. For example, useful queries of chemical data such as “what clinically relevant drug-drug interactions are known for the drug methadone” are already possible with Wikidata’s SPARQL endpoint. Supporting APIs also provide access to the edit history of all items in the graph, providing programmatic capabilities to detect and correct vandalism and to reward individual contributors.
Wikidata is unique among biomedical Semantic Web resources in that it is editable by anyone and is embedded directly in the context of all other human knowledge. This openness and centrality make it the ideal foundation upon which to build the next generation of Web-scale semantic data. We encourage the chemical informatics and chemical biology community to join us in expanding Wikidata’s coverage of the chemical universe, in particular, the development of drug-gene and drug-disease semantic relations.

CINF: Reimagining Libraries as Innovation Centers: Enabling, Facilitating & Collaborating throughout the Research Life Cycle 8:45am - 12:00pm
Wednesday, March 16
Room 24C - San Diego Convention Center
Ye Li, Vincent Scalfani, Organizing
Ye Li, Presiding
8:45am-8:50am Introductory Remarks
8:50am-9:15am CINF 133: From dusty stacks to an information hub: Reimagining the UF libraries
Neelam Bharti1, neelambh@ufl.edu, Sara Gonzalez2

1 Marston Science Library, University of Florida, Gainesville, Florida, United States; 2 Marston Science Library, Gainesville, Florida, United States
In the last several years, the University of Florida Libraries has been working on redesigning the Marston Science Library, rising above expectations and collaborating actively on a program of strategic planning and innovation. Once a library that contained mostly books and journals with very little study space and electrical outlets, the science library was transformed into an innovative collaboration center by providing modern technologies and study space for students and researchers. The science library became quickly a center point for the university with the inclusion of 3D printing, the MADE@UF lab, visualization and conference room, and open floor seating in the new Collaboration Commons. Considering that most of the research resources and journals are online, results have been very impressive with our user counts doubling in the last year. It's not just transforming the library space and workflow; but has also transformed the library’s organizational culture and the responsibilities of librarians. Marston’s transformation has been so successful that other UF libraries are following in similar renovations (known as “Marstonization”). This transition has demonstrated a huge step in rethinking and redesigning a traditional library space as a step towards the inventive libraries of the future.

9:15am-9:40am CINF 134: Expanding the research commons model into disciplinary instances
Jeremy Garritano, jgarrita@umd.edu

University Libraries, University of Maryland, College Park, Maryland, United States
In a distributed library environment, providing services to faculty, staff and students can be complicated with concerns of dispersed libraries staff, properly targeting services to appropriate campus segments, and leveraging various infrastructures of individual libraries. At the University of Maryland, a Research Commons model was first developed at the “main library,” with a focus on both virtual and physical services. After an initial academic year, the development of a disciplinary Commons was considered to complement the Research Commons and the previously established Learning Commons. In the summer of 2015, a taskforce was created to outline the creation of a Science Commons that would be connected to the Research Commons. This talk will present the general philosophy of the Commons model as interpreted at the University of Maryland as well as discuss the administrative and organizational evolution of the Commons. Descriptions of preliminary services of the Science Commons as well as their assessment will also be discussed.

9:40am-10:05am CINF 135: Libraries for the future: A digital economy perspective
Jeremy Frey, j.g.frey@soton.ac.uk, Steven Brewer

University of Southampton, Southampton, United Kingdom
The discussion of Libraries for (and of) the future formed a major theme for the IT as a Utility (ITaaU) Challenge area network (http://www.itutility.ac.uk) of the Research Councils UK (RCUK) Digital Economy theme. We present a summary of the discussion and conclusions of the workshops and meetings examining the future role of research and community libraries that have taken place under the auspices of the ITaaU Network. The concept of a library in the digital age was informed by considering the origins and uses of research libraries over time as not only a repository but as an active research space. The key role of the University library as a meeting point between disciplines enabling and informing interdisciplinary discourse long before this became necessary to formally acknowledge this need. New ways of interacting with the research and wider community will be discussed, along with the way in which digital presence (“digital aura”) of people and “books” alter the information flow between organisations and people.

10:05am-10:20am Intermission
10:20am-10:45am CINF 136: Leveraging the interdisciplinarity of chemistry: Building interdisciplinary collaborations
Kiyomi Deards, kiyomideards@gmail.com

Research and Instructional Services, University of Nebraska-Lincoln, Lincoln, Nebraska, United States
An outreach event started by three chemists in Nebraska has spawned several collaborations both nationally and state wide. Learn how they are leveraging the interdisciplinary nature of chemistry and chemical information to create outreach and scholarly collaborations within STEM (Science Technology Engineering Math) and with the social sciences.

10:45am-11:10am CINF 137: Predicting local trends in scholarly communication for decision-making in collection development: An exploration beyond citation analysis
Ye Li, liye@umich.edu

University of Michigan, Ann Arbor, Michigan, United States
Data-driven collection development has been one of the means employed by academic librarians to revolutionize library collections and services for many years. Citation analysis of scholarly publications from researchers of an institution, in combination with the resource usage data and interlibrary loan data, often can generate a baseline of needed resources during a given time period. However, most analysis focused on counting the frequency of current or past citations or usages; and few studies have used current data to predict the future trends in scholarly communication and demands of new resources. In this study, we explore the possibility of using basic regression models and machine learning tools in the emerging data science field to analyze citation, usage and other library transaction data in Chemistry and related research fields. This analysis will identify useful features, such as citation counts, subjects, and keywords etc., and their corresponding weights for prediction of future trends and potential research directions in a specific institution. Other features like costs, budgets, and usage statistics could be included in the model to predict the importance and the likelihood of keeping a title or a group of titles in the next few years. One focus of the study is to make a useful model for revealing the trends of publishing open access articles among chemists. A successful data model has the potential to help librarians select open access journals for recommendation and decide whether to pay the member institution fees. Through applying the model, we hope to tie our decision-making in collection development closer to the local trends of research and scholarly communication through an evidence-based approach.

11:10am-11:35am CINF 138: Academic technologies: A new library service to offer advanced software training
Vincent Scalfani, vincent.scalfani@gmail.com, Melissa Green

University Libraries, University of Alabama, Tuscaloosa, Alabama, United States
Libraries have started to offer access to a tremendous amount of advanced academic software such as graphing, 3D design, and technical computing programs. Traditionally, students are expected to learn much of this software on their own or within their courses. We have found a great need to go beyond simply offering access to software in our libraries. As such, there is a tremendous opportunity for new collaborations and teaching initiatives with software applications among librarians, students, and faculty in their coursework and research. This presentation will cover what libraries are doing to meet software training needs as well as our own experience teaching workshops and offering consultations to support various software applications such as Adobe Creative Suite, ChemDraw, IBM SPSS Statistics, MathWorks Matlab, MS Office, QtiPlot, and Trimble SketchUp Pro. We will conclude this presentation with our ideas for the future role of libraries with advanced software training and collaborations.

11:35am-12:00pm CINF 139: Enhanced chemical understanding through 3D-printed models

Amy Sarjeant1, sarjeant@ccdc.cam.ac.uk, Peter Wood4, Ian Bruno1, Ye Li2, Vincent Scalfani3, Shawn O'Grady2

1 Cambridge Crystallographic Data Centre, Cambridge, United Kingdom; 2 University of Michigan, Ann Arbor, Michigan, United States; 3 University Libraries, University of Alabama, Tuscaloosa, Alabama, United States; 4 CCDC, Cambridge, United Kingdom
With the advent of affordable 3D Printing technology, including in-house printers and web-based commercial enterprises, what had long been a novelty is rapidly becoming a useful tool in the education process. Students have long used chemical model kits to create tactile molecules which help elucidate principles of bonding, VSEPR theory and other three-dimensional properties which are difficult to understand from the two-dimensional world of textbooks and slide presentations. The ability to print copies of common molecules, as they appear in the solid state, can not only bring a sharper understanding of these “static” concepts but can also shed light on dynamic processes such as those involved in molecular machines, host-guest chemistry, protein docking phenomena, and molecular motions. The inherent difficulty in creating models which can demonstrate these dynamic behaviors is finding the correct parameters and materials which will produce an interlocking, flexible model which remains robust. Many academic libraries have created 3D labs providing 3D visualization and printing services. These 3D labs enable us to develop these models collaboratively with researchers, educators, digital fabrication specialists and librarians together.

Using data available from the nearly 800,000 structures in the Cambridge Structural Database (CSD) and software embedded in the visualization and exploration program Mercury, we explore the procedures needed to produce such classroom aids. This presentation, timed to coincide with the end of the 50th anniversary year of the Cambridge Structural Database, will describe our attempts to create dynamic 3D models as well as several educational modules which can be used in conjunction with them.

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:45pm
Wednesday, March 16
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 140: IUPHAR/BPS guide to pharmacology (GtoPdb): Concise mapping for the triples of chemistry, data, and protein target classifications
Christopher Southan, cdsouthan@gmail.com, Joanna Sharman, Adam Pawson, Elena Faccenda, Jamie Davies

Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden
The International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR) provides authoritative reports on G protein-coupled receptors (GPCRs) Nuclear Hormone Receptors and Ion Channels as pharmacology-based classifications. While these recommendations surfaced as Pharmacological Review papers (i.e. unstructured) since the 1990’s, they were already underpinning the protein tables in GtoPdb's predecessor, IUPHAR-DB, by 2003. By 2012 this hierarchical data structure had expanded into the GtoPdb schema covering essentially all target classes for pharmacology, drug discovery and chemical biology. As of August 2015 the expert-curated relationship capture from the literature covers 1505 target-to-ligand mappings of which 1228 human protein IDs have quantitative interaction data recorded against 5860 chemical structures. The motivation, evolutionary trajectory, the need for community engagement to fill data gaps and future directions of the resource will be outlined. Descriptions will cover the challenges of cross-referencing alternative gene/protein hierarches, each of which has different navigational utilities and linkages to chemistry in GtoPdb. These now extend beyond receptors to enzymes and include NC-IUPHAR, HGNC, UniProt, Ensembl, InterPro, Gene Ontology and E.C. numbers. The adaption of our classifications to encompass a new immunopharmacology project will also be discussed.

2:00pm-2:25pm CINF 141: Open PHACTS: Semantic interoperability for drug discovery
Herman Van Vlijmen1, hvvlijme@its.jnj.com, Open PHACTS Consortium2

1 Computational Chemistry, Discovery Sciences EU, Janssen, Beerse, Belgium; 2http://www.openphacts.org, Vienna, Austria
The Open PHACTS project ( http://www.openphacts.org) has built a semantic platform for drug discovery that integrates data over diverse sets of public chemistry and biological data. It currently connects linked open data from 12 different data sources, including chemical compounds, protein targets, biological pathways and tissues, and diseases. The diversity and size and of the Open PHACTS data are growing rapidly, and it contains currently more than 3 billion triples. The Open PHACTS project is a unique collaboration between European academic groups, small businesses and large pharmaceutical companies, partially funded by the EU. The driver for the project is to enable scientists to easily access and process data from multiple sources to solve real-world drug discovery problems that were very difficult to solve before. These drug discovery problems formed the basis for selecting what public data sources were integrated in the Open PHACTS project. Anyone can freely access the Open PHACTS data through a well-documented interface (API), and numerous workflows to answer specific biomedical questions have been developed and published using the KNIME and Pipeline Pilot pipelining tools. In addition, several custom applications have been built using the API. Open PHACTS has shown that Linked Open Data in the form of RDF triples can be used effectively by the scientific community, and allows queries that were previously very difficult or impossible to run. Future directions include the integration of additional public and commercial data sources, integration of internal company data with Open PHACTS data, and the continued development of workflows for scientific questions that can only be answered using linked data.

2:25pm-2:50pm CINF 142: Representation of drug discovery knowledge in the ChEMBL and SureChEMBL databases
Anna Gaulton, agaulton@ebi.ac.uk

Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom
The ChEMBL1 bioactivity database and the SureChEMBL patent resource were both originally developed as commercial products but have been transferred to EMBL-EBI and are now freely available to academic and industrial researchers. The increasing availability of such open chemistry data has had a dramatic impact on the field of cheminformatics. This talk will address some of the issues and complexity involved in curating and maintaining these large-scale chemistry resources and the strategies employed to facilitate integration and mining of these data2,3.

References

1. Bento, A.P., Gaulton, A., Hersey, A., Bellis, L.J., Chambers, J. Davies, M., Krüger, F.A., Light, Y., Mak, L., McGlinchey, S., Nowotka, M., Papadatos, G., Santos S., Overington, J.P. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 42, D1083-D1090 (2014).

2. Papadatos, G., Gaulton, A., Hersey, A., Overington J.P. Activity, assay and target data curation and quality in the ChEMBL database. J. Comput. Aided Mol. Des. DOI:10.1007/s10822-015-9860-5 (2015).

3. Hersey, A., Chambers, J., Bellis, L., Bento, A.P., Gaulton, A., Overington, J.P. Chemical databases: curation or integration by user-defined equivalence? Drug Discov. Today Technol. 14, 17-24 (2015).

2:50pm-3:15pm CINF 143: Chemical knowledge representation and access in Wolfram|Alpha and Mathematica

Eric Weisstein, eww@wolfram.com

Scientific Content, Wolfram|Alpha, Champaign, Illinois, United States
Wolfram|Alpha (http://www.wolframalpha.com) is a freely available website that contains and exposes curated data sets taken from hundreds of technological, scientific, sociological, and other domains--including a substantial and growing set of chemical data. This data is accessible directly via the website, through its API, and through a number of other specialized sources (such as various apps and SIRI). More recently, it is also available in the Wolfram Language and Mathematica as a set of built-in functions centered around an entity-property approach to information representation.

This talk will focus on the infrastructure developed for representing and accessing data (especially chemical data) in Wolfram|Alpha and on the Wolfram Language functionality for making this data even more computationally accessible within Mathematica. The talk will also touch on the extensive unit system now built into Mathematica through the unit and physical quantity infrastructure backend developed for Wolfram|Alpha.

The introduction of entity, entity class, property, property qualifier, and related Wolfram Language symbols provides a flexible way to represent, access, and compute with data. At the same time, Wolfram|Alpha and Mathematica implement a natural language encoding and processing system for easily accessing information and automatically converting plain text inputs and computational queries into the Wolfram Language. The resulting synthesis of data representation, exposure, and access provides a powerful and extensible framework that is practically applicable to virtually any domain of interest (including chemistry).

Wolfram|Alpha's knowledge comes from a combination of Mathematica computations, roughly 1000 curated data sets, and links to a number of real-time data sources. Additional chemistry-specific functionality currently under development in the Wolfram Language includes service connections to the Open PHACTS, ChemSpider, and PubChem databases (which will allow computations using chemical databases more extensive than those directly built in to the Wolfram Language), computational encoding of functional groups (for further graph related exploration), and improved support for pharmaceuticals and chemical compounds relevant in medical sciences.

3:15pm-3:30pm Intermission
3:30pm-3:55pm CINF 144: Helping people navigate the changing seas of scientific information
David Evans1, david.evans@relx.ch, Pieder Caduff1, Thibault Geoui2, Juergen Swienty-Busch2

1 Reed Elsevier Properties SA, Neuchatel, Switzerland; 2 Elsevier Information Systems, GmbH, Frankfurt, Germany
Many people suggest that chemistry is the central science. It certainly underpins much of our modern life, and has a central role in delivery many of the solutions to key problems facing mankind today. At RELX Group we provide high quality content, and data, and analytics tools to help scientists make these new discoveries. In order to provide the systems that will meet the demands of tomorrow scientists we are re-building out infrastructures today, including different classifications, linking technologies. In this presentation we will discuss the ongoing requirements for the production of a major online research resources, in particular ontologies and taxonomies in the chemical, biological and biomedical areas, automatic indexing of content, simplification of search strategies. We will also provide an insight into how our strategies for the future are being influences by changes in user behaviour and demands, and technology on the web.

3:55pm-4:20pm CINF 145: Characterization and categorization of novel knowns, unknowns, and the interface between physical and digital
Graeme Whitley1, gwhitley@wiley.com, Bernd Berger2, Timothy Adams2

1 Wiley, Hoboken, New Jersey, United States; 2 Wiley-VCH, Weinheim, Germany
We present our experience as a publisher in categorizing novel compounds, partially characterized metabolites, mixtures, and other edge cases in the interface between lab instrument, literature, and the chemical knowledge space. Examples and solutions from a variety of domains, include toxicology and clinical, will be provided, with an emphasis on spectroscopic data.

4:20pm-4:45pm CINF 146: Semantic approaches for biochemical knowledge discovery
Michel Dumontier, michel.dumontier@gmail.com

Medicine, Stanford University, Stanford, California, United States
With its focus on investigating the basis for the sustained existence of living systems, biochemistry has always been a fertile, if not challenging, domain for formal knowledge representation and automated reasoning. Thousands of databases and hundreds of ontologies are publically available, and there is a salient opportunity to mine these for discovery. In this talk, I will discuss our efforts to build a rich foundational network of ontology-annotated linked data, develop methods to intelligently retrieve content of interest, uncover significant biochemical associations, and pursue new avenues for drug repositioning. As the portfolio of semantic technologies continue to mature in terms of functionality, scalability, and an understanding of how to maximize their value, biochemical researchers will be strategically poised to pursue increasingly sophisticated projects at improving our overall understanding of human health and disease.

CINF: Reimagining Libraries as Innovation Centers: Enabling, Facilitating & Collaborating throughout the Research Life Cycle 1:30pm - 4:45pm
Wednesday, March 16
Room 24C - San Diego Convention Center
Ye Li, Vincent Scalfani, Organizing
Vincent Scalfani, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 147: Leveraging the VIVO research networking system to facilitate collaboration and data visualization

Michaeleen Trimarchi, Danielle Bodrero Hoggan, danielle@scripps.edu

Kresge Library, The Scripps Research Institute, La Jolla, California, United States
VIVO is a Research Networking System (RNS) based on open source software originally developed at Cornell. The Scripps Research Institute's Kresge Library staff created the Scripps VIVO Scientific Profiles RNS with NIH grant support in 2009-2011 and have continued to enhance this Linked Open Data resource. At the start of the research life cycle, faculty can search VIVO to identify potential collaborators. When they are preparing grant proposals and submitting renewals, they can include VIVO's NIH Biosketch Lists and PubMed Papers links. VIVO's metadata allows for the automated creation of Map of Science and Co-author Network data visualizations based on journal articles in a faculty member's profile. In addition, Library staff reuse the data generated for VIVO publication ingest to create custom research collaboration network visualizations to support NIH training grant applications.

2:00pm-2:25pm CINF 148: Stanford profiles created to support the university’s scholarly community
Grace Baysinger, graceb@stanford.edu

Swain Chem & Chem Eng Library, Stanford University Libraries, San Jose, California, United States
In 2014, Stanford created the Stanford Profiles website to support faculty and to facilitate their research activities by extending to other schools, institutes and administrative units on campus the Community Academic Profiles (CAP) system that has been available to School of Medicine faculty since 2004. Currently, there are more than 18,000 profiles of faculty, graduate students, postdocs and staff in Stanford Profiles. The profiles are available through both a public and a Stanford-only view. A faculty profile may include biographical information, research interests, publications, courses taught, name of graduate and postdoctoral advisees, doctoral programs the faculty member is associated with as a PhD advisor, faculty collaborators, plus cross-references for faculty members by keywords. Data is being pulled to profiles from the unit that generates the data (e.g. University Registrar for courses taught). A user is able to download a curriculum vita created from profile data. Under a system developed by Stanford University Libraries, new publications are 'exported' to Stanford Profiles, where the citations and other relevant data are displayed in the profile owner's inbox for review. Once an individual has approved a publication, it will appear on his or her profile. Salesforce Chatter, a leading social-networking platform designed for the business context, is integrated into the Stanford-only view, making it easy to work closely with colleagues in a private, secure environment. The CAP Working Group has overseen the development of Stanford Profiles; a collaborative group containing representatives from participating units. Data from Stanford Profiles can be shared with other Drupal-based websites on campus via APIs, thus saving time and duplication of effort.

2:25pm-2:50pm CINF 149: Managing researchers' reputations throughout the research life cycle
Linda Galloway, galloway@syr.edu, Anne Rauh

Syracuse University Libraries, Syracuse, New York, United States
Publically documenting research impact using professional, academic, and social networks has become an increasingly important component of the research life cycle. At Syracuse University Libraries, STEM Librarians assist researchers in developing and managing their online portfolios. Tools like figshare, github, Slideshare, academia.edu, Research Gate, Google Scholar, and more can be used in building one’s online reputation. From data to peer-reviewed journal articles, teaching researchers how to best promote their work will highlight their accomplishments and create opportunities for researcher and librarian interactions. This presentation will give an overview of networking tools and include descriptions of recommended services and outreach strategies. Attendees will learn the best tools and resources for managing their professional reputation and for helping researchers to do the same.

2:50pm-3:05pm Intermission
3:05pm-3:30pm CINF 150: Anatomy of the chemistry research enterprise in the academic sector: Serving the underserved in a large research institution
Leah McEwen, lrm1@cornell.edu

Clark Library, Cornell University, Ithaca, New York, United States
The Research Life Cycle (RLC) at any research institution involves a myriad of scientific and technical support roles, including instrumentation, data management, information access, environment health and safety. Researchers engage with many of these services and these providers in turn liaise across numerous disciplines and departments. All of these functions involve the use of technical information for analysis, interpretation and documentation. In supporting these other research support groups, libraries contribute more fully to the RLC and engage more broadly across the research community. This talk will outline outreach services developed for a variety of service groups on an academic university campus, including chemical analysis labs, chemistry IT services, Environmental Health & Safety and Occupational Medicine.

3:30pm-3:55pm CINF 151: Safety use case for chemical safety information
Ralph Stuart, secretary@dchas.org

Dept of Env Hlth Safety, Keene State College, Keene, New Hampshire, United States
Since 2010, increasing interest in chemical safety in general and laboratory safety in particular has led to the development of new tools for risk assessment of chemical use in the laboratory. In 2015, the NFPA issued a new standard for chemical safety in the teaching setting. This presentation will describe how these tools can be used to support prudent planning of laboratory research and teaching. The safety professional's and librarian's role in using these tools will be described and sources of chemical safety information highlighted.


 

3:55pm-4:20pm CINF 152: PubChem BioAssay: Grow with the community
Yanli Wang, ywang@ncbi.nlm.nih.gov

Building 38a, Room 5s506, Bethesda, Maryland, United States
The PubChem BioAssay repository was set up in 2004 by the National Center for Biotechnology Information (NCBI). While initially serving as an archival information system for small molecule bioactivity data from HTS experiment, the BioAssay database was further developed over the years to support depositions of small molecule and RNAi research result that are associated with publications. The data content in PubChem BioAssay is contributed by world-wide screening facilities, research laboratories, as well as literature curation projects. The database has now received over 1,000,000 bioassay record submissions (BioAssay accession, AID), containing 200 million bioactivity outcomes against tens of thousands protein and gene targets. Being created to meet the community’s need for data sharing, the more than one decade and tireless development at PubChem has been supported, driven, and stimulated by the participation and enthusiasm of the community. This presentation will describe the effort from the community and PubChem when working together to support and advocate data sharing and open access. BioAssay may be accessed at http://www.ncbi.nlm.nih.gov/pcassay/.
Additional retrieval and data analysis tools are available at http://pubchem.ncbi.nlm.nih.gov. Bioassay data may be submitted using the PubChem Upload tool at: http://pubchem.ncbi.nlm.nih.gov/upload. PubChem provides embargo mechanism to assist data deposition associated with manuscript submission.

4:20pm-4:40pm Discussion
4:40pm-4:45pm Concluding Remarks
CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 8:15am - 11:55am
Thursday, March 17
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
8:15am-8:20am Introductory Remarks
8:20am-8:45am CINF 153: Linking chemical and non-chemical data in structured product labeling
Yulia Borodina, yulia.borodina@fda.hhs.gov, Bill Hess, CoCo Tsai, Pete Phong, Lonnie Smith

FDA, Catonsville, Maryland, United States
Structured Product Labeling (SPL) is a document markup standard approved by Health Level Seven (HL7) and adopted by FDA as a mechanism for exchanging product and facility information. Product information provided by companies in SPL format may be accessed from the FDA Online Label Repository (labels.fda.gov) and the National Library of Medicine DailyMed web site (dailymed.nlm.nih.gov). The product information indexing initiative has the goal of enhancing access to the electronic product information provided by the companies. Indexing refers to the creation by FDA of one or more files with machine-readable annotations that can be linked to the product SPL provided by the company. FDA maintains and publishes SPL Indexing Files for Pharmacologic Class, Substance, Product Concept, Biological Drug Substance, and Billing Units. Data from the Indexing Files can be linked to data in both SPL resources and external resources via chemical and non-chemical identifiers.

8:45am-9:10am CINF 154: Ginas: A global effort to define and index substances in medical products
Tyler Peryea1, tylerperyea@gmail.com, Lawrence Callahan2

1 Informatics, NIH NCATS, North Bethesda, Maryland, United States; 2 FDA, Silver Spring, Maryland, United States
Chemical databases have a rich history in the recent past. The development of systematic nomenclature, chemical data formats, and identity standards have allowed chemical data to become increasingly definable and searchable. However, the scope of definitional chemical databases, to date, has remained largely on small well-defined organic molecules. Large molecules and complex poly-disperse substances are often neglected, under-standardized, or entirely ignored. Historically, the scope of chemical databases has been slow to expand into other substance classes due to a lack of available standards, a lack of software tools, and a lack of motivating usage cases for deeply describing such materials. The increasing need to track and monitor medical products of all forms on the global market place, however, has motivated the the creation of the ISO IDMP standards, with ISO 11238 describing a strategy for encoding substance information of diverse forms and origins: from simple chemicals, to complex polymers, extending even to plant and animal material. ginas is a global effort to implement the ISO IDMP substance standard with useful and distributable software tools, with the aim of facilitating the global interchange of well-defined substance information from 'lithium' to 'leeches'.

9:10am-9:35am CINF 155: TranSMART Foundation: An open-data and open-science platform to integrate molecular and clinical data in translational research and precision medicine
Rudolph Potenzone, rudypoten@me.com

tranSMART Foundation, Redmond, Washington, United States
The tranSMART Foundation is a not-for-profit organization that fosters the evolution of the open source tranSMART Platform in support of translational research. With active research in over 100 labs worldwide, the Platform is used daily by scientists in industry, research foundations, academic labs and medical schools. Molecular data, genomics information, proteomics experiments are stored together with anonymized patient data, outcomes, time series and wearable sensor data and our system allows for sub setting, query and routine analyses. Advanced and complex analytics and visualization are possible through our API and many examples of interconnectivity are available such as with R, Spotfire and Genome browser.

In this talk, we will also cover an interesting approach to advance our understanding of disease and diagnostic and treatment options. We have held our first Datathon and others are in planning. A Datathon brings together multiple data repositories, often ones that have never been used in concert previously, within the tranSMART Platform. Key analytical tools and extensions that could be suited to the particular topic of the Datathon are also gathered into a single platform instance. Finally, a team of human experts is assembled that includes data scientists, machine learning practitioners along with experienced researches from the particular topic disease area of interest. Over the course of three days, this teams work with the Platform and the assembled data to attempt to learn new relationships and form novel hypotheses that can form the basis of future research effort. Results from these Datathon sessions will be shared.

9:35am-10:00am CINF 156: Leveraging RxNorm and drug classifications for analyzing prescription datasets
Olivier Bodenreider, obodenreider@mail.nih.gov

Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, Maryland, United States
Prescription datasets (e.g., claims data obtained from Medicare Part D) represent a rich source of information for studying frequencies of prescription and co-prescription (i.e., concomitant medications). We demonstrate that RxNorm supports the conversion of various kinds of identifiers for clinical drugs (e.g., National Drug Code and First DataBank) to RxCUIs, the identifiers required for exchanging drug information as part of the Meaningful Use incentive program. Moreover, drug classes provide a convenient way of analyzing prescription datasets at a higher level (e.g., by aggregating specific medications, such as Lipitor 10 MG oral tablet, into the class statins). RxNorm is well integrated with many drug classification systems, such as the Anatomical Therapeutic Chemical (ATC) classes, and contributes to the class-level analysis of prescription datasets.

10:00am-10:15am Intermission
10:15am-10:40am CINF 157: Evolution of digital and semantic chemistry at Southampton
Jeremy Frey1, j.g.frey@soton.ac.uk, Simon Coles2, Colin Bird1

1 University of Southampton, Southampton, United Kingdom; 2 University of Southhampton, Hampshire, United Kingdom
We take a historical view of e-Science and e-Research (alternatively called Cyber-Infrastructure) developments within the range of Chemical Sciences at the University of Southampton (UK). We discuss the development of several stages of the evolving data ecosystem as Chemistry moves into the digital age of the 21st Century. We cover our research on aspects of the representation of chemical information in the context of the world wide web (WWW) and its semantic enhancement (the Semantic Web) and illustrate this with the example of the representation of quantities and units within the Semantic Web. We explore the changing nature of laboratories as computing power becomes increasing powerful and pervasive and specifically look at the function and role of electronic or digital research notebooks. Having focussed on the creation of chemical data and information in context, we highlight the use and reuse of this data as facilitated by the features provided by digital repositories and their importance in facilitating the exchange of chemical information touching on the issues of open and or intelligent access to the data.

10:40am-11:05am CINF 158: Implementing chemistry platform for OpenPHACTS: Lessons learned
Colin Batchelor, Alexey Pshenichnov, Jon Steele, Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States
The Open PHACTS project delivers an online platform integrating a wide variety of data from across chemistry and the life sciences and an ecosystem of tools and services to query this data in support of pharmacological research, turning the semantic web from a research project into something that can be used by practising medicinal chemists in both academia and industry. In the summer of 2015 it was the first winner of the European Linked Data Award. At the Royal Society of Chemistry we have provided the chemical underpinnings to this system and in this talk we review its development over the past five years. We cover both our early work on semantic modelling of chemistry data for the Open PHACTS triplestore and more recent work building an all-purpose data platform, for which the Open PHACTS data has been an important test case, what has worked well, what's missing and where this is is likely to go in future.

11:05am-11:30am CINF 159: Representation of molecular structures and related computations on the semantic web: A universal data model and its ontology
Mirek Sopek2, sopek@makolab.com, Stuart Chalk1, Neil Ostlund2, Jacob Bloom2

1 Department of Chemistry, University of North Florida, Jacksonville, Florida, United States; 2 Chemical Semantics, Inc., Gainesville, Florida, United States
Chemical Semantics, Inc. is a company with a mission of bringing the semantic web to computational chemistry, with future goals covering chemical results from other areas. The company has built a universal portal that enables computational chemists to publish results of their computations on semantic web servers (powered by semantic triple stores) holding RDF data.

This presentation will report the work of the definition, implementation and evaluation of a new data model based on semantic web standards. This new model exploits further the RDF data model for efficient encoding of the molecular structures and basic results of computational chemistry experiments. Various serializations methods were tested including Turtle and JSON-LD.

The model is exceptionally flexible and allows for various types of chemical structure representation (e.g. Cartesian, fractional or based on Z-matrix). It enables the encoding of various structural units like residues for polymers and biopolymers and groups. Efficient encoding of bonding information enables fast substructure searches using standard tools like SPARQL and application in the domain of cheminformatics. The model offers maximum possible flexibility, allowing users to add their own data without the destroying the readability of the core elements.
The model enables practitioners to interact with the data in much more flexible way using variety of current programming tools and languages.
The most important aspect of the model is its fully semantic character, i.e. encoding the meaning of the data in the data itself through the reference to the new edition of Gainesville Core ontology1.

[1] Neil Ostlund, Mirek Sopek, Proceedings of the 6th International Workshop on Semantic Web Applications and Tools for Life Sciences, Edinburgh, UK, December 10, 2013. http://ceur-ws.org/Vol-1114

11:30am-11:55am CINF 160: GlyTouCan international glycan structure repository using semantic web technologies
Issaku Yamada1, issaku@noguchi.or.jp, Kiyoko Aoki-Kinoshita2,3, Nobuyuki Aoki2, Daisuke Shinmachi2, Masaaki Matsubara1, Akihiro Fujita2, Shinichiro Tsuchiya2, Shujiro Okuda4, Noriaki Fujita3, Hisashi Narimatsu3

1 The Noguchi Institute, Tokyo, Japan; 2 Graduate School of Engineering, Soka University, Tokyo, Japan; 3 Research Center for Medical Glycoscience, AIST, Tsukuba, Japan; 4 Graduate School of Medical and Dental Sciences, Niigata University, Niigata, Japan
Glycans are known as the third major biomolecules, next to DNA and proteins, and they have been found to be involved in various important biological functions. The structure of glycans, however, differs greatly from DNA and proteins in that they are branched, as opposed to linear sequences of amino acids or nucleotides. Therefore, the storage of glycan information in databases, let alone their curation, has been a difficult problem.
This has caused efforts in the integration of glycan data between different databases difficult, making an international repository for glycan structures, where unique accession numbers are assigned to every identified glycan structure, necessary. As such, an international team of developers and glycobiologists have collaborated to develop this repository, called GlyTouCan, which has been released this year and is freely available at http://glytoucan.org/, to provide a centralized resource for depositing glycan structures, compositions and topologies, and to retrieve accession numbers for each of these registered entries.
GlyTouCan has been developed based on Semantic Web technologies, providing links to other major glycan databases such as GlycomeDB and BCSDB, using RDF. The RDF data of linked resources in GlyTouCan use GlycoRDF, an ontology to represent glycomics data. Moreover, the glycan structure representation called WURCS is used as the main format for storing glycans, thus ensuring uniqueness of even ambiguous glycan structures while representing them as linear strings. This allows for efficient searching of the repository for existing structures because a simple text comparison can be used. In addition, an enhancement of WURCS as an RDF representation allows a glycan structure to be searched using a SPARQL query.
As a result, GlyTouCan enables researchers to reference glycan structures simply by accession number, as opposed to by chemical structure or text string, which has been a burden to integrate glycomics databases in the past. Moreover, GlyTouCan is being supported by the MIRAGE initiative, recommending that its accession numbers be used when reporting glycomics experiments in publications that include identified glycan structures. This will also allow easier identification of glycan structures in publications.
Thus, in the future, not only can GlyTouCan serve as a central registry, but it can serve as a portal to search for glycan-related publications as well as other biological information.

CINF: General Papers 9:00am - 11:50am
Thursday, March 17
Room 24C - San Diego Convention Center
Elsa Alvaro, Erin Davis, Organizing
Elsa Alvaro, Erin Davis, Presiding
9:00am-9:05am Introductory Remarks
9:05am-9:35am CINF 161: Progress toward a conformational database for sesquiterpene reaction pathways
Jordan Zehr2, jordan.zehr001@albright.edu, Dean Tantillo1, Christian Hamann3, chamann@albright.edu

1 Dept Chemistry, UC Davis, Davis, California, United States; 2 Chemistry & Biochemistry, Albright College, Reading, Pennsylvania, United States
The transformation of the bisabolyl cation in to a range of sesquiterpene natural products has been described in the literature (Hong, YJ; Tantillo, DJ. J. Am. Chem. Soc. 2014, 136, 2450−2463). Hong and Tantillo proposed unifying mechanistic pathways by which the moncylcic bisabolyl cation is converted the into mono-, di- and tricyclic molecules containing an array of interesting structural features including 3-to-7-membered rings, fused rings, spiro centers, geometric and stereoisomers, and conjugated dienes. The sesquiterpene products of these pathways include barbatene, bazzanene, chamigrene, chamipinene, cumacrene, cuprenene, dunniene, isobazzanene, iso-g-bisabolene, isochamigrene, laurene, microbiotene, sesquithujene, sesquisabinene, thujopsene, trichodiene, and widdradiene. Now that the chemistry steps for the pathways leading to these products have been established we are focused on establishing a conformational library in database format of sesquterpene carbocation intermediates and products. We propose that analysis of this database will provide insight into the detailed stereoelectronic requirements of these chemically complex carbocation cascades.

9:35am-10:05am CINF 162: OMPOL: Visualization of large chemical spaces
Peter Corbett, Colin Batchelor, Alexey Pshenichnov, Valery Tkachenko, tkachenkov@rsc.org

Royal Society of Chemistry, Rockville, Maryland, United States
In last few years the number and the size of chemical databases has been steadily increasing, as has the complexity of information residing in those databases creating truly multidimensional chemical spaces. Yet the most common user interface approach still remains based on search-and-browse workflow thus essentially preventing a proper navigation through such databases and hiding data patterns which may belong to other dimensions. As we at the Royal Society of Chemistry are building a chemical database service it is potentially useful to be able to visualize large chemical spaces, ranging in size from tens of thousands to tens of millions of compounds. Dimensionality reduction techniques such as PCA have been used to produce two-dimensional displays of large chemical spaces, via the production of scatterplots. Standard chart-plotting libraries allow interactive scatterplots to be produced, but do not scale well to large numbers of data points. Our new visualisation tool, OMPOL, is a browser-based tool for displaying and interacting with these data sets, allowing people to smoothly and responsively pan and zoom these plots, view the names and structures associated with the data points, select regions of chemical space and find typical and atypical members of those regions.

10:05am-10:35am CINF 163: Comparison of machine learning algorithms for the prediction of critical values and acentric factors for pure compounds
Wendy Carande, wendy.carande@nist.gov, Andrei Kazakov, Kenneth Kroenlein

NIST, Boulder, Colorado, United States
Speed and accuracy are primary factors to consider when choosing a machine learning algorithm for prediction of thermophysical properties. Individually, swift computational methods often incur large deviations between predicted and experimental values, but ensemble methods can make up for this shortcoming. We propose a boosting method in which multiple “weak learners” are combined to create a stronger predictive algorithm, and we present predictions for critical temperature, critical pressure, and acentric factor. Our training set for a given compound consists of the 15 most structurally similar (as determined by the Tanimoto metric) compounds for which we have experimental data. 19 predictive models, each with automated feature selection, are combined to construct our ensemble. These methods include multivariate adaptive regression spline models, linear models (using ridge regression, lasso, elastic net, and partial least squares strategies), rule-based model trees with nearest-neighbor corrections, and single-variable quadratic models. The median of the prediction pool provides a property estimate for the target compound and the median absolute deviation (MAD) of the predictions provides an uncertainty measure. We find that combining these methods performs favorably against any individual method in the prediction algorithm pool.

10:35am-10:50am Intermission
10:50am-11:20am CINF 164: Optimal superposition of arbitrarily ordered molecules using the Kuhn-Munkres algorithm
Berhane Temelso1, berhane.temelso@bucknell.edu, Joel Mabey1, Toshiro Kubota3, George Shields2

1 701 Moore Avenue, Bucknell University, Lewisburg, Pennsylvania, United States; 2 Deans Office, 113 Marts Hall, Bucknell University, Lewisburg, Pennsylvania, United States; 3 Mathematical Sciences, Susquehanna University, Selinsgrove, Pennsylvania, United States
When assessing the similarity between two isomers whose atoms are ordered identically, one typically translates and rotates their Cartesian coordinates for best alignment and computes the pairwise root mean square distance (RMSD). However, if the atoms are ordered differently, it is necessary to find the best ordering of the atoms and check for chirality before calculating a meaningful pairwise RMSD. The exponential scaling of the computational cost of finding best ordering makes it too expensive for any system with more than ten atoms. We report the use of Kuhn-Munkres matching algorithm to reduce the cost of finding the best ordering from exponential to polynomial scaling. That allows the application of this scheme to any arbitrary system in a reasonably short time. The implementation of this approach and its application to systems ranging from molecular clusters to large peptides will be demonstrated.

11:20am-11:50am CINF 165: Predicting drug-induced hepatic systems' toxicity by integrating transporter interaction profiles

Eleni Kotsampasakou, eleni.kotsampasakou@univie.ac.at, Gerhard Ecker

Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria
Systems pharmacology studies that utilize large data sets, such as protein–protein interaction networks and the FDA adverse event reports, can enhance the understanding of drug adverse events and pinpoint off-targets [1]. In this context, drug-induced liver injury (DILI) is a major challenge for drug development, as it comprises one of the main causes of attrition [2]. There are several indications in literature associating hepatic transporter inhibition with manifestations of DILI, such as OATP1B1 and 1B3 with hyperbilirubinemia [3] and BSEP with cholestasis [4].
Towards this direction, we developed statistical classification models for predicting hepatotoxic endpoints, such as hyperbilirubinemia, cholestasis and DILI, by combining the physicochemical and structural properties of compounds with their hepatic transporter inhibition profiles. For the latter task, we used our in-house transporter inhibition models for BSEP, P-glycoprotein, BCRP and OATP1B1 and OATP1B3. Several meta- and base-classifiers were investigated and the classification models obtained were of reasonable performance for all three endpoints.
For the case of hyperbilirubinemia, OATP1B1 and 1B3 inhibition profiles are evaluated as important descriptors, even though there is no significant improvement of the statistical performance of the model when using transporters’ information. In contrast, for cholestasis the use of transporter inhibition profiles significantly improves the model’s performance, although the individual transporters are not ranked high in comparison to other physicochemical descriptors. Finally for general DILI prediction, descriptors annotating transporter inhibition do not influence the model’s performance. In addition, their importance is low compared to other physicochemical descriptors, such as lipophilicity. This suggests that for the entire liver system, there is no clear association pattern with transporters - at least not for the particular ones investigated.

Acknowledgements
The research leading to these results has received support from the Innovative Medicines Initiative Joint Undertaking under grant agreement n°115002 (eTOX), as well as from the Austrian Science Fund, grant F3502.

References
1. Berger, S.I. et al., Interdiscip Rev Syst Biol Med 2011, 3, (2), 129–135
2. O’ Brien, P.J. et al., Arch Toxicol 2006, (80), 580–604
3. Chang, J. H. et al., Mol Pharm 2013, 10, (8), 3067-75
4. Vinken M. et al., Toxicol Sci 2013, 136,(1), 97–106

CINF: Chemistry, Data & the Semantic Web: An Important Triple to Advance Science 1:30pm - 4:20pm
Thursday, March 17
Room 25B - San Diego Convention Center
Evan Bolton, Stuart Chalk, Organizing
Evan Bolton, Stuart Chalk, Presiding
1:30pm-1:35pm Introductory Remarks
1:35pm-2:00pm CINF 166: Ontology for biomedical investigations (OBI)
Bjoern Peters, bpeters@lji.org, James Overton, Randi Vita, OBI consortium

Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, California, United States
The Ontology for Biomedical Investigations (OBI) provides terms with precisely defined meaning to describe all aspects of how biomedical investigations are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. OBI covers all phases of the investigation process, such as planning, execution and reporting. It represents information and material entities that participate in these processes, as well as roles and functions. Prior to OBI, it was not possible to use a single internally consistent resource that could be applied to multiple types of experiments for these applications. OBI has made this possible by creating terms for entities involved in biological and medical investigations and by importing parts of other biomedical ontologies such as GO, ChEBI and PATO without altering their meaning. OBI is being used in a wide range of projects covering genomics, multi-omics, immunology, and catalogs of services. The OBI project is an open cross-disciplinary collaborative effort, encompassing multiple research communities from around the globe. The OBI Consortium maintains a web resource (http://obi-ontology.org) providing details on the people, policies, and issues being addressed in association with OBI. The current release of OBI is available at http://purl.obolibrary.org/obo/obi.owl.

2:00pm-2:25pm CINF 167: Protein ontology: Fostering connections in chemical biology
Darren Natale1,2, dan5@georgetown.edu

1 Georgetown University Medical Center, Washington, District of Columbia, United States; 2 PRO Consortium, Washington, District of Columbia, United States
Our understanding of The Way Things Work advances when we are able to make connections between individual observations and the entities such observations are about. Such understanding is especially facilitated when we have the ability to say precisely what is known, without overstating or understating, about precisely what entity. With this notion in mind, the Protein Ontology (PRO) was developed to provide protein entity representation at key levels of abstraction, ranging from general protein families down to specific protein PTM forms. Here, we describe PRO, its development, and its place in the network of knowledge using examples from the fields of proteomics, glycobiology, and pharmacology.

2:25pm-2:50pm CINF 168: Ontologies for classifying and modeling drug discovery data
Stephan Schuerer1,3, stephan.schurer@gmail.com, Asiyah Yu Lin1, Saurabh Mehta1, Hande Kücük McGinty2, Qiong Cheng3, Amar Koleti3, Nooshin Zadeh1, Dusica Vidovic1,3

1 Pharmacology, University of Miami, Miami, Florida, United States; 2 Computer Science, University of Miami, Miami, Florida, United States; 3 Center for Computational Science, University of Miami, Miami, Florida, United States
Several research consortia and countess projects in pharmaceutical companies generate, organize, and analyze small molecule drug screening data. Such consortia supported by the NIH Common Fund include the (now past) Molecular Libraries Program (MLP), and currently the Illuminating the Druggable Genome (IDG) and the Library of Integrated Network-based Cellular Signatures (LINCS) projects. A large component of the MLP program was the development of chemical probes to study a wide variety of biological questions. This program generated new assay technologies, huge amounts of chemical biology screening data and over 350 chemical probes. The observation of an apparent strong bias of drug discovery research and development efforts towards targets that are already well studied, motivated the IDG program to prioritize novel drug targets and catalyze the development of chemical entities that target understudied proteins in these families. The LINCS program has a systems biology focus. The project creates a reference 'library' of molecular signatures, such as changes in gene expression and other cellular phenotypes that occur when cells are exposed to a variety of perturbing agents, and computational tools for data integration, access, and analysis. Dimensions of LINCS signatures include the biological model system (cell type), the perturbation (e.g. small molecules) and the assays that generate diverse of phenotypic profiles.
Data integration is a common and critical challenge in these and other projects; and data integration requires common metadata standards and conventions for data representation and exchange. Towards the goal of creating common data standards to represent data in these and other projects that produce data relevant for drug discovery, and to support software tools that we and others have been building as part of these projects, we have been developing ontologies including the BioAssay Ontology (BAO) and the Drug Target Ontology (DTO). The goal of these ontologies is enable the knowledge-based classification of diverse datasets into categories that facilitates re-use and context-specific integration of these data, for example to develop predictive models or to quickly explore and correlate different datasets.
BAO, DTO and other ontologies provide a robust framework to represent, integrate, model, and query diverse drug discovery data generated in different projects.

2:50pm-3:05pm Intermission
3:05pm-3:30pm CINF 169: Immune Epitope Database (IEDB) and its use of formal ontologies
Randi Vita, rvita@liai.org, James Overton, Bjoern Peters

Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, California, United States
The Immune Epitope Database (IEDB) is a resource provided by the NIH/NIAID to make all published experimental data regarding immune epitopes freely available to the scientific public. Immune epitopes are the specific portion of a pathogen, allergen or autoantigen that is recognized by antibodies or T cells of the immune system. They are most often linear peptides, but can also be carbohydrates, lipids, metals, or other structures. The IEDB represents experimental assays demonstrating the binding of an epitope specific adaptive immune receptor (TCR, antibody, or MHC molecule) to an antigen in a consistent and easily searchable manner by harnessing established biological ontologies for its data representation and creating new ontologies, when needed. Formal ontologies provide standardized nomenclature, hierarchical relationships, and logical definitions. They also provide a simple mechanism to link disparate resources and allow sophisticated queries across these resources.

3:30pm-3:55pm CINF 170: PubChemRDF: Semantic annotation and search
Gang Fu1, gangfu1982@gmail.com, Evan Bolton2

1 NCBI, NIH, Rockville, Maryland, United States; 2 NCBI, NIH, Bethesda, Maryland, United States
PubChem is an open repository for chemical substance description, biological activities and biomedical annotations. PubChem databases have been cross-referenced with other National Center for Biotechnology Information (NCBI) resources, such as PubMed, Gene, Biosystems, and so on. Semantic Web standards offer a well-defined syntax for the formal representation of the PubChem knowledgebase, and Semantic Web technologies facilitate the query and reasoning of PubChem data. PubChemRDF project focused on the semantic annotations of PubChem databases, which were accomplished by using standardized ontologies that promise the high compatibility and consistency with currently existing cheminformatics and bioinformatics resources. Semantic annotations may help PubChem data to be shared, reused, and analyzed across chemical, biological, and life science domains. PubChemRDF provides a new ability for researchers to utilize schema-less database with rule-based reasoner to search and analyze data. We will demonstrate how to combine SPARQL queries and Description Logic (DL) queries for question answering using PubChemRDF data.

3:55pm-4:20pm CINF 171: Generic scientific data model and ontology for representation of chemical data
Stuart Chalk, schalk@unf.edu

Department of Chemistry, University of North Florida, Jacksonville, Florida, United States
The current movement toward openness and sharing of data is likely to have a profound effect on the speed of scientific research and the complexity of questions we can answer. However, a fundamental problem with currently available datasets (and their metadata) is heterogeneity in terms of implementation, organization, and representation.

To address this issue we have developed a generic scientific data model (SDM) to organize and annotate raw and processed data, and the associated metadata. This paper will present the current status of the SDM, implementation of the SDM in JSON-LD, and the associated scientific data model ontology (SDMO). Example usage of the SDM to store data from a variety of sources with be discussed along with initial efforts to develop SPARQL queries, based on the SDMO, that allows federated search across different datasets.

Technical Program with Slides (new)

Here is a complete list of presentations with slides if available
  • 1) 'Relational database file can take us beyond the plain text file format'
    T O'Donnell1
    1. gNova, San Diego, CA, United States.
     
  • 3) 'Rule-based capture/storage of scientific data from PDF files and export using a generic scientific data model'
    Stuart J. Chalk1, Audrey Bartholomew1, Bashar Baraz1, John Turner1
    1. Department of Chemistry, University of North Florida, Jacksonville, FL, United States.
     
  • 4) 'Building linked-data, large-scale chemistry platform: Challenges, lessons, and solutions'
    Valery Tkachenko1, Alexey Pshenichnov1, Aileen Day1, Colin Batchelor1, Peter Corbett1
    1. Royal Society of Chemistry, Rockville, MD, United States.
     
  • 5) 'Towards a functional database for enzyme data: STRENDA DB'
    Carsten Kettner1, Martin G. Hicks1
    1. Beilstein Institut, Frankfurt/Main, Germany.
     
  • 6) 'Virtues and vicissitudes of curatorial data wrangling: The guide to pharmacology experience'
    Christopher Southan1
    1. Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden.
     
  • 7) 'Finding better aim at a moving target by exploiting structural data'
    Marcel Verdonk1
    1. Astex Pharmaceuticals, Cambridge, United Kingdom.
     
  • 8) 'Bridging the dimensions: Seamless integration of 3D structure-based design and 2D structure-activity relationships to guide medicinal chemistry'
    Marcus Gastreich1, Matthew D. Segall3, Carsten Detering2, Edmund Champness3, Christian Lemmen1
    1. BioSolveIT, Sankt Augustin, Germany. 2. BioSolveIT Inc, Bellevue, WA, United States. 3. Optibrium Ltd, Cambridge, United Kingdom.
     
  • 9) 'Predicting binding affinity doesn't work, or does it?'
    Christian Lemmen1
    1. BioSolveIT, Sankt Augustin, Germany.
     
  • 10) 'Structural knowledge by prediction: Crystal structure prediction tests and progress'
    Colin Groom1, Jason Cole1, Anthony M. Reilly1
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom.
     
  • 11) 'Using physicochemical data and predictions in the risk assessment of mutagenic impurities'
    Susanne Stalford1
    1. Lhasa Limited, Leeds, United Kingdom.
     
  • 12) 'Profile-QSAR generation 2: Perfection, the enemy of the good?'
    Valery R. Polyakov1, Eric J. Martin2, Li Tian1
    1. GDC, NIBR, Lafayette, CA, United States. 2. Computational Chemistry, Novartis, El Cerrito, CA, United States.
     
  • 13) 'Open data is not enough: A look at the Research Data Alliance'
    Mark Parsons1
    1. Research Data Alliance , Boulder, CO, United States.
     
  • 14) 'Responses to the data revolution: CODATA on policy, data science, and capacity building'
    Simon Hodson1, John Rumble2
    1. CODATA, Paris, France. 2. R&R Data Services, Gaithersburg, MD, United States.
     
  • 15) 'Moving research forward with persistent identifiers and services'
    Patricia Cruse1
    1. DataCite, Berkeley, CA, United States.
     
  • 16) 'Discoverability and reusability of FAIR chemistry research data as a key outcome of registering persistent identifiers and standardised metadata with DataCite'
    Henry S. Rzepa1, Matthew J. Harvey2, Andrew Mclean3
    1. Chemistry, Imperial College London, London, United Kingdom. 2. HPC division, Imperial College London, London, United Kingdom. 3. ICT Division, Imperial College London, London, United Kingdom.
     
  • 17) 'Surveying and tracking the biomedical data landscape'
    Maryann E. Martone1
    1. Neurosciences, University of California, San Diego, San Diego, CA, United States.
     
  • 18) 'Data Observation Network for Earth: Earth and environmental science data management and discovery'
    Amber E. Budden1, William Michener1, Dave Vieglais2, Rebecca Koskela1, Heather Soyka1
    1. University of New Mexico, Albuquerque, NM, United States. 2. University of Kansas, Lawrence, KS, United States.
     
  • 19) 'California Digital Library: Advancing the digital transition of scholarly information'
    John Chodacki1
    1. California Digital Library, University of California, Oakland, CA, United States.
     
  • 20) 'Sigma-hole interactions for rational drug design'
    Suman Sirimulla1
    1. Basic Sciences, St.Louis College of Pharmacy, St. Louis, MO, United States.
     
  • 21) 'Deep convolutional neural networks for autonomous discovery of molecular interactions'
    Abraham Heifets1, Izhar Wallach2, Michael Dzamba3
    1. Atomwise, Inc., San Francisco, CA, United States. 2. Atomwise, Inc., San Francisco, CA, United States. 3. Atomwise, Inc., San Francisco, CA, United States.
     
  • 22) 'Crystallographic informatics: Similarity and statistics'
    Simon J. Coles2, Graham J. Tizzard2, Philip Adler1
    1. Chemistry, Haverford College, Haverford, PA, United States. 2. University of Southhampton, Hampshire, United Kingdom.
     
  • 23) 'Chemical fragment analysis of halogen bonds in protein binding sites'
    AhWing Chan1
    1. UCL, London, United Kingdom.
     
  • 24) 'Mining interaction data in the Cambridge structural database: Getting the rewards and removing the risks!'
    Jason Cole1, Peter A. Wood1, Neil Feeder1, Robin Taylor1, Colin Groom1
    1. CCDC, Cambridge, United Kingdom.
     
  • 25) 'Fast mining of adaptable interaction patterns in protein-ligand interface'
    Therese Inhester2, Matthias Rarey1
    1. University of Hamburg, Hamburg, Germany. 2. Center for Bioinformatics, University of Hamburg, Hamburg, Germany.
     
  • 26) 'Dual nature of a halogen atom'
    Mahesh Narayan1
    1. Chemistry, University of Texas at El Paso, El Paso, TX, United States.
     
  • 27) 'Crystal clear: Using statistical descriptions and analysis to understand crystallisation'
    Philip Adler2, Simon J. Coles4, Alex J. Norquist1, Joshua Schrier2, Dave Woods4, Sorelle Friedler1, Lucy Mapp3
    1. Haverford College, Bryn Mawr, PA, United States. 2. Chemistry, Haverford College, Haverford, PA, United States. 3. Chemistry, University of Southampton, Southampton, United Kingdom. 4. University of Southhampton, Hampshire, United Kingdom.
     
  • 28) 'Towards a fully automated creation of large protein structure ensembles'
    Stefan Bietz1, Matthias Rarey1
    1. University of Hamburg, Hamburg, Germany.
     
  • 29) 'On our way to the automated search for ligand-sensing cores'
    Tobias Brinkjost1, 2, Christiane Ehrt2, Petra Mutzel1, Oliver Koch2
    1. Faculty of computer science, TU Dortmund University, Dortmund, Germany. 2. Faculty of chemistry and chemical biology, TU Dortmund University, Dortmund, Germany.
     
  • 30) 'Deep learning in the 3rd dimension: Structure-based bioactivity prediction on novel targets'
    Abraham Heifets1, Izhar Wallach2, Michael Dzamba3
    1. Atomwise, Inc., San Francisco, CA, United States. 2. Atomwise, Inc., San Francisco, CA, United States. 3. Atomwise, Inc., San Francisco, CA, United States.
     
  • 31) 'CDD vision: Advanced analytics, calculations, and visualization live in CDD vault'
    Barry A. Bunin1
    1. CDD, Belmont, CA, United States.
     
  • 32) 'Advances in data provisioning'
    Barry A. Bunin1
    1. CDD, Belmont, CA, United States.
     
  • 33) 'Chemical information on the web: Find and be found'
    Asta Gindulyte1
    1. National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD, United States.
     
  • 34) 'Quantifying the effect that chemical environment exerts upon changes in property in matched molecular pairs analysis'
    Iva Lukac1, Andrew Leach1, 3, Edward J. Griffen3, Alexander Dossetter2
    1. School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, United Kingdom. 2. MedChemica Limited, Macclesfield, United Kingdom. 3. Medchemica Ltd, Macclesfield, United Kingdom.
     
  • 35) 'CSNAP: A new chemoinformatics approach for target identification using chemical similarity networks'
    Yu-Chen Lo1, Silvia Senese1, Chien-Ming Li3, Qiyang Hu2, Yong Huang3, Robert Damoiseaux4, Jorge Torres1
    1. Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA, United States. 2. Institute for Digital Research and Education, University of California, Los Angeles, Los Angeles, CA, United States. 3. Drug Study Units, University of California, San Francisco, San Francisco, CA, United States. 4. Molecular Shared Screening Resource, University of California, Los Angeles, Los Angeles, CA, United States.
     
  • 36) 'Prediction and quantification of cation-? interactions in ligand-bromodomain binding: Using quantum chemistry to capture electronic effects'
    Wilian Augusto Cortopassi1, Robert S. Paton1
    1. Chemistry Research Laboratory, University of Oxford, Oxford, United Kingdom.
     
  • 37) '3Dmol.js: Chemical structure visualization for the modern web'
    Jasmine L. Collins1, Matthew Ragoza3, Justin Jensen4, David Koes2
    1. Computer Science/Neuroscience, University Of Pittsburgh, Pittsburgh, PA, United States. 2. Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, United States. 3. University Of Pittsburgh, Pittsburgh, PA, United States. 4. Pittsburgh Science & Technology Academy, Pittsburgh, PA, United States.
     
  • 38) 'General purpose 2D and 3D similarity approach to identify hERG blockers'
    Patric Schyman1, Ruifeng Liu1, Anders Wallqvist1
    1. DoD Biotechnology High Performance Computing Software Applications Institute, Frederick, MD, United States.
     
  • 39) 'Indexing techniques and algorithms to efficiently mine interaction patterns in large sets of protein-ligand-complexes'
    Therese Inhester2, Matthias Rarey1
    1. University of Hamburg, Hamburg, Germany. 2. Center for Bioinformatics, University of Hamburg, Hamburg, Germany.
     
  • 40) 'Development and application of multiclass QSAR models for predicting human skin sensitization'
    Vinicius M. Alves3, 2, Alexey Zakharov1, Eugene Muratov3, Denis Fourches5, Nicole Kleinstreuer4, Judy Strickland4, Carolina H. Andrade2, Alexander Tropsha3
    1. CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, Frederick, MD, United States. 2. Faculty of Pharmacy, Federal University of Goias, Goiania, Goias, Brazil. 3. UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States. 4. Contractor supporting the NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), ILS, Inc., Research Triangle Park, NC, United States. 5. Department of Chemistry and Bioinformatics Research Center, North Carolina State University, Chapel Hill, NC, United States.
     
  • 41) 'Virtual screening in the cloud computing environment'
    Aaron Cooper1, Mathew R. Koebel3, Grant Schmadeke1, Suman Sirimulla2
    1. Basic Sciences, St. Louis College of Pharmacy, St. Louis, MO, United States. 2. Basic Sciences, St.Louis College of Pharmacy, St. Louis, MO, United States. 3. Basic Sciences, St. Louis College of Pharmacy, St. Louis, MO, United States.
     
  • 42) 'Structural evolution of Tcn (n = 4-20) clusters from first-principles global minimization'
    Chad Priest1, De-en Jiang2
    1. Chemsitry, University California, Riverside, Riverside, CA, United States. 2. Department of Chemistry, University of California, Riverside, Riverside, CA, United States.
     
  • 43) 'PubChem BioAssay: A decade's practice for managing chemistry research data'
    Yanli Wang1
    1. NCBI, NLM, NIH , Building 38A, Room 5S506, 8600 Rockville Pike, Bethesda, MD, United States.
     
  • 44) 'Data infrastructural design for informing critical evaluation'
    Kenneth Kroenlein1
    1. Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, CO, United States.
     
  • 45) 'Community-driven disciplinary data repositories: A case study'
    Ian Bruno1, Colin Groom1
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom.
     
  • 46) 'ICSU World Data System: Trusted data services for global science'
    Mustapha Mokrane1, Jean-Bernard Minster2, Rorie Edmunds1
    1. International Programme Office, ICSU World Data System, Koganei, Tokyo, Japan. 2. Institute of Geophysics and Planetary Physics, Scripps Institution of Oceanography, La Jolla, CA, United States.
     
  • 47) 'STRENDA and MIRAGE: Examples of community-based data reporting standardization initiatives'
    Martin G. Hicks1, Carsten Kettner1
    1. Beilstein Institut, Frankfurt, Germany.
     
  • 48) 'Standardizing the description of nanomaterials: The CODATA uniform description system'
    John Rumble1, Steven Freiman2, Clayton Teague3
    1. R&R Data Services, Gaithersburg, MD, United States. 2. Freiman Consulting, Potomac, MD, United States. 3. Teague Consulting, Gaithersburg, MD, United States.
     
  • 49) 'Scientific units in the electronic age'
    Stuart J. Chalk1
    1. Department of Chemistry, University of North Florida, Jacksonville, FL, United States.
     
  • 50) 'Toward semantic representation of science in electronic laboratory notebooks (ELNs)'
    Stuart J. Chalk1
    1. Department of Chemistry, University of North Florida, Jacksonville, FL, United States.
     
  • 51) 'New cloud-based ELN with built-in raw analytical data support and automatic structure confirmation capabilities'
    Santiago Dominguez Vivero1, Juan C. Cobas Gomez1, Santiago Fraga Castro1, Francisco Javier Sardina2
    1. Mestrelab Research SL, Hereford, Herefordshire, United Kingdom. 2. Chemistry, University of Santiago de Compostela, Santiago De Compostela, A Coruña, Spain.
     
  • 52) 'Mobile interfaces for a digital research notebook'
    Jeremy G. Frey2, Cerys Willoughby2, Simon J. Coles1, Richard J. Whitby3, Colin L. Bird2
    1. University of Southampton, Hampshire, United Kingdom. 2. University of Southampton, Southampton, United Kingdom. 3. Univeristy of Southampton, Southampton, Hants, United Kingdom.
     
  • 53) 'Not just another reaction database'
    Aileen Day2, Valery Tkachenko2, Alexey Pshenichnov2, Leah McEwen1, Simon J. Coles3, Richard J. Whitby3
    1. Clark Library, Cornell University, Ithaca, NY, United States. 2. Royal Society of Chemistry, Rockville, MD, United States. 3. University of Southhampton, Hampshire, United Kingdom.
     
  • 54) 'Directly upload data from an ELN into PubChem'
    Ben Shoemaker1, Asta Gindulyte1, Evan Bolton1, Steve Bryant1
    1. NCBI / NLM / NIH, Bethesda, MD, United States.
     
  • 55) 'Intuitive collaboration platform: A Scilligence story'
    Rajeev Hotchandani1, Jinbo Lee2
    1. Scilligence, Watertown, MA, United States. 2. Scilligence Corporation, Burlington, MA, United States.
     
  • 56) 'ACAS LIMS simplifies diverse data loading, management, and querying'
    John McNeil1, Guy Oshiro1, Brian C. Fielder1, Eva Gao1, Samuel Meyer1, Brian Bolt1, Fiona McNeil1, Matthew Shaw1, Kelley Carr1
    1. John McNeil & Co., San Diego, CA, United States.
     
  • 57) 'ChemEngine: An automated chemical data harvesting tool for molecular inventory and chemical computing from scientific literature'
    Muthukumarasamy Karthikeyan1, Renu Vyas2
    1. Digital Information Resource Centre, CSIR National Chemical Laboratory, Pune, India. 2. Chemical Engineering and Process Development, CSIR-National Chemical Laboratory, Pune, MH, India.
     
  • 58) 'Screening of materials for energy applications based on transport properties: Methods and data automation tools'
    Boris Kozinsky1
    1. Bosch Research, Waban, MA, United States.
     
  • 59) 'High-throughput chemical simulations and virtual screening for materials discovery'
    Mathew Halls1, David Giesen1, Thomas Hughes1, Shaun Kwak1, Thomas Mustard1, Jacob Gavartin1, Alexander Goldberg1, Yixiang Cao1
    1. Schrodinger Inc., San Diego, CA, United States.
     
  • 60) 'Machine learning and high-throughput quantum chemistry methods for the discovery of organic materials'
    Alan Aspuru-Guzik1
    1. Harvard University, Cambridge, MA, United States.
     
  • 61) 'Using drug discovery methods to accelerate the search for better battery materials'
    Joshua Schrier1
    1. Chemistry, Haverford College, Haverford, PA, United States.
     
  • 62) 'Combining density functional theory with cheminformatics for development of a new-paradigm ligand screening method in computational drug discovery'
    Art Cho1, 2
    1. Korea University, Seoul, Korea (the Republic of). 2. Quantum Bio Solutions, Seoul, Korea (the Republic of).
     
  • 63) 'Discovery through deterministic optimization: Navigating chemical space for effective material design'
    Jennifer M. Elward1, Christopher B. Rinderspacher1
    1. Army Research Laboratory, Aberdeen Proving Ground, MD, United States.
     
  • 64) 'Authoring tools to automate data sharing in scientific publishing'
    John R. Kitchin1
    1. Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, United States.
     
  • 65) 'Facilitating the inclusion of analytical raw data in the submission and review process'
    Santiago Dominguez Vivero1, Juan C. Cobas Gomez1, Felipe Seoane1, Jose A. Garcia Pulido1, Agustin Barba1, Jesus A. Varela Carrete2
    1. Mestrelab Research SL, Hereford, Herefordshire, United Kingdom. 2. Chemistry, University of Santiago de Compostela, Santiago de Compostela, A Coruña, Spain.
     
  • 66) 'Crystallography: A domain exemplar for chemistry data management'
    Simon J. Coles1
    1. University of Southhampton, Hampshire, United Kingdom.
     
  • 67) 'Are data management solutions developed for commercial organizations suitable for academic research?'
    Mariana E. Vaschetto1, Tom Oldfield1, Michael J. Hartshorn1
    1. Dotmatics, Bishops Stortford, United Kingdom.
     
  • 68) 'Data sharing in life sciences R&D: Pre-competitive collaboration through the Pistoia Alliance'
    Carmen I. Nitsche1
    1. Pistoia Alliance, San Antonio, TX, United States.
     
  • 69) 'The Royal Society of Chemistry and the data publication landscape'
    Serin Dabb1
    1. The Royal Society of Chemistry, Cambridge, United Kingdom.
     
  • 70) 'Digital IUPAC: The need for global representation of chemistry and chemical information in the digital age'
    Jeremy G. Frey1
    1. University of Southampton, Southampton, United Kingdom.
     
  • 71) 'DIG chemistry: Establishing a research data interest group to address the many faces of chemical data management'
    Leah McEwen1
    1. Clark Library, Cornell University, Ithaca, NY, United States.
     
  • 72) 'Building a business with and without scientific computing: The five W's and one H'
    Steven M. Muskal1
    1. Suite 103-475, Eidogen, Oceanside, CA, United States.
     
  • 73) 'Interactive cheminformatics for occasional use in SMEs'
    Therese Inhester1, Matthias Hilbig3, Matthias Rarey2
    1. Center for Bioinformatics, University of Hamburg, Hamburg, Germany. 2. University of Hamburg, Hamburg, Germany. 3. Center for Bioinformatics, University of Hamburg, Hamburg, Germany.
     
  • 74) 'Playing by the rules: Knowing what applies and what information you have to maintain regarding your chemical inventory'
    Frankie K. Wood-Black1
    1. Ag., Science and Engineering, Northern Oklahoma College, Ponca City, OK, United States.
     
  • 75) 'ChemSpider: Search and share chemistry... for free'
    Serin Dabb1
    1. The Royal Society of Chemistry, Cambridge, United Kingdom.
     
  • 76) 'What chemists and other scientists need to know about their duty of disclosure under the new law governing the patenting process in the US'
    Xavier Pillai1
    1. Leydig Voit Mayer Ltd, Chicago, IL, United States.
     
  • 77) 'Monitoring the minnows: Using IP information to understand what small businesses are doing'
    Stephen R. Adams1
    1. Magister Ltd, Roche, Cornwall, United Kingdom.
     
  • 78) 'Patent information in PubChem for small businesses and startups'
    Sunghwan Kim1, Paul Thiessen1, Evan Bolton1, Steve Bryant1
    1. National Library of Medicine, National Institutes of Health, Rockville, MD, United States.
     
  • 79) 'Open patent chemistry "big bang" presents large opportunities for small enterprises'
    Christopher Southan1
    1. Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden.
     
  • 80) 'In silico, high-throughput screening of non-fullerene acceptor materials for applications of organic photovoltaic devices: A Harvard clean energy project study'
    Steven A. Lopez1, Edward Pyzer-Knapp1, Alan Aspuru-Guzik1
    1. Harvard University, Cambridge, MA, United States.
     
  • 81) 'Regioselectivity prediction of metabolic reactions based on ab initio derived descriptors'
    Arndt R. Finkelmann2, Andreas H. Göller1, Gisbert Schneider2
    1. Global Drug Discovery, Bayer Pharma AG, Wuppertal, Germany. 2. Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
     
  • 82) 'COSMO-based approach for the design of solvents to optimize reaction rates'
    Nicholas D. Austin1, Nikolaos V. Sahinidis2, Daniel W. Trahan3
    1. Chemical Engineering, Carnegie Mellon University, Bowling Green, KY, United States. 2. Dept Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, United States. 3. The Dow Chemical Company, Freeport, TX, United States.
     
  • 83) 'Efficient, first-principles-based screening for high-charge carrier mobility in organic crystals'
    Christoph Schober1, Karsten U. Reuter1, Harald Oberhofer1
    1. Chair of Theoretical Chemistry, Technical University Munich, Garching, Germany.
     
  • 84) 'Data-driven chemistry: From small molecules to discovery of new functional materials'
    Olexandr Isayev2, Alexander Tropsha1
    1. Univ of North Carolina, Chapel Hill, NC, United States. 2. UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States.
     
  • 85) 'Multi-agent approach for molecular modeling in chemical vapor deposition'
    Luke E. Achenie1
    1. Virginia Tech, Blacksburg, VA, United States.
     
  • 86) 'Towards knowledge representation improvements in chemistry'
    Evan Bolton1
    1. NCBI / NLM / NIH, Warrenton, VA, United States.
     
  • 87) 'Chemical classifications for biology and medicine'
    Minoru Kanehisa1
    1. Institute for Chemical Research, Kyoto University, Uji Kyoto, Japan.
     
  • 89) 'ChEBI database and ontology: A key resource for chemical biology and metabolomics'
    Gareth Owen1
    1. EMBL-EBI, Ely, United Kingdom.
     
  • 90) 'Classifying chemistry: Current efforts in Canada'
    David S. Wishart1
    1. Biological Sciences, University of Alberta, Edmonton, AB, Canada.
     
  • 91) 'Classifying compounds in public databases'
    Lutz Weber1
    1. IT, OntoChem, Germering, Germany.
     
  • 92) 'Automated structural and functional annotation of small molecules using integrated chemical ontologies: ClassyFire, ChemOnt, and downstream applications'
    Yannick Djoumbou Feunang1
    1. Biological Sciences, University Of Alberta, Edmonton, AB, Canada.
     
  • 93) 'Evaluation of machine-generated chemical ontologies for molecular information'
    Stephen Boyer1, Thomas Griffin1, Eric Louie1
    1. IBM Research, San Jose, CA, United States.
     
  • 94) 'Connecting 3D chemical data with biological information'
    Ian Bruno1, Suzanna Ward1, Elizabeth Thomas1, Colin Groom1
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom.
     
  • 95) 'PubChem BioAssay: Link chemical research to GenBank and beyond'
    Yanli Wang1
    1. Building 38a, Room 5s506, Bethesda, MD, United States.
     
  • 97) 'Predicting adverse drug events using literature-based pathway analysis'
    James Rinker1, Timothy Hoctor1
    1. R & D Solutions, Elsevier Inc., Philadelphia, PA, United States.
     
  • 98) 'Intersecting different databases to define the inner and outer limits of the data-supported druggable proteome'
    Christopher Southan1
    1. Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden.
     
  • 99) 'Applications of drug-target data in translating genomic variation into drug discovery opportunities'
    Anna Gaulton1
    1. Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom.
     
  • 100) 'NIH public access policy'
    Neil Thakur1
    1. NIH, Rockville, MD, United States.
     
  • 101) 'U.S. Department of Energy public access plan'
    Laura Biven1
    1. US Department of Energy, Washington, D.C., District Of Columbia, United States.
     
  • 102) 'Helping authors and funders achieve open access goals at ACS Publications'
    Darla Henderson1
    1. Publications Division, American Chemical Society, Washington, District Of Columbia, United States.
     
  • 103) 'Libraries at the hub as the federally funded research wheel turns to open'
    Shannon Kipphut-Smith1, Betty Rozum2, Becky Thoms3
    1. Rice University, Houston, TX, United States. 2. Utah State University, Logan, UT, United States. 3. Utah State University, Logan, UT, United States.
     
  • 104) 'SHARE phase II: Enhancing the dataset and engaging the community'
    Judy Ruttenberg1
    1. Association of Research Libraries, Washington, DC, United States.
     
  • 105) 'Supporting openness and reproducibility in scientific research: The Center for Open Science'
    Sara Bowman1
    1. Center for Open Science, Charlottesville, VA, United States.
     
  • 106) 'Impact of open publishing: Scalability, sustainability, and success'
    Ann Gabriel1
    1. Elsevier, New York, NY, United States.
     
  • 107) 'Representing the chemistry of 800,000 crystal structures'
    Suzanna Ward1, Ian Bruno1, Colin Groom1
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom.
     
  • 108) 'CHEMnetBASE and beyond: CRC handbooks and dictionaries in today's world'
    Fiona Macdonald1, Megan Eisenbraun2
    1. Taylor and Francis, Boca Raton, FL, United States. 2. Taylor & Francis, London, United Kingdom.
     
  • 109) 'Collection, curation, and communication of thermophysical and thermochemical property data at the NIST Thermodynamics Research Center'
    Andrei Kazakov1, Robert Chirico3, Chris D. Muzny4, Vladimir Diky5, Eugene Paulechka1, Ala Bazyleva1, Joseph Magee2, Scott A. Townsend1, Kenneth Kroenlein2
    1. NIST, Boulder, CO, United States. 2. Thermodynamics Research Center, National Institute of Standards and Technology, Boulder, CO, United States. 3. National Institute of Standards Technology, Boulder, CO, United States. 4. NIST, Boulder, CO, United States. 5. NIST, Boulder, CO, United States.
     
  • 110) 'Building a better materials science database: Challenges and opportunities'
    Robin Padilla1, Michael Klinge1
    1. Corporate Markets & Databases, Springer Nature, Heidelberg, Germany.
     
  • 111) 'TCI's approaches to chemical information for researchers'
    Haruhiko Taguchi1, Tracey Barber2
    1. RD (Information Management) Department, Tokyo Chemical Industry Co Ltd, Chuo-ku Tokyo, Japan. 2. Marketing, TCI America, Cambridge, MA, United States.
     
  • 112) 'Presenting the latest scientific knowledge on an e-commerce website'
    Jonathan Stephan1
    1. Sigma Aldrich, Saint Louis, MO, United States.
     
  • 113) 'Beyond chemistry: Collect, organize, and visualize scientific data on the web'
    David Deng1, Rajeev Hotchandani1, Jinbo Lee1
    1. Scilligence, Burlington, MA, United States.
     
  • 115) 'Reactome pathway knowledgebase: Connecting pathways, networks, and disease'
    Robin A. Haw1
    1. Informatics and Bio-computing, OICR, Toronto, ON, Canada.
     
  • 116) 'Competitive intelligence workbench: Getting access to information for decision making'
    Huijun wang1
    1. Merck, Kenilworth, NJ, United States.
     
  • 117) 'Using systems biology in computational drug design workflows'
    George Nicola1, Bruce Kovacs1
    1. Afecta Pharmaceuticals, Irvine, CA, United States.
     
  • 118) 'Combining semantic triples across domains to identify new and novel relationships and knowledge'
    Matthew Clark1, Frederik van den Broek1, Anton Yuryev1, Maria Shkrob1, Sherri Matis-Mitchell1, Timothy Hoctor2
    1. R & D Solutions, Elsevier Inc., Philadelphia, PA, United States. 2. R & D Solutions, Elsevier Inc., Philadelphia, PA, United States.
     
  • 119) 'Are we ready to define the scholarly commons?'
    Maryann E. Martone1, 2
    1. Neurosciences, University of California, San Diego, San Diego, CA, United States. 2. Hypothes.is, San Francisco, CA, United States.
     
  • 120) 'Research data curation services at UC San Diego library'
    Ho Jung Yoo1, David Minor1
    1. Library, UC San Diego, San Diego, CA, United States.
     
  • 121) 'Is open science an inevitable outcome of e-science?'
    Jeremy G. Frey1
    1. University of Southampton, Southampton, United Kingdom.
     
  • 122) 'Navigating the research data ecosystem'
    Dan Valen1
    1. figshare, Brooklyn, NY, United States.
     
  • 123) 'Funding mandates and policies: A database provider's response'
    Ian Bruno1, Colin Groom2, Amy Sarjeant1
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom. 2. CCDC, Cambridge, United Kingdom.
     
  • 124) 'Quest to find "broader impact": How funding bodies are using altmetrics to evaluate funded research and grant applications'
    Sara Rouhi1
    1. Altmetric, Washington, DC, District Of Columbia, United States.
     
  • 125) 'Analytical data, the web, and standards for unified laboratory informatics databases'
    Graham A. Mc Gibbon1, Patrick D. Wheeler2
    1. Advanced Chemistry Development (ACD/Labs), Toronto, ON, Canada. 2. Product Development, Advanced Chemistry Development, Encinitas, CA, United States.
     
  • 126) 'From molecular formulas to Markush structures: Different levels of knowledge representation in chemistry'
    Michael Braden1
    1. ChemAxon, Cambridge, MA, United States.
     
  • 127) 'Strategies for creating knowledge from chemistry and text data'
    Tom Oldfield1, Mariana E. Vaschetto1, Jeff Nauss2
    1. Dotmatics, Bishops Stortford, United Kingdom. 2. Linguamatics, San Diego, CA, United States.
     
  • 128) 'Combined structure and reaction retrieval in scientific content: What satisfied users in the past and what they demand for the future'
    Guido F. Herrmann1, Josef Eiblmaier2, Valentina Eigner-Pitto2
    1. Georg Thieme Verlag Kg, Stuttgart, Germany. 2. InfoChem GmbH, Munich, Germany.
     
  • 129) 'Harnessing chemical and toxicological data for the evaluation of food ingredients and packaging'
    Diane M. Schmit1, Tammy Page1, Kirk B. Arvidson1, Patra Volarath1, Leighna Holt1
    1. US Food and Drug Administration, College Park, MD, United States.
     
  • 130) 'Expansion of DSSTox: Leveraging public data to create a semantic cheminformatics resource with quality annotations for support of U.S. EPA applications'
    Christopher Grulke2, Inthirany Thillainadarajah1, Antony J. Williams1, David Lyons1, Jeff Edwards1, Ann Richard1
    1. National Center for Computational Toxicology, US EPA, Research Triangle Park, NC, United States. 2. Zachary Piper Solutions, New Hill, NC, United States.
     
  • 131) 'Comparative toxicogenomics database: Advancing understanding of molecular connections among chemicals, genes, and diseases'
    Cynthia J. Grondin1, Allan P. Davis1, Thomas C. Weigers1, Carolyn J. Mattingly1
    1. Biology, North Carolina State University, Raleigh, NC, United States.
     
  • 132) 'Wikidata: Advancing science through semantic integration of genes, diseases, and drugs'
    Benjamin M. Good1, Elvira Mitraka2, Andra Waagmeester1, 3, Sebastian Burgstaller-Muehlbacher1, Timothy Putman1, Andrew Su1, Lynn Schriml4
    1. Department of Molecular and Experimental Medicine, Scripps Research Institute, La Jolla, CA, United States. 2. Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States. 3. Micelio, Antwerp, Belgium. 4. Epidemiology and Public Health, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, United States.
     
  • 133) 'From dusty stacks to an information hub: Reimagining the UF libraries'
    Neelam Bharti1, Sara Gonzalez2
    1. Marston Science Library, University of Florida, Gainesville, FL, United States. 2. Marston Science Library, Gainesville, FL, United States.
     
  • 134) 'Expanding the research commons model into disciplinary instances'
    Jeremy R. Garritano1
    1. University Libraries, University of Maryland, College Park, MD, United States.
     
  • 135) 'Libraries for the future: A digital economy perspective'
    Jeremy G. Frey1, Steven Brewer1
    1. University of Southampton, Southampton, United Kingdom.
     
  • 136) 'Leveraging the interdisciplinarity of chemistry: Building interdisciplinary collaborations'
    Kiyomi Deards1
    1. Research and Instructional Services, University of Nebraska-Lincoln, Lincoln, NE, United States.
     
  • 137) 'Predicting local trends in scholarly communication for decision-making in collection development: An exploration beyond citation analysis'
    Ye Li1
    1. University of Michigan, Ann Arbor, MI, United States.
     
  • 138) 'Academic technologies: A new library service to offer advanced software training'
    Vincent F. Scalfani1, Melissa F. Green1
    1. University Libraries, University of Alabama, Tuscaloosa, AL, United States.
     
  • 139) 'Enhanced chemical understanding through 3D-printed models'
    Amy Sarjeant1, Peter A. Wood4, Ian Bruno1, Ye Li2, Vincent F. Scalfani3, Shawn O'Grady2
    1. Cambridge Crystallographic Data Centre, Cambridge, United Kingdom. 2. University of Michigan, Ann Arbor, MI, United States. 3. University Libraries, University of Alabama, Tuscaloosa, AL, United States. 4. CCDC, Cambridge, United Kingdom.
     
  • 140) 'IUPHAR/BPS guide to pharmacology (GtoPdb): Concise mapping for the triples of chemistry, data, and protein target classifications'
    Christopher Southan1, Joanna L. Sharman1, Adam J. Pawson1, Elena Faccenda1, Jamie A. Davies1
    1. Guide to PHARMACOLOGY, University of Edinburgh, Göteborg, Sweden.
     
  • 141) 'Open PHACTS: Semantic interoperability for drug discovery'
    Herman Van Vlijmen1, Open PHACTS Consortium2
    1. Computational Chemistry, Discovery Sciences EU, Janssen, Beerse, Belgium. 2. http://www.openphacts.org, Vienna, Austria.
     
  • 142) 'Representation of drug discovery knowledge in the ChEMBL and SureChEMBL databases'
    Anna Gaulton1
    1. Chemogenomics Team, European Molecular Biology Laboratory - European Bioinformatics Institute, Cambridge, United Kingdom.
     
  • 143) 'Chemical knowledge representation and access in Wolfram|Alpha and Mathematica'
    Eric W. Weisstein1
    1. Scientific Content, Wolfram|Alpha, Champaign, IL, United States.
     
  • 144) 'Helping people navigate the changing seas of scientific information'
    David Evans1, Pieder Caduff1, Thibault Geoui2, Juergen Swienty-Busch2
    1. Reed Elsevier Properties SA, Neuchatel, Switzerland. 2. Elsevier Information Systems, GmbH, Frankfurt, Germany.
     
  • 145) 'Characterization and categorization of novel knowns, unknowns, and the interface between physical and digital'
    Graeme Whitley1, Bernd Berger2, Timothy Adams2
    1. Wiley, Hoboken, NJ, United States. 2. Wiley-VCH, Weinheim, Germany.
     
  • 146) 'Semantic approaches for biochemical knowledge discovery'
    Michel Dumontier1
    1. Medicine, Stanford University, Stanford, CA, United States.
     
  • 147) 'Leveraging the VIVO research networking system to facilitate collaboration and data visualization'
    Michaeleen Trimarchi1, Danielle Bodrero Hoggan1
    1. Kresge Library, The Scripps Research Institute, La Jolla, CA, United States.
     
  • 148) 'Stanford profiles created to support the university's scholarly community'
    Grace Baysinger1
    1. Swain Chem & Chem Eng Library, Stanford University Libraries, San Jose, CA, United States.
     
  • 149) 'Managing researchers' reputations throughout the research life cycle'
    Linda Galloway1, Anne Rauh1
    1. Syracuse University Libraries, Syracuse, NY, United States.
     
  • 150) 'Anatomy of the chemistry research enterprise in the academic sector: Serving the underserved in a large research institution'
    Leah McEwen1
    1. Clark Library, Cornell University, Ithaca, NY, United States.
     
  • 151) 'Safety use case for chemical safety information'
    Ralph Stuart1
    1. Dept of Env Hlth Safety, Keene State College, Keene, NH, United States.
     
  • 152) 'PubChem BioAssay: Grow with the community'
    Yanli Wang1
    1. Building 38a, Room 5s506, Bethesda, MD, United States.
     
  • 153) 'Linking chemical and non-chemical data in structured product labeling'
    Yulia Borodina1, Bill Hess1, CoCo Tsai1, Pete Phong1, Lonnie Smith1
    1. FDA, Catonsville, MD, United States.
     
  • 154) 'Ginas: A global effort to define and index substances in medical products'
    Tyler A. Peryea1, Lawrence Callahan2
    1. Informatics, NIH NCATS, North Bethesda, MD, United States. 2. FDA, Silver Spring, MD, United States.
     
  • 155) 'TranSMART Foundation: An open-data and open-science platform to integrate molecular and clinical data in translational research and precision medicine'
    Rudolph Potenzone1
    1. tranSMART Foundation, Redmond, WA, United States.
     
  • 156) 'Leveraging RxNorm and drug classifications for analyzing prescription datasets'
    Olivier Bodenreider1
    1. Lister Hill National Center for Biomedical Communications, National Library of Medicine, Bethesda, MD, United States.
     
  • 157) 'Evolution of digital and semantic chemistry at Southampton'
    Jeremy G. Frey1, Simon J. Coles2, Colin L. Bird1
    1. University of Southampton, Southampton, United Kingdom. 2. University of Southhampton, Hampshire, United Kingdom.
     
  • 158) 'Implementing chemistry platform for OpenPHACTS: Lessons learned'
    Colin Batchelor1, Alexey Pshenichnov1, Jon Steele1, Valery Tkachenko1
    1. Royal Society of Chemistry, Rockville, MD, United States.
     
  • 159) 'Representation of molecular structures and related computations on the semantic web: A universal data model and its ontology'
    Mirek Sopek2, Stuart J. Chalk1, Neil S. Ostlund2, Jacob W. Bloom2
    1. Department of Chemistry, University of North Florida, Jacksonville, FL, United States. 2. Chemical Semantics, Inc., Gainesville, FL, United States.
     
  • 160) 'GlyTouCan international glycan structure repository using semantic web technologies'
    Issaku Yamada1, Kiyoko Aoki-Kinoshita2, 3, Nobuyuki Aoki2, Daisuke Shinmachi2, Masaaki Matsubara1, Akihiro Fujita2, Shinichiro Tsuchiya2, Shujiro Okuda4, Noriaki Fujita3, Hisashi Narimatsu3
    1. The Noguchi Institute, Tokyo, Japan. 2. Graduate School of Engineering, Soka University, Tokyo, Japan. 3. Research Center for Medical Glycoscience, AIST, Tsukuba, Japan. 4. Graduate School of Medical and Dental Sciences, Niigata University, Niigata, Japan.
     
  • 161) 'Progress toward a conformational database for sesquiterpene reaction pathways'
    Jordan D. Zehr2, Dean J. Tantillo1, Christian S. Hamann3
    1. Dept Chemistry, UC Davis, Davis, CA, United States. 2. Chemistry & Biochemistry, Albright College, Reading, PA, United States. 3. Chemistry & Biochemistry, Albright College, Reading, PA, United States.
     
  • 162) 'OMPOL: Visualization of large chemical spaces'
    Peter Corbett1, Colin Batchelor1, Alexey Pshenichnov1, Valery Tkachenko1
    1. Royal Society of Chemistry, Rockville, MD, United States.
     
  • 163) 'Comparison of machine learning algorithms for the prediction of critical values and acentric factors for pure compounds'
    Wendy Carande1, Andrei Kazakov1, Kenneth Kroenlein1
    1. NIST, Boulder, CO, United States.
     
  • 164) 'Optimal superposition of arbitrarily ordered molecules using the Kuhn-Munkres algorithm'
    Berhane Temelso1, Joel Mabey1, Toshiro Kubota3, George C. Shields2
    1. 701 Moore Avenue, Bucknell University, Lewisburg, PA, United States. 2. Deans Office, 113 Marts Hall, Bucknell University, Lewisburg, PA, United States. 3. Mathematical Sciences, Susquehanna University, Selinsgrove, PA, United States.
     
  • 165) 'Predicting drug-induced hepatic systems' toxicity by integrating transporter interaction profiles'
    Eleni Kotsampasakou1, Gerhard F. Ecker1
    1. Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria.
     
  • 166) 'Ontology for biomedical investigations (OBI)'
    Bjoern Peters1, James A. Overton1, Randi Vita1, OBI consortium1
    1. Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, CA, United States.
     
  • 167) 'Protein ontology: Fostering connections in chemical biology'
    Darren Natale1, 2
    1. Georgetown University Medical Center, Washington, DC, United States. 2. PRO Consortium, Washington, DC, United States.
     
  • 168) 'Ontologies for classifying and modeling drug discovery data'
    Stephan Schuerer1, 3, Asiyah Yu Lin1, Saurabh Mehta1, Hande Kücük McGinty2, Qiong C. Cheng3, Amar Koleti3, Nooshin Zadeh1, Dusica Vidovic1, 3
    1. Pharmacology, University of Miami, Miami, FL, United States. 2. Computer Science, University of Miami, Miami, FL, United States. 3. Center for Computational Science, University of Miami, Miami, FL, United States.
     
  • 169) 'Immune Epitope Database (IEDB) and its use of formal ontologies'
    Randi Vita1, James A. Overton1, Bjoern Peters1
    1. Division of Vaccine Discovery, La Jolla Institute for Allergy & Immunology, La Jolla, CA, United States.
     
  • 170) 'PubChemRDF: Semantic annotation and search'
    Gang Fu1, Evan Bolton2
    1. NCBI, NIH, Rockville, MD, United States. 2. NCBI, NIH, Bethesda, MD, United States.
     
  • 171) 'Generic scientific data model and ontology for representation of chemical data'
    Stuart J. Chalk1
    1. Department of Chemistry, University of North Florida, Jacksonville, FL, United States.
     

2016 CINF Officers and Functionaries

Chair
Rachelle Bienstock
National Institute of Environmental Health Sciences rachelleb1@gmail.com

Chair-Elect
Erin Davis
Cambridge Crystallographic Data Centre erinsdavis@gmail.com

Past-Chair
see Chair

Secretary
Tina Qin
Michigan State University ginna@mail.lib.msu.edu

Treasurer
Rob McFarland Washington University rmcfarland@wustl.edu

CINF Councilors
Bonnie Lawlor chescot@aol.com
Andrea Twiss-Brooks University of Chicago atbrooks@uchicago.edu
Svetlana N. Korolev, University of Wisconsin, Milwaukee skorolev@uwm.edu

CINF Alternate Councilors
Carmen Nitsche carmen@cinformaconsulting.com
Charles Huber, University of California, Santa Barbara huber@library.ucsb.edu
Jeremy Ross Garritano University of Virginia jg9jh@virginia.edu

Archivist/Historian
Bonnie Lawlor
See Councilor

Audit Committee Chair
TBD

Awards Committee Chair
David Evans david.evans@relx.ch

 

Careers Committee Co-Chairs

Pamela Scott, Pfizer pamela.j.scott@pfizer.com

Sue  Cardinal,  University of Rochester scardinal@library.rochester.edu

 

Communications and Publications Committee Chair
Graham Douglas graham_c_douglas@hotmail.com

Constitution, Bylaws & Procedures
Susanne Redalje, University of Washington curie@u.washington.edu

Education Committee Chair
Grace Baysinger, Stanford University graceb@stanford.edu

Finance Committee Chair
Rob McFarland
See Treasurer

Fundraising Interim Committee Chair
Communications and Publications Committee Chair

Membership Committee Chair
Donna Wrublewski Caltech Library dtwrub@caltech.edu

Nominating Committee Chair
see Chair

2016–2017 Program Committee Chair
Elsa Alvaro, Northwestern University elsa.alvaro@northwestern.edu

2015–2016 Program Committee Chair
Erin Davis, Cambridge Crystallographic Data Centre erinbolstad@gmail.com

Tellers Committee Chair
Susan Cardinal
see Careers Committee Chair

Chemical Information Bulletin Editor Spring
Vincent F. Scalfani, The University of Alabama vfscalfani@ua.edu

Chemical Information Bulletin Editor Summer
Judith Currano, University of Pennsylvania currano@pobox.upenn.edu

Chemical Information Bulletin Editor Fall
Teri Vogel, UC San Diego Library tmvogel@ucsd.edu

Chemical Information Bulletin Editor Winter
David Shobe, Patent Information Agent avidshobe@yahoo.com

Webmaster
Patti McCall, University of Central Florida patti.mccall@ucf.edu

Assistant Webmaster
Stuart Chalk, University of North Florida schalk@unf.edu

 

Spring 2016 CINF Bulletin Contributors

Articles and Features
Rachelle Bienstock
Bonnie Lawlor
Robert E. Buntrock
Vincent F. Scalfani
Adam Bernacki
Alexandra Williams

Sponsor Information
Graham Douglas

Technical Program
David Martinsen

Production
Vincent F. Scalfani
Teri Vogel
Patti McCall
David Martinsen
Bonnie Lawlor
Wendy A. Warr

 

 

 

 

 

 

Schedule of Future ACS National Meetings

252nd

Aug. 21–25

2016

Philadelphia, PA

Chemistry of the People,by the People, and for the People

253rd

Apr. 2–6

2017

San Francisco, CA

TBD

254th

Aug. 20–24

2017

Washington, DC

TBD

255th

Mar. 18–22

2018

New Orleans, LA

״

256th

Aug. 19–23

2018

Boston, MA

״

257th

Mar. 31–Apr. 4

2019

Orlando, FL

״

258th

Aug. 25–29

2019

San Diego, CA

״

259th

Mar. 22–26

2020

Philadelphia, PA

״

Download the PDF