Chemical Information Skills: The Essential Toolkit for Chemical Research

The symposium, chaired by Grace Baysinger (Stanford University) and Jonathan Goodman (University of Cambridge), opened with a presentation by Charles Huber (University of California, Santa Barbara) on the “Chemical Information Sources Wikibook: the open source created by chemical information professionals for chemical information professionals.” He related the origins of the resource in the printed reference created by Gary Wiggins (Indiana University) in 1991, and its subsequent migrations to the Internet and to the Wikibook platform, concluding with a description of its current format and plans for the future, with a call to interested parties to contribute.

Donna Wrublewski (Caltech) and Neelam Bharti (University of Florida) presented a paper “Soft skills of chemical research: academic integrity and research ethics.” They defined the concepts involved: academic integrity, which includes areas of authorship, plagiarism, responsible conduct of research and conflicts of interest; and research ethics, including honestly, objectivity, integrity, openness, respect for intellectual property, and responsible publication, among others. They then described a program created at the University of Florida, by the library, in partnership with the Office of Research and Office of Undergraduate Research, that reached out to undergraduate, graduate in master’s and Ph.D. programs, and to distance learning students, to introduce these concepts.

“Integrating bibliographic management tools in chemical information literacy instruction” by Svetla Baykoucheva and Joseph Jouck (both University of Maryland), related to their incorporation of bibliographic management tools (EndNote Online, Mendeley, and Zotero) into a chemistry research assignment, using Web of Science, PubMed and SciFinder. Teaching assistants for the large classes were trained on the use of the tools, and students also had access to online tutorials and guided assignments. (Note: Svetla Baykoucheva is the author of Managing Scientific Information and Research Data, Elsevier, 2015. ISBN: 978-0-08-100195-0.)

Vincent Scalfani (Universtiy of Alabama) spoke on “Replacing the traditional graduate chemistry literature seminar with a chemical information literacy course” (co-authored with Stephen Woski and Patrick Frantom, both also University of Alabama). They developed a course (CH584: Literature and Communication in Graduate Chemistry) for second year graduate students that incorporated the presentation of a literature seminar with instruction on chemistry information resources, critical analysis, scientific writing and presentation, and peer review, among others. The students learned more about the basic skills, and did a superior job on their presentations. The course was deemed so successful that they are developing an equivalent course for the Chemical and Biological Engineering program.

Elaine Cheeseman of the ScienceIP service of Chemical Abstracts Service described the work of a professional chemical information searcher entitled “Chemical information skills: a searcher’s perspective.” Professional searching requires not only expertise in sophisticated search tools, but also skills in analyzing a client’s needs and in generating reports that can distill a huge mass of data into a form that the client finds useful. Collaboration at every stage of the process is crucial. Elaine stressed the two key qualities of the professional searcher, namely: meticulous attention to detail, and a desire to learn.

The next presentation “Patents: the essential multifunctional tool for science, business and intellectual property information” by Edlyn Simmons (Simmons Patent Information Service, LLC) described the nature, content, and uses of patents. She likened patents to a Swiss army knife, able to reveal not only technical, but also competitive intelligence and legal information to the user. The description of the invention allows users to build on the substances and techniques described, while the claims delineate what intellectual property is legally protected, and the information about inventors and assignees can be valuable to analyzing trends in industry, as well as possible licenses, acquisitions or recruitments. Edlyn also described some of the key tools that the patent searchers in both governmental and commercial sectors use in their trade.

Grace Baysinger spoke on “Career information resources for graduate students and postdocs.” There is a wide variety of tools that chemists in the early stages of their careers can use to their advantage. Professional societies, such as ACS and RSC, offer career information as well as job listings. Funding agencies, such as NSF and NIH, provide guidance for grant writing. Libraries supply guides for job seeking and resume writing, as well as newer skills such as establishing your research identity through ORCID, developing data plans, and establishing a professional presence on social networks.

Some of the “do’s” and “don’ts” of dealing with chemical structures for cheminformatics were discussed in “So I have an SD file…what do I do next?” by Rajarshi Guha (National Center for Advancing Translational Sciences) and Noel O’Boyle (NextMove Software). Their principal message was: avoid ambiguity in chemical structure description!  Some file formats for chemical structure description have ambiguities, or they lose data via data compression. Users should stick to the formats that have a unique identifier for each structure and a unique structure for each identifier. Stereochemical descriptors, such as R/S and +/-, are also frequently ambiguous and must be used with care. Verification of structures is key. ChemSpider provides a useful service for this purpose.

Leah McEwen (Cornell University) spoke to a related topic in her paper entitled “Chemical literacy for the ages: essential skills in 2D chemical representation.” She discussed the history of 2D chemical structure notation and chemical nomenclature. Both have had evolving standards from a variety of sources, such as IUPAC and Chemical Abstracts Service. The electronic era of chemical information retrieval has made, if anything, knowledge of the grammar and vocabulary of this basic language of chemistry even more important for chemists.

Neelam Bharti returned in the afternoon session with “From the lab to the library: a new journey.” She described the skills required of a chemistry librarian, and how research chemists can make the transition to librarianship using many of the key skills that they bring with them to their advantage.

In “Experiments with chemists and information,” Jonathan Goodman described how chemical information instruction at the University of Cambridge has changed over time. The focus has shifted from the print literature of handbooks and journals to open data and social media. In some areas, the students now have greater familiarity with the tools than their faculty. The challenge for instructors is in directing the students to apply these skills to chemical research (and to enlighten their faculty in the process).

The next few presentations looked at specific tools and their uses to enhance skills in chemistry.  Stuart Chalk (University of North Florida) discussed “ChemData: a web application for learning chemical informatics.” ChemData is a prebuilt website application framework which he is using in his Chemical Information Science course. It allows students to learn basic concepts in manipulating chemical data. It is now being incorporated into an OLCC course (see: where students will use datasets from the NIST Reference Data collections for their projects, learning how to deal with metadata, XML, Scientific Markup Language, and the Semantic Web.

Andras Stracz (ChemAxon) discussed the use of Marvin Live in his talk, “Improving geographically distributed research with real time collaboration.” Many organizations have widely scattered research teams who must find ways to share ideas and data. Marvin Live allows users to automatically capture ideas from project meetings and brainstorming sessions automatically, and provides a framework to connect to other cheminformatics applications which the participants may wish to call upon in their discussions. By automating these processes, Marvin Live saves time and ensures that no important ideas will be lost by poor documentation.

Joshua Bishop (PerkinElmer Informatics) discussed “Chemical research toolkit: an end-to-end solution.” He focused on the problems faced by biomedical researchers in dealing with huge numbers of compounds, their properties, and the structure-activity relationships among them. He described a variety of tools that can aid in collecting and analyzing the data at each step of the research process.

Rajeev Hotchandani (Scilligence) presented “ELN, RegMol and Inventory: from synthesis to registration to inventory,” describing three interconnected software solutions from Scilligence that include an electronic lab notebook package for collecting data. RegMol can capture molecule data from the ELN to create a compound registry, and transfer the information to Inventory for tracking samples and lots of chemicals. Single-click connectivity makes the system simple, quick, and effective for the researcher.

