Technical Program

CINF Technical Program Highlights

ImageThanks to everyone for attending ACS San Francisco and helping make it one of CINF’s BIGGEST meetings yet!

We had about 130 talks, 20 Sci-Mix posters, 5 competitor posters for the CINF Scholarship for Scientific Excellence: all organized into 13 multi-tracked sessions including the Herman Skolnik Award Symposium. With San Francisco continuing the trend of high attendance, it was a very full and busy meeting.

The program was very diverse, ranging from a brand new symposium (Epidrugs) to one of our largest symposia on a classic CINF topic (Chemistry Text Mining). With the rapidly developing field, we see new technology and research presented every time we host this popular symposium.

We are very pleased to report that our symposium “Nature's Second Act: Revisiting Natural Products” was selected for ACS Presentations on Demand. This was a surprising symposium with a very diverse array of presentations, which all tied in together well to the topic. An excellent organizing job, Roger Schenck! Check out a selection of these talks at Presentations on Demand (http://presentations.acs.org/common/tracks.aspx/Fall2014). While you’re there, be sure to also check out CHED’s Sustain-Mix symposium, a collection of talks on sustainability across the Society from the perspectives of many divisions. CINF’s own Chair, Judith Currano, contributed an excellent presentation on the sustainability and utility of an ever expanding scholarly record. (Editor’s note: due to technical issues Judith Currano’s presentation recording is currently not available at the ACS website. To see her presentation slides, please follow this link at: /PDFs/maintaining-sustainable-scholarly-record.pdf).

The fall meeting also saw some in-depth and exciting discussions within CINF and among the attendees. The “Inspiring the Next Generation To Pursue Computational Chemistry and Cheminformatics” led to some long and interesting discussions about informatics education, as well as newfound plans for outreach across other divisions for CINF and the student body. This is something we plan to focus on for the upcoming Denver and Boston meetings. 

CINF’s regular Tuesday Luncheon had an exciting keynote address presented by Prof. Dr. Barend Mons (short bio and presentation abstract, slides) discussing an open initiative to get data into a standard and consumable format for easy searching and connecting (Datafairport.org). This falls under a rapidly evolving area of semantic technology, a subject very important to CINF, and to electronic information in general.

Despite CINF being placed quite far from the conference center, our attendance was still high, which led to some fabulous discussions across this broad array of symposia. We are still in the process of collecting the presentation slides from speakers so that we can share them with you: check back soon for these at the CINF website (/node/621).

Thanks again to everyone who put so much work into making this San Francisco meeting a resounding success! 

Erin Bolstad, Chair, CINF Program Committee

Computational Methods and the Development/Production of Biologics and Biosimilars

For a long time the pharmaceutical industry has been dominated by the production of small organic molecules as drugs. This has changed recently with the introduction of biological medicines or biologics. Biologics, as opposed to synthetic organic small molecules, are derived from living cells and are typically larger-sized molecules. As a group they include therapeutic and fusion proteins, monoclonal antibodies and DNA vaccines. They are much more difficult to manufacture and characterize than typical small molecular organic drugs and are almost always taken by injection (as opposed to orally). The development of biologics, hormone therapies and targeted monoclonal antibodies, has contributed greatly to enhanced cancer treatments for patients. Biosimilars are similar, but not identical copies of a biologic drug. Generic drugs are identical copies of small organic molecues but, because biologics are much more complex, biosimilars cannot be identical copies of them. The manufacturing of biologics and biosimilars adds a level of complexity because biologics are not made with a standard set of starting materials like typical drugs, but are made by genetically engineered living cells. A number of steps involved in creating a biological drug are much more complex than for a typical drug and any slight process variation can affect the biological product in terms of stability, efficacy and immunogenic properties. In 2009 the World Health Organization developed guidelines for regulating biosimilars and biological pathways, and for ensuring that their manufacturing process should meet the same standards as required for the originator products. However, even simple issues such as naming conventions and compound registration systems can be of concern. This symposium sought to initiate discussion on some of these issues. 

Dr. Roger Sayle of NextMove Software gave a presentation entitled “Classification, representation, and analysis of cyclic peptides and peptide-like analogs,” which described some of the difficulties involved in naming nonstandard amino acids, as well as defining macrocyclics due to the diverse ways in which peptides can form cyclic linkages. The naming and machine-recognition of synthetic macrocylic peptides provide significant informatics challenges. For example, covalently cross-linked side-chains may have multiple possible (degenerate) primary sequences, requiring the selection of a preferred canonical form during biological registration. In this presentation, Dr. Sayle described the development of a software program “Sugar and Spice,” which can provide some assistance in standardization and machine-recognition of non-standard amino acids. There is a need for standardized naming conventions and notations, which currently are not in place for designed and engineered non-alpha-amino acids, acyclic peptide backbones, and different types of peptide disulfide bridges.

Dr. Suman Sirimulla, St. Louis College of Pharmacy, gave a talk entitled “Non-covalent interactions in protein-ligand interactions: Applications of halogen bonds and carbon bonds in designing PTSD drugs,” in which he presented the significance of considering non-covalent protein-ligand interactions in drug design. There is a growing need to consider the significance of halogen bonds and carbon bonds in protein-ligand interactions. He presented an application in designing drugs for post-traumatic Stress Disorder (PTSD) with the nociceptin receptor as a target, and the design and development of nociceptin analogues using halogen-bond information and halogen-amino acid interaction data to search and mine the Protein Data Bank.

Dr. Sandeepkumar K Kothiwale, Chemistry, Vanderbilt University, Nashville, Tennessee, presented  “BCL:Conf A knowledge based ligand flexibility algorithm and application in computational drug discovery like online drug design game Foldit,” on a derived fragment conformational database created from frequently sampled  experimental structures within the Crystallographic Structure Database (CSD) and the Protein Data Bank (PDB). The likely sampled fragments were stored as a rotamer library. A hierarchical search algorithm was used to perform substructure searching: a random Monte Carlo conformational sampling. 

All the talks presented illustrated the development of novel computational methods to the area of biologics drug design.

Rachelle Bienstock, Symposium Organizer

 

symposium

The Future of the History of Chemical Information
Editors: Leah McEwen, Robert Buntrock
Volume 1164
Sponsoring Division: ACS Division of Chemical Information
Publication Date (Web): August 6, 2014
ISBN: 9780841229457; eISBN: 9780841229464

The aim of this collection is to critically examine trajectories in chemistry, information and communication as determined by the authors in the light of current and possible future practices of the chemical information profession. Along with some additional areas primarily related to present and future directions, this book contains most of the topics covered in the meeting symposium held on August 20, 2012, at the 244th American Chemical Society Meeting in Philadelphia, PA.

Image

Presentation titles, abstracts and slides (#47-51 & 59-65) are listed at: /node/347.

A symposium summary was published in Chemical Information Bulletin, Winter 2012, at:  /node/393

 

Science and the Law

How the Communication of Science Influences Science-Based Policy Development in the Environment, Food, Health, and Transport Sectors

Image

This symposium set out to explore how the public communication of science influences the interaction between science and policy development in the regulation of the environment, food, health and transport. It presented a series of case studies illustrating the impact of science communication on policy development. The controversy surrounding the science behind the study of global warming and the resulting focus on the reduction of carbon dioxide emissions by international agreement and by national and international regulation is one example of such an area where science communication and policy development are inextricably intertwined. The symposium was one of a series which is seeking to identify other areas where science-based policy development is of increasing importance. The first symposium, held in Philadelphia in 2012, has recently been published as an ACS Symposium book Science and the Law: Analytical Data in Support of Regulation in Health, Food, and the Environment, Editors: William G. Town, Judith N. Currano, Volume 1167, ISBN13: 9780841229471, eISBN: 9780841229488.

Our first speaker, Fred Stoss, gave a presentation entitled “Coming out from under the cloud of “Climategate”: Are scientists effectively communicating with the public on climate change?” The strategies, resources, and tools of the groups that challenge scientific communication with the public, were discussed and various STEM-based information, communication and education resources described. Communicating with the public is not a traditional role of scientists, but is of increasing importance in today’s virtually connected world. It is incumbent that the public understands the basic scientific principles of the environments in which they live, work, learn, and play.

Climate change emerged in the 21st century as one of the most complex and controversial environmental problems. Enhanced scientific communication by scientists and their proxies was a response to the theft of emails and documents from the University of East Anglia in November 2009. These emails were strategically “leaked” days before the United Nations Framework Convention on Climate Change in Copenhagen. Climate change deniers attacked researchers, and Conservative radio talk shows dispersed allegations of fraud, withholding and manipulation of data, and suppression of publications. “Climategate” neutered the UN negotiations and, more lastingly, created public confusion and mistrust about the scientific consensus on climate change.

Scientific organizations challenged scientists to evaluate their roles as communicators and set the stage for increasing the proactive discussion of their research. In particular, they needed to explain how their research contributes to a more informed decision-making process which is necessary if we area to transform our ways of thinking: from living in a greenhouse gas-constrained world to a world not constrained by greenhouse gases.

The second speaker, Hanna Breetz, experienced logistical problems and was unable to deliver her talk on “Carbon accounting for indirect land use change (ILUC) in biofuels policy.” Fortunately, we had advanced notice of the logistical problems which were caused by Hanna’s new university position and Fred was able to fill the gap in the program without any difficulty.

Our final speaker before the intermission, Jim Solyst, described strategies for “Communicating the risk of nicotine delivery products.” The rapid increase in the use of electronic cigarettes by smokers of tobacco cigarettes has highlighted the risk perception and communication challenge facing the US Food and Drug Administration (FDA) in characterizing nicotine delivery products. Tobacco smokers believe that by switching to electronic cigarettes they are reducing their risk level. This is likely to be true, but the scientific evidence is only now being collected. FDA is a public health and science-based agency, and cannot communicate the risk reduction potential of electronic cigarettes until they have sufficient evidence, regardless of the intuitive risk reduction potential of the product.

The 2009 Tobacco Control Act provides authority to the FDA Center for Tobacco Products to regulate tobacco products, including electronic cigarettes. Section 911 of the Act, Modified Risk Tobacco Product (MRTP), provides a scientific evidence-based process by which a company can apply for and receive a MRTP order. If a product can be demonstrated to reduce harm to the individual and benefit the overall public health then it may be characterized as modified risk and FDA may communicate that information to the public.

There are products, for example, Swedish snus (a smokeless tobacco powder), for which there is a great deal of human health evidence and may at some time be granted a MRTP order; but there is no such human health evidence for electronic cigarettes due to the recent introduction of the product to the market. So what does FDA say to the tobacco smoker who is considering switching to electronic products? The best advice is not to use nicotine products at all, but does FDA have an obligation to inform smokers of the obvious benefits of switching, even if the evidence is not complete?

Our first speaker after the intermission was George Lunn from the FDA who presented “PEPFAR - a US Government program that is helping to keep millions alive around the world.” This was a good news story. PEPFAR, the President’s Emergency Plan for AIDS Relief, was announced by President George W. Bush in 2003 with the aim of preventing infections, treating infected people, and caring for infected individuals and orphans in resource-limited countries.

In a unique arrangement, low-cost manufacturers submit to the FDA New Drug Applications or Abbreviated New Drug Applications for antiretroviral drugs to treat AIDS and these applications are reviewed to the same standards as applications for products that are destined for the US market. To expedite the preparation and submission of these applications, the FDA has reached out to the manufacturers, distributors, and other interested parties. At the beginning of 2014 the FDA had taken an action on 168 applications and 6.7 million people worldwide are being treated with these antiretroviral drugs. George described the communication challenge that resulted from this outreach program.

Our next speaker was Neil Ravenhill of Weber Shandwick (who was a late substitute for Tamora Langley) and he discussed the topic “Does science or communications have greater influence in formulating policy? A UK perspective.The dynamics and principles of the scientific environment are starkly at odds with the dynamic of the political environment in which policies are made or broken. While a scientific approach is rational, evidence-based, and formed through consensus of experts, the political environment is emotional, driven by communications, and adversarial. Although decision makers aspire to evidence-based policy-making, in contested areas the side with the most effective communications often seems to “win.”

The recent economic crisis in the UK and across mainland Europe has necessitated drastic cuts in public spending, impressing on officials the need to make savings and squeeze more value out of public resources. In the UK, the government marked out the health budget as one of only two areas of public spending to be shielded from the cuts. Still, relatively flat health spending has been outstripped by rising demand for services, and so any new policies requiring additional resources remain in theory unaffordable. Where new health policies have been introduced, such as the Cancer Drugs Fund (CDF), they have been driven not by scientific developments so much as by public opinion and political decision-making. Similarly, attempted policy change driven by or expressed in terms of economic or rational imperatives (such as attempts to reconfigure health services, or attempts to change statutory regulation of dispensing medicines), have failed in the face of patient and professional campaigns. To conclude that those who shout loudest will always win is to over-simplify. Besides, in some policy debates, patient and professional advocacy groups are divided. Even if policies are pushed through by noisy campaigns, they can be reversed or stalled by the public officials who “outlive” their political masters and realize they are in practice unworkable or inefficient. The answer? Begin with the science, but recognize that the communication of science is just as critical.

David P. Richardson, described some of the pitfalls in the “Consumer communication of nutrition science and impact on public health.” Dietary interventions for vulnerable groups such as the elderly, women of childbearing age, children and adolescents can contribute beneficially to help reduce the risk of suboptimal intakes and deficiencies of micronutrients, to control costs of healthcare, and to promote the health and quality of life of people globally. Examples presented included the communication of the scientific evidence for (a) the use of folic acid/folate to reduce the risk of neural tube defects, (b) the reduction in prevalence of iron-deficiency anemia, (c) the relationship of calcium and vitamin D to bone health and reduced risk of osteoporosis, and (d) the modulation of the age-related decline in most organ functions and reduction in the development and/or progression of many chronic diseases. Richardson highlighted the need for evidence-based healthcare and communication policies, including the use of nutrition and health claims on food products to raise awareness of the role of diet in health.

The final speaker, Sarah Cooney, illustrated the problems of Communicating controversial science: The case of tobacco harm reduction and the ethics of blanket censorship” and the difficulties encountered when publishing research from the tobacco industry. It has long been accepted that cigarette smoking causes serious disease and death, and public policy has focused on reducing tobacco use. In the US, the FDA has had regulatory jurisdiction over tobacco products since 2009 and is committed to an evidence-based approach for regulatory decision-making anchored by sound science. In an effort to generate much more data about tobacco science, the FDA has established an interagency partnership with the National Institutes of Health (NIH), which is making available billions of research dollars to study priority questions about tobacco science to inform FDA regulations. This new funding should attract many new researchers, creating a larger and more diverse, transparent and results-orientated tobacco science community.

The FDA has set an example in acknowledging tobacco manufacturers both as an important stakeholder and as a potential source of valuable scientific expertise. Perhaps as a result, there is a general increase in scientific publications resulting from research undertaken by tobacco industry scientists. Additionally, most tobacco manufacturers are even more committed to developing products substantially less risky than cigarettes, and the science to evaluate the potential of such products to promote harm reduction. At the same time there is an increase in the number of scientific journals introducing blanket bans on publishing science from tobacco manufacturers, with the BMJ (British Medical Journal)being a recent example. Cooney described the ethical dilemmas surrounding scientific censorship and the role of peer review in protecting scientific integrity.

The session concluded with a panel discussion with active participation of the speakers and the audience. The layout of the room precluded the normal format of a panel discussion with all the speakers on a podium, but this worked to our advantage and may have inspired more active involvement of the audience, many of whom had followed the whole symposium.

William Town, Symposium Organizer

International Conference on Chemical Structures

ImageImage

10th International Conference on Chemical  Structures
10th German Conference on Chemoinformatics
June 1-5, 2014, Noordwijkerhout, Netherlands

The International Conference on Chemical Structures (ICCS: http://www.int-conf-chem-structures.org/) is a leading chemical informatics meeting, held every three years since 1987, and this year it was combined with the 10th Annual German Conference on Chemoinformatics (GCC). CINF is one of the sponsoring organizations for ICCS. This combination of two 10th anniversaries worked well, and although the subjects covered tended more towards computational chemistry and QSAR, with a reduced emphasis (compared with previous ICCS sessions) on informatics per se, data analysis and integration, the result was interesting and an eclectic mix of papers with plenty for anyone involved in chemical structures and informatics to learn from.

The conference was well-attended with 204 participants from 20 countries. There was a reasonably balanced demographic: 37% academic (including 12 students supported by bursaries), 10% government/non-profit, 24% commercial, and 28% vendors.

The conference received 144 abstract submissions from 20 countries, and the scientific advisory board selected 34 plenary and 83 poster presentations. Papers were presented in the following thematic sessions:

  • Cheminformatics
  • Structure-based drug design and virtual screening
  • Analysis of large chemistry spaces
  • Dealing with biological complexity
  • Integration of chemical information with other resources
  • Structure-activity and structure-property prediction

Where abstracts were submitted, they are available at http://www.int-conf-chem-structures.org/fileadmin/user_upload/program/Book.of.Abstracts.pdf; and for those authors who choose to submit them, their papers will be published (subject to satisfactory peer review) in a special edition of Journal of Chemical Information and Modeling due for publication in early 2015.

In addition to these rather staid and 20th century methods of communication, the minute-by-minute goings on during the conference were live-tweeted by some participants, the most active (the “tweetiest”?) being Wendy Warr and Egon Willighagen, who kept the Twittersphere informed almost on a slide-by-slide basis.

In the majority of the papers, the focus was (correctly) on the science, experimental studies and results, and there was very little mention of informatics products or solutions per se as the focus of any talks, again correctly; but wearing my vendor hat briefly, it is always interesting to see when off-the-shelf products can be used, and when organizations feel compelled to build their own proprietary solutions.

There is now a growing set of open source offerings to complement/compete with the commercial vendors. One example that was mentioned by several speakers was Open PHACTS (http://www.openphacts.org/) which is being widely used in industry and academia in Europe as a source for integrated published pharmacological data. It can be accessed via an API (https://dev.openphacts.org/), a simple forms-based application Open PHACTS Explorer (http://www.openphacts.org/explorer), and some other third-party applications such as ChemBioNavigator (www.chembionavigator.org), and PharmaTrek (www.pharmatrek.org), for exploring pharmacological space.

In addition to the plenary papers, there was a small exhibition with 17 exhibitor companies, and there were two poster sessions. All these were well-attended and provided the backdrop for serious discussions, networking, rumor-mongering and light-hearted social chit-chat. 

And on the light-hearted social front, one highly anticipated part of ICCS is the afternoon excursion. This year’s was no exception, and included a tour of the newly re-opened Rijksmuseum in Amsterdam, surviving a brief torrential downpour, a fascinating guided boat tour of Amsterdam’s canals, and a sumptuous conference dinner in an old converted church.

All told this was a very successful conference. The three year cycle for ICCS means that enough “new stuff” has happened between conferences to make attendance almost mandatory to keep abreast; and the injection of the GCC in 2014 added new content and different speakers, resulting in a detailed and up-to-date snapshot of the state of chemoinformatics.

Phil McHale, Conference Reporter

Image

Photo credit: pic.twitter.com/avN8DCMGDJ

The 2014 CSA Trust Mike Lynch Award presented to the “InChI Team” at the 10th ICCS/GCC Conference, Noordwijkerhout, Netherlands

 

Impact of IUPAC InChI on Finding and Linking Information on Chemicals

With the inception of the internet and the proliferation of vast and scattered resources of chemical information on the web, it became evident that the community needed an open identifier to allow for a free and meaningful discoverability and exchange across this body of data. So back in 2000 IUPAC began a project to establish just such an identifier, releasing the first version in 2005. This fall’s symposium on InChI, taking place over 10 years since that effort began, clearly demonstrated that the original goals of the IUPAC efforts have succeeded, as most chemical database producers support or use InChI in one way or another and the exchange of chemical compound information across web resources is now commonplace and successful, thanks to InChI.

The symposium covered many aspects, from ideas for improvement of the existing implementation, to ideas and actions to expand coverage to include new compound types, to examples of how InChI is being used at various organizations today to improve and foster data exchange, and how it could be used in the future.

InChI Expansion

We heard from the InChI Trust (Richard Kidd, Treasurer) and several working groups, whose mission it is to maintain and further develop the InChI algorithm. Polymers are next on the list to be handled by InChI. The specifications have been completed and the standard will be programmed by end of this year. The organometallics proposal is out for comment. For now, funding is still being sought for inorganics and Markush structures.

A fair amount of work has also been completed on the RInChI: a chemical Identifier to handle reactions (update given by Guenter Grethe). Efforts in the biomolecules project are moving forward, with an InChI working group holding a requirements-gathering meeting at NIH later this Fall. (update provided by Keith Taylor). Don Burgess of NIST described a project underway to use the existing InChI structure and expand the tiers to capture conformer, electronic state, and a quantum enumeration layer to create a representation for elementary reactions (InChI-ER).

InChI Improvements

The NCI/CADD group has worked with InChI since the beginning as part of their free web services.  Marc Nicklaus presented an interesting analysis of the current state of tautomers within the InChI 1.x algorithm. With this data they are putting together a proposal to improve the algorithm so InChI can achieve its original design goal of being a “tautomer-invariant” identifier.

Community Use of InChI

Major database producers updated us on their uses of InChI. Tony Williams of the Royal Society of Chemistry discussed how InChI has allowed them to integrated disparate compound databases, and how it supports the pursuit of their open source drug discovery platform. He held out the hope that more chemists outside of the CINF Division would become more aware of InChI and related topics. Evan Bolton of National Center for Biotechnology Information discussed how InChI is fundamental for their cross-resource correlation within PubChem and how they have started offering programmatic services that include InChI. Users can import, export, and compute InChIs, in addition to searching by them. He also discussed the PubChemRDF project, which allows users to download slices of related PubChem data. Ian Bruno of the Cambridge Crystallographic Data Centre discussed how they were using InChI to identify the overlap between the Protein Data Bank (PDB) and the Cambridge Structural Database (CSD).

Future Uses of InChI

Image

One interesting application of InChI was as part of the information incorporated into a QR code, discussed by Don Cruickshank of University of Southampton. This could be a good way to deliver emergency information as well as speed up inventory management. Even labels damaged up to 30% are still readable.

InChI Keys also lend themselves to text mining, and Tom Griffin of IBM reported on processing full text documents and assigning InChIs and InChIKeys (“entity insertion”) to make the originally text-based chemical information indexable and retrievable.

One question that kept coming up in various guises at the Q&A sessions was why the “same compound” often seemed to have “different” InChIs. This tended to do with the way a molecule was normalized rather than with a failure of the algorithm. While the vendors have always understood the challenges of normalization and representation rules, I believe this forum allowed the wider audience to appreciate the rich and nuanced intricacies behind our taken-for-granted chemical drawing tools.  

Please visit /node/621 for the full program with abstracts and slides, where available. 

Carmen Nitsche, Symposium Reporter

Global Challenges in the Communication of Scientific Research

Science is global now: it is not only carried out in more places than ever, but also addresses worldwide problems and is increasingly collaborative. Forty years ago, about two-thirds of publications had an author from one of the G7 countries (Canada, France, Germany, Italy, Japan, the United Kingdom, and the United States). Nowadays, the contributions of other countries to the research landscape have increased, especially from the so-called BRICKS countries (Brazil, Russia, India, China, South Korea, and South Africa).

The ACS CINF Symposium “Global Challenges in the Communication of Scientific Research” addressed this new geography of science from different perspectives. How are publishers reaching out to new audiences? How are developers shaping their products to exploit the cheaper technologies and the interconnected world? How can we facilitate international science? And ultimately, what are the challenges ahead of us in the scholarly communications arena?

The symposium was organized by David Martinsen and Norah Xiao, and chaired by David Martinsen. The morning sessions featured talks centered around two broad themes: tools and resources that can facilitate global science, and global cross-cutting issues. In the first group, Steven Muskal described the efforts of Eidogen-Sertanty to use cloud-based mobile app development and pipelining technologies to meet global needs; Tony Williams and Valery Tkachenko talked about the Royal Society of Chemistry (RSC) efforts to build a chemical data repository that can enable storage, validation, standardization, and sharing of data, and that can also enhance scientific publishing; Charlie Weatherall presented an overview of CDD Vault, which is a tool to manage, visualize and share chemical and biological data and described some examples of its use in team science; Andrey Yerin described the new IUPAC organic nomenclature and the concept of Preferred IUPAC Name (PIN) and how ACD/Labs is working to incorporate it into the newest iterations of their products. Global issues broached in this session included the state of photovoltaics in the world, by Colin Perry from the University of North Texas, and the need of open data for drug discovery of rare and ultra-rare diseases, by Sean Ekins of Collaborations in Chemistry.

The afternoon sessions focused on how the changing global landscape of science affects scholarly communication. A common issue raised by several speakers was how the global nature of science is pushing publishers to engage with new audiences and creating new markets for companies offering services to international authors. Thus, Amy Beisel of Research Square described some of the challenges that international authors face when submitting articles to English-language journals: preparing and formatting the manuscript, making sure that the topic fits the journal scope, responding to reviewer comments, and understanding the correspondence from editors and reviewers.

Talks by ACS and RSC staff brought the chemistry publisher perspective and highlighted their focus to reach globally. Steve Hansen and ACS journal editors Kirk Schanze and Prashant Kamat presented an overview of the expansion of the well-established ACS on Campus program to Mexico, China, and India. They also summarized what they have learned about participants of those programs, including their desire for information and advice on the peer review and publication process, their value of recognition and eagerness to see their research published, and their concerns about the fairness of the peer review system. Along the same lines, Stephen Hawthorne and Daping Zhang presented RSC efforts for supporting and facilitating the development of research beyond the scientific powerhouses in chemistry, such as their community engagement in Far East Asia and Latin America, and their focus on developing skills in Russia, Africa, and India.

Chinese journals are also seeking to increase their international influence. Xiaowen Zhu of University & Higher Education Press described the situation of chemical journals in China from the Chinese publisher point of view. Increased financial support by the government, international editorial boards, and collaboration with international publishers were some of the strategies mentioned.

New trends in scientific publishing are also likely to have a global impact. Thus, the session also included two talks centered on new trends in scholarly communication by open access publishers. Frederick Fenter described Frontiers’ efforts to increase article discoverability and visibility in the global research community by using a scientific social network. Martin Hicks of Beilstein-Institut focused on the challenges in scholarly communication, such as data reproducibility and integrity, issues in the peer review system, and plagiarism.

Finally, the symposium concluded with a talk on how to engage with non-scientific audiences to fight chemophobia.

Elsa Alvaro, Symposium Reporter

Please join us again for the CINF symposia in Denver, March 22-26, 2015!

Research Results: Reproducibility, Reporting, Sharing & Plagiarism. Martin Hicks.

Defining “Value” in Scholarly Communications: Evolving Ways of Evaluating Impact on Science. Sara Rouhi.

A list of the CINF symposia planned for Denver is at:             http://www.acs.org/content/acs/en/meetings/abstract-submissions/cinf.html

 

Herman Skolnik Award Symposium Honoring Engelbert Zass

Introduction

Throughout his career Dr. Engelbert Zass (“Bert”), head of the Chemistry Biology Information Centre at ETH Zürich (retired), has been a bridge builder and mediator between database producers, vendors, publishers, librarians, and end users in chemistry, contributing to advancing chemical information as a whole. Specializing in chemical information after receiving his Ph.D. in organic chemistry, Dr. Zass has more than 30 years of experience in searching, operating and designing chemistry databases, as well as in the support, training, and education of users of chemical information. He has given numerous lectures and courses in Europe and the United States, is author of more than 60 papers on chemical information, and served on several publisher advisory boards. From 1999 till 2004, he was a partner in the German Federal Ministry of Education and Research’s project “Vernetztes Studium – Chemie,” where he was engaged in the design of multimedia educational material for chemical information. Through his leadership, vision, and collaborative efforts with his staff, ETH Zürich developed a model 21st century library. Dr. Zass did his undergraduate studies in chemistry at Universität zu Köln, followed by a Master’s degree (Diplom) in Chemistry with Prof. E. Vogel. He went on to complete his Ph.D. (Dr. sc. nat.) studies with Prof. A. Eschenmoser at ETH Zürich. He then became a lecturer and senior scientist at ETH, later serving as Head of the expanded ETH Chemistry Biology Pharmacy Information Center until his retirement in 2012.

Evolution and transformation of journals in a digital environment

Grace Baysinger of Stanford University opened the proceedings with a talk about electronic journals. The number of electronic journals continues to increase: CrossRef (http://www.crossref.org/01company/crossref_indicators.html) covers over 35,000 and the Directory of Open Access Journals (http://doaj.org/) lists about 9,700. Chemists make extensive use of journal articles. Most publishers now offer them electronic manuscript submission systems and authoring tools. The Royal Society of Chemistry (http://www.rsc.org/Publishing/Journals/guidelines/AuthorGuidelines/AuthoringTools/), for example, offers author templates, an experimental data checker, and a Crystallographic Information File data importer.

We need more automated data checking tools, not just to aid authors, but also to help prevent fraud. It is important that readers should be able to reproduce the work reported in an article; reproducibility depends partly on the availability of supporting information. Open source software and lower computing costs make it easier nowadays to re-use data. NISO and NFAIS have published Recommended Practices for Online Supplemental Journal Article Materials (http://www.niso.org/workrooms/supplemental). CrossCheck (http://www.crossref.org/crosscheck/index.html) prevents plagiarism; CrossMark (http://www.crossref.org/crossmark/) provides a standard way for readers to locate the authoritative version of a piece of content; FundRef (http://www.crossref.org/fundref/) provides a standard way to report funding sources, and ORCID (http://orcid.org/) provides a persistent digital identifier that distinguishes an author from every other researcher.

Many people think that traditional peer review is “broken” (http://www.nature.com/nature/peerreview/debate/) because it causes delays, and reviewers are overloaded. Alternatives are pre-publication review as carried out by arXiv.org (http://arxiv.org); post-publication review as in Faculty of 1000 (http://f1000.com/); and open two-stage peer review as used by Atmospheric Chemistry & Physics (http://www.atmospheric-chemistry-and-physics.net/review/review_process_and_interactive_public_discussion.html).

Copyright may be transferred to publishers, or held by the author, who grants a distribution license to a publisher, or held by an institution or employer. There are also creative commons licenses, and some works are in the public domain. The Copyright Clearance Center operates RightsLink (http://www.rightslink.com) to enable publishers to give permission for an item to be reproduced. SHERPA/RoMEO (http://www.sherpa.ac.uk/romeo/) summarizes permissions that are normally given as part of a publisher’s copyright transfer agreement.

Collection management is different in the era of the electronic library. Expenditure reports have to be produced; electronic resource management systems are needed for storing licenses; and authentication and security must be handled correctly. Catalog records for bibliographic data and knowledge bases for online holdings must be maintained. COUNTER (http://www.projectcounter.org/about.html) and SFX reports (http://www.exlibrisgroup.com/category/SFXOverview) measure electronic usage. Print collections get sent to storage and archiving of print may be shared. Online access may be perpetual or there may be archival access to an online version. Repositories for data have been established. Pricing is a major issue. Chemistry journals reportedly have the highest average cost (http://lj.libraryjournal.com/2014/04/publishing/steps-down-the-evolutionary-road-periodicals-price-survey-2014/#_) of all subject areas: $4,215. A recent article has drawn attention to variations in pricing across institutions and a lack of transparency.1 All sorts of metadata issues may arise. There can be multiple titles in one catalog record for the print version of a journal. Only the latest title may be on the publisher site, or in the open URL knowledgebase, but the link and content may include older titles. Digitization and metadata for a journal may be incomplete. Multiple versions or copies of the same article may be on different sites. Problem-solving is more difficult now that print copies from libraries are in storage or withdrawn. NISO has published Recommended Practices for the Presentation and Identification of E-Journals (http://www.niso.org/workrooms/piej).

Archival material can be accessed through the Wayback machine (http://archive.org/web/), Portico (http://www.portico.org/digital-preservation/), Hathi Trust (http://www.hathitrust.org/help_general), and Lots of Copies Keep Stuff Safe (LOCKSS, http://www.lockss.org/) and Controlled LOCKSS (CLOCKSS, http://www.clockss.org/clockss/Home), but there are still problems with “bit rot” and multimedia content. LOCKSS and CLOCKSS are the only services that check for bit rot. Data are being stored in repositories and databases, and on Amazon cloud.

Mobile access is another big theme. Enhanced journal article services are now being added by publishers: see, for example, the tools supplied by ACS (http://pubs.acs.org/doi/abs/10.1021/cb500271c) and Wiley (http://onlinelibrary.wiley.com/doi/10.1002/ajoc.201402054/full). The University of California Irvine has an online guide to research impacts using metrics (http://libguides.lib.uci.edu/researchimpact-metrics). Article level metrics are increasingly becoming an alternative method of measuring the impact of scholarly and other output: Altmetric (http://www.altmetric.com/) is an example.

There are many resources for discovery and delivery including open URL knowledge bases, Portico (http://www.portico.org/digital-preservation/), databases, federated sites and tools (e.g., xSearch at Stanford, https://xsearch.stanford.edu/search/), alerts and RSS feeds, data mining and visualization, XML parsing of content, linked data and taxonomies, and machine-to-machine retrieval via APIs and Web Services (e.g., Stanford Profiles (https://profiles.stanford.edu/), a LinkedIn-type service for Stanford faculty).

End user behavior

Research publications are ultimately intended to be read by scientists. What do they want and need in a publication? How do they gather their information and decide what to read? When and how much do they read? User studies of various kinds have been done to try and find answers to these questions. Andrea Twiss-Brooks of the University of Chicago Library reported on recent studies in which she has been involved.

User studies carried out in around 2005, many conducted by Tenopir and King, showed that users browse for current articles and current awareness, but search databases or follow citations for older articles. They rely on the library copy for all but a few core titles, and authoritative and trusted sources are preferred. They need an efficient means of accessing literature: time management is a priority. Reading supports primary research, background research, teaching, and writing.2

User behavior research may be basic or applied, and methods may be quantitative or qualitative. Quantitative methods include surveys (using the Likert Scale, for example, a psychometric scale often used with questionnaires), return on investment metrics, citation metrics, altmetrics, and server log analysis. These methods answer the “how much?” “what?” and “when?” type of question. They are easier to manage and design (sometimes), empirical, and perhaps extensible. Standardized statistical approaches can be used.

Qualitative methods include focus groups, interviews, open-ended comments on surveys, and applied ethnographic techniques which may include research diaries, mapping the diaries with interviews, and observation. Qualitative methods often require more oversight by Institutional Review Boards (IRBs). They produce information only on the cases studied: generalization is more difficult. They give insight into “how?”, “why?” and “who?” These questions can often not be answered by quantitative methods, but analysis of qualitative results can be more challenging and is usually not statistical.

Nancy Fried Foster and Susan Gibbons have used anthropological and ethnographic methods to examine how undergraduate students at the University of Rochester write their research papers. Students (with informed consent) were watched, and they kept diaries. The results were published in Studying Students: the Undergraduate Research Project at the University of Rochester (http://www.ala.org/acrl/sites/ala.org.acrl/files/content/publications/booksanddigitalresources/digital/Foster-Gibbons_cmpd.pdf).

Early in 2012, David Bietila and Gina Petersen at the University of Chicago conducted a qualitative study of the research process of eight graduate students in the humanities and social sciences. This study aimed to model the research process, enhance the understanding of the relationships between information tools and services, and identify gaps between participant behavior and librarian expectations. The study’s data collection comprised three components: research logs, semi-structured interviews, and commentary by subject librarians on student research practices. The findings indicated that source discovery was not a primary concern for participants, but that developing a feasible, clear, and relevant topic was a much greater obstacle for most participants. Bietila and Petersen spoke about this in “Guiding Interface Design with Ethnographic Methods” at the American Library Association Annual Meeting in 2013.

Some examples of research log questions were:

  • What were you trying to accomplish during this research session?
  • At what time did you conduct this research session?
  • Where were you when conducting this research session?
  • What tools did you use to help conduct your research?

Examples of interview questions were:

  • What do you anticipate will be the most difficult part of the research?
  • How do you evaluate sources?
  • How have you converted your topic into a searchable phrase or keywords?

Graduate students tended to use the same resources, for example, JSTOR. They did a broad search and read a few articles. They used a non-linear, non-structured process; they did not have a set process for doing things. The librarian would always have done the job differently. A report on the study has been published (http://www.lib.uchicago.edu/gradstudy).

A second project, “A Day in the Life,” is ongoing. This mapping project is applying ethnographic methods to clinical health information research. It is a low-cost, six-institution study investigating how third-year medical students seek and use information in the course of their daily activities. The students mark their movements on a map for one full day. Each participant is then interviewed (and rewarded with a $100 gift card). Interviews are audio recorded and transcribed. The transcripts are coded and analyzed in order to identify possible service, facility, resource and other improvements.

Nancy Fried Foster, who was involved in the University of Rochester work, is the analyst and consultant. The medical students have a much more focused life than the University of Rochester students did: they went to the clinic and stayed there. The interviewers carried out “interested, neutral listening.” They were taught how to code the transcripts in a two-day workshop.

Preliminary results suggest that “putting on the white coat” is significant, and time management matters. When selecting the best information tool, print books are more important than predicted. Andrea has learned some other lessons concerning timeframes, the challenges of multi-institutional studies, the funding and IRB process, and achieving consistency in methods in study design. The importance of the team leader for each institution, and the value of the consultant are significant success factors.

Panel discussion on information literacy

Two speakers were unable to attend at the last minute so an impromptu panel discussion was arranged. While impromptu, this discussion sparked a lively dialogue between panel members and attendees. The panelists were Engelbert Zass, Grace Baysinger, Andrea Twiss-Brooks and Donna Wrublewski (of Caltech).

Bert has been running courses at ETH Zürich and at the Universities of Bern and Innsbruck, teaching chemical information since 1981. Nowadays, he said, the providers do a much better job of training, but it is source-oriented; Bert does problem-oriented training. He reported that registration for SciFinder is not liked; at Innsbruck students found it even difficult. In order to register as SciFinder users, students have to find a link on their University’s website, and from there connect to the specific CAS connection page; for SciFinder, there is no direct registration from the general database start page as in Web of Knowledge, Scopus, or Reaxys. Trainees are better at searching nowadays (because of better user interfaces), but they find it harder to find the full text and original source if the link is broken: they often do not know how to use the library’s online public access catalog (OPAC). Many also have problems in defining a structure query.

Grace offers workshops, gives presentations to classes, and does a lot of one-on-one consultation to users. All of the major database vendors also provide short online tutorials that users can consult. The spectrum of expertise levels varies considerably. While some users are very “tech savvy,” they may have limited experience searching chemical information. Because Google provides a couple of highly relevant citations quickly and easily, multi-tasking users have developed “short attention span theater” and now expect to locate information without any training. Unfortunately, they do not know what they do not know. One colleague reported that some students hired to work in a corporate environment did not have the information skills needed for the job because they had relied too much on Google while in school. One area that Grace has been concentrating on is compiling information about laboratory safety resources as users may not be as familiar with them as they are for materials in other areas of chemistry. With so many materials now being purchased only in a digital format, it is critical for users to learn how to navigate in the OPAC rather than to rely on browsing print copies in the stacks. With student populations in universities now being more diverse, English may be a second language. Hands-on practice is essential in chemistry, not just a demo.

Donna was taught by a member of the audience as an undergraduate student; her academic advisers were not much help. Some people just go to the library to hide. When Donna arrived at the University of Florida she was able to think what would have made her thesis project less painful. She modularizes her information literacy courses. Controlled vocabulary is very important in chemistry. She teaches one useful strategy for each of Google, Wikipedia, the OPAC, and Web of Science. She uses a topic with which the user is familiar in order to start formulating a strategy. If you can find it on Amazon you can use an OPAC.

Andrea outlined some challenges faced at the University of Chicago. Two thirds of students are graduate students and one third is undergraduates. There are non-bibliographic information sources and tools used in science and medicine research that librarians may not be familiar with, nor have the training or knowledge to use. Students are being asked to use all sorts of new IT and analysis tools, for example, OpenBLAST, FlyBase, and other bioinformatics tools. The University’s Computer Science Instructional Laboratory tutors teach the use of computer programming tools. Librarians collaborate with the tutors to see what resources the students need and how best to refer the students needing expert training. The library tries to keep aware of where the expertise lies at the University. For example, 3D visualization is available at the Research Computing Center. In joint courses, the library provides the space, and the partners provide the expertise. The library also works with the IT group: the IT group can do problem-solving for mobile devices setup, and they will do things such as MatLab training, for example. The library also works with vendors for on-site and Webinar training, providing the room with computers, or the needed audio and video setups.

Questions were invited from the audience. One person said that the four panelists were all from big libraries; what about smaller schools? Bert replied that Bern and Innsbruck are much smaller than ETH Zürich. Even with a limited number of databases you can teach people to use what they have, but this demands creativity. The questioner felt that Europe is different. Grace said that at undergraduate institutions librarians usually spend more time teaching than doing collection development, but the number of resources being taught is smaller. Grace helped revise a document on information literacy skills that undergraduate chemistry students should achieve by the time they graduate (e.g., search by topic, author, and physical properties). This document included resources that could be used to learn that skill. Due to small budgets, at some skills an effort was made to include high-quality free resources and to put a dollar sign by resources that need to be purchased. Often, instruction that is integrated into a course and tied to an assignment that a student must complete is the most effective type of training at the undergraduate level. Starting her academic career at a junior college, Grace benefited by getting personalized help from her instructors and the librarians. Because so much stuff in chemistry can be related to ordinary life, it provides an opportunity to introduce undergraduates to chemical information tools and to increase their interest in chemistry as a discipline.

Another person asked if it is appropriate to use Google. Yes, said Donna, with some hesitation. Bert recommends using all sources. Yes, said Grace, if the risk is low. It should not be used for explosives, for example. The questioner pointed out that Google points you to a source. Could you run a course on Google? Andrea thought you probably could; searching is a general skill that everyone should have. Grace would prefer to teach Google Scholar rather than Google for finding information. When Grace attended the Biennial Conference on Chemical Education (BCCE) in 2012, she heard a presentation by a chemistry professor who is using Wikipedia to help teach students better writing skills. First the students evaluate an article present in Wikipedia and then they have to write an article for Wikipedia. With Google you can get 1 million answers in under a minute, but it can take hours to evaluate the hits if you are trying to do a more comprehensive search. Donna pointed out that Google has offered courses, but not in chemistry. A librarian in the audience has taught a laboratory class for non-chemistry majors. She used Wikipedia and got the message across about the pros and cons. Faculty and students do not think they need teaching. There are cultural differences too: students need to get something useful that they can use in their own projects.

Donna reported the same problem in teaching ethics and communication and safety. Faculty do not have time to care. Grace said that the Stanford Chemistry Department has laboratory safety coordinators for each laboratory group, who are semi-expert. She has resisted this model for chemical information as everyone should learn how to use information, but maybe an information coordinator in the laboratory would be helpful. Bert said that in “the STN days” ETH had this concept in order to save money. When Beilstein CrossFire came along, the information coordinators were known and could be useful contact persons.

Another librarian in the audience said that the answer is in the kind of question you ask on Google. In the print days you asked the librarian if you could not find something in a book. In the electronic era you do find something so you may not ask the librarian. Google is superlative at the “good enough” answer, but it fails on an exhaustive search. Yet another librarian said that he introduces Web of Science. Students do not use the right fields. He uses queries such a “XYZ was published in Tetrahedron; how often has it been cited.”

A German attendee pointed out that courses of this sort are not usually taught in Germany. Andrea said that, in theory, information literacy has to be taught in the United States. The German attendee said that the course could take two hours out of a laboratory class. Grace noted that if it becomes a library class rather than a chemistry class, the faculty member may object. For example, faculty may want the students to analyze peaks in their spectra manually, rather than do a spectral peak search in a database. Doing this may be harder and take longer when trying to identify an unknown, but the students understand chemistry better if they have done the analysis manually. Donna said that learning how to use SciFinder should not be made difficult.

Grace added that in the days of print, people used to be able to browse the stacks; they have to use the OPAC now. Bert feels that it is important to show people how to find a good book or review. You have to show them an example. Grace said that some people know how to use RSS feeds, but others do not know what tools are available for keeping current.

Chemical publications revisited

Guido F. Herrmann of Georg Thieme Verlag also addressed the topic of chemical publications, but concentrated on the connections between full text information and information embedded in the chemical structures and reactions. Thieme (http://www.thieme.com/) is a medium-sized publisher that has had an internationally strong position in chemistry (https://www.thieme.de/en/thieme-chemistry/home-51399.htm) since 1886. It produces journals, text books, monographs, reference works, dictionaries, databases, continuous education products and interactive online libraries, in multiple formats. These basic categories have been very robust and stable for more than a hundred years. In contrast the published formats (digital versus print), the user expectations, the production processes, and the distribution channels have seen significant change over the last decade.

According to the November 2012 STM report An overview of scientific and scholarly journal publishing (http://www.stm-assoc.org/2012_12_11_STM_Report_2012.pdf) “the number of articles published each year, and the number of journals, have both grown steadily for over two centuries, by about 3% and 3.5% per year respectively. The reason is the equally persistent growth in the number of researchers, which has also grown at about 3% per year.” A major change, a veritable revolution, has been the move from print to electronic distribution, and Thieme is still managing an ongoing change process. All Thieme’s chemistry publications now have digital versions, including backfiles from 1909 onwards.

The basic role of a publisher has remained, but the actual operations and production processes have changed drastically. The STM Tech Trends 2013 poster (http://www.stm-assoc.org/future-lab-trend-watch-2013/) illustrates many things that are starting to happen: where should a medium-sized publisher focus? Can a publisher help in converting scientific information into applied knowledge? One goal is not just to produce information, but to create value. In May 2008, a Research Information Network report (http://www.rin.ac.uk/system/files/attachments/Activites-costs-flows-report.pdf) estimated that “the global cost each year of undertaking and communicating the results of research reported in journal articles is £175bn, made up of £116bn for the costs of the research itself; £25bn for publication, distribution and access to the articles; and £34bn for reading them.” Publishers could add value by reducing the cost of reading, that is, by providing researchers with relevant information more effectively.

Guido gave two examples. The first concerned primary data. Guido estimates that there are 500,000 to 1 million datasets a year in organic chemistry. To preserve them, and make them discoverable and re-usable, requires servers and data centers, metadata, and digital object identifiers (DOIs). FIZ Karlsruhe (http://www.fiz-karlsruhe.de/home.html?&no_cache=1&L=1) houses the Thieme data, and Technische Informationsbibliothek (TIB, the German National Library of Science and Technology (http://www.tib-hannover.de/en/) assigns DOIs to them, stores the metadata and keeps them searchable. TIB is the managing agent of the DataCite organization (http://www.datacite.org/). At the same time as an article is published, the primary data are published as an independent entity: the article quotes the research data as reference items with the assigned DOI.

Authors of articles in the Thieme journals SYNLETT and SYNTHESIS are now being invited to submit their datasets for publication alongside their articles. The primary data have their own DOI, different from the one of the paper, and can thus be cited independently. Spectra, for example, are published not as PDFs or JPEGs, but as raw, interactive data, which can be downloaded and analyzed. Benefits are citability and high visibility of research data, easy re-use and verification of the datasets, avoidance of duplication, and motivation for new research. Unfortunately, authors are, thus far, not enthusiastic about supplying their data, and reviewers claim they have no time to check the data.

Guido’s second example concerned full text, structures and reactions. Science of Synthesis (https://science-of-synthesis.thieme.com/app/home) is the successor to Houben-Weyl, the archive of which contains approximately 146,000 experimental procedures, 580,000 structures and 700,000 references, in 160 volumes. Science of Synthesis (from year 2000 onwards) contains approximately 50,000 experimental procedures, 270,000 reactions, and 1,250,000 structures in 48 volumes. Science of Synthesis Updates (since 2010) contains a further 18,000 experimental procedures and 40,000 reactions in 17 Volumes. Science of Synthesis Reference Library (since 2010) contains 15,000 experimental procedures and 40,000 reactions in 13 Volumes. New material is uploaded several times a year.

Science of Synthesis 4.0 has a new production system this year, developed in collaboration with InfoChem (http://www.infochem.de). Previously, all reaction schemes were completely redrawn and indexing was done manually. Now authors’ schemes are used (with modification), the schemes are checked and modified by a scientific editor, and the indexing is mostly automated. In the days of manual indexing, each structure was taken from a scheme and loaded into a database; starting materials, products, reagents, solvents, temperature and yield were defined by an indexer; each individual structure and reaction was extracted from tables and scheme tables; and it took about three months in all to index a Science of Synthesis volume. Now Thieme, in a continuing collaboration with InfoChem, has developed an automated indexing system: structures and reactions are automatically indexed using InfoChem’s SchemeAnalyzer software, which is 85% successful in extracting structures and single-step reactions directly from complex schemes in ChemDraw files. Another new development is the implementation of a MarkLogic NoSQL database (http://www.marklogic.com/what-is-marklogic/) for a new graphical user interface, and full-text and data search. Work is ongoing further to improve the link between the InfoChem system (i.e., chemical information) and the MarkLogic system (i.e., full-text information). The final system will give even better value to the user.

CAS keeps pace with the worldwide growth in disclosed chemistry

Chemical Abstracts Service (CAS) has always been a leader in providing scientists with access to chemical information. Matt Toussant of CAS described how CAS has adapted to the phenomenal growth in chemical information being published today. The ACS is committed to “improving people’s lives through the transforming power of chemistry.” Its mission is “to advance the broader chemistry enterprise and its practitioners for the benefit of Earth and its people.” The mission of CAS is “to provide the world’s best digital research environment to search, retrieve, analyze, and link chemical information.” Chemistry is the central science.

Matt showed a timeline of some influential events in publishing. Johannes Gutenberg invented the printing press in about 1450. The Internet started with the time-sharing of computers in the early 1960s at U.S. universities and with the Advanced Research Projects Agency Network (ARPANET), developed after the launch of Sputnik in 1957. Ebooks are now an alternative to printed books. Galileo’s heliocentric dialogue was published in Latin by Elzevir.3 Robert Boyle’s five-person dialogue The Sceptical Chymist4 was published in 1661. Books were printed in small numbers in those days; nowadays books are widely available.

The history of scientific journals dates from 1665, when the French Journal des sçavans and the English Philosophical Transactions of the Royal Society first began systematically publishing research results. The number of serials (http://www.stm-assoc.org/2012_12_11_STM_Report_2012.pdf) has grown every year since then. Eventually abstracting and indexing services such as Chemisches Zentralblatt (born in 1830) were needed to help readers keep pace with the literature. Chemical Abstracts (CA) began in 1907. Later, CAS extended its reach into the patent world. In 1641, Samuel Winslow was granted the first patent in North America for a new process for making salt. The first U.S. patent was granted in 1790. The first patent in Chemical Abstracts dates back to 1808 and concerns an alcohol still.

CAS has covered several serials for more than 100 years, for example, Annalen der Chemie und Pharmacie (later Justus Liebigs Annalen der Chemie und Pharmacie, now part of the European Journal of Organic Chemistry) which began in 1840. CAS has covered 50,000 journal titles over the years; it now covers 10,000. It has more than 100 years’ experience of analyzing and organizing disclosed chemistry from around the world. The work is no longer done manually in a library at Ohio State University; nowadays computerized data entry and sophisticated tools are used. Chemist labor around the world, and “postal support” to analyze chemical publications, ended in 1994 because it was too slow. At its peak in 1967, this process involved nearly 3,500 people. A newsletter called The Little CA, published up to four times a year, kept the “volunteers” informed. E. J. Crane (editor of CA from 1915 to 1958) was the main author. The volunteers worked all over the world: Czechoslovakia, Poland and the United States were well-represented, but the largest number was that of Japanese chemists. Neutral parties helped during the war years. Much analysis is now outsourced to India, Japan and China.

The history of CAS REGISTRY goes back to a concept of Malcom Dyson’s in the 1950s. The original database was a file of fluorine compounds using the Dyson-IUPAC notation on edge-notched cards. In the 1960s, Harry Morgan of CAS, building on the work of Donald Gluck at DuPont, published an algorithm that converted structure diagrams into unique tabular forms.5 This handled aromatic and tautomer bond representations and established the basis for REGISTRY. The building of REGISTRY began in 1964.

The CAS indexer analyses the whole document, creates a CAS REGISTRY record and interprets when compounds are described in terms other than singular structures or names. A typical chemistry patent (a PCT application for “A new antibacterial,” with 250 pages and 24 claims) took 15 days to index completely, with 917 compounds, 576 new compounds, 613 single-step reactions, 5,394 multistep reactions, 1029 reaction participants, and one MARPAT Markush structure with 2,119 substituent definitions. CAS specialists in many fields of chemistry interpret author terminology to register compounds. Spectra, numeric properties, tags, and published sources are recorded.

CAS databases continue to show strong growth. In particular the number of patents has greatly increased recently (9.2% growth in 2012); much of the increase is due to China, Korea and Japan. About half of small molecule registrations are compounds from patents. Sixty-three patent authorities are now covered, and it usually takes less than 27 days from receipt for a patent to appear in CA. In non-patent information, Matt drew attention to growth in Asia. Nowadays, 69% of items indexed by CAS are in English, 13% in Chinese. More than 89 million CAS Registry Numbers have been issued; 29 million registered substances have been indexed in the last 5 years alone. The number of prophetic substances and Markush structures is up more than 10%.

Apart from journals and patents, CAS also covers dissertations, meeting abstracts, and conference proceedings, more than 1,000 ahead of print journals, valuable Web sources, commercial chemical suppliers and regulatory inventories. Attendance at national meetings by larger societies does not really seem to be decreasing. “Ahead of print” appearance of journal articles, rapid publication, and “letters” journals are other trends. Speed of publication is critical. The proportion of ACS articles with supporting information rose from 60% in 2012 to 70% in 2014; supporting information is mostly in the form of PDF files, but there are also Crystallographic Information Files.

Matt ended with some predictions. Numbers of all forms of publications will continue to increase annually by 3% to 5%; Web-based “As Soon As Publishable” will dominate. Supporting information will broaden. Open access will play a significant role. Asia will be the origin of much new science. Disclosures originating from commercial sources will increase as new substances are being created in laboratories around the world. The pace of growth in patent applications will slow, but patents will remain a main vehicle for the monetization of science. Finally, meetings with a geography requirement will lessen further, as technology makes global connections easier and more efficient.

InChI and the information chain

Steve Heller of the National Institute of Standards and Technology (NIST) gave an overview of the IUPAC International Chemical Identifier (InChI, http://www.iupac.org/inchi). InChI is a non-proprietary, freely available, machine-readable string of symbols that enables a computer to represent a compound in a completely unequivocal manner. InChIs are produced by computer from structures drawn on screen with existing structure drawing software, and the original structure can be regenerated from an InChI with the same software.

Like bar codes, and QR codes, InChIs are not designed to be read by humans. InChI should be thought of as “plumbing,” a modern enabling technology. It is not something the average chemist needs to know about: researchers merely use it to find and link information on the Web. There is too much information on the Web and it lacks integration and connection; InChI is an infrastructure foundation that allows for linking, and hence for higher productivity. It is not a replacement for any existing internal structure representations; it is in addition to what is used internally. There are four amusing videos on the Web that explain InChI simply:

What on Earth is InChI? (http://www.youtube.com/watch?v=rAnJ5toz26c),

The Birth of the InChI (http://www.youtube.com/watch?v=X9c0PHXPfso),

The Googlable InChIKey (http://www.youtube.com/watch?v=UxSNOtv8Rjw), and

InChI and the Islands (http://www.youtube.com/watch?v=qrCqJ0o4jGs).

Some people question why InChI should be used instead of SMILES. SMILES is a popular line notation but it is not a published standard. Each vendor has a different implementation of SMILES, so strings cannot reliably be compared. SMILES has no structure normalization, so different structural representations yield different SMILES strings: a subscriber to chminf-l has reported finding 172 different SMILES representations for caffeine on the Web. InChI is easy to generate using existing software, expressive of structural information, unique and unambiguous, and amenable to searching for structures using Internet search engines (using a hash key).

The InChI standard was developed by consensus, by a technically competent team, with political and technical cooperation. The work involved precompetitive collaboration among publishers, database producers, and software vendors. InChI is not in competition with commercial products, it has no “mission creep,” and it is endorsed by IUPAC. The standard has been widely adopted: there are, for example, tens of millions of InChIs in each of PubChem, ChemSpider and Reaxys, and an InChI can be input to SciFinder to search the 89 million compounds in CAS REGISTRY.

If the work of the InChI project were to endure, it needed to be turned over to an entity that would ensure its ongoing activities, and be acceptable to the community. A not-for-profit organization was best; hence the decision to create and incorporate the InChI Trust (http://www.inchi-trust.org) as a UK charity. The Trust has about 60 Members, Associate Members, and (non-paying) Supporters. The InChI project has experienced remarkable cooperation and support.  It is a truly international project with programming in Moscow, computers in the cloud, incorporation in the United Kingdom, and a project director in the United States. Collaborators from over a dozen countries, from academia, the pharmaceutical industry, publishers, and the chemical information industry, have all offered senior scientific staff to develop the InChI standard.

Organizations need a structure representation for their content (databases, journals, chemicals for sale, etc.), so that it can be linked to and combined with other content on the Internet. InChI provides an excellent return on investment and increases productivity. It is a freely available, open source algorithm that anyone, anywhere can freely use, and it is certainly widely used: its success is proved by un-coerced adoption. InChI’s combination of the Internet, open source software, crowdsourcing, graph theory, existing representation algorithms, digitized data available on the Web, and search engines has created a very valuable tool. This is taking advantage of “the second machine age,”6 which includes “recombinant innovation,” or mashups.

InChI is a “layered” line notation, currently with the following layers:

  • Formula
  • Connectivity (no formal bond orders)
    • disconnected metals
    • connected metals
  • Isotopes
  • Stereochemistry
    • double bond (Z/E)
    • tetrahedral (sp3)
  • Tautomers (on or off).

Charges are added to the end of the string. Layers are separated by slash marks. Opening characters before the formula denote the version of the algorithm used. An example is alpha-D-glucose:

Image

InChI=1S/C6H12O6/c7-1-2-3(8)4(9)5(10)6(11)12-2/h2-11H,1H2/t2-,3-,4+,5-,6?/m1/s1

The InChI algorithm normalizes chemical representation, and includes a “standardized” InChI, and a “hashed” form called the InChIKey. The key facilitates Web searching, previously complicated by unpredictable breaking of InChI character strings by search engines. The “standard InChI” and InChIKey for caffeine are shown below.

Image

InChI=1S/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
InChIKey=RYYVLZVUVIJVGH-UHFFFAOYSA-N

The first block of 14 letters of the InChIKey (RYYVLZVUVIJVGH) encodes the molecular skeleton (the connectivity). The first eight letters of the second block (UHFFFAOY) encode stereochemistry and isotopes. After that, “S” indicates that the key was produced from standard InChI and “A” indicates that version 1 of InChI was used. The final character, “N,” means “neutral.” The first 14 characters of an InChIKey can be used to search for structures with the same skeleton (e.g., to find all stereoisomers).

The InChI certification suite is a software package designed to check that an installation of the InChI program has been performed correctly; it ensures that InChIs have been generated properly and consistently. Currently, InChI handles straightforward organic molecules; it is being extended to handle more complex entities such as organometallics, Markush structures, macromolecules, and reactions.

Virtual communities and beyond

Wendy Warr, of Wendy Warr & Associates (the author of this CIB article), looked at the evolution of virtual communities and publishing platforms. As research has become increasingly collaborative, possibilities for communication and collaboration on the Web have also increased. Virtual communities in science such as EiVillage and BioMedNet began to spring up in the 1990s. The earliest virtual community in chemistry, ChemWeb.com,7 was announced in August 1996 by MDL and Current Science Group, and launched in April 1997. It was acquired by Elsevier in October 1997. Elsevier closed all its “portals” in the middle of 2003, and ChemWeb.com was sold to ChemIndustry in April 2004.

ChemWeb.com was a pioneer in virtual conferences: the first was held in December 1997. The technology (interactive chat alongside PowerPoint slides and audio) was hardly ready to support such innovations at that time. Many chemists were unwilling to register themselves into a virtual community in the 1990s, but by 2002, ChemWeb.com had 300,000 members, and it offered 350 journals, 25 databases, structure searching, a careers center, a conference center, a bookstore, a magazine (The Alchemist), 11 forums, and a preprint server, the Chemistry Preprint Server (CPS, http://www.sciencedirect.com/preprintarchive).

The CPS was launched as an experiment in 2000. By Elsevier’s own criteria it was a partial success: the number of readers and their geographic spread were excellent; the number of preprints (466) was encouraging, but rather less than had been hoped for; and, unfortunately, it was difficult to ascertain the number of preprints going on to traditional publication. The CPS was terminated in 2002. An evaluation has been published.8

Chemistry as a discipline has been slow in adopting open access, but some journals merit a mention: Chemistry Central Journal, Journal of Cheminformatics, the Beilstein Journal of Organic Chemistry, Frontiers in Chemistry, Chemical Science, and ACS Central Science. ChemSpider (http://www.chemspider.com/), in particular, is worthy of note. Most of the publishing services discussed in the rest of this talk concern biology, medicine and biomedical sciences rather than chemistry.

In 2004, the year that ChemWeb.com changed hands, the term “Web 2.0” was first coined by O’Reilly. Not everyone would agree on the definition, or usefulness, of the term, but the era of wikis, blogs, feeds, podcasts, webinars, social networks, social bookmarking, and virtual worlds had begun.9 Facebook had 66 million users in March 2008: compare that with ChemWeb’s 300,000 in 2004. People are no longer afraid of signing up to virtual communities. The traditional peer review process can now be challenged.

Peerage of Science (https://www.peerageofscience.org/), for example, promotes one peer review process for multiple journals, in ecology, and evolutionary and conservation biology. It is run by a for-profit organization, funded by publishers etc. The reviews themselves are peer-reviewed, but there is no journal editor in control of the peer review process. As of July 2014, 176 manuscripts have received 381 peer reviews, and there have been 942 peer review evaluations. Rubriq (http://www.rubriq.com/) also supports one peer review process for multiple journals. It offers independent, double-blind peer review and manuscript submission, recommends a suitable journal, and provides a score-card for reviews. The reviewer is rewarded financially, while the author pays for the R-score etc. Axios Review (http://axiosreview.org/) is another independent review service in ecology and evolutionary biology. The author aims at a top choice journal and “plays safe with others.” (Few manuscripts completely fail to be published, rather they are resubmitted, after rejection by one journal, to a less prestigious journal, in the so-called “journal cascade.”) eLife, BioMedCentral, PLoS and EMBO have started a peer review consortium in which papers are redirected with reviewer reports. Reviewers are anonymous to the author, but anonymity is optional for the journal cascade editor. SciRev (https://scirev.sc/) claims to be “speeding up scientific knowledge:” authors rate journals on efficiency and seek an efficient journal. An editor can compare his or her own journal with competitors.

So-called “mega-journals” (PLoS ONE, PeerJ, and eLife) are another trend. eLife arose from the San Francisco Declaration on Research Assessment (DORA, http://am.ascb.org/dora/). It is opposed to journal Impact Factors and runs a consolidated, pre-publication peer review service. Researchers can read and publish in eLife for free; the journal is supported by the Howard Hughes Medical Institute, the Max Planck institutes, and the Wellcome Trust. As of July 2014, it had published 488 articles, 85 of them in biochemistry. PeerJ is a peer-reviewed journal and preprint server in biological and medical sciences. It offers cheap, lifetime accounts (pre-paid). An editor handles pre-publication peer review. Some reviews (about 40%) are not anonymous. It links to Publons (vide infra). As of July 2014, it had published 476 articles and 433 preprints.

Frontiers (http://www.frontiersin.org/) was launched in 2007 by scientists from the Swiss Federal Institute of Technology, Lausanne, with a major investment by Nature Publishing Group. Its journals are largely in medicine etc., but also in chemistry, earth science, ecology and evolution. It offers open peer review in two phases (independent and interactive); anonymity is not allowed. Analytics automatically track views and downloads. The Frontiers evaluation system allows an entire community to score a paper. Frontiers has published 20,000 articles in 45 community-driven journals.

The business model is author-pays. The community also shares jobs postings. Other communities or social networks include Academia.edu (http://www.academia.edu/) and ResearchGate (http://www.researchgate.net/), where millions of researchers share papers, see analytics, and follow other people. The data repository Figshare (http://figshare.com/) has collaborative space in the cloud.

A number of centralized commenting platforms have sprung up. PubMed Commons (http://www.ncbi.nlm.nih.gov/pubmedcommons/) enables authors to share opinions and information about scientific publications in PubMed. Publons (https://publons.com/) collects peer review information from reviewers and publishers, produces reviewer profiles with publisher-verified peer reviews, and handles pre- and post-publication peer review. Authors (and a few peers) are notified of comments, and reviewers get credit in the form of DOIs. As of July 2014, 1,954 reviewers had produced 4,247 reviews. The service is free to academics. PubPeer (https://pubpeer.com/) offers anonymous post-publication peer review. Users can comment on any scientific article with a DOI, or on an arXiv preprint, adding comments to a centralized database. Authors and other interested parties are alerted to comments. Journal Lab (http://www.journallab.org/) handles open summaries and peer review of PubMed papers, but anonymity is optional. It also offers journal clubs and discussions.

A few “publishing platforms” such as Faculty of 1000 (http://f1000.com/), ScienceOpen (https://www.scienceopen.com/) and The Winnower (https://thewinnower.com/) offer a wider range of services. Faculty of 1000 has F1000Prime (http://f1000.com/prime) literature filtering, and F1000Research (http://f1000research.com/), an open access journal and journal club, with open post-publication peer review. It has published 522 articles. F1000Posters (http://f1000.com/posters) was launched recently.

ScienceOpen is an open access, research and publishing network, launched on May 29, 2014. It features almost 1.3 million articles from PubMed Central and arXiv, by 2 million networked authors. It will publish all sorts of article types in the sciences, humanities, and social sciences. It offers collaborative pre-publication workspaces where authors can manage draft versions and share files, and easily collaborate on a paper. Authors get almost immediate publication with a DOI. Reviewers get DOIs for their open, post-publication peer reviews. Article metrics are powered by Altmetric. Authors benefit from automatic proofs, easy corrections, and versioning. User roles are allocated based on ORCID publication history. Public and private groups can be constructed.

In the world of open access, open data and open science what might happen next? It certainly seems likely that there will be consolidation (or closure) among the services mentioned above. Does Science Open have anything to learn from ChemWeb.com? ChemWeb was ahead of its time. If Elsevier had hung on to ChemWeb for just a short time longer, it would have had a ready-built community for Article of the Future and its other ventures. More than a decade after the Chemistry Preprint Server closed, chemistry still has a different culture10,11from other disciplines, but a number of members of ScienceOpen’s scientific advisory board were in the audience, wishing the new venture well.

Reaxys: a digital transformation

Sebastian Radestock of Elsevier Information Systems gave this talk, replacing David Evans of Reed Elsevier Properties, who was indisposed. Sebastian talked about the long road leading to the current version of Reaxys (http://www.elsevier.com/online-tools/reaxys). Reaxys has its origins in the preeminent Gmelin and Beilstein Handbooks, begun by Prof. Leopold Gmelin in 1817 and Prof. Friedrich K. Beilstein in 1881. In the 1980s the focus was on database development: SANDRA (a structure-based tool to locate references in the Beilstein Handbook), and the Gmelin Formula Index were released in 1987-1988, and the Gmelin and Beilstein databases went online on STN and Dialog in 1989-1990.

Since then, the development focus has been the user. CrossFire was launched and improved between 1993 and 1995, and the printed Handbooks were discontinued in 1997-1998. The Patent Chemistry Database was launched in 2005. In 2009, Reaxys was launched, based on Gmelin, Beilstein, and the Patent Chemistry Database. In 2013, Reaxys was completely overhauled and its scope was expanded. In 2014, it was upgraded with re-indexing and concept search.

The 1989 version of Beilstein on STN was text-based; knowledge of the database structure and STN commands was needed; and not all query forms or results were readily accessible. CrossFire Commander in 1996 was a graphical, client-server based system. Access was improved with input forms, and structure and reaction searches could be combined with factual queries, but displays were rather cluttered. The 2014 version of Reaxys has a subject-oriented, customizable, Web-based user interface; concept search is enabled; all types of chemical information needs are supported; and there is one-click access to advanced-query forms for multiple source databases.

The CrossFire database structure had three different, tightly connected contexts: substances and properties, reactions and reaction details, and citations and abstracts. This concept is still valid today: Reaxys is a bibliographic database with more than 46 million records from 16,000 journal titles; it is a substance database with more than 57 million unique substances and more than 500 million experimental facts; and it is a reaction database with more than 36 million single- and multi-step reactions. The database structure is built around a sustainable chemical substance model for single compounds, component compounds, and Markush compounds.

Chemists like to do graphical structure searching, but Elsevier wondered if they would also like to start searching for a topic by typing text. The company therefore surveyed 700 chemists with a wide range of job roles, years of experience, and worldwide locations. The chemists were from many different areas of interest (14% were in materials chemistry, 9% in organic chemistry, and 4% in electrochemistry, for example), and all sorts of organizations in many sectors.

The survey revealed a definite need for keyword search capabilities. On average, chemists search for chemistry-related information five times a week. Researchers in organic, inorganic, medicinal and organometallic chemistry are the most intensive searchers. Some 70% of the chemists are keyword searchers (in that they spend more than 60% of their time searching for keywords); 10% are structure searchers; 20% search for both keywords and structures. Structure search is highest among organic and inorganic chemists.

The searching pattern for both structure search and keyword search is similar. The top four use cases are:

  • find reviews, introductory articles and other starting points for research
  • find the very latest information on a certain topic
  • search and retrieve substance properties, and
  • compile a comprehensive survey of the published literature.

Keyword searchers more often search for reviews and introductory articles, properties, and comprehensive surveys of the published literature. There are some use cases which are ideally supported by structure searching.

Based on these results, Elsevier developed Ask Reaxys to be more than just a text search. Text input is analyzed to identify all possible queries; all queries are assigned a probability factor; and the query exceeding a certain probability factor threshold is automatically selected, or the user is prompted to select a query manually from the list of possible queries. Each word, or group of words, can be classified as bibliography, compound, concept, date, or keyword, or it can be ignored. An example is “electrical conductance of titanium.” This query could be “electrical conductance” as a concept and “titanium” as a compound; or it could be “electrical conductance” and “titanium” as keywords with “of” ignored. The former option is translated into a combined structure and factual query, and this is executed in the substances context (on a substances tab in the Reaxys interface). The latter is translated into a pure keyword query, which is executed against all text fields in the citations context.

To improve the relevancy of the results, all citations and abstracts have been indexed, and enriched with additional Elsevier keywords.

For text input analysis, the input string is tokenized and each token is annotated using

  • a proprietary tool for chemical entity recognition
  • a regular expression for third party registry numbers and InChIKeys
  • a list of author names
  • a regular expression for dates
  • a list of words that can be ignored, and
  • a large chemistry taxonomy.

Text input analysis also involves annotation clean-up. Query translation takes the annotated text input and returns an advanced database query string.

ReaxysTree, a new chemical taxonomy, has been developed. Tree development started with looking at the Reaxys data structure (the field codes) and the database content. The tree was further extended using keywords from more than 46 million records from 16,000 journal titles. Synonyms and spelling variants were also added to each term. ReaxysTree contains more than 15,000 concepts with more than 40,000 synonyms. It is poly-hierarchically structured and organized in several chemistry-related facets. There are broad and narrow terms, each with a unique label ID, language, date, source, type, case sensitivity, etc. Elsevier’s taxonomy construction rules were followed. ReaxysTree is used for indexing the content in a continuous process of application and learning. After automatic indexing has been carried out, statistical analysis reveals new synonyms and additional terms, which are subjected to editorial work, and then fed back into ReaxysTree before further automatic indexing.

The Ask Reaxys keyword search functionality is now prominently available at the top of the Reaxys query page (with a Google-like appearance). ReaxysTree can also directly be accessed, searched and browsed.

Award address

Finally, Bert gave his own presentation. In the last century, publications about total syntheses in journals would usually have full experimental details. A remarkable exception was the total synthesis of vitamin B12, considered a landmark in organic synthesis, involving the research groups of Robert Burns Woodward at Harvard, and Albert Eschenmoser at ETH Zürich. The synthetic target was actually cobyric acid, because this compound had already been converted to Vitamin B12.12 Hence total synthesis of cobyric acid would amount to a formal synthesis of Vitamin B12.

Image

There are two variants of the total synthesis, differing in their overall strategy of creating the corrin system (in red in the vitamin structure below).

Image

The variant collaboratively pursued closes the macrocyclic corrin ring between rings A and B (the “A/B variant”), while the synthesis accomplished at ETH achieves the corrin ring closure between rings A and D by a photochemical process (the “A/D variant”); the final steps toward cobyric acid were jointly carried out at Harvard and ETH, using material from the respective variants. Woodward reported on the A/B variant in lectures published in 1968, 1971, and 1973, culminating in the announcement of the total synthesis of the vitamin in his lecture13 at the IUPAC Conference in New Delhi, in July 1972. Eschenmoser discussed the ETH contributions to the A/B variant in his Centenary Lecture,14 published in 1970, and presented the approach to the photochemical A/D variant of the B12 synthesis at the 23rd IUPAC Congress in Boston, published in 1971 (http://e-collection.library.ethz.ch/eserv/eth:8691/eth-8691-01.pdf).15 A full report on the photochemical variant is given in a Science article16 which is an extended English translation of an article based on a lecture17 by Eschenmoser.

Seventy-seven postdoctoral students, but no Ph.D. students, worked on the project at Harvard between August 1961 and December 1975. Twelve Ph.D. students and 14 postdoctoral students worked on the project at ETH between September 1960 and August 1974. Research records consist of 67 postdoctoral reports (not publicly accessible) and 48 individual experimental procedures at Harvard, and 12 Ph.D. theses (publicly accessible) one diploma thesis (not publicly accessible) and 6 postdoctoral reports (not publicly accessible) at ETH. Unfortunately theses are not indexed in Chemical Abstracts and most European theses are not even covered by Chemical Abstracts or Dissertation Abstracts.

Bert found that a SciFinder search for “total synthesis of vitamin B12 (total synthesis of cobyric acid)” does not retrieve the significant publications by Woodward and Eschenmoser, although they can be found in Google. Web of Science finds Eschenmoser’s paper in Science16 and is the only database to find two papers in Chimia.18,19(These references are just one short abstract about two talks given at a Swiss Chemical Society Meeting, so it is perhaps not surprising that they are not found in SciFinder.) Scopus finds Eschenmoser’s paper in Science,16 a paper in Japanese that is not found by SciFinder, although it is in Chemical Abstracts, and a paper by Wintner with recollections of ETH.20 Note the absence of papers by Woodward. Woodward’s 1972 lecture is found in Scopus and SciFinder if you include a space between “B” and “12.”

In summary, the primary publication record is incomplete: Harvard has no experimental details and ETH has them only in theses. In the secondary literature, too many publications are missed altogether and those that are covered are hard to retrieve. The tertiary publication record (plus the Web and Wikipedia) is often incomplete, and the change of paradigm exemplified by the two variants of the total synthesis is not recognizable.

In a lecture given at Wesleyan University on September 29, 1972, Woodward gives the only complete list of both Harvard and ETH co-workers ever made public so far; 50 Harvard postdoctoral students are not mentioned in other lectures published by Woodward. Only the Harvard-ETH A/B route is mentioned in this 1972 lecture; the A/D alternative is not. The tapes and slides of this lecture are no longer available, but Bert has a shorthand transcript by Eschenmoser’s secretary, Miss H. Gächter (now Frau Zass).

The ETH publication project started by Eschenmoser and the Zasses in 1979 is without precedent. It seeks to produce a high-quality, fully documented record of 2,123 man-months of work, 3,732 pages of postdoctoral reports and procedures, and 1,889 pages of Ph.D. theses. Bert started by applying for all the Harvard records and recording individual reaction steps; 75 out of 77 postdoctoral students are now covered; two Russians are not included, for lack of information. Bert showed pages and pages of reports, listings, strategies, and handwritten records. Attribution is given to scientists for the specific work they did, with detailed comments; everyone’s work should be acknowledged. The summaries for 238 compounds, with nomenclature and spectra, were at first recorded on edge-notched cards, in different colors to distinguish Harvard and ETH. Reaction pathways and flow charts for synthesis pathways and reaction details were drawn up. Compound-centered manuscripts and modular, standalone, experimental descriptions were produced. Literature references were listed and standards were drawn up for the data, solvents, reagents etc. All this work was patiently typed by Frau Zass and then retyped when corrections were needed.

Between 1979 and 1986, 599 handwritten pages were typed on a Remington typewriter and later retyped into Macintosh Word; 210 pages were processed on an Olivetti ETV 300 and later converted into Macintosh Word. From 1984 a computer graphics program was used. By 1983 the Harvard ring A/D work was finished; ETH rings B/C and D were finished in 1986. Bert had other work to do for ETH at this time, including most non-routine online searches, but the project was resumed in January 1990. The Olivetti work was converted in 1990. Corrections to the A/D seco-corrin records were completed by June 1991, and retyping of the Remington material was finished in February 1992. In 2006, work began to add theses to the ETH collection. By February 2009, the Macintosh Word 5.1 files were converted to Windows 97-2003. Unfortunately for the project, Bert and Eschenmoser had an enormous number of other commitments between 1984 and 2012, and the final steps carried out at Harvard and ETH are still missing. Bert is now resuming work on the project (in 1979, his first cheminformatics project) and he intends to finish it as a tribute to all the chemists involved, and in particular to Albert Eschenmoser.

Conclusion

The symposium was ably chaired by Andrea Twiss-Brooks. After Bert’s award address, Judith Currano, chair of the ACS Division of Chemical Information, formally presented the Herman Skolnik Award:

Image

References

  1. Bergstrom, T. C.; Courant, P. N.; McAfee, R. P.; Williams, M. A. Evaluating big deal journal bundles. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (26), 9425-9430.

  2. Are Chemical Journals Too Expensive and Inaccessible? A Workshop Summary to the Chemical Sciences Roundtable. Heindel, N. D.; Masciangioli, T. M.; Schaper, E. v., Eds.; The National Academies Press: Washington, DC, 2005.
  3. Galilei, G. Discorsi e dimostrazioni matematiche intorno à due nuove scienze. Elzevir: Leiden, The Netherlands, 1638.
  4. Boyle, R. The Sceptical Chymist. J. Cadwell for J. Crooke: London, England, 1661.
  5. Morgan, H. L. The generation of a unique machine description for chemical structures - a technique developed at Chemical Abstracts Service. J. Chem. Doc. 1965, 5 (2), 107-113.
  6. McAfee, A.; Brynjolfsson, E. The Second Machine Age. W. W. Norton & Company: New York, NY, 2014.
  7. Warr, W. A. Communication and communities of chemists. J. Chem. Inf. Comput. Sci. 1998, 38 (6), 966-975.
  8. Warr, W. A. Evaluation of an Experimental Chemistry Preprint Server. J. Chem. Inf. Comput. Sci. 2003, 43 (2), 362-373.
  9. Warr, W. A. Social Software: Fun and Games or Business Tools? J. Inf. Sci. 2008, 34 (4), 591-604.
  10. Velden, T.; Lagoze, C. Communicating chemistry. Nat. Chem. 2009, 1 (9), 673-678.
  11. Velden, T.; Lagoze, C. The extraction of community structures from publication networks to support ethnographic observations of field differences in scientific communication. J. Am. Soc. Inf. Sci. Technol. 2013, 64 (12), 2405-2427.
  12. Friedrich, W.; Gross, G.; Bernhauer, K.; Zeller, P. Synthesen auf dem Vitamin-B12-Gebiet. 4. Mitteilung Partialsynthese von Vitamin B12. Helv. Chim. Acta 1960, 43 (3), 704-712.
  13. Woodward, R. B. The total synthesis of vitamin B12. Pure Appl. Chem. 1973, 33 (1), 145–178.
  14. Eschenmoser, A. Centenary Lecture. (Delivered November 1969). Roads to corrins. Q. Rev., Chem. Soc. 1970, 24 (3), 366-415.
  15. Eschenmoser, A. Studies on Organic Synthesis. Pure Appl. Chem. Suppl. 1971, 2, 69-106.
  16. Eschenmoser, A.; Wintner, C. E. Natural Product Synthesis and Vitamin B12. Science 1977, 196 (4297), 1410-1420.
  17. Eschenmoser, A. Organische Naturstoffsynthese heute Vitamin B12 als Beispiel. Naturwissenschaften 1974, 61 (12), 513-525.
  18. Fuhrer, W.; Schneider, P.; Schilling, W.; Wild, H.; Shreiber, J.; Eschenmoser, A. Totalsynthese von Vitamin B12: die photochemische Secocorrin-Corrin-Cycloisomerisierung. Chimia 1972, 26 (6), 320.
  19. Maag, H.; Obata, N.; Holmes, A.; Schneider, P.; Schilling, W.; Schreiber, J.; Eschenmoser, A. Totalsynthese von Vitamin B12: Endstufen. Chimia 1972, 26 (6), 320.
  20. Wintner, C. E. Recollecting the Institute of Organic Chemistry, ETH Zürich, 1972–1990. Chimia 2006, 60 (3), 142-148.

Wendy Warr, Symposium Presenter and Reporter

It Takes Two to Tango: A Symposium Honoring Dana Roth

Chemistry Librarians Partnering with Publishers and Researchers to Advance the Chemical Sciences

“During almost 50 years of service to the chemical information profession, Dana Roth has clearly demonstrated the need for chemistry librarians and information professionals to work closely with researchers and publishers. In honor of his achievements, we present a day of accepted and invited presentations, highlighting the many ways that chemistry librarians and information professionals are working closely with publishers and researchers.” (Call for papers)

Image

This symposium was organized by Judith Currano (University of Pennsylvania) and Ted Baldwin (University of Cincinnati) to honor Dana Roth in the best way possible. His dedication to building relationships among all the players in the chemical information “ecosystem” was emulated in all presentations. Talks from librarians described their relationships with their campuses, publishers and other information providers, and the broader community of librarians and information professionals. From the information provider side, we heard about improvements and growth that were directly influenced by valued interactions with the library community.

As the slides for all talks will be made available, the summary provided below is primarily from my own notes and thoughts, and does not exactly follow the order of the presentations – please consult the program listing for those details and abstracts (/node/621#W1a). Any mischaracterizations and omissions are solely errors on my part.

Understanding one’s campus is arguably the most important information a librarian can have. In recent years, campus libraries have borne the brunt of both reorganizations and budget cuts. Susanne Redalje (University of Washington) gave a poignant update on the closure of the campus's Chemistry Library five years ago. One important note was that with the loss of physical space also came the loss of both virtual space and organizational structure, both major hindrances to effective outreach. She stressed the importance of a physical presence in an increasingly virtual world, and how that presence involves more the librarian and the services than the actual physical place.

However, campus feedback indicated that the library was missed, and there was a distinct lack of understanding as to why it was closed. In turn, this emphasizes that direct engagement and communication with users, and especially with administration, is critical to successful outreach. This includes staying abreast of campus developments such as new research centers or initiatives, hosting on-campus events (such as ACS On Campus), surveys on services, and direct email notifications to patrons whenever possible. Finally, organizing on-campus services, as well as networking with the larger science librarian community through events such as the Science Librarian Boot Camps, has provided a structure and a “way forward” for improving and innovating services.

Ted Baldwin described several projects underway at the University of Cincinnati in response to input on meeting the needs of researchers. The Science Libraries are being reorganized to provide additional collaborative and study space, as well as a Research Commons with features including a data visualization studio and GIS capabilities. This involved significant relocation of print materials, which was done with “lots of consultation” with the affected researchers, emphasizing again the importance of clearly communicating changes. Another innovation is the creation of the “science informationist” position, working with everything from data management to digital repositories to best practices. Finally, fall 2014 will see collaborations with faculty “early adopters” of the campus digital repository and open journal systems, emphasizing the focus on responding to researchers, giving them what they need.

Understanding what your campus is doing is vital to both informing the services you provide and lobbying effectively for services that you need from information providers and publishers. Grace Baysinger (Stanford University) talked about the various needs of a campus, both in terms of education and research, and the importance of partnering with information providers to meet these needs. Campus visits, one-on-one meetings, and focus groups are just some of the ways to give feedback to providers, and one should take advantage of any opportunity presented. This will in turn improve offerings and services, including everything from content to metadata to training materials. Engaging with providers and offering substantive feedback is a highly effective method to truly “partner” with providers and serve your campus users.

From the provider viewpoint, two longtime stalwarts in the chemical information field, CAS and ACS Publications, provided insight into their development and relationship with librarians, pointing to Dana Roth as an influential partner and how librarians in general provide valued feedback.

Steve Hansen from ACS Publications, speaking in place of Sara Rouhi, echoed Grace’s sentiments, this time from the publisher side. He described ACS’ efforts to engage librarians in discussions and summits to address issues ranging from pricing to Open Access to metrics. As a nice side note, Steve mentioned a story that Sara had told him that visiting Caltech was one of her first trips as an ACS Publications representative, and she spoke warmly of Dana’s kindness and hospitality.

Roger Schenck (Manager, CAS Content Promotions) told a brief history of CAS, including the evolution of the CAS REGISTRY, arguably CAS’s most “far-reaching development.” He emphasized that the adoption by libraries and engagement of users with its services, SciFinder Key Contacts in particular, have been instrumental to CAS’ growth and success. As a Key Contact, Dana has provided feedback throughout the years to improve and expand CAS’ services, including its databases and interfaces. Roger closed with some of Dana’s most notable contributions to the “chemical information enterprise,” including co-editing the volume “Chemical Information for Chemists” with Judith Currano and his significant activity on the CHMINF-L mailing list. Roger counted over 700 messages (and rising) sent by Dana since its inception in 1991: yet another statement of his service to his fellow information specialists and his contribution to the field as a whole.

Returning to the academic realm, my presentation touched on a number of topics to show how Dana’s vision for service and librarianship at Caltech is being continued and expanded upon by his current colleagues. Among some of the projects described were innovations in awareness, the continuing relationship with Thomson Reuters by providing extensive feedback on Web of Science, and the growth of Caltech’s Institutional Repository, CODA (Collection of Open Digital Archives). Although the methods and technological tools available to libraries have changed throughout the years, we are still fundamentally doing the same things we always have been doing: serving the information and research needs of our patrons.

Taking both campus awareness and provider collaboration to a new level for patron service, Leah McEwen (Cornell University) gave an update on a data management project. There are many, many issues involved, including what to keep, how to document and annotate it, and how to store it, among others. Her collaboration with researchers at the Royal Society of Chemistry and the University of Southampton saw her embed with and observe chemical researchers on her campus. “People are already doing data management as part of their research,” she noted, “but a significant issue is a lack of standards and/or best practices particular to chemistry,” something her project hopes to help resolve. One goal will be to evaluate how an electronic laboratory notebook can be useful to streamline information needs, including linking to repository data as well as chemical safety considerations.

The presentations so far had touched upon interactions with campuses and companies, but what about our colleagues? How do librarians learn about these tools, and learn about how to communicate with their researchers? Because chemistry essentially has its own language, it may be hard for librarians without a background in chemistry to talk to researchers and understand research questions being asked. Judith Currano described the evolution of “Chemistry for the Non-Chemist Librarian,” a continuing education course offered through the Chemistry Division of the Special Libraries Association (SLA). Dana was an inaugural instructor of the course in 1999, along with Bartow Culp. At the time, the course was called “Chemistry and Chemical Librarianship for Non-Chemists,” and gradually expanded in scope after Judith and Sue Cardinal (University of Rochester) joined the instruction team. In addition to overviews of the main areas of chemistry, it covers information sources and tips for framing research questions and how to find out what your patron is really asking. One common thread was that the time slot of four hours is simply “not enough time” to explore the relevant topics. The course continues its success and influence, and was offered in 2013 outside of SLA for the first time.

Finally, Dana himself said a few words in gratitude to Judith and Ted for organizing such a wonderful slate of talks. He also thanked Caltech for being such a unique institution that allowed for much of his influential work to happen. Because of its small size and sharp focus, meeting the community’s needs still allowed for time and inspiration to pursue projects for improving service not only to the Caltech campus, but also to the profession as a whole. And this unwavering dedication to service, local and global, to students, librarians, and publishers, is what makes Dana such an extraordinary librarian and luminary to others in this field.

“Caltech Chemistry Librarian Discovers Equation for a Satisfying Career” Caltech News, 05/26/2011 http://www.caltech.edu/content/caltech-chemistry-librarian-discovers-equation-satisfying-career

Papers in Caltech CODA by Dana: http://authors.library.caltech.edu/view/person-az/Roth-D-L.html

Dana Roth’s Caltech Library Profile: http://libguides.caltech.edu/profile.php?uid=7047

Chemical Information Sources Wikibook: http://en.wikibooks.org/wiki/Chemical_Information_Sources

Chemical Information for Chemists: http://dx.doi.org/10.1039/9781782620655

Donna Wrublewski, Symposium Presenter and Reporter 

Exploring the Application of New Technologies in Chemical Research and Education

This session was somewhat unusual in the CINF program, with a focus on devices offering new mechanisms for processing and collecting of chemical or scientific information. Sean Ekins from Collaborative Drug Discovery led off with a description of an iPad-based tool for analyzing TB data, using Bayesian statistical analysis to rank active molecules and probable targets for TB prediction and visualization. The resulting app can handle approximately 800 compounds and associated targets. While the app represents a niche use of informatics, it demonstrates that other datasets could also be “appified” to make data visualization and prediction tools available to a broad audience.

Rajeev Hotchandani from Scilligence discussed the use of a touch-optimized Electronic Lab Notebook (ELN). ELNs have been largely unused in academic environments, but this is beginning to change. Since Scilligence’s app is JavaScript-based, and because it makes use of the cloud, there is a very small footprint on the client. The app also integrates with Scilligence’s registration system, so entry of substances, and even ordering of chemicals, is seamless.

Image

Vin Scalfani from the University of Alabama described the use of 3D printers to print crystal structures. While 3D printers have been around for many years, the appearance of inexpensive printers in the last couple of years has enabled the technology to be used in academic labs. Scalfani described several software packages that could be used to convert CIF files into the .stl format recognized on 3D printers. The process uses Jmol to convert the CIF files into VRML, and then AccuTrans 3D to convert VRML to .stl, and Netfabb to repair the .stl file. In order to make 3D files more readily-available to users, Scalfani enlisted Bob Hanson at St. Olaf College to create a custom version of Jmol to automatically handle counterions and solvents, and to pack the crystal lattice, allowing the software to run in batch mode. Scalfini has created over 30,000 3D files and stored them in a crystal data repository hosted by the Royal Society of Chemistry. Scalfani also referred to a 3D print exchange hosted by NIH. (Photo credit: http://chemistry.ua.edu/3-d-printing-of-molecular-models/).

Jeffrey Lancaster, Emerging Technology Director at Columbia University, described his approach at Columbia to envision the library as a neutral space where access to devices, software, and even software training modules, complement  the traditional access to content. Lancaster noted that while installing 3D printing into the library was a highly visible technology, “3D printing is not about the printing.” Instead, while people are interested in the printing technology itself, once the novelty of the technology has passed, the real question becomes how the printer can be used to do something to help the researcher. In some cases, there may be nothing. In other cases, with the help of the expert, the researcher may find something useful that would justify purchase of a device to use in his or her own lab.

Julea Vlassakis, a graduate student at UC Berkeley and UC San Francisco, is part of an innovative group of graduate students who launched the website http://www.teklalabs.org/. The purpose of the website is to share designs for laboratory equipment that can be created by the end user on either an additive device (such as a 3D printer) or a subtractive device (such as a laser cutter), resulting in devices that are far less-expensive to purchase and maintain than vendor equipment. The designs are peer reviewed and must be research grade. Safety is also a concern. Developing countries often receive used equipment from the developed world, and too often the equipment is impossible to maintain.

Teklalabs experience is that there are many willing “makers” in the world, and they are happy to share their experience and ideas, especially when there is a contest involved. From a 3D printing design competition in 2012 with 17 designs, a PrintMyLab competition in 2014 attracted 174 designs. The BuildMyLab 2 contest is scheduled for 2015. Teklalabs cosponsored a Diagnostics by Design Hackathon, which brought together 30 experts in global health, engineering, and computer science to build a prototype to address a global health need. The winning team produced a bug zapper that counted mosquitoes for malaria monitoring.

Image

Steve Feng, from UC Los Angeles, described several projects that involved creation of 3D printed attachments to smartphones in order to create portable instruments. His latest development involved a technique using Google Glass to record and analyze results from home HIV test kits, which are often misread by consumers. The technique involved capturing a picture of the test kit using Google Glass, applying an algorithmic enhancement of the low quality Glass image, and then determining whether the result was positive or negative. Results were promising, and similar test procedures are under development for other visual test kits as well. While this experimental setup and workflow point out some of the limitations of Glass, it demonstrates the potential for more robust Glass-like devices. (Photo credit: http://www.universityofcalifornia.edu/news/google-glass-app-performs-instant-diagnostic-tests).

Manu Prakash, Stanford University, with a vision to equip labs and citizens in remote and poverty-stricken areas with tools to understand their environments better, described his approach to what he has called “frugal science.” He has created a microscope, the Foldscope, out of a single sheet of paper with a lens made in his lab. For under a dollar, a working microscope enables parents and children to see the difference between washed and unwashed hands, impure drinking water, and swimming areas. With this knowledge, the adults and children might be more motivated to acquire healthy practices. Equipping hospital labs is also a challenge, but Prakash is now focusing on the development of punch-card chemistry to assist in medical and environmental analysis. In the technique, a music box mechanism, normally used to play music, is instead used to inject droplets into microfluidic channels based on the program in the punch-card. In almost any CINF program, someone will allude to the historic use of punch-cards as a precursor to the digitized and automated databases we use today. The concept of punch-card chemistry brings that legacy medium back to practical use. Following the symposium, Prakash demonstrated the Foldscope.

I don’t remember ever hearing the word “inspiring” used to describe our CINF sessions, but that was the word used by a couple of those attending. Thanks to all of the speakers for a great symposium.

David Martinsen, Symposium Organizer

IUPAC Solubility Data Series

Image

During the IUPAC 47th General Assembly, which occurred in 2013 in Istanbul, it was proposed that members of the Subcommittee on Solubility and Equilibrium Data (SSED) of the IUPAC Analytical Chemistry Division participate in the 2014 Fall American Chemical Society National Meeting in San Francisco as a venue to celebrate the publication in early 2014 of the 100th volume of the IUPAC-NIST Solubility Data Series. The Solubility Data Series (SDS, http://www.iupac.org/index.php?id=593) has been providing comprehensive compilations and (whenever possible) critical reviews of published data. This has been a major vehicle for helping IUPAC fulfill one of its long-range goals: international standardization of physical constants. The symposium was hosted by the Division of Chemical Information and cosponsored by the Divisions of Analytical Chemistry and History of Chemistry. The presentations were organized so as to highlight the practical importance and present relevance of the work being done inside the SSED framework on solubility data and stability constants.

Mark Salomon, the present editor-in-chief of the IUPAC-NIST Solubility Data Series, made a brief reference to the history of the project that can be traced back to 1972 when A. S. (Stevan) Kertes proposed to the then IUPAC Commission V.6 a project on collecting and evaluating solubility data. Publication began in 1979 with the first volume on the solubility of helium and neon in liquids. Since then 102 volumes have been published. Besides the historical aspects, Salomon also outlined the process of data compilation and evaluation. In the compilation process all the available literature sources of data for a specific solute/solvent system have to be considered, even the very old ones. For example, solubilities published in the 19th century often compare favorably with values measured recently.  Thus, solubilities of NaCl in H2O published in 1885 were found to be comparable to the best modern results. Sometimes such old values constitute the only source of data available. For each published paper Compilation (data) sheets are built providing information on materials, experimental methods and errors. Where sufficient literature data exist, contributors to the SDS provide critical evaluations of the data to determine their merits. Data can be classified as Recommended when agreement between independent authors exists,  Tentative when sufficient literature comparisons cannot be made, but the data appear to be reasonable, or Rejected when qualitative or incorrect. The format of a typical Compilation sheet was presented as well as the general format for critical evaluations.

Allan Harvey, the co-editor-in-chief of the Journal of Physical and Chemical Reference Data (JPCRD, http://scitation.aip.org/content/aip/journal/jpcrd) where the IUPAC-NIST Solubility Data Series has been published since volume 66 in 1998, made the bridge between NIST Standard Reference Data and the Solubility Data Series. The publication of SDS as articles in the JPCRD substituted their earlier publication as monographs by Pergamon Press (volumes 1 to 53) and Oxford University Press (volumes 54 to 65). In their communicaiton, Harvey and D. R. Burgess analyzed the fruitful cooperation between NIST and IUPAC from the perspective of the journal and in the context of NIST’s mission. Efforts are being made to make the data published in the journal more accessible and useful in the future. An effort to make the contents of the pre-1998 volumes available on the Web in a format that will be easily searchable by researchers was described.

Stuart Chalk (University of North Florida) spoke next about the application of the “REST API for the IUPAC Solubility Data Series: a ‘Skunkworks’ project.” The focus of his presentation was to show a way to make the data published by NIST available in a more web-enabled format. Chalk presented an outlined project to scrape data and metadata from pages of the IUPAC Solubility Data Series (http://srdata.nist.gov/solubility/) and make them available via a REST API on the authors website. The data points/datasets will be published at unique REST URLs for referencing. Finally, multiple export options (HTML, XML, JSON, JSON-LD) are available to allow both human and software usage of the data.

Glenn Hefter (Murdoch University, Australia) spoke about the work being done in IUPAC on the critical evaluation of stability (formation) constants of metal-ion complexes with inorganic and organic ligands in aqueous solution. He traced the history of projects in this area back to the 1950s with the creation of IUPAC Commission V.6 on Equilibrium Data. The present SSED of the IUPAC Analytical Chemistry Division was formed in 2000 when V.6 was re-combined with Commission V.8 on Solubility Data. Stability constants are important for modeling chemical speciation in areas as diverse as medicine, engineering, process control, extractive metallurgy, environmental management, and so on. In his talk Hefter provided an overview of the many contributions that have been made by the IUPAC group to the important task of compiling and critically evaluating the plethora of available stability constant and related thermodynamic data, which are widely dispersed across the scientific literature.

Johan Jacquemin (Queen’s University Belfast, UK) and William E. Acree (University of North Texas, USA) spoke about specific aspects related to the IUPAC projects they chair, “Progresses and prospects in the database on ionic liquids solubilities in molecular solvents” and “Models to evaluate experimental solubility data for crystalline nonelectrolyte solutes in organic mono-solvents and solvent mixtures,” respectively.

Clara Magalhães (University of Aveiro, Portugal) and Earle Waghorne (University College Dublin, Ireland) spoke about the need for reliable data that can help in the creation of new paradigms about the present impact of carbon dioxide in global warming, environment remediation technologies, and the effects of solvents on the thermodynamics of electrolyte and non-electrolyte solubilities, respectively.

Clara Magalhães, Symposium Organizer

The following is an overview of the IUPAC Subcommittee on Solubility and Equilibrium Data (SSED), a handout was made available at the CINF Symposium in San Francisco, and also kindly provided to Chemical Information Bulletin by Clara Magalhães, Chairman of the IUPAC SSED.

 

The IUPAC Subcommittee on Solubility and Equilibrium Data (SSED)

Who are we?

Membership of the SSED is open to all scientists who wish to contribute. The current membership includes contributors from 21 countries spread over 4 continents.

Roles

The main roles of the SSED are the comprehensive compilation and critical evaluation of selected thermodynamic data, specifically:

  • solubilities of gases, liquids and solids in liquids and solids, and
  • stability constants for homogeneous reactions.

Topics

Topics range from those of pure scientific interest through to those of pressing environmental, medical and technological importance.

Examples of current projects:

  • Solubility of non-steroidal anti-inflammatory drugs in both neat organic solvents and organic  solvent mixtures
  • Mutual solubility of rare earth metal (Sc, Y, Lanthanoides) bromides in molten alkali bromides
  • Database on solubility and liquid-liquid equilibria of binary mixtures of ionic liquids and molecular solvents
  • Critical evaluation of thermodynamic data of sulfate complexes in solution.

If you are interested in joining the existing project or proposing a new one, please contact us. Your expertise will be valued.

How to contribute?

New contributions are made through proposals addressed to the chairman of the SSED (Prof. M.C. Magalhães, mclara@ua.pt). We welcome proposals and suggestions for work on new projects from chemists everywhere.

Awards

Franzosini Award

The Solubility Data Commission, now the SSED, established the Franzosini Award to assist promising young contributors to the Solubility Data Project to attend SSED meetings and conferences. Since 1988, when the award was established, through 2014, the prize has been awarded to 20 recipients from 14 countries.

Outputs

Publication of stability constants (critically evaluated data) currently occurs as papers in the IUPAC Journal Pure and Applied Chemistry (impact factor in 2013 of 3.1).

Most recent publication:

Powell, K. J.; Brown, P. L.; Byrne, R. H.; Gajda, T.; Hefter G.; Leuz, A.-K.; Sjöberg, S.; Wanner H.; Chemical speciation of environmentally significant metals with inorganic ligands. Part 5: The Zn2++ OH-, Cl-, CO32-, SO42- and PO43- systems. IUPAC Technical Report. Pure Appl. Chem., 2013, 85(12), 2249-2311. http://dx.doi.org/10.1351/PAC-REP-13-06-03.

Publication of solubility data (critically evaluated) currently occurs as papers in Journal of Physical and Chemical Reference Data (impact factor in 2013 of 3.2).

Currently 102 volumes (containing over 30 000 pages) have been published, others are in the pipeline. Editor-in-Chief of the Solubility Data Series: Dr. M. Salomon (marksalomon@comcast.net).

Most recent publication:

Acree, W. E. IUPAC-NIST Solubility Data Series. 102. Solubility of Nonsteroidal Anti-inflammatory Drugs (NSAIDs) in Neat Organic Solvents and Organic Solvent Mixtures. J. Phys. Chem. Ref. Data 2014, 43, 023102; http://dx.doi.org/10.1063/1.4869683

Electronic databases in stability constants and ionic liquids have been or are being developed.

Books

  • Chemicals in the Atmosphere: Solubility, Sources and Reactivity; Fogg, P., Sangster, J., Eds.;  John Wiley & Sons: Chichester, U.K., 2003.
  • he Experimental Determination of Solubilities; Hefter, G., Tomkins, R. P. T., Eds.; John Wiley & Sons: Chichester, U.K., 2003.
  • Biomineralization: Medical Aspects of Solubility; Königsberger, E., Königsberger, L., Eds.; John Wiley & Sons: Chichester, U.K., 2006.
  • Developments and Applications in Solubility; Letcher, T. M., Ed.; RSC Publishing: Cambridge, U.K., 2007.
  • Thermodynamics: Solubility and Environmental Issues; Letcher, T. M., Ed.; Elsevier: Amsterdam, 2007.

Conferences

The SSED organizes the International Symposium on Solubility Phenomena and Related Equilibrium Processes (ISSP), which has been held every two years for over 30 years. The next symposium is planned for 2016, in Switzerland.

Additional information

For more information consult the IUPAC web page: http://www.iupac.org/nc/home/about/members-and-committees/divisions.html, and choose “Analytical Chemistry Division” and then “Subcommittee on Solubility and Equilibrium Data,” or contact Chairman Prof. M.C. Magalhães (mclara@ua.pt), or Secretary Prof. E. Waghorne (earle.waghorne@ucd.ie). 

Multidisciplinary Planning Program Group

Image

The theme of the 248th ACS National Meeting in San Francisco, August 10-14, 2014 was “Chemistry and Global Stewardship.” Our CINF division participated with a theme-related full-day symposium:  “Nature’s Second Act: Revisiting Natural Products.”

The formal Multidisciplinary Program Planning Group (MPPG) meeting was held late Saturday afternoon, August 9, at the Hilton San Francisco Union Square. Lisa Houston welcomed all MPPG participants; division representatives introduced themselves; Richard Love was recognized for his years of service to MPPG as Staff Liaison; and the Dallas General Meeting minutes were reviewed and approved. Dan Daly was elected 2016 MPPG Chair and Christine McInnis was elected 2015-2017 MPPG at-Large Representative. Thematic program chairs gave their reports.

Robin Rogers from the University of Alabama, program chair for the San Francisco meeting, summarized the plenary speakers, and the subthemes of the San Francisco thematic program, and mentioned a few notable division events such as the 100th Anniversary of the ACS ENVR Division and its multiple symposia in San Francisco. He ended with a recommendation that future theme organizers be aware of subthemes and other competing programming.

Thematic program chair for the upcoming 2015 Spring ACS National Meeting in Denver, Robert Weber from Pacific Northwest National Laboratory, updated MPPG on the Denver theme: Chemistry of Natural Resources. Plenary speakers have been chosen:

  • Peter Kareiva, Santa Clara University and Chief Scientist for the Nature Conservancy
  • Dr. Paul Bryan, formerly VP at Chevron and formerly manager of the biomass program at the US DOE
  • Dr. Carolyn Koh, Department of Chemical Engineering at the Colorado School of Mines.

The Fred Kavli Innovations in Chemistry lecture will be given by Dr. Laura Kiessling from the University of Wisconsin-Madison. The Kavli Foundation Emerging Leader in Chemistry lecture has yet to be determined.

The theme for fall 2015, Boston, is Innovation from Discovery to Application. Thematic program chair, Rick Wagner from the University of Michigan, was not able to attend, but submitted his report. Plenary speakers will be:

  • Paula Hammond, MIT
  • Peter Schultz, Scripps Research Institute
  • Karen Wooley, Texas A&M

The Fred Kavli Innovations in Chemistry lecture will be given by Dr. George Whitesides from Harvard.  The Kavli Foundation Emerging Leader in Chemistry lecture has yet to be determined.

The theme for spring 2016, San Diego, is Computers in Chemistry. Thematic program chair, Kenneth Merz from Michigan State University, updated MPPG on progress so far. Modeling will be the broad theme of the meeting with subthemes on computer-aided drug design, big data, nanomaterials with a focus on energy and sustainability, and multiscale modeling.

The theme for fall 2016, Philadelphia, is Chemistry of the People. Thematic program chair, Rudy Baum, retired Editor-in-Chief of C&EN, is just getting started and asked that anyone interested in contributing or helping to organize get in touch with him.

Spring 2017, San Francisco
Theme: Advanced Materials, Technologies, Systems and Processes
Thematic Program Chair: TBD

Fall 2017, Washington, DC
Theme: Chemistry’s Impact on the Global Economy
Thematic Program Chair: TBD

The MPPG meeting ended with a general discussion and vote on themes for 2018 and beyond. The following were approved by the MPPG General Representatives and will be sent to the Divisions for confirmation.

Spring 2018, New Orleans
Theme: Energy Solutions and the Environment (To Be Confirmed)
Thematic Program Chair: TBD

Fall 2018, Boston
Theme: Chemistry – From Bench to Market (To Be Confirmed)
Thematic Program Chair: TBD

Spring 2019, Orlando
Theme: Chemistry for New Frontiers (To Be Confirmed)
Thematic Program Chair: TBD

Roger Schenck, CINF Representative on MPPG

Image

image credit: http://www.acs.org/content/acs/en.html

Abstract submission for CINF symposia for Denver meeting deadline:               October 17, 2014