The year 2014 has been designated the International Year of Crystallography by the United Nations General Assembly, to commemorate the 100th anniversary of the discovery of X-ray diffraction and the 400th anniversary of the observation of the symmetry of ice crystals. In accordance with this, we thought it would be appropriate to describe some of the crystallographic databases that may be of interest to the chemical information professional.
The Biological Macromolecule Crystallization Database, available at http://xpdb.nist.gov:8060/BMCD4/index.faces, contains data on 43,000 proteins, nucleic acids, and viruses. It is provided as a free service by the National Institute of Standards and Technology. Each entry includes identifying data such as protein function and the species it is derived from, as well as space group and unit cell data. It does not contain a complete crystallographic structure, but a bibliographic reference to primary literature is provided.
The Cambridge Structural Database (CSD), found at http://www.ccdc.cam.ac.uk/Solutions/CSDSystem/Pages/CSD.aspx, is provided by the Cambridge Crystallographic Data Centre. The CCDC was originally part of the Department of Chemistry at the University of Cambridge, but is now a separate entity. The CSD contains crystal structures of nearly 700,000 organic, organometallic, and boron-containing compounds; over 40,000 new structures are added per year. One has to pay to access the database, but submission of new structures is free.
CRYSTMET, available at http://www.tothcanada.com/databases.htm, is provided by Toth Information Systems. The database provides crystallographic and supporting bibliographic information on inorganic systems, with a focus on metals, alloys, intermetallic compounds, and minerals. Three-dimensional structures can be displayed, with a variety of plot options. One has to pay for CRYSTMET; there is a demo version, but that requires Materials Toolkit, which is not free.
The Database of Zeolite Structures, available at http://www.iza-structure.org/databases, is provided by the International Zeolite Association, free of charge. It contains structural information, bibliographic references, and powder diffraction data for zeolites. The database is designed to be searched by three-letter structural code (e.g. FAU for faujasite).
The Inorganic Crystal Structure Database (ICSD), which can be found at http://www.fiz-karlsruhe.de/icsd.html, is provided by FIZ Karlsruhe. It contains about 166,000 crystal structures for elements and inorganic compounds. The term "inorganic" is used strictly here: organometallic structures are not included. The ICSD is not free, although there is a 30-day demo version. It is available on STN, or at a dedicated, password-protected website.
MINCRYST, found at http://database.iem.ac.ru/mincryst/index.php, is provided by the Institute of Experimental Mineralogy Russian Academy of Sciences. As the name implies, this is a mineralogical crystallographic database. MINCRYST includes 8557 entries (some minerals have more than one entry), containing space group, unit cell parameters, atomic positions, and other data. It is provided free of charge.
NIST Crystal Data Standard Reference Database, described at http://www.nist.gov/srd/nist3.cfm, is provided by the National Institute for Standards and Technology. One has to pay for this database, but it appears to be fairly inexpensive ($490 and that's not a per-year charge) as these things go. The NIST Crystal Data Standard Reference Database contains data on “more than 237,671” organic and inorganic systems, including minerals, drugs and pesticides.
The Nucleic Acid Database, available at http://ndbserver.rutgers.edu, is a free service provided by researchers at Rutgers University. It contains three-dimensional structures of about 7000 DNA and RNA species, and their complexes with drugs and proteins. Structures from solution NMR are included, as well as those derived from X-ray diffraction. The site includes a rotatable 3D viewer.
The Pauling File, described at http://paulingfile.com, is provided by Japan Science and Technology Corporation (JST) and Material Phases Data System (MPDS). The Pauling file contains phase diagrams, crystal structures, and physical properties of elements and inorganic compounds. It includes 271,710 crystal structures. The Pauling File is commercially available from several vendors in various formats, including from ASM, Materials Design, and Springer Materials, although some of these are subsets of the complete Pauling File.
The Powder Diffraction File, found at http://www.icdd.com/products/pdf4.htm, is provided by the International Centre for Diffraction Data. An annual subscription is required. The inorganic database contains 340,653 structures; the organic database, sold separately, contains 479,278 structures. A minerals-only database is also available, for less than the full inorganic database.
The Protein Data Bank (PDB), available at http://www.rcsb.org/pdb/home/home.do, is provided by the Research Collaboratory for Structural Bioinformatics. It contains structures of just under 100,000 proteins and nucleic acids. The PDB is available for free, and the site includes a rotatable 3D viewer. A “detailed view” gives such information as the space group, the primary structure (amino acid or base sequence), the resolution of the structure, and a citation to the primary literature.
Thus, several crystallographic databases are available to the information professional, either online or via shippable media. Some are free of charge; others require either a one-time fee or an annual subscription. Most specialize in some broad region of chemical space, such as biomolecules or inorganics. On that last note, the ACS Division of Inorganic Chemistry hosted "A Celebration of Crystallography in Solid-State and Materials Chemistry: Complex Problems and New Solutions in Inorganic Small-Molecule Crystallography" at the Spring 2014 ACS Meeting, indicating that research in this useful technique continues to be performed.
David Shobe, Assistant Editor, Chemical Information Bulletin