Evolving roles of librarians in data management and curation

ImageBio: Kristin Briney has recently obtained a M.A. in library and information science from the University of Wisconsin-Madison (May 2013) and shortly after started in a temporary position as Data Services Librarian at the University of Wisconsin-Milwaukee. She is a holder of a PhD in physical chemistry from the University of Wisconsin-Madison (2010) and a B.A. in chemistry and computer science from DePauw University (2005).

Svetlana Korolev: Kristin, please accept my sincere congratulations on being the recipient of the 2013 Lucille Wert Scholarship awarded to you by Division of Chemical Information. The award announcement says that you are planning to combine your advanced scientific background in physical chemistry with knowledge of library and information science to tackle challenges in the burgeoning field of data curation. What has brought you to the realization of such professional interest? Has it been triggered by a person, an event, or something else?

Kristin Briney: It has long been my goal to do something at the intersection of science and technology but it wasn’t until recently that I figured out this was data curation. I was never a great laboratory scientist and much preferred hacking at my data, thinking about e-lab notebooks, and engaging other scientists to handle solvents. Happily, all of these interests align with issues in data management.

I found out about data curation when I was looking into career options after finishing my PhD. I was speaking with Emily Wixson, the now retired chemistry librarian at UW-Madison, about science librarianship and she mentioned this new growing area called data curation. At that point, everything started to fall into place. I really credit Emily for introducing me to this field and for giving me a push into this career by connecting me with the right people.

SK: Speaking of the “burgeoning field of data curation” and, coincidently, cheering for Antony Williams, 2013 CINF Chair, on his recognition with the 2012 Jim Gray eScience Award for making chemistry publically available through collective action via ChemSpider, let me ask you about the terminology of the field. Since this is an evolving methodology, which terms do you use when talking to chemical scientists for describing your services?

KB: This is a good question. The area in which I work is a fairly new space so the terminology is still settling and being spread outside of the community. I consider my work to be in “data management,” which I think is a more accessible way of saying “data curation.”

Data curation is related to eScience, in that they both focus on digital data, but there are distinct differences. Data curation is about the organization, management, and preservation of research data, while eScience concerns the new modes of scientific research that can be done such as data mining because data are now digital. Good data management enables better eScience, but I don’t actually do any scientific research. I’ve recently read an article Thoughts on “eResearch”: a Scientist’s Perspective by Amanda Whitmire that discusses the etymology of “eScience” and “eResearch.” It attributes the term “eScience”' to John Taylor. If you are interested in eScience, I recommend the books Reinventing Discovery: The New Era of Networked Science by Michael Nielsen and The Fourth Paradigm: Data-Intensive Scientific Discovery edited by Tony Hey, Stewart Tansley, and Kristin Tolle.

Complicating this terminology is that there are other newish fields related to data curation, such as digital preservation, open data, and open notebook science. Digital preservation focuses on the retention of digital information, research data included, over the long term. Open data is the growing movement to disseminate datasets along with their published articles. Open notebook science takes this concept further by putting laboratory notebooks online in real time. The latter two terms fall under the umbrella of “open science.”

Part of what I like about data management is that it borrows so heavily from other fields, not limited to the ones listed here. Things are still in flux, which makes it an exciting time to be in this field.

SK: Kristin, you are the most recent graduate from the School of Library and Information Studies, University of Wisconsin – Madison (May 2013). What is the current state of the curriculum supporting campus research needs and someone’s specialization in the data management processes, especially focusing on natural sciences? Have you benefited from taking any courses on this subject? What advice could you give to individuals wishing to get educated on this matter?

KB: Data curation has been around long enough that there is some established coursework in the area, but very few whole programs. I know of two universities that offer MLIS degrees specifically in data curation: the University of Illinois at Urbana-Champagne and the University of North Carolina at Chapel Hill. Other universities offer one, some, or no data curation classes, depending on the program.

I went to a more traditional library school and had to create my own path through the curriculum. I had only one class in digital curation, but was able to piece together a useful program from classes both inside and outside of the department. Having a flexible advisor really helped.

Honestly, the most important part of my library school experience was working on data curation projects outside of the classroom. I spent a year embedded in a microscopy laboratory focusing on data management issues, a semester teaching information literacy and good data management practices, and another semester on a team building a data management system for a virtual reality laboratory. Each of these experiences added to my understanding of data management at a deeper level than I could ever get in the classroom. So if I had one piece of advice for anyone going into data curation, it would be to get as much hands-on experience as you can.

SK:  Based on your job hunting experience have you noticed whether many libraries are recruiting for a similar position as Data Services Librarian? What are the common titles for such positions and the skills required of librarians in this new area? Were you able to find electronic discussion groups or other networking opportunities? Which professional societies could be relevant for your specialization? 

KB: It is a really good time to be in data management because many universities are looking to establish data services. These jobs are listed under many names: Data Services Librarian, Data Librarian, E-Science Librarian, Research Data Librarian, etc. I have also seen quite a few “science librarian” positions in which a portion of the job responsibilities pertain to data curation.

Data curation jobs come with a variety of required skills, most often: an MLS, data curation experience, research experience or an advanced degree, various technical skills, communication skills, and project management experience. Beyond matching up skills in a job post, I also look to work in an environment that is open in to new initiatives. There is no one standard way to address data problems, meaning that data managers must try new things (risking the occasional failure) and build incremental progress. Being in a supportive environment makes a big difference to the ultimate success of data services.

Now that I’m in a data curation position, it’s important for me to keep up with the field because new things are always coming up. I can’t recommend Twitter enough here. Twitter lets me network with peers in my field (many of whom are heavy Twitter users), keep up with the latest articles, and get immediate feedback when I’m stuck on a data problem. Besides Twitter, I subscribe to several listserves (acr-idgc-l, asis-l, sts-l, and chminf-l) and am looking forward to going to the Research Data Access and Preservation Summit this spring. There is really no one place to talk about data curation at the moment, so I’m always on the lookout for new forums for discussion.

SK: What are the major trends and opportunities for librarians to be involved in the lifecycle of the scholarly creation and management of data? At what stage of the research process do you think the librarians could contribute to this process? Could you envision possible new activities evolving in five years?

KB: I think that academic libraries are going to become much more involved with research data as data dissemination becomes a regular part of the scholarly process. I anticipate that in the near future we will be helping patrons find and cite research datasets in the way that we currently help them find and cite the data’s corresponding journal articles.

But to be the leaders in finding and citing data, we need to take part in solving the current data problem: disorganized and mismanaged data. It’s not a role that we have traditionally played, but it is an area where we have important skills: organization, documentation via metadata, and preservation. We risk losing our relevance as experts in these areas if we ignore the data problem entirely. Additionally, being involved early in the data management process lets us shape the data dissemination systems that we will be soon helping our patrons use.

SK: How are you embracing the first tasks in the newly created Data Services Librarian position at the University of Wisconsin - Milwaukee? What projects have you been working on recently?

KB: One of my big focuses is on outreach and education. In particular, I don’t think that we are adequately preparing students to manage data well once they become working scientists. I remember being incredibly frustrated by dealing with data when I was a grad student in chemistry and that’s an experience that I don’t want other students to have.

So data management training is one of the two services I’m starting up in my new position at UWM (the other is data management plan consultations). In addition to the formal sessions I’m planning, I’m hoping to have a lot of informal discussions about data management with students and faculty on campus. I have found that once you start talking with researchers they intrinsically understand the problems around data management, but they don’t necessarily know the solutions. That’s where I come in. It’s my goal to engage with and help as many researchers along as many avenues as I possibly can.

SK: Tell us something about yourself. Do you enjoy living in Wisconsin? What hobbies do you have?

KB: I have been living in Wisconsin for eight years now, but only just moved to Milwaukee. I have really enjoyed learning the culture of this state and the rich history of Milwaukee. At this point, I have been on more brewery tours than I can count and have come to believe that one should always live near a lake of some sort.

When I’m not thinking about data, I like to go out biking. That is wonderful for about half the year in Wisconsin and for the other half I have wool and knitting. My other big hobby is blogging. My knitting blog has been around for over 5 years and I just started a blog called “Data Ab Initio” that aims to demystify data management for researchers. The data blog has been a very fun project and has really helped my own understanding of many data management issues.

SK: Kristin, thank you very much for an inspiring discussion of the opportunities for librarians to get involved in data management and curation. Once again, congratulations on being the recipient of the 2013 Lucille Wert Scholarship and best wishes on your endeavors!

