Can you hack it? On librarian-ing at hackathons

I had the great pleasure of spending the last few days working on a team at the latest NCBI hackathon.  I think this is the sixth hackathon I’ve been involved in, but this is the first time I’ve actually been a participant, i.e. a “hacker.”  Prior to working on these events, I’d heard a little bit about hackathons, mostly in the context of competitive hackathons – a bunch of teams compete against each other to find the “best” solution to some common problem, usually with the winning team receiving some sort of cash prize.  This approach can lead to successful and innovative solutions to problems in a short time frame.  However, the so-called NCBI-style hackathons that I’ve been involved in over the last couple years involve multiple teams each working on their own individual challenge over a period of three days. There are no winners, but in my experience, everyone walks away having accomplished something, and some very promising software products have come out of these hackathons.  For more specifics about the how and why of this kind of hackathon, check out the article I co-authored with several participants and the mastermind behind the hackathons, Ben Busby of NCBI.

As I said, this time was the first hackathon that I’ve actually been involved as a participant on a team, but I’ve had a lot of fun doing some librarian-y type “consulting” for five other hackathons before this, and it’s an experience I can highly recommend for any information professional who is interested in seeing science happen real-time.  There’s something very exciting about watching groups of people from different backgrounds, with different expertise, most of whom have never met each other before, get together on a Monday morning with nothing but an often very vague idea, and end up on Wednesday afternoon with working software that solves a real and significant biomedical research problem.  Not only that, but most of the groups manage to get pretty far along on writing a draft of a paper by that time, and several have gone on to publish those papers, with more on their way out (see the F1000Research Hackathons channel for some good examples).

As motivated and talented as all these hackathon participants are, as you can imagine, it takes a lot of organizational effort and background work to make something like this successful.  A lot of that work needs to be done by someone with a lot of scientific and computing expertise.  However, if you are a librarian who is reading this, I’m here to tell you that there are some really exciting opportunities to be involved with a hackathon, even if you are completely clueless when it comes to writing code.  In the past five hackathons, I’ve sort of functioned as an embedded informationist/librarian, doing things like:

  • basic lit searching for paper introductions and generally locating background information.  These aren’t formal papers that require an extensive or systematic lit review, but it’s useful for a paper to provide some context for why the problem is significant.  The hackers have a ton of work to fit in to three days, so it’s silly to have them spend their limited time on lit searching when a pro librarian can jump in and likely use their expertise to find things more easily anyway
  • manuscript editing and scholarly communication advice.  Anyone who has worked  with co-authors knows that it takes some work to make the paper sound cohesive, and not like five or six people’s papers smushed together.  Having someone like a librarian with editing experience to help make that happen can be really helpful.  Plus, many librarians  have relevant expertise in scholarly publishing, especially useful since hackathon participants are often students and earlier career researchers who haven’t had as much experience with submitting manuscripts.  They can benefit from advice on things like citation management and handling the submission process.  Also, I am a strong believer in having a knowledgeable non-expert read any paper, not just hackathon papers.  Often writers (and I absolutely include myself here) are so deeply immersed in their own work that they make generous assumptions about what readers will know about the topic.  It can be helpful to have someone who hasn’t been involved with the project from the start take a look at the manuscript and point out where additional background or explanation might be beneficial to aiding general understandability.
  • consulting on information seeking behavior and giving user feedback.  Most of the hackathons I’ve worked on have had teams made up of all different types of people – biologists, programmers, sys admins, other types of scientists.  They are all highly experienced and brilliant people, but most have a particular perspective related to their specific subject area, whereas librarians often have a broader perspective based on our interactions with lots of people from various different subject areas.  I often find myself thinking of how other researchers I’ve met might use a tool in other ways, potentially ones the hackathon creators didn’t necessarily intend.  Also, at least at the hackathons I’ve been at, some of the tools have definite use cases for librarians – for example, tools that involve novel ways of searching or visualizing MeSH terms or PubMed results.  Having a librarian on hand to give feedback about how the tool will work can be useful for teams with that kind of a scope.

I think librarians can bring a lot to hackathons, and I’d encourage all hackathon organizers to think about engaging librarians in the process early on.  But it’s not a one-way street – there’s a lot for librarians to gain from getting involved in a hackathon, even tangentially.  For one thing, seeing a project go from idea to reality in three days is interesting and informative.  When I first started working with hackathons, I didn’t have that much coding experience, and I certainly had no idea how software was actually developed.  Even just hanging around hackathons gave me so much of a better understanding, and as an informationist who supports data science, that understanding is very relevant.  Even if you’re not involved in data science per se, if you’re a biomedical librarian who wants to gain a better understanding of the science your users are engaged in, being involved in a hackathon will be a highly educational experience.  I hadn’t really realized how much I had learned by working with hackathons until a librarian friend asked me for some advice on genomic databases. I responded by mentioning how cool it was that ClinVar would tell you about pathogenic variants, including their location and type (insertion, deletion, etc), and my friend was like, what are you even talking about, and that was when it occurred to me that I’ve really learned a lot from hackathons!  And hey, if nothing else, there tends to be pizza at these events, and you can never go wrong with pizza.

I’ll end this post by reiterating that these hackathons aren’t about competing against each other, but there are awards given for certain “exemplary” achievements.  Never one to shy away from a little friendly competition, I hoped I might be honored for some contribution this time around, and I’m pleased to say I was indeed recognized . 🙂

It's true, I'm the absolute worst at darts.

There is a story behind this, but trust me when I say it’s true, I’m the absolute worst at darts.

The Librarian’s First Dataset: A Treatise on Incredible Nerdiness

I must preface this post by saying that, if you didn’t know already, I’m a huge herd.  The biggest.  There’s nothing I’m more passionate about than knowledge and learning, and this has often earned me very perplexed looks from people who probably think I’m crazy.  In this post, I’m going to wax poetic about knowledge and reveal the depths of my geekiness.  However, I’m guessing if you’re here reading this blog, this is probably not going to come as any sort of a surprise to you.

For the last few weeks, I’ve been working on planning a research data management class.  Working with researchers on their data is hands-down my favorite part of my job.  I adore science and the best part of being a medical librarian/research informationist is that I get to work with all different researchers and hear about all sorts of fascinating things.  Sometimes I regret that I didn’t get a science degree, but mostly I’m okay with it because this job allows me to get my hands into all sorts of different things and never have to choose a specialty. Talking to researchers is fascinating.  However, the more I talk to them, the more I realize that a lot of them really have no idea what they’re doing when it comes to data management.  These are brilliant people, to be sure, but the way they handle their data makes me cringe.  They’ve never been trained to do it properly, but as a librarian, I have that training.  Part of what I do is helping people with their data, but I also believe in the adage about giving a man a fish versus teaching him to fish.  I’m one librarian in a huge research enterprise.  As much as I’d like to, there’s no way I could possibly reach everyone to personally help them figure out their data.  So one of the things I decided to do to help mitigate the fact that I can’t be in eight million places at once is to offer a class on research data management.

Because I work in the field of medicine, in which everything must be evidence-based, of course I wasn’t satisfied just to offer a class and hope people liked it.  I am a data librarian, so I decided that I should probably gather some data!  My plan was to devise a pre-test that people would take before the class, then a follow-up post test.  Obviously the goal was that they wouldn’t know the answers to the questions on the pre-test, and then they would after the class. I spent weeks agonizing over how best to assess this. I’ve had very, very preliminary training in devising assessment instruments, but mostly I was just kind of taking a shot in the dark when I came up with my pre-test. I changed the questions a million times, but I finally came up with something that I thought would probably work.

Today, our office manager sent out the reminder email about tomorrow’s class to those who had RSVP’d.  The email contained a link to the survey and a brief explanation of why I was asking people to complete it.  It was a short survey, took only a couple minutes to complete, but I had this sinking feeling that everyone would ignore it.  Because of IRB (Institutional Research Board) requirements, I had emphasized in the email that people weren’t required to take the survey if they wanted to do the class.  I figured people would see that and just ignore the survey, but I was keeping my fingers crossed.  I was on the train to the airport in San Francisco on my way back to Los Angeles when I saw that the email had gone out.

So now, allow me to set the scene for one of the nerdiest moments of my life.  I had gotten to the airport and had some time to kill before my flight, so I was sitting in a wine bar getting something to eat (and drink of course).  I ordered a glass of Champagne (yeah, that’s how I roll) and pulled out my laptop.  I was logging on when the Champagne arrived.  I pulled up the survey site.  The email had only gone out maybe an hour or so earlier, so I wasn’t expecting any responses yet.  But when I logged on, you know what I found?  Almost EVERY SINGLE PERSON who has registered for the class had taken the survey!  When I saw the number of responses, I made an audible, astonished gasp, and several people in the restaurant turned and looked at me.  I refrained from getting up from my seat and jumping up and down in excitement, though this is what I would have done if I had been alone. 🙂

Not only did people respond to my survey, but they responded exactly as I hoped they would.  I won’t go into detail here, since obviously I’m going to attempt to publish all of this in a peer-reviewed journal.  🙂  But essentially, these pre-test results reveal that, as I had suspected, these people really need a lot of help with this stuff and don’t have a lot of knowledge of the many awesome resources out there.  Hopefully that will all change tomorrow when I teach this class.

So that is the story of how I came to have my very own research dataset.  This is incredibly heartening for me.  For one thing, I’ve always felt like I really ought to have more hands-on experience working with data if I’m going to teach it.  My dataset is super tiny compared to the datasets I help researchers with, but this is a good start.  More importantly, I am so excited that this actually worked.  I’ve been wanting to move forward with additional research in this area, but I wasn’t entirely sure if it was worthwhile, since I basically only had anecdotal evidence to suggest this kind of thing was needed, and there have been a few naysayers whose words weighed heavily on my mind.  I’ve worked really hard on all of this, and it’s been exhausting, especially with having to work around sort of a crazy travel schedule.  But now it feels like things are all falling into place.  All those little ideas I’ve had floating around in my mind about additional research I’d like to do feel a little more feasible now.  So it’s an exciting time for me career-wise.  Now that I’m a little more assured that I know what I’m doing, I have some good ideas about how to move forward. I’ve got a hunger for data and research now and I need more. 🙂

So yeah, again, probably news to no one, but I’m a huge nerd.  Now, in celebration, I’m going to order a second glass of Champagne to enjoy in the hour before I have to catch my flight.  Cheers!