So you think you can code

I’ve been thinking about many ideas lately dealing with data and data science (this is, I’m sure, not news to anyone).  I’ve also had several people encourage me to pick my blog back up, and I’ve recently made my den into a cute and comfy little office, so, why not put all this together and resume blogging with a little post about my thoughts on data!  In particular, in this post I’m going to talk about coding.

Early on in my library career when I first got interested in data, I was talking to one of my first bosses and told her I thought I should learn R, which is essentially a scripting language, very useful for data processing, analysis, statistics, and visualization.  She gave me a sort of dubious look, and even as I said it, I was thinking in my head, yeah, I’m probably not going to do that.  I’m no computer scientist.  Fast forward a few years later, and not only have I actually learned R, it’s probably the single most important skill in my professional toolbox.

Here’s the thing – you don’t have to be a computer scientist to code, especially in R.  It’s actually remarkably straightforward, once you get over the initial strangeness of it and get a feel for the syntax.  I started offering R classes around the beginning of this year and I call my introductory classes “Introduction to R for Non-programmers.”  I had two reasons for selecting this name: one, I had only been using R for less than a year myself and didn’t (and still don’t) consider myself an expert.  When I started thinking about getting up in front of a room of people and teaching them to code, I had horrifying visions of experienced computer scientists calling me out on my relative lack of expertise, mocking my class exercises, or correcting me in front of everyone.  So, I figured, let’s set the bar low. 🙂  More importantly, I wanted to emphasize that R is approachable!  It’s not scary!  I can learn it, you can learn it.  Hell, young children can (and do) learn it.  Not only that, but you can learn it from one of a plethora of free resources without ever cracking a book or spending a dime.  All it takes is a little time, patience, and practice.

The payoff?  For one thing, you can impress your friends with your nerdy awesome skills!  (Or at least that’s what I keep telling myself.)  If you work with data of any kind, you can simplify your work, because using R (or other scientific programming languages) is faaaaar more efficient than using other point and click tools like Excel.  You can create super awesome visualizations, do crazy data analysis in a snap, and work with big huge data sets that would break Excel.  And you can do all of this for free!  If you’re a research and/or medical librarian, you will also make yourself an invaluable resource to your user community.  I believe that I could teach an R class every day at my library and there would still be people showing up.  We regularly have waitlists of 20 or more people.  Scientists are starting to catch on to all the reasons I’ve mentioned above, but not all of them have the time or inclination to use one of the free online resources.  Plus, since I’m a real human person who knows my users and their research and their data, I know what they probably want to do, so my classes are more tailored to them.

I was being introduced to Hadley Wickham yesterday, who is a pretty big deal in the R world, as he created some very important R packages (kind of like apps), and my friend and colleague who introduced me said, “this is Lisa; she is our prototypical data scientist librarian.”  I know there are other librarian coders out there because I’m on mailing lists with some of them, but I’m not currently aware of any other data librarians or medical librarians who know R.  I’m sure there are others and I would be very interested in knowing them.  And if it is fair to consider me a “prototype,” I wonder how many other librarians will be interested in becoming data scientist librarians.  I’m really interested in hearing from the librarians reading this – do you want to code?  Do you think you can learn to code?  And if not, why not?

17 thoughts on “So you think you can code

  1. I am interested in coding, but feel like I would be overwhelmed because I only speak computer to an extent, and after that it feels like it is over my head. Also, not sure how open the hospital would be to me coding.

    • That’s an interesting point – I know from hospital library friends that hospitals are notoriously locked down when it comes to IT. My (admittedly uneducated) impression was that most of the issues had to do with connecting to outside servers, which makes sense since you wouldn’t want patient data getting out. Is it hard to get basic programs installed too?

  2. Can you give me a practical example of how you have used R in your library setting?
    In other words. . .name some projects please that you used R and it was better than using another program.
    Thanks very much,

  3. I”m glad you wrote this post Lisa! Hearing how easy it was for you to learn R and how useful you find it will inspire others to learn it. Echoing Kellee’s request- could you give an example of a specific project or type of data for which you used R?

  4. I would really like to learn it. Have you thought about giving a class at MLA? Or I am apart of the South Central Chapter (SCC/MLA), I would love to have you give a class at one of our chapter meetings.

    • I’m thinking of proposing a class for MLA 2017, and this post was partly intended to feel out whether people are interested in learning it. I love going to chapter meetings too!

  5. I think for future blog posts I’ll go into more detail about some specific things I’ve done with R (and Python, as I’m in the process of learning it). My work is a little different from “typical” library work since I’m an informationist and focus almost exclusively on data. I do a lot of my own library research, so I use R for data analysis. For example, I did most of the analysis and all the figures for this article in R: Really you can use it for anything you’d use Excel for. If you’re just tracking a simple spreadsheet, Excel is probably easier, but if you’re doing anything with charts, graphs, or visualization, R is far superior. It’s also really good for automating tasks you’d repeat over and over. Basically you can write the code once and use it over and over, so it ends up being way quicker and more efficient than pointing and clicking. I’ll try to think of other specific examples relevant to librarians for future blog posts! 🙂

    • Thanks for this blog post! I learned a bit of Python in library school, but I have to review it or use it every so often to retain the knowledge. Do you find it’s easy to retain your R skills – since you use it for projects – or does it require periodic reviewing?

      • I use R on a near daily basis, between using it for projects, doing consultations, teaching classes, and developing new classes. I can definitely see how you could get a bit rusty if you weren’t using it often. On the other hand, I really do think it’s a pretty straightforward language, much more so than Python, so once you have the syntax down, I don’t think you’d forget it, even if you maybe didn’t remember all the specific functions off the top of your head.

        I’m curious about your experience with Python in library school. Was that required, or just something you had an interest in?

        • It was part of a required course. The class was essentially an introduction to computing for those with no prior technical experience and Python was the language we learned.

  6. I write a lot of code. I’ve only played around with R, which I like, but which I haven’t yet needed for anything. On the server-side I mostly use PHP, although I’ve been getting more and more into Python (I like it’s object and data models a lot) and I sometimes end up writing some bash scripts. You can write pretty much anything in PHP. I’ve also written some pretty sophisticated XSLT for some XML parsing purposes, although it’s been awhile. I write a lot of stuff using client-side web code, including some games. I’ve dabbled in a number of other languages, but mostly I end up writing web code of various types or writing scripts to work with different kinds of (usually library) data.

    • I LOVED your Zombie Emergency game demo at MLA in Austin. 🙂 Sounds like you’re doing some cool stuff – is your job as a library developer, or is this something you’re doing on the side?

      • Thanks! I’m basically a developer (my current title is Innovation Architect) and, I guess, utility infielder. I’d love to branch more into data, and I may get my chance, but most of my work in that vein to date has been data wrangling for catalog migrations, converting data formats, or parsing things like EZProxy logs. I’ve got some cool Kung Fu for that, but I haven’t really done much with researcher data sets. I’d like to play around with some novel storage types and metadata concepts (like graph databases and my emergent metadata model).

  7. Pingback: OTR Links 09/22/2015 | doug — off the record

  8. “…I’m not currently aware of any other data librarians or medical librarians who know R.” I think there may be more of us than you think! I taught myself R (via Coursera, and just plowing through it for a research project) after having used Matlab for 10+ years in my prior field. My buddy here at OSU Libraries, Steve, also programs in R. I think several of us #datalibs have been through the Coursera Data Science Certificate program (I’m in progress), but I don’t know how many of us use R as often as you do. I do intermittently, as I need it to do data analysis. I wish I had an excuse to use it more regularly!

    • Good, I had a feeling there were probably many of us coding out there and I just didn’t know who they were! I would suspect it’s probably more common among #datalibs, but I don’t know many other #medlibs who are doing much with R. Or if they are, it’s not something that I hear discussed much within our professional organizations.

Leave a Reply

Your email address will not be published. Required fields are marked *