Scientific “artifacts” – #overlyhonestmethods and indirect observation

This week I’ve been reading the first half of Bruno Latour and Steve Woolgar’s book Laboratory Life: The Construction of Scientific Facts.  Like many of the other pieces I’ve been reading lately, this book argues for a social contructivist theory of scientific knowledge, which is a perspective I’m really starting to identify with.  What I’m finding most interesting about this book is the ethnographic approach that was taken to observe the creation of scientific knowledge.  Basically, Bruno Latour spent two years observing in a biology lab at the Salk Institute.  Chapter 1 begins with a snippet of a transcript covering about 5 minutes of activity in a lab – all the little seemingly insignificant bits of conversation and activity that, taken together, would allow an outside observer to understand how scientific knowledge is socially constructed.

The authors emphasize that real sociological understanding of science can only come from an outside observer, someone who is not themselves too caught up in the science – someone who can’t see the forest for the trees, as it were.  They even suggest that it’s important to “make the activities of the laboratory seem as strange as possible in order not to take too much for granted” (30).  Why should we need someone to spend two years in a lab watching research happen when the researchers are going to be writing up their methods and results in an article anyway, you may ask?  The authors argue that “printed scientific communications systematically misrepresent the activity that gives rise to published reports” and even “systematically conceal the nature of the activity” (28).  In my experience, I would agree that this is true – a great example of it is #overlyhonestmethods, my absolute favorite Twitter hashtag of all time, in which scientists reveal the dirty secrets that don’t make it into the Nature article.

I’ve been thinking that an ethnographic approach might be an effective way to approach my research, and I’m thinking it makes even more sense after what I’ve read of this book so far.  However, this research was done in the 1970s, when research was a lot different.  Of course there are still clinical and bench researchers who are doing actual physical things that a person can observe, but a lot of research, especially the research I’m interested in, is more about digital data that’s already collected.  If I wanted to observe someone doing the kind of research I’m interested in, it would likely involve me sitting there and staring at them just doing stuff on a computer for 8 hours a day.  So I’m not sure if a traditional ethnographic approach is really workable for what I want to do.  Plus, I don’t think I’d get anyone to agree to let me observe them.  I know I certainly wouldn’t let someone just sit there and watch me work on my computer for a whole day, let alone two years (mostly because I’d be embarrassed for anyone else to know how much time I spend looking at pictures of dogs wearing top hats and videos of baby sloths).  Even if I could get someone to agree to that, I do wonder about the problem of observer effect – that the act of someone observing the phenomenon will substantively change that phenomenon (like how I probably wouldn’t take a break from writing this post to watch this video of a porcupine adorably nomming pumpkins if someone was observing me).

This thought takes me back to something I’ve been thinking about a lot lately, which is figuring out methods of indirect observation of researchers’ data reuse practices.  I’m very interested in exploring these sorts of methods because I feel like I’ll get better and more accurate results that way.  I don’t particularly like survey research for a lot of reasons: it’s hard to get people to fill out your survey, sometimes they answer in ways that don’t really give you the information you need, and you’re sort of limited in what kind of information you can get from them.  I like interviews and focus groups even less, for many of the same reasons.  Participant observation and ethnographic approaches have the problems I’ve discussed above.  So what I think I’m really interested in doing is exploring the “artifacts” of scientific research – the data, the articles, the repositories, the funny Twitter hashtags.  This idea sort of builds upon the concept I discussed in my blog last week – how systems can be studied and tells us something about their intended users.  I think this approach could yield some really interesting insights, and I’m curious to see what kind of “artifacts” I’ll be able to locate and use.

If data sharing is difficult, what can it tell us? An Actor-Network Theory approach

In my ongoing adventures in science and technology studies readings, this week I’ve been reading The Social Construction of Technological Systems.  It diverges a little bit from my interests, strictly speaking, and focuses more on development of technologies rather than more of the laboratory and clinical science that I’m interested in, but I’m still glad I read it because it sparked some thoughts and ideas that I think could be interesting to pursue.

The portions of the collection that I read were rooted in social constructivist theory (as you might guess from the title of the book), specifically Actor-Network Theory (ANT).  The preface to the 25th anniversary edition explores some new developments in the field since the original edition, including “posthuman” approaches that consider nonhuman actants within social systems (xxv).  Scientific researchers operate within a complex system – not only because scientific research is itself often complicated, but also because science happens within a social system involving things like grant funding and scholarly articles and citations and so on.  Data play important roles in that system, as the raw product of scientific research, as evidence for scientific claims, and, now that many researchers operate in fields where data sharing is becoming more expected, something of a commodity.  In ANT, actants can be nonhuman, so I think it would be reasonable to consider data an actant in the social network of scientific research, and potentially one of the more interesting parts of that network, even more so than the humans.

The other avenue this collection sent my mind down had to do with data repositories.  At the start of the chapter “Society in the Making: The Study of Technology as a Tool for Sociological Analysis,” Michael Callon argues that “the study of technology itself can be transformed into a sociological tool of analysis” (77).  To summarize his thesis, essentially he argues that technological systems are created by what he calls “engineer-sociologists,” the designers or creators of the technology, who have had to essentially transform themselves into sociologists to study the intended users in order to develop technologies that will meet their needs.  If this is true, then these new technologies should be able to tell us something about their intended users.

This chapter got me thinking about some of the systems that are in place for data sharing, like some of the major data repositories.  I won’t name any names, but there are a couple of very well-known data repositories that people often complain to me about when it comes to submitting their data.  In some labs, researchers have mentioned that they have one person who knows how to submit the data, and they all have to bug that person because they can’t figure out how to do it properly.  I’ve read some of the help documentation for some of these repositories, and those people weren’t complaining for nothing.  Many of these systems are a big pain – opaque in many of their requirements and onerous to use, yet many researchers are specifically required to put their data there because of grant or journal requirements.

So if we take Callon’s approach and view the system as a tool for sociological analysis, what does it say about the state of data sharing that some of these repositories are so difficult to use?  I can think of possibilities:

  • that the engineers haven’t really been in all that close of contact with the users, so they’ve built a system that doesn’t actually meet their users’ needs;
  • that the needs of the system administrators (good quality data with a minimal amount of effort on their part) are directly at odds with the needs of the data submitters (also a minimal amount of effort on their part) and the administrators’ needs won out;
  • that the engineers are aware of issues but there just isn’t money/time/resources to make the system easier to use.

Another possibility is that sharing data isn’t really that much of a priority for most researchers, so they go along with a hard-to-use system because it’s not worth the trouble to try to get it to change.  It’s sort of like how I feel like it’s really a huge pain to have to deal with the DMV, but I only have to go there once every few years, so I’m not about to start a huge campaign to reform the DMV, especially when there are bigger problems our elected officials should be dealing with.  Maybe sharing your data in some of these systems is like that – an annoyance you deal with because you have to.

This is all entirely speculation on my part, but I do think it’s an interesting approach to take.  It would be interesting to sit down with some of the people who built or who currently run some of these systems and get the story on why things are the way they are.