p-hacking and science with an agenda

I recently read this post about p-hacking (see also: data dredging, fishing, snooping). Two things that I found to be noteworthy were an interactive example of how p-hacking works, and a description of an experiment where different research teams analyzed the same data set:

 

“Twenty-nine teams with a total of 61 analysts took part. The researchers used a wide variety of methods, ranging — for those of you interested in the methodological gore — from simple linear regression techniques to complex multilevel regressions and Bayesian approaches. They also made different decisions about which secondary variables to use in their analyses.

 

Despite analyzing the same data, the researchers got a variety of results. Twenty teams concluded that soccer referees gave more red cards to dark-skinned players, and nine teams found no significant relationship between skin color and red cards.”

 

To reiterate, all of the methods used were justifiable. There wasn’t any fudging or fabricating data. A group of skilled analysts sat down and came up with 29 defensible methods for analyzing the same data that gave different answers. To me, this is the stuff of existential crises. To quote the article, “[e]very result is a temporary truth”. Which I think is pretty concerning if you’re working in a situation where temporary truths don’t cut it.

 

Joshua Tewksbury is a biologist who spent 10 years as a professor at the University of Washington before moving to a position with the World Wildlife Fund. About a year ago, he wrote a post about transitioning to an NGO position where, he writes, “[s]cience shows up as just another wrench in the toolkit.” A deeply malleable tool, apparently. On the one hand, it’s troubling to think about making decisions with temporary truths. On the other hand, and this strikes me as almost heretical to type, if you deeply believe in your cause, maybe it’s not so bad to (ethically and with full disclosure) make subjective decisions in how you analyze your data to advance your cause.

 

After thinking about it for a while, I’m still not sure how bad my crisis should be. In the first post, one of the project leaders is quoted as saying:

 

“On the one hand, our study shows that results are heavily reliant on analytic choices,” Uhlmann told me. “On the other hand, it also suggests there’s a there there. It’s hard to look at that data and say there’s no bias against dark-skinned players.”

 

At first pass, this didn’t help me. As somebody who takes comfort in certainty (and don’t most scientists?) the “squint at it” method of assessing data is an endless source of frustration. But I’ve also realized that we might feel confident about one other thing from the soccer data set. No groups concluded that lighter skinned players received more red cards. Maybe there are some relatively permanent truths, it’s just that they don’t answer the question we set out to answer.

Open data

I am a firm believer in open access. It is why I violate all copyright issues and post a pdf of every paper we publish on this site. Damn backward publishers who make that illegal.

But open data…? That makes me a bit tense. It takes years to raise the money, recruit the people, deal with the animal paperwork, the health and safety paperwork, the HR paperwork, the grant administration paperwork….all just to enable folks in the lab to generate novel data. We then have to sweat together during the arduous process of figuring out what to do, and then doing it, and then making sense of the outcomes. Where’s the justice if we just put the raw data out there, for free, for any passing dilettante who can’t be bothered to get their hands dirty…?

It is what it is. For two of Silvie’s papers (2010, 2011), the data are freely available in Dryad (2010, 2011), and we are in the process of posting the raw data for a third (2013). The 2010 data have been available for a few days and have already been downloaded 5 times. The 2011 data came available awhile back; they have been downloaded 51 times.

Fifty one times. Who is that? What are they doing with it? What have they learned? All very intriguing.

How to change science

I have just finished reading The Silwood Circle. It’s by an historian of science with a big interest in the philosophy of science. Despite that, I could hardly put it down. I found it riveting partly because I know the players involved, and partly because it is about putting math into ecology (and why that matters even though the models are largely heuristic). But mostly I could not put it down because the book is really about a bunch of men (all men), who set out to insert ecology into the heart of British science and The Establishment – and why they succeeded. I think it has lessons for young folk who want to change science – and older folk who want the next generation to change science.

Silwood Park is a campus of Imperial College London. In the late 1960s, Richard Southwood and slightly later Bob May set out to use Silwood to transform ecology, particularly British ecology. By the time I came on the scene in the mid to late 1980s, they had done it. It was achieved by picking the right people (smart, ambitious, sociable) and opening doors (career opportunities, prizes) once those people performed (which the anointed ones did). But more importantly, it seems, it was done by putting together people who shared a common philosophy about how to do science but whose interests and specific expertise were complementary within the group. They drove each other forward (as big egos do), but as part of an us-against-them mentality, not a dog-eat-dog approach. And my sense is they laughed and argued and socialized as a group, something which really glued them together. Together, they rode the 1970’s environmentalism into the upper reaches of the British establishment. Much of it because they hiked together. It might all hinge on the hiking.

I have heard the criticism that the book fails to acknowledge what happened elsewhere in the world at that same time. That is perhaps a little fair. I also wonder if the Southwood ambition and associated narrative look a bit clearer in retrospect. But to me, what is really missing from the book is an analysis of the impact of charisma. Several of the protagonists are (or were) some of the most charming, forceful, articulate, erudite, stylish, visionary, self-confident, stimulating people I have ever met in science. Add to that potent mix their ability to unite previously disparate subjects like pesticides, parasitoids, parasites, predators, pathogens, public health, plants and a whole lot of other p-words like parties and pubs, and well, ka-Pow.

(Obese) man’s best friend?

Perverse as it may be, I actually quite enjoy the lonely hours in the windowless mouse room. I chew up BBC radio (The Infinite Monkey Cage included) podcasts and audiobooks. Indeed, a single experimental stint could generate a slew of book reviews and musings based on the content of these media. Here’s an example…

The other day, as the denouement of both sampling and the Freakonomics podcast approached, I heard something that made me think. The Robert Wood Johnson Foundation had convened a round-table discussion of economists, nutritionists, political scientists and public health experts to discuss solutions to the ‘obesity epidemic’, which had been recorded for the podcast. All ideas, however taboo, were admissible.

After discussing taxation, parental behaviour and stigmatization, a Harvard economist David Laibson piped up with the idea that if one could ‘consume a parasite that could not reproduce and…would…make the things that I eat ineffective as nutrients… I could have more of those…and the food companies could sell more food!’. Steve Levitt, an economics professor at University of Chicago and coauthor of the Freakonomics books, then declared that after doing some research he came ‘to believe…that there are some pretty good tapeworms out there…If we could domesticate them…we could make tapeworms some of the most loyal and serviceable pets that would ever be out there’. This exchange made me smile and lead to the following thoughts:

  1. The ghost of Weinstock was in the room! Laibson’s suggestion echoed the proposal that helminths can be harnessed to treat autoimmune diseases. Had these ideas been at the root of David Laibson’s suggestion, however incorrect it might be*? Was this evidence that one of evolutionary medicine’s insights had been widely disseminated (forgive wishful thinking)?
  2. The idea’s proposition, along with recent positive coverage of fecal transplants, suggests that the idea that parasites are by dint of their nature ‘bad’ is breaking down. The ‘ick factor’ is losing its power. Is this due to a (re)appreciation of the importance of the natural world in the lives of humans, driven by high-profile coverage of the beneficial effects of gastrointestinal microbiota? Or does it reflect a belief that, as evidenced by agriculture and medicine, organisms can be so well controlled that we can harness even the most threatening?
  3. At the risk of allowing megalomania to go by unchallenged, does it really matter why opinions are changing, if indeed the are? If pathogenic organisms become decoupled from disease in the imaginations of patients and policymakers alike, interventions that permit the maintenance of parasite populations, as opposed to their complete eradication, might become a reality. As we all know, the maintenance of a parasite population might reduce the selection for resistant parasites; reduce the invasibility of a host by pathogenic bacteria & reduce the huge expenditure involved in killing that One Last Parasite.
  4. What a wonderful Secret Santa present a tapeworm would be!

*My assessment of this idea is at a nascent stage. At first, I wondered whether foraging behaviour (also known as eating) would increase upon infection, in order to compensate for the increased energetic demands of the parasite & the immune response induced by it. I believed this likely because the energetic costs of excess foraging would be minimal in the presence of supermarkets. Thus, while parasite pets would enable the consumption of more food and capitalist glory, they might not aid in the process of losing weight. Rather one’s weight might only stabilize.

That said, a completely non-exhaustive perusal of the literature, revealed no evidence for increased foraging upon infection with gastrointestinal parasites. Indeed, it appears that calorie restriction may be the rule, at least in wild animals. Calorie restriction may be a voluntary (and perhaps an adaptive) or parasite-induced (which seems counter-intuitive) behaviour. Furthermore, gastrointestinal helminths both reduce the efficacy of nutritional absorption and increase energetic usage, achieving the holy grail of ‘eat less, do more exercise’, if you will. Perhaps its the next Atkins, then?

This all assumes that a non-reproducing tapeworm would even reap much resources from a host, of course…

How do we balance respect for individual privacy with the demands of epidemiological research?

Human disease epidemiologists love Scandinavian countries. Not only are they delightful sites for conferences (side trip to Norway’s fjords, anyone?), they also maintain fanatically detailed health records for their citizens that they are more than willing to share with scientists. These datasets offer amazing opportunities for disease modeling and allow breakthroughs that simply wouldn’t be possible with less complete data. It’s much harder to do this type of work in the US. Our health records are incomplete, noncentralized and difficult to access. And when datasets are identified that do offer opportunities to bypass these problems (Guthrie cards, for instance), legal maneuvering by privacy advocates can make meaningful studies untenable.

As a researcher, I am awed by the potential of detailed personal health data to revolutionize disease management. We’re now able to track influenza outbreaks by analyzing Twitter streams; imagine what we could do with unrestricted access to Guthrie cards, computerized health records, network data from Facebook, GPS data from cell phones? But I also understand the perspective of the private citizen, who may or may not be informed about his data are being used, and who may be subject to increased insurance premiums, job discrimination or public shunning if these data are not properly handled.

Very few people would feel that private testing of infants for easily treatable genetic disorders is a bad thing. Likewise, very few would be comfortable with full public disclosure of all genetic data, health records and spatial information. But there is a huge gray area in between these two extremes that would respect individual privacy while allowing research to move forward. Public outreach and education, opt-in vs. opt-out programs and a firm commitment to privacy and transparency will increase public trust and facilitate scientific inquiry, but will this be enough? What kinds of safeguards and incentives do you think will be necessary to public health research forward in the US?

Quality control


Delicious looking health care (photo from Wikimedia Commons).

At CIDD lunch this week, we discussed this paper by Kumar et al (2005). It’s a compilation of data from randomized clinical trials of cancer treatments in children conducted between 1955 and 1997, and it shows that on average, new treatments are no better or worse than standard care.

Although scientists are not getting better at producing more effective treatments in comparison to standard care, the overall survival rate for children with cancer has been increasing since the 1950s, which suggests that standard care has changed for the better. During the CIDD lunch, someone referred to this improvement as “moving the goal posts” for cancer treatments. In my mind, this raises the question of how standard is standard care at any given time, for any given doctor, in any given hospital?

At the end of the discussion, Andrew touched on one of the problems with standard care, which is that ineffectual or even detrimental treatment regimes can become common because of misplaced confidence in anecdotal observation. An op-ed published last August in the New York Times suggested that the solution to this problem is to throw money at it – and of course, medical researchers to spend the money. More specifically, take some small fraction of health care spending and put it towards testing current treatments.

Yet even if all treatments were well supported by data, the widespread and timely implementation of best practices is not assured, as it is a long and leaky pipeline from the NEJM or the Lancet to your general practitioner’s office. Dr. Atul Gawande wrote about this problem in the New Yorker, also published this past August. As an example of the inconsistency in standard practices, Dr. Gawande wrote about his mother’s knee replacement and the extensive variation between surgeons in all aspects of the procedures, from anesthesia to physical therapy. A solution to this problem, according to Dr. Gawande, is strict and centralized quality control, just like in the Cheesecake Factory and other chain restaurants.

Which brings me to my final question: how important is standard care? For double blind, randomized clinical trials (i.e. science) it’s quite important to know that your control group is being treated consistently but for patient care (i.e. medicine), I can see some benefits, as well as costs, from multiple approaches.

NB: My experience with health care has been largely within the U.S., which certainly colors my perspectives.

Scientists to be jailed for failing to predict earthquake

Yesterday, seven Italian scientists were convicted of manslaughter and sentenced to six years in jail for, as many headlines asserted, failing to predict an earthquake that killed just over 300 people in L’Aquila, Italy in April 2009.

The formal details of the case are a bit more nuanced than the sound bites suggest, but that ultimate outcome is still disturbing, as a scientist. An “unidentified woman on Sky television” thought it was “just a tiny bit of justice so that it doesn’t happen again” (NY Times). I found it upsetting, that individuals would feel comforted by dumping the burden of the deaths of hundreds of people on someone for not pinpointing a natural disaster, and ridiculous that they would think this would somehow get negligent scientists back on their game. What do people think that scientists can/should do?

These scientists populated the country’s National Commission for the Forecast and Prevention of Major Risks. Given that the scientific consensus on the possibility of earthquake prediction seems to be something like “certainly not within a timeframe very useful for informing short term evacuations,” what did these scientists say they could do?

There are many interesting discussion points surrounding this case, beyond just the fundamental, scientific question about the extent to which different aspects of the world and life are predictable; my mind swirls with them. But I guess I’ll ask this mini-blogosphere directly if you think that this conclusion is a worrying precedent? Do you think it is likely to help or hurt more people, and, if so, how?

Other articles: 1, 2