There is no relationship between the type of food being sold in a neighborhood and obesity among its children and adolescents.Sure thing, Sturm. But I suspect you wouldn't think certain neighborhoods are swamped with good food if you actually got out of your office and went to one of the neighborhoods. After all, what are going to believe: A nice data set or your lying eyes?
Within a couple of miles of almost any urban neighborhood, “you can get basically any type of food,” said Roland Sturm of the RAND Corporation, lead author of one of the studies. “Maybe we should call it a food swamp rather than a desert,” he said.
"Food outlet data ... are classifıed using the North American Industry Classifıcation System (NAICS)" (p. 130). Assuming validity and reliability of NAICS occupational categories is quite a red flag. It means that if something is coded "445110," then -- poof -- it's a grocery store! What could make for easier analysis? But your 445110 may not be like my 445110. Does your supermarket look like this:
Well the NAICS says it does because they're both coded 445. New York is filled with bodega "grocery stores" (probably coded 445120) that don't sell groceries. You think this matters? It does. And the study even acknowledges as much, before simply plowing on like it doesn't. A cigarette and lottery seller behind bullet-proof glass is not a purveyor of fine foodstuffs, and if your data doesn't make that distinction, you need to do more than list it as a "limitation." You need to stop and start over.
Here's one way to do it: a fine 2010 Johns Hopkins study edited by Stephen Haering and Manuel Franco. They actually care about their data. Read the first page in particular for the problems of food-store categorization. It matters. And notice the sections titled "residents personal reflections on their local food environment" and "food store owners' attitudes regarding stocking healthy food." What a concept for researchers to actually talk to people! (The picture above is from this study.)
I find this so frustrating because so much quantitative analysis is so predictably problematic, over and over, again and again, in exactly the same way. Here's the mandatory (and then ignored) disclaimer (p. 134, emphasis added):
Possibly even more of a limitation is the quality of the ... business listings, although this is a criticism that applies to all similar studies, including those reporting significant fındings.... More generally, categorizing food outlets by type tends to be insufficient to reflect the heterogeneity of outlets, and it is possible that more detailed measures, such as store inventories, ratings of food quality, and measuring shelf space, would be more predictive for health outcomes. Unfortunately, such data are very costly and time consuming to collect and may never exist on a national scale.So let me get this right, because “all similar studies” use this flawed data, it’s OK? And because getting good data may be “very costly and time consuming to collect,” we’ll simply settle for what we have at hand? Bullshit!
You know, perhaps we never will have good data on a national level about what produce is sold in each and every store in America. I can live with that. But it is neither very costly nor time consuming to simply go into every store in any one neighborhood and see what is there. Do a spot check. Or at least read and learn from the John Hopkins study. I just found it on google without even trying. They managed just fine. And if a corner store sells three moldy heads of iceberg lettuce and some rotting root vegetables, it is not the same as Whole Foods simply because they're both coded 445!*
Ironically, An and Sturm may still be right about their conclusions, but more by accident than design. Maybe the focus on food deserts is barking up the wrong tree. Perhaps obesity is not caused primarily by lack of access to good food. Maybe people do not want to eat healthy foods. Or maybe people simply don't know how to cook. Maybe we need to bring back Home Ec. I don't know. Certainly, I think we can agree, culture matters. But quantitative people don't like looking at culture because it's so hard to count. And who has the time to do time-consuming ethnographies when we've all got to get our name on as many co-authored quantitative peer-reviewed journal articles as possible?
There actually is (or was?) an excellent produce store in Baltimore's Eastern District, Leon's Produce. Conveniently it was right by a busy drug corner. Talk about one-stop shopping! Seriously, as a cop, I could suppress the corner drug market and buy onions and carrots. And yet people would indeed pass up this local family-run store to buy a cheesesteak or yakomee.
Maybe the problem is intense neighborhood isolation. Drawing a geographic circle around somebody and saying a grocery store is "close enough" may not matter if you've never left your neighborhood, don't have access to a car, or are afraid to walk down the block. Speaking of cars, Sturm also uses CHIS data in which "Only 3% of households ... report not having access to a car."
Well there's another red flag.
What does "access" mean? I suspect to some it is gathering $10 for a gypsy cab or knowing somebody who may let you borrow their car in an emergency.
The authors acknowledge the limitations of CHIS data, and then go right on using it: "The response rate ... remains low, and the current study sample has a large proportion of missing values" (30%, in fact!). If you're looking at the problems of poverty in America and believe data that say 97% of people have access to a car, you've got your head up your ass.
And if you have bad data, it doesn't matter what fancy quantitative methods you use. It's putting lipstick on the damn pig of correlation. Garbage in, garbage out:
The primary dependent variables (i.e., counts of food consumption) are regressed on the explanatory variables using negative binomial regression models, a generalization of Poisson models that avoids the Poisson restriction on the mean-variance equality.Wow! Negative binomial Poisson regression models to avoid the mean-variance equality restriction. I (to my shame) no longer have any idea what that means, even though Poisson regressions were all the rage when I was in graduate-school. But I do remember the fatal flaw of non-random missing data.
I'm not against quantitative methods. I'm against bad research.
And I also believe you need to talk to the people you're studying no matter what methods you use. I don't trust your study on poverty if you've never talked to a poor person. I don't trust your research on police if you've never talked to a cop. I don't trust your research on crime if you've never talked to a criminal. Nor do I trust your research on obesity if you don't talk to a fat person. And if you're going to write about food deserts, you'd better talk to some people who live in one. If you're not careful, you may learn something before it's done. Once you quant-heads actually talk to the people you're studying, then you can go ahead and run all the regressions they want.
*Update (April 29): As one commenter pointed out, a Whole Foods is not coded the same as a corner store (because the Whole Foods is larger). Indeed. But you still get my point.
And here's a picture of a corner "deli-grocery" in Crown Heights, Brooklyn (NYC):
It was in the Daily News because 14 were arrested for a running a drug ring from it. I strongly suspect it wasn't a good place for quality groceries.