[Update: also see Nick Selby's take on this. And David Klinger's]
How do we get data on police-involved shootings?
Trick question. We don't! A few departments, like the NYPD, issue great annual reports on shots fired by police. But other than that, we don't know. We don't know how many people cops shoot. So at best we're left with those shot and killed by police. And that's probably less than half of those shot by police.
When academics call for "more study," it's usually a cliché. But the need here is real. We don't know how many Americans get shot police each year? Are you effing kidding me?!
Given that, there's nothing wrong with using the best data you have. And I'm partial to using the Washington Post data myself. But that doesn't mean the data are good. (By good I mean valid, in that they show what they claim to show.) (I also have a bias problem with their "ticking" counter, like last year's shootings numbers are still going up. No, dude, every time I click on 2016 data, it's going to go to 963. You're not actually compiling the data on the spot.)
1st question: Is the basic number of people shot and killing by police correct?
Answer: Probably. It's an unknown unknown, but we have a lot of reasons to think most killings are here.
2nd question: Is their coding correct.
Answer: Depends on what you want. For race, probably. For threat, probably not. The data might be "reliable" (you might get the same code if you did it again). But what does "threat" labeled "other" mean? And how is that different from "undetermined"?
Others have pointed out to me that reporters don't have the expertise to judge what experienced police officers are trained to see. There's a great deal of truth to that. But more importantly, is the Post categorization valid? We don't know.
Say somebody gets killed on the street. How does that data get to us?
Well, in the traditional manner -- going from the street to the Uniform Crime Report (UCR) -- usually somebody calls 911 cause a crime happened. Some young officer shows up and takes a report. This is a local form, for a local department, not at all coded to the standards requested by the UCR (Hispanic data is a key issue here). The cop writes a report that is collected by their sergeant toward the end of their shift. It well enough written, so it goes up to some supervisor and then to some police data consolidator and then, once a year to the FBI.
At each stage it might get "cleaned up" a bit, as needed. And then, 9 to 21 months after the incident occurred, it gets published in the UCR index or Part I (or II) crimes. I've actually been able to check individual incidents I handled, later, in the UCR data. It checked out. All the facts were basically correct.
But you only know what the UCR tells you, and it isn't much. Nevertheless the UCR is considered the "gold standard" of crime data. But it sure ain't perfect. And it's particularly bad when it comes to police-involved shootings. Mostly because most departments simply do not report data on those killed by police.
Because of that, after Michael Brown in Ferguson, the Washington Post (and to a worse extent the British Guardian) said "we're going to start counting." Good on them, because nobody else was. [As was pointed out to, and I should have mentioned, killedbypolice was doing it first.] They use whatever they can, which means google searches of news accounts, basically.
So a cops (or criminal) shoots somebody. Some local reporter (most likely) with a police scanner goes to the scene and files a report. People don't get killed by police that often. It bleeds, so it leads.
That reporter either does or does not do a good job. They gather some of the information that seems relevant. But since they weren't there, they don't really know happened. It's called an investigation. Who do you believe? The cops say the guy was armed; his family says he wasn't. Reporters file a story and then the Washington Post has to decide if the guy was armed. Usually (for good reason) they go with the cop's version. But what if the cop is lying? Isn't the crux of the matter? Even if it doesn't happen much, how would we know? Of course high-profile cases get more investigation.
Which system is better? Neither. Both. It depends. But no existing data gather system is universal, mandatory, or really gets to the context of the incident.
But then even more, there's the subjective recording of data.
Miscoding threat level
The Washington Post labels a threat as either "attack", "other," or "undetermined." That's an odd trichotomy. Police care if a shooting is "justified" or not (aka "good" or "bad"). Courts care if it was criminals or not. The public may care if it were "necessary" or not. These are all different standards. But how can one tell 3rd-hand if a shooting was "good"?
The article's authors equate "other" with non-attack. This is wrong.
Take Paul Alfred Eugene Johnson, who robbed a bank with replica guns.
He forced the bank employees into the vault at gunpoint, told them he would kill them if they called police, and stole cash, police said shortly after the robbery.There was a crazy chase. Johnson got out of his car and officers opened fire. I wasn't there, but I'm willing to call this a justifiable shooting. The threat level in the data is coded as "other," but in the journal article this gets recoded to "non-attack"? Come on, now.
Surveillance images from both robberies show someone dressed in similar-looking white hooded sweatshirts and carrying guns in their left hands.
Kevin Allen charged at officers with a knife. Kaleb Alexander had a gun he wouldn't drop. Troy Francis chased his wife and roommate with a knife, and then charged at responding officers. Hashim Abdul-Rasheed, previously not guilty by reason of insanity in an attempted murder case, tried to stab a Columbus, Ohio, police officer and was then shot and killed. Markell Atikins was wanted for the death of a 1-year-old, and then threatened officers with a knife. Tyrone Holman threatened to kill officers with a rifle and a grenade. Joseph Tassinari told an officer he was armed (he was) and then reached for his waistband. Harrison Lambert threatened his father with a knife before officers responded.
What do all these cases have in common (along with mental illness in most of them)? They're all categorized as "other" in the threat department. I don't fault the Washington Post for how they categorize. They may not have proof of attack beyond and officer's (self-justifying) account. I wish they did better, but they do what they need to do. (And nobody is doing better.) I do fault others who then group all these "others" into "non-attack" (n = 212), implying the cops did wrong.
I'm more curious about the label of threat called by the Post: "undetermined" (n = 44). Many of the potentially worst shootings are in this category. And yet: "Cases involving an undetermined threat level were excluded from multivariate regression models." I'm not certain why. Couldn't you go one by one and look at them? Isn't that what researchers do? I looked at a few.
The Post says Robert Leon:
exchanged gunfire with police, stole another car at gunpoint and fled. was first accused of shooting at cops and then shooting himself.This account seems simply to be not true. Further investigation may have revealed that Leon didn't have a gun and died from police bullets. I wasn't there. I don't know. But it sure seems like an odd one to me.
The "unarmed" issue
If you're looking for bad shootings, "unarmed" sure seems like a good place to start. But it's not enough. "Unarmed" is a flag, but it is no guarantee that a suspect isn't a lethal threat. Officers have and will be attacked and killed by "unarmed" suspects.
Some of these cases, like white officer Stephen Rankin killing unarmed black William Chapman, resulted in the officer's criminal conviction. The Washington Post codes Chapman as attacking the officer. The jury may not have thought so.
The problem here, one the researchers seemed to have, is that if you look at "unarmed" suspects and those categorized as "non-attack" (the ones that people are most concerned about) you don't have a large enough n (number of cases) to do statistical analysis.
In 2015, you'd be down to a grand total of 50 people shot and killed by cops. It's enough for an outrage of the week, but you can't do much data analysis with 50 cases. And if you were to use "undermined" rather than "other" as meaning "non-attack" (I think a better but still horribly flawed categorization) you'd be down to a total of 9 cases.