About . . . . . . Classes . . . . . . Books . . . . . . Vita . . . . . . . Links. . . . . . Blog

by Peter Moskos

September 20, 2017

St. Louis and the acquital of Officer Stockley

So somehow perhaps I thought doing a podcast would be less time consuming or easier than writing a blog post? No. Hell, no. Do you know what editing entails? Even light audio editing? But it's different. Kind of fun. What the hell. I hope it's educational (and hopefully also entertaining).

Anyway, here's Nick Selby and I talking about the acquittal of Officer Stockley in St. Louis.

We now have six episodes up. (Even though with our odd counting system it only counts as three.) And Nick finally got a decent mic (not till be heard till the seventh episode).

The episode we're most proud of is our interview of former Decatur Police Officer Andrew Wittmer. He talks about his police-involved shooting and the post-incident PTSD.

September 11, 2017

Quality Policing: Episode 2

Enjoy. You can add Quality Policing to your podcast subscription or download the MP3 audio file old-school style. Either way, head on over to the webpage for info and links.

September 8, 2017

Still trying to explain...

What's wrong with the Brennan Center's analysis? There are many problems. But here are a few:

1) They take a non-random sample (which isn't bad in and of itself) and then A) don't tell the reader in the text and B) state conclusions as if the sample were a random sample (every data point equal chance of being picked), representative of the nation.

2) They take short time frames (1 year) to point out that fluctuations could be random. True. For a short time frame. They could take a longer time frames (3 years) and see more clearly developed patterns.

3) This is bit trickier to explain. And that's why I'm giving it another shot. They base their findings on a magnitude of changes within their sample. This has the perverse effect of attention getting conclusions -- "more than half" -- that are noteworthy only in direct proportion to the limitations of their sample.

Let's take an analogy. I want to look at murder in my City of Moskopolis (a fine city, despite a bit of a crime problem). So I take a sample of three police districts (out of ten equally sized police districts). Now it just so happens that we already know that murder in Moskopolis is up 20 percent. But our study looks at District #1, where murder is up 30 percent, and District #2, where murder is up 10 percent.

Now maybe District #1 is important for its own reasons. "Murder is up 30 percent in District #1." No problem there. Or maybe, as mayor of Moskopolis, I prefer to give a bit of spin: "Murder is up 30 percent in District #1, but not so much in rest of city." That's fine, too.

But I can't say this: "District #1 accounts for 75 percent of the murder increase in Moskopolis." This is not true. It is false. District #1 accounts for 15 percent of the city's murder increase.



But some guy who has a stick up his ass about accurate data (even though he really does have better things to be doing with his time) gets all huffy and points out this inconvenient truth to the Washington Post, which quoted my incorrect statement because I'm generally a trustworthy guy.

So the Washington Post calls me and says "What's up?"

"Oh," I say. "I'm sorry. I was talking about 75 percent in my sample. Did I not make that clear?"

The Washington Post dutifully makes the correction and updates the story: "District #1 accounted for 75 percent of the murder increase in two districts."

This is now no longer a false statement, but it's a still meaningless one. Who cares about what percentage of change there is in one district in my sample? Why are we talking about two districts when we could be talking about six, eight, or even all ten of them. And here's a doozy: What if murder went down in District 2? Could District #1 account for more than 100 percent of the increase in my sample? Mathematically, yes, says my calculator. But statistically an increase of 100 percent is absurd. Methodologically, this should be a big red flag.

Anyway, Moskopolis is still a fine place. And indeed, we shouldn't overreact to an increase a murder. But if the mayor says murder isn't up, perhaps you shouldn't believe the mayor.

September 7, 2017

Quality Policing Podcast: Interview With Jeff Asher

There's another quality policing podcast in which I talk to data analyst Jeff Asher about the Brennan Center's latest report on crime. Asher had posted this thread about methodological problems in their data and analysis.
Brennan has a new report out showing murder down 2.5% nationally, but there are some major issues with that finding.

1) The figures cited aren't year-to-date, they're projected year end numbers based on around midyear counts.

2) Murder tends to pick up over the second half of the year, and any projection using midyear numbers will almost certainly be wrong.

3) They found murder -2.5% but included San Fran's 2016 count in that. There was no count for 2017. Removing SF makes murder -1.5%.

4) Detroit is estimated to be -27%, but that's based on Detroit's open data site.

5) That's problematic because the open data site is slow to add murders, so any year-to-date count will be wrong.

6) Detroit had over 130 murders as of late June according to the Detroit Police Department, and the 220 murders they project would be the fewest there since 1966.

7) Taking Detroit's inaccurate count out takes murder in their sample from -1.5% to +0.7% overall. So Detroit's inaccuracy explains the drop

8) The Phoenix count is similarly wrong. Phoenix had about 150 murders in 2016 but this report says they had 80 and project 60 for 2017.

9) The Phoenix figure was reached by using MCCA midyear data and doubling it, but Phoenix only reported Q1 data to the MCCA.

10) As of May Phoenix had 58 murders year-to-date in 2017 and 56 in 2016. Take away Phoenix and Detroit and suddenly murder is up 1.2% in the sample.

11) Which is to say nothing of the methodological issue of projecting midyear for 30 cities to a full year and calling it a national trend.

12) For what it's worth, my midyear piece for @FiveThirtyEight shows murder up a few % but rising slower than previous years.

13) Also worth reading is @Jerry_Ratcliffe on why doing year-to-date analysis isn't a great idea

14) Larger point is that measuring murder nationally is tough, drawing sweeping conclusions from badly incomplete data is a huge mistake in my opinion
This isn't the first time the Brennan Center has released faulty and misleading reports on the rise in homicide. In July, after the last one, I finally made an attempt to talk to one of the report's authors. Once I laid out my concerns, the correspondence ended. Today I asked the other author (via twitter) if he wished to be interviewed or engage in a civil discussion of methods. No dice, apparently he's "alright, thanks." It's still an open invitation.

There are numerous problems with their analysis, but the most irksome to me is the straight-up misleading statement. I asked:
Is this statement [from your report] true? "Notably, 55.6% of murder increase 2014 to 2017 is attributable to two cities — Chicago and Baltimore."
Because I know it's not true, since about 14 percent of the murder increase from 2014 to 2017 is attributable to Chicago and Baltimore. He replied:
Yes. It's true for the 30 largest cities (our cohort), not nationally.
This not an explanation as much as a confession because they don't say "for the 30 largest cities (our cohort), not nationally" in their report.



I understand how they got their numbers; on my calculator, I can replicate their methods. That's good, but not good enough. Their methods are faulty.

Here are some of my remaining unanswered questions I posted on twitter.
Since 2013, what is the change in homicides in those 30 cities? I get a decrease in 3 cities and an increase in 27. Is this correct?

Do you understand problems in saying a "percentage of increase in sample"? Substantively meaningless & statistically absurd.

If you have three years of data, why do 2017 tables only compare with last year, 2016?

It may turn out to be true, but still seems a odd choice that only mention of (20%!) 2-year homicide increase is as "short-term fluctuation"

If twitter can't do this justice, I'd be happy to interview you for @QualityPolicing podcast.
I asked if we could "continue w/ a civil discussion of your methods?" Alas, the reply was: "I'm alright, thanks."

For two main reasons, I'm not OK. I'd like the Left to stay committed to the truth. The generally decent Brennan Center should be above Heritage-Foundation-style BS.

But more importantly: when you say murder is down when murder is up, it's not just an issue of truth. It's also an attempt to make the murder victims -- disproportionately poor young black men -- disappear from our consciousness. As if they never existed. Do their lives not matter, too?

September 5, 2017

Quality Policing Podcast

Nick Selby and I made a podcast! Check it out at qualitypolicing.com/. The first episode is up. And cut us some slack, it's the first episode.

September 4, 2017

The Freddie Gray Effect in Baltimore

Building on my previous post on data presentation, I did some grunt work to get a count of murders and shootings for each and every day since January 1, 2012. (If you think that's easy or [that] can be readily downloaded, you're wrong. Update: I could have saved a few hours of grunt work had I thought of using the  =VLOOKUP function in excel to fill in missing dates that had no major crimes.)

If you simply chart the data, you get this kind of chart, which might be cool in an abstract expressionist blurry kind of way, but it's next to worthless as a form of data presentation.



Here's the same data, given a bit of love and handling. For all the reasons mentioned in my previous post, I went back to a one-year moving average, split on April 27, 2015, the day of the Baltimore riots. (Pre-riot takes the average from preceding year; post-riot from the year following.) What I'm trying to highlight, in an honest way, is the large spike in murders and shooting immediately after the riots and Mosby's decision to bring flimsy criminal charges against six Baltimore City police officers.



Unlike other crimes, shootings and homicides are reported quite accurately. Other crimes will rise and fall in sync. (And if the data doesn't show that, consider those data flawed, particularly in terms of less accurate reporting.) And if you're more partial to a line graph:



The riots were a big deal, but nobody died. More important to policing and public safety was what happened after the riots. Nobody was holding the tiller. The department was basically leaderless. The mayor had been almost in hiding. Then Mosby made the biggest mistake of all. She criminal charged six officers for doing their job -- legally chasing and arresting a man running from an active drug corner (this man, Freddie Gray, then died in the police van and that led to riots). Mosby got no convictions because she had no case. She couldn't prove a crime, much less culpability. She would later say, "I think the message has been sent." Police got the message: if you do your job and somebody dies, you might face murder charges. Activists and Baltimore's leaders pushed a police-are-the-problem narrative.

Police were instructed -- both by city leaders and then in the odd DOJ report city leaders asked for -- to be less proactive since such policing will disproportionately affect minorities. Few seem to care that minorities are disproportionately affected by the rise in murder. Regardless, police were told to back off and end quality-of-life policing. So police did. But, unlike the arrest-'em-all strategy formulated by former Mayor O'Malley (which worked at reducing crime a little) discretionary enforcement of low-level offenses targeting high-risk offenders reduced violence a lot. It also sent a proper message to non-criminals that your block and your stoop were not going to be surrendered to the bad boys of the hood.

Of course these efforts will disproportionately affected blacks. In a city where more than 90 percent of the murderers and murder victims are black, effective anti-violence policing will disproportionately affected blacks (Of course, bad policing will, too). The rough edges of the square can be sanded down, but this is a square that cannot be circled. Reformers wanted an end to loitering and trespass arrests. Corner clearing basically came to a stop. Add to this other factors -- fewer police officers, the suspension of one-person patrol units, poor leadership -- and voilĂ : more violent criminals committing more violent crime.

Murders and shooting increased literally overnight, and dramatically so. Of course this took the police-are-the-problem crowd by surprise. By their calculations, police doing less, particularly in black neighborhoods, would result in less harm to blacks. And indeed, arrests went way down. So did stops. So did complaints against policing. Even police-involved shootings are down. Everything is down! Shame about the murders and robberies, though.

Initially this crime jump was denied. Now we're supposed to think it's just the new normal for a city in "transition." How about this narrative: police and policing matter; and despite all the flaws in policing at a systemic and individual level, police and policing are still more good than bad, especially for society's most at risk. There is no reason to believe that the path to better policing much pass through a Marxist-like stage of "progressive reform" before improving. We pay police, in part, to confront violent criminals in neighborhoods where more than 20 percent of all men are murdered. We own this to those, all of those, who live there. To abdicate police protection in the name of social justice in morally wrong.

And lest you think this rise in crime is only a problem in Baltimore, be aware that over the past three years, homicide is up dramatically in America, almost everywhere. Not just Baltimore and Chicago. Unprecedentedly so, in fact.


In related news, the odds of dying if shot in Baltimore have gone down slightly since 2012, presumably because of better medical care. It's a crude measure, but notice the downward slope of the trend line. The chance of dying has gone down from 39 percent to 34 percent. Also note the seasonal changes in mortality. I don't know why that is.


September 3, 2017

Data presentation and the crime rise in Baltimore

Data presentation fascinates me because it's both art and science. There's no right way to do it; it depends on both hard data, good intentions, and interpretive ability. Data can be manipulated and misinterpreted, both honestly and dishonestly. And any chart is potentially yet another step removed from whatever "truth" the hard data has.

Where I'm going isn't exactly technical, but there's no point here other than data presentation and honest graph making (and also crime being f*cking up in Baltimore after the riots, but that's not my main point). If that doesn't interest you, stop here. [Update: Or jump to the next post.]

I took reported robberies (all), aggravated assaults, homicides, and shootings from open data from 2012 to last month. I then took a simple count of how many happen per day (which is strangely not simple to simple to analyze, at least with my knowledge of SPSS and excel). You get this.



It takes a somewhat skilled eye to see what is going on. Also, since the day of riot is so high (120), the y axis is too large. With some rejiggering and simply letting that one day go off the scale unnoticed, you get this.



It's still messy, but is the kind of thing you might see on some horrible powerpoint. Things bounce up and down too much day-to-day. And there are too many individual data points. Nobody really cares that there were more than 60 one day in July 2016 and less than 5 in early 2016 (I'm guessing blizzard). It's true and accurate, but it's a bad chart because it does poor job of what it's supposed to do: present data. Again, a skilled eye might see there's a big rise in crime in 2015, but the chart certainly doesn't make it easy.

Here's crimes per day, with a two-week moving average. A moving average means that for, say September 7, you take Sep 1 through Sep 14 and divide by 14. Why take an average at all? Because it smooths out the chart in a good way. It's a little less accurate literally but much more accurate in terms of what you, the reader, can understand. One downside is that the number of crimes listed for September 7th isn't actually that number of major crimes that happened on that day. You can see why that might be a big deal in another context. But here it isn't.



For a general audience it's not clear what exactly the point is. You still have lots of little ups and downs, and the seasonal changes are an issue. (Crimes always go up in summer and down in winter. And it's not because of anything police do. And it's nothing do to with the non-fiction story I'm trying to tell.) On the plus side, you do see a big spike in late April, 2015, after the riots and the absurd criminal prosecution of innocent Baltimore cops. But it needs explaining.

Also, you need some buffer for the data. The bigger the average, the more of a buffer you need. But for this I think this is one perfectly fine way to present these data, at least for an academic crowd used to charts and tables.

Another tactic is to take the average for the past year. Jeff Asher on twitter over at 538.com does good work with NOLA crime and is a fan of this. It totally eliminates seasonal issues (that's huge) and gives you a smooth line of information (and that's nice).



You can see a drop in crime pre-riot (true) and a rise in crime post-riot (also true). That's important. Baltimore saw a drop in crime pre-2015 that wasn't seasonal. It was real. And the rise afterward is very real. But there are two problems with this approach: 1) you need a year of data before you get going and 2) everything is muted. What looks like a steady rise (the slope since 2015) is actually a huge rise. But it looks less severe than it is because it takes an average from the previous year. But that's not exactly true. Crime went up on April 27, 2015. And basically stayed up, with a slight increase over time.

Here's my problem. I want to show the rise in crime post-riot. But I want to do so honestly and without deception. But yes, for the purpose of this data presentation, I have a goal. (My previous attempts were pretty shitty.)

Also, you need at least a year of data before you can graph anything. That's a downside.

Here's my latest idea. If one is looking at a specific date at which something happened -- in this case the April 27, 2015 -- and trying to eliminate seasonal fluctuations, why not take the yearly average for the previous year before that time and the yearly average after that date for dates after that time? I think it's kosher, but I'm not certain.

Here's how that works out:



This shows the the increase that was real and immediate. And as minor point I like the white line on the day of the riot, which I got from removing April 27 from the data (because it was an outlier).

Now if I wanted to show the increase in more stark form, I would move the y axis to start at 20. But being the guy I am, I always like to have the y-axis cross the x-axis at 0. That said, if the numbers were higher and it helped the presentation of data, I have no problem with a y-axis starting at some arbitrary point.

Take into account that graphs are like maps. While very much based on truth, they exist to simplify and present selected data. I mean, you can have my data file, if you want it. But I do the grunt work so you don't have to. But of course my reputation as an academic depends on presenting the data honestly, even though there's always interpretation (e.g.: in the case of a map, the world, say scientists, isn't flat). The point, rather, is if the interpretation honest and/or does the distortion serve a useful purpose (In the case of the Mercator Projection it was sea navigation; captains didn't gave a shit about the comparative size of the landmass of Greenland and Africa.)

So taking an average smooths out the line of a chart, which is a small step removed from the "truth," but a good stop toward a better chart. It's not a bad approach. But it tends to mask quick changes in a slow slope, since each data point in the average for a lot of days. A change in slope in the graph actually indicates a rather large change in day-to-day crime. There are always pluses and minuses.


If you're still with me, here's what you get when just looking at murder. Keep in mind everything up to this point has been the same data on the same time frame. This is different. But homicides matter because, well, along with people being killed, it's gone up much more than reported crime.



[My data set for daily homicides (which is a file I keep up rather than from Baltimore Open Data) only goes back to January, 2015. So I don't have the daily homicide count pre-2015. 2014 is averaged the same for every day (0.5781). This makes the first part of the line (pre April 27, 2015) straighter than it should be. This matters, and I would do better for publication, but it doesn't change anything fundamentally, I would argue. At least not in the context of the greater change in homicide. Even this quick and imperfect methods gets the major point across honestly. ]

Update and spoiler alert: Here's a better version of that chart, from my next post.