Friday, October 27, 2017

The Search for Useful Metrics & Fair Discussion

Imagine that there are 10 high schools in a school district and nearly everyone suspects that one school (Acme High School, the one in the poorest neighborhood) has inferior football equipment compared to the rest of the schools. 

Concerned parent David Francis writes an op ed in the local paper stating that something has to change.  He lays out some pretty convincing evidence that Acme is underfunded when it comes to equipment.  He also points out that they have never even been to a state championship quarterfinals in football (which everyone agrees is true).  “For me,” David writes, “The ultimate metric of quality equipment is whether a team is ever represented in even the quarterfinal round of the state championship.”

Another parent, Mark Smith, responds in the comments that although everyone knows Acme has worse equipment, if we use state championship qualification to measure status or progress, we 1) won’t know whether the equipment is improving at all, and 2) are asking the kids to prove, by winning, that they are properly supported.  Can’t we find a different way to measure the equipment, or at least an effect of having good equipment that is closer to the cause (and therefore more reliable, with fewer confounding variables)?

Parents at Acme (and their supporters) can’t believe Mark doesn’t care about these kids getting better equipment!  Mark responds that he does care about the kids, and points out that the teams that make it to the state championship playoff level have 100 advantages other than equipment over Acme High.  Kids move to play at those schools.  The best coaches coach there.  They have every advantage you can think of that comes with caring deeply about winning at football and being in a position to help yourself do it.  Importantly, even with the exact same equipment, Acme will not be making the playoffs in the next 5 or 10 years. 

“So, you don’t think we should even improve the equipment?”  Mark didn’t say that.  In fact he said more than once that we should improve the equipment.  They accuse Mark of simply playing a semantics game.  What exact game and which semantics it is that got unfairly twisted, they never quite say.  David wrote that there is something important and we ought to track our progress, everyone agreed, and then Mark pointed out that the way David suggests we do so simply won’t work

Finally, David adds that “Well, whatever the total circumstances are that keep Acme High out of the state championships, we need to remedy them all.  Stop being a dick.”  But Mark points out that Acme spends its discretionary budget on afterschool programs and SAT training and that those programs might suffer if funds were redirected to football.  In other words, Acme has made different choices than schools that go all out for football championships.  Mark believes we can search for ways to improve the equipment without demanding a state championship.  

_____________

This is where we find ourselves whenever someone suggests that we measure our progress in creating a welcoming environment for women and girls by checking in with the Pro Tour and seeing how many women are succeeding there.  Most recently, this article http://manadeprived.com/men-magic-building-community/ offered Pro Tour participation as not just one thing to check in on, but as “the ultimate metric of success for all efforts meant to make Magic more welcoming.”  There isn’t a way to read that with any amount of fairness and not conclude that Daniel believes that’s where you look to investigate our progress.

But although we do need to measure our progress in inclusiveness, this metric doesn’t work.  The end result of looking there is not a significant increase in understanding what has happened over the last 1 year, 5 years, or 10 years.  

I wrote on twitter that:

The factors that I believe contribute to women not showing up on the Pro Tour are (in no order since I don’t know enough to come close to ranking anything): sexism and discrimination denying them opportunities and resources,  personal choices about how to spend their time and energy, personal choices about what a successful and healthy hobby is, the competitive advantage men have accrued over years of having fewer barriers and more interest (whether that interest is caused by a toxic community or not, the gap that accumulates is there), the fact that Wizards of the Coast has hired the most promising players and role models off of the Pro Tour into jobs that prohibit them from competing, and other factors. 

My relationship to this hobby was the most negative in my life when I was trying the hardest to qualify for the Pro Tour.  I put in hours and hours and got little tangible benefit in return.  Yes I made friends along the way, but that’s the kind of thing people say whenever they do something stupid for a long time and happen to make a friend out of someone else walking that same misguided path.  Was I financial rewarded?  Was I emotionally rewarded in a way that let me find balance in my life (eventually, yes, but not until over a decade of financially and emotionally draining choices).  I don’t want to fall into the familiar trap of measuring women’s progress using men’s goals.  And there is an overwhelming amount of evidence that men and women do not choose to spend their time the same way or pursue the same goals, for a multitude of reasons.  (This is an article about the research which I found informative on the topic: http://slatestarcodex.com/2017/08/07/contra-grant-on-exaggerated-differences/


If you go survey women at local game stores and try to measure inclusion that way, there will be confounding variables too (selection bias unless you somehow find all the ones that stopped showing up comes to mind), but does anyone doubt that that survey is closer to the source of what we are trying to measure?  There are countless numbers of ways to check in with how we’re doing as a community when it comes to inclusiveness.  Stop focusing on the one with the most confounding variables, the one that hasn’t moved whether things have gotten worse or better over the years.  If you want to measure how you’re doing on something, get close to the damn thing and measure it fairly.

Lastly, if we want to measure diversity on the Pro Tour, let's measure it.  In that case, we are close to what we intend to measure in the same way we can always claim "Acme High isn't very competitive in football" and make a case.  What gets us in trouble is making a claim of one type, and a measurement of a very different type.  The trouble we get into can be described as a severed feedback loop, total blindness, about our progress towards an important goal (and lack of clarity about what that goal is).