Data-driven does not mean value-added metrics (VAM).  And VAM is not the same thing as merit pay.

The last few weeks have been an exciting time for K-12.  There’s all the media attention being showered on Waiting for Superman, as well as the release of the Nashville merit pay study.  Unfortunately, I’ve detected a consistent conflation of different concepts – data-drive decision-making, value-added metrics, and merit pay – that threatens to undermine the education and reform communities’ ability to make steady improvements to instructional and human capital operations.

As part of a very compelling and critical review of Waiting for Superman, Dana Goldstein quotes LA teacher and social justice unionist Alex Caputo-Pearl.  In the process of advancing an exciting vision of what teachers unions could be in the future, Caputo Pearl says:

“Data! There’s a good term out there,” he says with a laugh. “There are all sorts of problems with standardized tests, but that doesn’t mean you don’t look at them as one small tool to inform instruction. You do. The problem with value-added, on top of its severe lack of reliability and validity, is that if you use it in a high-stakes way where teachers are constantly thinking about it in relationship to their evaluations, you will smother a lot of the beautiful instincts that drive the inside of a school, with teachers talking to each other, collaborating and teaming up to support students.”

What I find distressing about his quote is the conflation of the different data concepts, a pattern that I see often in the current discussion about the direction of education reform.  Just look at the comments section of any online piece about the Nashville study and you’ll get a sense of what I mean.

This conflation is very problematic.  Let’s return to the Caputo-Pearl quote to examine why.

For starters, it’s unclear what exactly Caputo-Pearl means by “data”.  Data and data-driven decision-making in K-12 is about a lot more than Bill Sanders and his value-added methodology.  At the most basic level, it’s about helping teachers make good instructional decisions by providing them assessment data either in print outs or through a web portal.  At most districts, this is the real labor-intensive “data” work – to tease out insights from assessments to improve instruction, and it’s also an area that few  (if any) districts master at every campus under their jurisdiction. Such data-driven work is distinct from VAM or merit pay, and quite important to core instruction.

Further, Caputo-Pearl prefers a limited use of VAM for high stakes decisions including evaluation. This is perhaps the strangest forced dichotomy in media coverage and I was annoyed that someone with Goldstein’s reporting chops didn’t press about what proportion he/she favors.

In the real world every practitioner I have ever talked to realizes that it’s not 100% either/or but rather a share of each. For example, in real-world districts with VAM, there is usually a (disjointed) mix of quantitative and qualitative information that goes into teacher evaluation. And teacher evaluation is but just one of the high-stakes decisions.  VAM information can be helpful in targeting teacher recruitment, it can be helpful in creating optimal mixes of teachers in schedules, and it can be helpful in uncovering  the impact of specific professional development.    I thought the national discussion had moved on to figuring the optimal mix depending on local contexts, but in the national coverage, such as NBC’s Education Nation,  it still is made to sound like a binary that it simply is not when actually implemented.

Moreover, the idea that somehow there some robot-like district that gets a VAM score for teachers and immediately fires up some well-oiled performance-based termination machine is incorrect.  There is no such district.  In real-world districts, there are countless veto points to teacher termination even after an improvement plan fails.  Some veto points are good (parent support, campus support) and some bad (nepotism, patronage).  VAMs are useful in that mix to complement other information. It’s not deterministic now and is unlikely ever to be.   It’s not perfect, but it is valuable, and criticisms that can be made of VAM apply even more so to subjective observation.  Anybody that’s ever tried to help a district create an instructional observation tool that can be database-backed knows the perils that poor inter-rater reliability, social influence, and unfriendly interfaces can wreak on even the most concrete and well thought-out qualitative tools.

Finally, let’s move into the high stakes decision that is most often conflated with the push for more data-driven decision-making and creation of VAMs: merit pay.  For starters, any discussion of merit pay that sticks with the monolithic idea of “merit pay” is a useless discussion.  In the real world, there are very specific, and greatly varied compensation arrangements and implementations.  Scholarly work always points this out, but this doesn’t seem to filter out in the larger public discussions.  There is no one merit pay scheme.  And differences in their design and implementation matter.

Critics of data-driven decision-making, value-added metrics, and merit pay not only conflate them, but often try to tar them with the aura of corporate irresponsibility.  For example, in The Death and Life of the Great American School System, high-visibility reform critic Diane Ravitch often lumps the aforementioned ideas together under the banner of “business,” or “free-market” or “privatization.”  Goldstein herself refers to the reform movement as “free-market”.

The problem is that the conflation is about normative judgments, not about advancing knowledge. Normative language and conflation are shutting down conversations that can yield incremental, steady improvements that are win-wins for administrators, teachers, and advocates.

For example,  we need to get better at helping school districts scale helping teachers to understand and mine assessment data.  We need to replace the black-box model of VAM calculation with open source, crowd-verified models that are the result of multi-district consortiums.  Critics of VAM and merit pay raise many great points; a lot of their criticism is useful feedback in determining the optimal way of calculating value-added.  We also need a national focus on data quality both for quantitative data that fuels VAMs, but also the qualitative data behind classroom observations. We need to continue to try and determine the optimal components of compensation policies to improve teacher quality.

But we can’t have those conversations if we use polarizing language that reifies binary thinking. What we should focus on instead is learning from the evidence and having empathy for each other.  Simplistic statements like “charters are better” or “merit pay doesn’t work” tell us more about the ideological blinders behind those making those statements than about how we can use the expanding body of knowledge to optimize our school systems for student outcomes. (JG)