LOL...standardized test scores are shit. They might be somewhat useful in the aggregate, but they are less than worthless in IDing an effective teacher. VAM scores - like EVAAS - are based on mathematical models developed to assess the growth of CROPS. They treat students like carrots and assume the only factor responsible for their growth is the teacher in their classroom.
And a particular teacher's score can vary dramatically from year to year. One study found that of the teachers judged to be in the top 20% based on test scores one year, a full third were in the bottom 40% the next year.
A study in NC indicated the "value" of VAM-based rankings when it found that
a student’s 5th grade teacher was a better predictor of students’ 4th grade growth than was the student’s 4th grade teacher. Yeah...read that again...teachers somehow impacted students they had never taught in a statistically significant way. Figure that one out.
And then, of course, there's the bias inherent in the models.
A very recent study used actual EVAAS data and found
that teachers in schools with the lowest relative populations of minority, FRL, SPED, or ELL students are more effective. Or, perhaps, and in line with current research on other VAMs (Amrein-Beardsley & *Collins, 2012; Capitol Hill Briefing, 2011; Hill et al., 2011; McCaffrey et al., 2004; Newton et al., 2010; Rothstein, 2009, 2010, 2014; Rothstein & Mathis, 2013), the EVAAS may be biased against teachers who teach relatively higher proportions of these students. The take-away from that is that good teachers facing high stakes testing will refuse to go where they are needed most...why would they when it will negatively affect their compensation.
BTW, EVAAS...the most widely used VAM...is proprietary to SAS Institute. They make bank selling it. And they have never allowed their source code to be evaluated outside their own walls. More relevant, the output generated by that source code - the actual information it generates - is also held as proprietary meaning their claims for EVAAS and what it can show has never been analyzed by any entity othert han SAS itself. One internal SAS
reliability analysis using 3 years of EVAAS data found correlations among teachers’ EVAAS estimates from one year to the next to be significantly high (i.e., ranging from r = .70 to r = .80). Sounds great.
SAS cites the result in their sales pitch but...strangely...refuses to release the actual paper for acdemic scrutiny.
.
We have to take their word for it, in other words.