Teacher Evaluations: The Devil We Know v. the Devil We Don’t

By Tamara

 

First, thank you Rob for inspiring the title of this post in one of your recent comments.

Last weekend I attended OSPI/CSTP’s symposium on Teacher Principal Evaluations and Common Core Standards implementation. I walked away with the overall sense that most teachers want an evaluation system that validates their efforts and provides opportunities for professional growth. There was also an overall sense of anxiety about how these new evaluations will be implemented. Who is doing the evaluating? How and when are they being trained to evaluate? Will my evaluator be knowledgeable about my content area or grade level? Or about goals and standards for special populations (hello, I teach English Language Development-I can assure you my students are not going to be meeting standard as defined by Common Core any more than they are with EALRs and Power Standards now)?

The overwhelming theme in my small group session was the need for implementation to be approached with positive intent by all involved. No wants to feel trapped in a game of “Gotcha”. At the same time the only positive thing I heard about our current evaluation model was that it doesn’t involve student data.

That is the ultimate sticking point. Everyone seems to see new teacher and principal evaluation as a positive until we get to the part about using student performance data. I agree this issue needs to be approached with caution and careful consideration. But I also think, what is the ultimate outcome of our work supposed to produce? Is it not improved student performance/learning over time? How often do we bemoan that the public does not see teaching as a bona-fide profession? All other “professionals” are evaluated to some degree or other as to how their work directly impacts achieving specific outcomes. Granted children are complex packages of multiple variables that make their growth as learners difficult to quantify. But going through the National Board Certification process opened my eyes to the fact that learning (as defined by growth demonstrated over time) is absolutely quantifiable. And because student learning is the core of what we do, we should not shy away from having that data as a part of our evaluations.

But student data can not be used as a “one shot” snap shot of teachers’ performance. And it cannot be based on a single measure (like MSP, HSPE, pick your alphabet soup high stakes test) especially if we can all agree that student learning is defined and growth demonstrated over time. We talk about portfolio assessment being a more accurate measure of student progress than individual on demand performance assessments or tests. Why not a portfolio assessment model for teachers when it comes to the student data portion of our evaluation? That would bring us far closer to the balance of accountability and flexibility I hear so many of us pining for.

6 thoughts on “Teacher Evaluations: The Devil We Know v. the Devil We Don’t

  1. Tom

    Great post, Tamara. I wanted to go to that symposium, but it happened to fall on my fiftieth birthday, and even I don’t like discussing education policy that much!
    The use of student data as you described (classroom-based assessments, pre/post test comparison) should be used as much as possible in regards to teacher evaluation. Unfortunately, though, when ed reformers talk about using student data for teacher evaluation, that’s not what they’re referring to . they’re talking about high-stakes tests.

  2. Tamara

    Mark-excellent point about what can get overlooked in “observational” data. I hadn’t been thinking about all those moments in class that can’t be captured (unless you are constantly video or audio taping-which would be overkill). It adds a whole other layer to my thinking. I also appreciate the reminder to all of us that assessments were designed to evaluate students not teachers. When we keep that in mind, the conversation changes dramatically.

  3. Mark

    Here’s a kind of corollary I noticed again in my literature classes this past week. If my goal is to assess a student’s comprehension of the reading, oftentimes discussion works well. We had a brilliant, goosebump-inducing conversation about Tim O’Brien’s “On the Rainy River” in my special education inclusion English 12 class this week. I framed this discussion as an assessment of my students’ reading comprehension, and three students in particular were able to orally articulate a depth of comprehension and analysis that blew me away.
    However, in the written assessment that accompanied the discussion (since in discussion sometimes not everyone speaks in order to enable my assessment) other kids showed similar depth of understanding via writing–yet these three boys who not only shone in discussion but clearly comprehended and achieved the understanding I aimed for–they utterly flailed in the writing assessment.
    Had I only used the “observational” data of the writing sample, I would have drawn a very different conclusion about what these boys had understood. I would have assumed they had failed, when actually, in the discussion which preceded the writing prompt, they clearly had achieved my learning target…in spades.
    My point, I guess, is what we were taught in teacher-school. Make sure the instrument you are using to assess your target is crafted for the assessment of that target. Because these particular students were struggling writers–what amounts to a “writing assessment” could have easily obscured the results of what was intended to be a reading comprehension assessment.
    Test scores were not designed to assess teacher effectiveness. They were designed to assess student learning, and through understandable logic, people then claim that they also can assess teacher effectiveness.
    However, all I have to say is that I am certainly glad that most of my seniors have a certain high quality history teacher this year… this history teacher is quite talented at teaching critical thinking and writing, so if I am assessed on my students’ abilities and even growth over the year, what she teaches them will make me look better than I really am… so, maybe it’s time to show a movie?
    Clearly I’m kidding about the movie…but that birdwalk shows another layer of the testing problem that I think further complicates reliance on data, particularly in the English classroom, since every other discipline requires reading to some extent and writing to some extent…so if some other teacher is effective in those realms, the perception of my performance is enhanced. That, to me, is a serious problem.

  4. Rob

    Data is just data…information. Often data leads to more questions but data rarely gives answers (especially to complex questions like is this teacher effective).
    Why is there such a push to quantify effective teaching? Could part of the reason be that teaching is complex. Yes we know good teaching when we see it but most people (evaluators included) have difficulty pointing at exactly what a teacher isn’t doing effectively and why their actions are ineffective.
    It takes an expert set of skills to truly diagnose effective teaching. It’s nearly impossible in a sixty minute snap shot. Many who possess this expert set of skills may not be good managers of people, they may not be effective principals.
    So data becomes a short cut. Set a particular benchmark that signifies effectiveness. Those who fail to meet it are removed. Those who remain are the “best”. Thereby the quality of teachers improve. Survival of the fittest. How gross, artificial, and shortsighted.

  5. Tamara

    And di dyou see the release of Gregoires preliminary budget today?!?!?! Eliminating a full week of the school year. Ending funding for transportation, Limiting English Language Development services to only the lowest linguistically capable (um…cognitive adademic language requires 7-10 years to approach fluency)…Oh yeah and gut NBCT stipends.
    I LOVE your suggestion of “Identify master teachers, have them teach part time and serve as evaluators (instead of having admin do this).Create administrative or teacher-leader positions whose sole purpose is the evaluation of instruction (no dealing with discipline issues, no standing up at pep rallies, no dealing with parents), which includes the identification of master teachers and the remediation of struggling teachers” but clearly there will be no funding for it. The governor is also talking about raising class size and yet somehow there are all kinds of monetary resources for implementing Teacher/Principal evaluation and common core? Where from? Not Race to the Top-we didn’t get that money…Why do our best, most viable ideas have to be pipe dreams?
    But back to your comment about the medical community: I do think they can (and should) be evaluated on data. I certainly relied on data such as where residency was done, how many c-sections performed annually, how many high-risk patients attended when choosing a doctor to deliver my last baby. Now does my doctor’s job depend on my choosing her? Probably not entirely, but in certain private practices it must factor in. As much as consumers choose to educate themsleves on their options….
    Another example of difficult but possible to quantify data.

  6. Mark

    I keep reading and hearing that other professionals are judged on their performance based on data. I’m curious: what professions, and what data? I get that there might be data in sales, other areas of commerce, etc. But what about the profession I feel bears the most resemblance to education: medicine. Theoretically, the medical community must treat whomever shows up at their facility. Same with us.
    I am not opposed to use of longitudinal data within my students’ experience with me. I am opposed to one-off test scores as a measure of my “proficiency” as an educator.
    There are solutions out there, but the budgets don’t allow them. For example:
    Identify master teachers, have them teach part time and serve as evaluators (instead of having admin do this).
    Create administrative or teacher-leader positions whose sole purpose is the evaluation of instruction (no dealing with discipline issues, no standing up at pep rallies, no dealing with parents), which includes the identification of master teachers and the remediation of struggling teachers–and require that this position involve at least four or five hours of in-the-classroom, in-the-trenches observation every day… and require that each FTE in this role have a case load of no more than forty teachers in order that real longitudinal observations of individual teachers can be performed.
    Alas, we can’t even muster the funding to keep the arts up and class sizes down.

Comments are closed.