As New York State prepares to roll out a Value-Added Model (VAM) to evaluate high school principals, education leaders are raising red flags. With the school year nearing its end, high school principals still lack a clear understanding of how their VAM score—representing 25% of their Annual Professional Performance Review (APPR)—will be calculated.
The model, introduced with little transparency and no stakeholder input, is raising critical questions about construct validity, reliability, and practical fairness. The metrics involved and the decisions made by the State Education Department and their contractor, AIR, could have profound and damaging effects on school leadership—especially in high-needs schools.
What Is Construct Validity and Why Does It Matter?
In any fair evaluation model, construct validity must be central. This means the measurement must accurately reflect the specific role or influence of a high school principal—isolating their effect from external variables like funding, district policy, and school demographics.
Unfortunately, New York’s proposed VAM system does not appear in any research literature on principal effectiveness. The metrics being used do not isolate the principal’s contribution from broader systemic factors. The mere presence of a statistical bell curve does not give the evaluation meaning if the underlying data is flawed or contextually irrelevant.
Problematic Components of the Principal VAM Model
One major component of the evaluation compares student scores from middle school state tests with performance on the Regents exams in Integrated Algebra and English Language Arts (ELA). However, this is not a true growth measure. The Algebra Regents exam is fundamentally different from 7th and 8th-grade math assessments—it’s a high-stakes graduation exam, not a skill-progress measure.
In districts where large numbers of students pass the Algebra Regents in 8th grade, very few scores remain for use in high school principal evaluation—often only from students who struggled or transferred in. As a result, principals may be judged based on the performance of a small, unrepresentative student group.
Similarly, ELA growth comparisons are complicated by inconsistent test schedules: some schools give the ELA Regents in January, others in June. This variability makes any growth calculation unreliable. Moreover, many principals haven’t even worked at their schools for the entire three-year ELA instructional window, further distorting attribution.
The second component of the VAM counts the number of Regents exams passed per student and compares that to similar students in other schools. This measure is fraught with inconsistencies. For example:
- Not all Regents exams are equally difficult—passing Earth Science is not the same as passing Physics, yet both count the same.
- Schools with innovative programs, like Scarsdale High School or portfolio schools, offer fewer or different exams, leaving their principals penalized despite high student performance.
- Special populations, such as BOCES students or students with disabilities, present additional evaluation challenges that the current VAM does not address fairly.
- Course acceleration practices vary widely. Students in some schools may enter high school with multiple Regents already passed, skewing comparisons.
These examples expose how this model fails to account for course offerings, curriculum design, and student placement strategies that vary from school to school.
Dangerous Incentives and Unintended Consequences
Perhaps the most troubling aspect of the VAM system is the incentives it creates that may undermine student learning. A model that rewards passing scores on easier exams may lead principals to:
- Encourage students to take less rigorous science or math exams
- Pressure special education students to retest repeatedly, even when not educationally appropriate
- Discourage participation in BOCES or arts-based programs that do not contribute to the Regents count
- Prioritize “the count” over curriculum quality or academic depth
These unintended consequences shift the focus away from student growth and well-being toward administrative survival in a punitive accountability system.
Who Would Want to Lead Under These Conditions?
Under this model, principals in high-needs schools face a disproportionate risk of being labeled “ineffective” through no fault of their own. Without controlling for teacher effects or school conditions, the model assigns responsibility where influence may be minimal.
This could lead to:
- Increased turnover among principals in struggling schools
- Fewer qualified candidates willing to step into high-stakes, low-support environments
- Disruption to school communities already facing instability
If a principal with only one or two years in a school receives a low VAM score, they may be discouraged from staying long enough to implement lasting change. Even though VAM is just one part of the APPR, a low score in this category can disproportionately affect a principal’s overall rating.
A Call for Responsible, Research-Based Reform
Education leaders across New York are urging the Board of Regents to pause the implementation of this untested, unreliable system. They ask for transparency, stakeholder input, and a commitment to evaluation tools that are meaningful, equitable, and aligned with what actually drives school improvement: professional development, leadership stability, and community engagement.
Until the VAM system meets accepted standards of validity, reliability, and fairness, it should not be used to judge high school leadership. A rush to implement poorly designed models may look like progress on paper—but it risks doing real harm to the students and educators it claims to serve.