The traditional way of measuring schools based on how many students pass a test “plays to your worst biases around privilege,” Boasberg said. “…The most important thing is for schools to make sure when kids come in, whatever level they’re at, that they grow.”
But the district has been criticized, including by candidates in this year’s heated school board election, for giving high ratings to schools that may have above-average growth but where, for example, just 10 percent of third-graders can read and write at grade-level.
The percentage of schools rated blue and green, the two highest ratings, has grown over the years. In 2010, 45 percent of schools were blue and green. This year, more than 60 percent were. The district’s goal is for 80 percent of schools in every neighborhood to be blue or green by 2020.
Sean Bradley, the president and CEO of the Urban League of Metropolitan Denver, is concerned that all that blue and green is misleading to parents.
“The district has a duty to tell the truth,” he said. “And the current calculations that the district is putting out there may not be as accurate as we assume they are.”
Early literacy concerns
Last year, just 9 percent of third-graders at Barnum Elementary in southwest Denver scored at grade-level or above on the PARCC literacy test, which the state requires be given to students in grades three through nine and which it considers the gold standard measure of what students should know.
But 57 percent of those same third-graders scored at grade-level or above on the iStation literacy test, another state-chosen test that’s given to students in kindergarten through third grade.
For the purposes of Denver’s school ratings, that 48-point gap and others like it are troubling to advocates like Van Schoales, CEO of the nonprofit education advocacy group A Plus Colorado.
“What’s happened this year on the elementary school front, primarily because of the early literacy scores, threatens undermining the whole system,” Schoales said. “Most importantly, it is saying to families that schools are good when they aren’t.”
This year, the district increased the number of points schools could earn for doing well on iStation and other early literacy tests by adding metrics measuring how groups of traditionally underserved students did, which district leaders consider key to closing achievement gaps.
That increase in the number of points came at the same time schools across Denver, including Barnum, saw big jumps in the number of young students scoring at grade-level on iStation and other tests, which leaders credit to an increased focus and investment in early literacy.
As a result, Barnum earned nearly every possible point on the framework for its early literacy scores, while earning far fewer points for its PARCC scores, including zeroes in several categories. The school, which serves a primarily low-income student population, was rated green this year after being rated yellow the year before.
In a statement provided to Chalkbeat, Principal Beth Vinson said Barnum is proud to have been rated green. She said its focus on early literacy “is starting to show good results” that she hopes will lead to higher achievement in its upper grades.
Barnum was not the only green school with a big chasm between its third-grade early literacy scores and its third-grade PARCC scores. One of the biggest was at Castro Elementary, where 73 percent of third-graders scored on grade-level on iStation but just 17 percent did on PARCC. Castro jumped all the way from a red rating, the lowest, to green this year.
Boasberg agrees that the misalignment between PARCC and tests like iStation is concerning. Because PARCC is relatively new, he said it was only recently that the district had enough data to confirm the mismatch. To remedy it, the district announced this fall that it will raise the early literacy test cut points, which were previously set by test makers and the state. Doing so will make it harder for schools to earn points, which Boasberg suspects will affect ratings.
The higher cut points will go into effect for 2019, giving schools time to get used to them. Boasberg rejected an idea floated by some critics to eliminate the early literacy tests from the framework altogether. While he acknowledged they’re an imperfect measure, he said the district added them in response to complaints that elementary school ratings long ignored progress being made in the lower grades because those students don’t take PARCC.
“We definitely agree the PARCC assessment is a stronger, higher quality assessment,” he said. But the early literacy tests are useful, too, he said, and the district is better off using them than nothing. “…The question is,” he said, “‘Do you let the perfect be the enemy of the good?’”
The debate over academic gaps
Another pervasive complaint this year has been how the district’s focus on academic gaps between more-privileged and less-privileged students is dragging down some schools’ ratings.
Two years ago, the district launched a new part of the framework it called the “equity indicator.” Meant to shine a light on educational disparities, it measured how traditionally underserved students — low-income students, students of color, special education students and English language learners — were scoring on tests compared to set benchmarks, and how they were scoring compared to students not in those groups, so-called “reference students.”
The district warned schools that the following year, the equity indicator could count against them. If they didn’t score blue or green on the indicator, they couldn’t be blue or green overall.
During that hold-harmless year, 33 blue or green schools scored poorly on equity. The hold-harmless period also provided a chance to highlight issues with the indicator. Some school leaders, for example, complained it was unfairly dinging them for having large gaps even though their traditionally underserved students were scoring better than average.
What sort of message was it sending low-income parents, they argued, when a school with a big gap between poor and affluent students but where poor students were doing above average was rated lower on equity than a school where all students were doing below average?
The district took those concerns into account and tweaked the indicator this year, Boasberg said. It still measures gaps within a school, but it awards twice as many points for whether traditionally underserved students are meeting the benchmarks, taking the emphasis off the comparisons and putting it on whether underserved kids are on grade-level.
The district also gave the indicator a more precise name: the “academic gaps indicator.”
But concerns persist.
The Downtown Denver Expeditionary School, a charter elementary school where about 40 percent of students are minorities and a quarter are low-income, scored red on the academic gaps indicator for the second year in a row and was rated orange overall.
School leaders acknowledge the school has work to do in closing its gaps. Last year, 61 percent of middle- and upper-income third-graders scored at grade-level on the state literacy tests, while just 23 percent of students who qualify for subsidized lunches did, for example.
But they said despite the district’s tweak, it continues to make little sense that schools with smaller gaps but 8 percent literacy proficiency are green, while their school is orange.
“This isn’t about not holding us accountable for our achievement gaps,” said principal Erin Sciscione. “We want to be held accountable to that. We just don’t think the current system of measuring that is doing what it says it’s doing.”
Chantel Maybach, a special education coordinator at George Washington High, was among a group of teachers, parents and students who spoke publicly about the indicator at a recent school board meeting. She said she was “discouraged and sickened” to learn from one of the school’s data specialists that if white students at George had just not answered every fifth question on the test, the school would done better on the indicator and been green overall instead of yellow.
Senior Emily Ostrander said the lower rating was unfair for a school that serves “some of the highest-achievers in the district.” George is home to a rigorous International Baccalaureate program that for years fueled a divide among students, often along racial lines, that the school is working to erase. About 72 percent of George students last year were students of color, and about 55 percent qualified for free or reduced-price lunch.
“In a way, it dings the school for being as diverse as it is,” said student Yemi Kelani.
Nine schools were downgraded this year because they didn’t score high enough on the academic gaps indicator. George wasn’t among them, but Brown International Academy, an elementary school in northwest Denver, was. Kate Tynan-Ridgeway, a third-grade teacher at Brown, wrote an opinion piece in the Denver Post calling the ratings misleading.
Sixty-one other teachers signed on in support of the opinion piece.
If Brown were located a few blocks west and over the border of Jefferson County, where there is no academic gaps indicator, Tynan-Ridgeway said, it’d be green and not yellow.
“The achievement gap worries us all,” she said. “As educators, we’re differentiating all the time.”
But Tynan-Ridgeway said that with the indicator highlighting the performance of traditionally underserved students, “it feels to me that the district is saying those kids are far more important than what could potentially be the bulk of your student body.”
Boasberg responded with an opinion piece of his own explaining why the indicator exists. He wrote that it’s already showing promising results: The number of would-be green schools with poor indicator scores dropped by two-thirds from the hold-harmless year to this year.
The district is still fine-tuning the indicator, Boasberg said, and it’s possible more tweaks are coming. One issue, he said, is whether it should apply to schools where nearly all students belong to traditionally underserved groups. This year, the district decided not to downgrade the overall ratings of three high-poverty schools even though they did poorly on the indicator.
With such high stakes as funding, enrollment and even possible closure attached to school ratings, there are plenty of theories about the reasons behind the frequent changes. Is the district embellishing the ratings to make its schools look better and insulate itself from criticism about closing low-performers? Or is it inventing new ways to drive traditional schools’ ratings down so it can justify replacing them with charter schools?
Boasberg insisted it’s neither. But he said he understands why people hold such passionate, and often conflicting, opinions about the way the district rates its schools.
“There’s no perfect way to do it,” he said. “… At the end of the day, it’s enormously helpful for teachers, for parents and for school communities to have a school performance framework that takes data from many different sources and brings it together in a way that’s understandable.”
While the district debates what to do about the academic gaps indicator and gives schools another year to get used to higher early literacy cut points, there is one change that’s definitely happening for the 2018 framework. After lowering the bar in 2016 to essentially give schools a reprieve from the new and rigorous PARCC tests, all cut points for the literacy and math tests will go up next year, inching blue and green ratings a bit further out of reach.