## About the Growth Points

I’ve lost count of how many times grades have changed. I think it depends on how frequently superintendents were hitting refresh their browsers. Over the weekend, schools logging on to the SDE secure website were likely to see their A-F Report Card grades jump around. Again. Maybe even multiple times.

We received the explanation Friday that the grades would be released late due to “an abundance of caution.” It’s an abundance of something alright. That caution turned to the wagging finger of scorn, quickly blaming schools for continuing to find mistakes with the grades.

In the last two weeks, countless people have asked SDE staff for an explanation about their grades resembling the Celebrated Jumping Frog of Calaveras County. To their credit, SDE staff have tried to keep up with the volume of success. Gratefully, many of those responses have been forwarded to me. Again, I credit the SDE staff for patiently trying to answer the questions they’re getting. Those charged with directly facing external constituents did not create this mess. The policy-makers above them, the legislature, and Jeb Bush did.

The central idea of most responses is that the growth points are the root of most recalculations. To understand the growth points, it is probably a good idea to review the report card formula first.

The 2013 formula for the A-F Report Cards has three components:

- Student Performance (50%)
- Overall Student Growth (25%)
- Bottom Quartile Student Growth (25%)

*There are also bonus points available for certain other criteria. But those are an add-on and not included in the formula per se. And discussing them here is probably not critical to help with understanding the growth issues.*

The first part of the formula is simple. Half of a school’s grade is based on the percentage of valid tests that students passed, as long as the students were in school for the full year. Special education students count (whether they took the regular test, took a modified test, or completed a portfolio). English language learners count. All kids count. All subjects count (except for social studies: see here). If students at a school took a total of 100 tests and passed 80, then the school would get an 80 for that section.

For the next section, the formula counts all students who have a valid score for a test that can be matched to a prior test. For example, if a student took a 3^{rd} grade reading test in 2012 and a 4^{th} grade reading test in 2013, those scores would be matched and compared. For growth consideration, only reading and math count. If a student took a modified test in 3^{rd} grade but a regular test in 4^{th} (or vice-versa), there wouldn’t be a match. The process intends to show linkage between compatible exams. For the formula, any student scoring Proficient or Advanced in 2013 automatically gets a point. Additionally, any student scoring Limited Knowledge in 2013 who scored Unsatisfactory in 2012 also gets a point. From there, the remaining students count for zero points, **unless their growth on the test from 2012 to 2013 is greater than the state average growth.**

This is where interpretations matter. The SDE only counts students who actually showed growth into this average. The metric they use is the scaled score, or the Oklahoma Performance Index (OPI). There are two huge statistical problems with this. First is that OPI was never intended to be used for comparisons across grades or tests. A scaled score of 650 on the 3^{rd} grade reading test may not reflect the same relative deficit to a proficient score as a 650 on the 4^{th} grade reading test. The second problem is that the method excludes a significant number of students from the calculation of growth. In some cases, we were told last year, the average change would actually be a decline of up to a few points.

If we are to blindly accept the idea that OPI change from year to year is meaningful, then we should insist that all students’ OPI change be considered. Excluding students for whom the change was negative introduces all kinds of selection bias to the process. Then again, we wouldn’t want anything about the report cards to resemble an *academic* study. We all know how the SDE and their newspaper feel about things being too researchy.

That brings me to the third section – calculating the bottom quartile growth. This sounds like it should be simple. Take all the matched scores and rank them based on last year’s performance. Remove the top 75 percent. For all the students who remain, count their growth point (or lack thereof) a second time. Well, it’s almost that simple. From page 16 of the technical manual:

The bottom 25% is determined by rank ordering the previous year’s OPI scores for all students with both pre- and post-scores at a specific school. Students who scored at or below the 25% percentile at that site will be included in the bottom 25% growth calculation. The bottom 25% group is calculated separately for Math and Reading. Because OCCT, EOI, OMAAP, and OAAP exams are on different scales, a bottom 25% will be identified separately for each exam type. In other words, for a school that administers both OCCT and OMAAP exams, the bottom 25% will consist of the bottom 25% of OCCT Math scores, the bottom 25% of OCCT Reading scores, the bottom 25% of OMAAP Math scores, and the bottom 25% of OMAAP Reading scores. A school must have at least four (4) exams of the same type (e.g., OMAAP Math, OAAP Reading, etc.) in order to identify a bottom 25% for that specific type.

The bottom 25 percent of regular tests (OCCT and EOI) are calculated separately from the modified tests (OMAAP) and portfolios (OAAP).Last year, the growth index for the bottom quartile stopped short of 25 percent if a school had few enough low performing students. This year we are supposedly counting up to 25 percent, even if that takes us into students who scored in the advanced range last year.

I say *supposedly* because most changes to schools’ grades have come from the bottom 25 percent growth. The first mistake was that the SDE actually applied the formula to the *top* 25 percent. Schools loved that. Since then, there have been too many tweaks to count. You would think by this point, whatever changes remained to be made would be small. They must not be. Many schools have seen their grades move by three points or more over the weekend. While there may not be a big difference between an 82 and an 85, there is a huge difference between a 78 and an 81.

Even if every school in the state sends out a letter like Keith Ballard’s explanation to Tulsa parents, there is going to be a perception problem that schools not receiving a good grade will have to manage. It won’t be based on anything we can trust, but the problem will exist just the same.

The point here is simple. We’re too far into the process now to see scores still acting with this much volatility. As hard as the hard-working SDE staff charged with managing this travesty work, nothing can explain that.

***

Additional resources from the SDE:

2013_State_Average_Positive_OPI_Change_By_Grade-Subject_1

This is interesting, but be sure to click on the link to Keith Ballard’s letter!!!!!! HE ROCKS!!!

LikeLike

I don’t see how you can compare the bottom quartile students in one year to the bottom quartile the next year. If students improve significantly then they will not be in the bottom quartile the next year. So while the average score of the bottom quartile will improve, the improvement will appear statistically to be less than it actually is. If a teacher submitted this grading system to an administrator it would be denied.

LikeLike

Even though they changed the grades over the weekend, they left 10 am Monday as the deadline for data verification requests.

LikeLike

I have seen several references to the SDE using the 2013 rank order, instead of the 2012 rank order, to determine which students belong in the “bottom 25%.” Do you know if this is true? (If so, it would be backwards, because it would eliminate all the students who moved up out of the bottom 25%. It would, undeservedly, lower school grades.)

LikeLike

The scores are ranked based on 2012 performance and then matched to 2013. You’re right; doing this the other way around would not work at all.

LikeLike

I love the link to the celebrated jumping frog. Maybe the next article you could work in the link to the song “What does the Fox Say.” And all that other stuff about the A-F formula. I’m so pleased to read something that is reliable and understandable. You are serving your community well.

LikeLike