Carleton faculty have been discussing grade inflation recently: The data indicates an approximate increase of about 0.14 grade points a decade in the average grade at Carleton over the last 30 years, bang in the middle of the pack of one of the better studies and we are going through one of those periods where considering what this means and what we might do about it seems appropriate. But don’t worry, of course there are people who say that there is no grade inflation at all. (You can read about all sorts of things related to grade inflation at the Wikipedia site, for example, though of course, caveat lector and all that).
Why and how has this happened? There is the ‘the world is going to hell in a hand-basket (we’ve screwed up undergraduate education and no one learns anything anymore)‘ argument from Stuart Rojstaczer which you can use as a baseline.
Then there is another whole different set of arguments that (speaking from the Carleton perspective) notes that (a) we’ve tremendously increased the support structure for students: They have access to writing support centers, math skills centers, reference librarians, information technologists … and other such personnel and resources that didn’t exist previously; (b) this sort of support system extends outside the purely academic side: We have a different and more thoughtful ways of tracking their progress through the Dean of Students office, and of providing help through Wellness Center counselors and elsewhere that was simply not possible earlier (c) The students are better trained on average coming in to Carleton than they’ve been (SAT scores keep creeping up) and finally (d) Our pedagogy has improved tremendously.
This last perhaps needs to spelled out a little more: It’s not that current faculty are claiming to be better intrinsically teachers than the various legendary professors who have gone before us. It’s that the way we teach has benefited tremendously from research on how students learn. The simplest example of this that speaks to the grade inflation question directly is that in writing intensive courses, we’ve learned to give students opportunity to revise their work, sometimes multiple times, and working with writing assistants and professional support staff as well as the faculty member in question before submitting their final paper. Is there any wonder that these grades are a lot better than they would be if based on the first effort?
None of these arguments discounts the possibility of a natural ratcheting effect: Assume that there is something, anything, that leads faculty to grade slightly more generously than they were themselves graded. Perhaps it is because they believe, correctly or otherwise, that their students are learning more than they ever did. Or anything else, it doesn’t matter what it is. Once you have that assumption, then you can see how grades ratchet up, generation after generation, with or without a ‘real’ improvement in student learning.
Next, why should we do anything about it? If you assume that current grades are NOT earned, then the reason is straightforward. We shouldn’t be doing ‘false advertising.’
And if so, how would we tackle it?
Well, how would we find out if the grades are earned? One way would be to introduce standardized tests of some sort to calibrate, and then re-set our grading system accordingly. Let’s assume we mean a standardized exam that is sort of cumulative and tests skills across the board. It turns out that there are some national attempts to talk about such things. I’ll let you hunt down these ideas on your own because of what I’m about to tell you: Carleton students broke one of the better-designed-and-known diagnostics when it was administered to them. They scored so high that the tests had to be renormed (twice, I believe) to accommodate our student scores. And then they hit the ceiling again (after which the test-makers refused to budge). We can’t talk about the details of this in public, so I’ll let it hang as a mysterious allusion. Unfair, I know, but my hands are tied, in this case. Anyway, what we learned from this exercise so far is that we can’t quite tell how well our students would do on an abstract external evaluation (compared to their internal grades) but they do extremely well.
So far, no reason to doubt our high grades with external benchmarks. In the absence of a benchmark telling us what our grades ‘should’ be, it’s hard to move forward except on some abstract principle, and some ‘gut-level’ feel of what is right. This hasn’t prevented various schools from trying different strategies. I refer you to a Princeton experiment (and ongoing consequences, including fears that it is affecting job-placement rates) which mandates that only a certain percentage of grades can be As. Will we go this route? Unlikely, but stay tuned.
I’ll note in passing that there has been no suggestion of standardized/external examinations at the microscopic level, for each course because of the idiosyncratic nature of our courses.
However, even if the high grades are earned, a second strong reason for doing something about grade compression is that we are starting to lose the ability to distinguish between student performances: If everyone gets an A, then even if all were over the ‘excellent’ bar, surely there are differences in performance that are being made invisible because the system is saturated by this compression.
Somewhere in the recesses of my memory is the thought that I’ve read an article that said that while grade inflation did indeed compress grades within a class, averaged over a student’s career, differences showed up in the final GPA. In short, the argument was that you might have to look at more digits beyond the decimal point than you used to, but you could still find differences. I can’t track down this article, so I just throw that out there.
Triggered by this, I have been idly throwing around one way in which we can decouple the second issue (distinguishing performance) from the first (is it earned): Decimal grading. That is, instead of going with As, Bs, and Cs, (with pluses and minuses as you like) which then get translated back to their numerical equivalents, why don’t we just assign numerical grades on a 4.0 scale? That allows us to distinguish with much better resolution between students.
Or perhaps you can think of other imaginative versions of the ‘revaluation’ of currency that happens when there has been hyperinflation when you just redefine 100 of your old currency thing-bobs to be 1 of your new currency thingy-bobs.
Of course the asymptotic consequence of decimal grading is … a ‘grade’ or report that presents your scores on a 100 point scale (or a 1000 point scale, your choice, but reducing it to percentages is ultimately sensible). Exactly the one I grew up with, in India.