On adjusting final marks

The process of adjusting the final marks for me depends on the course and the circumstances, but it doesn't differ a lot from term to term, so a single example will be illustrative. Here's what happened in fall 2003 with fellow instructor Steve Mann and tutor Terry Vaskor.

Terry reported that the final grade average, without taking into consideration all the special cases (plagiarism cases under review, failure to pass the weighted average of exams) was a little over 70%, with about 10% failures. That, viewed as a coarse measure, seemed to indicate that we didn't have to adjust the marks globally. Terry, Steve, and I gathered at 9am one morning in the tutor office to look at individual cases.

We didn't have to adjust the marks globally, but the fact that I had to think about it only underscores the arbitrary nature of the measurement. I allude to this in my essay on the distribution of midterm marks. What does a mark mean, when I can move it this way and that with a few keystrokes?

There was a little more basis to what we did that morning, but still a fair amount of subjectivity. Terry had created a file with weighted exam marks and final grades for four groups of people: those who failed on a final grade basis, those who failed the weighted average of the exams, those who had accepted a plagiarism penalty, and those who had appealed a plagiarism penalty (and consequently would get a grade of UR, or "Under Review", pending the decision of the Associate Dean).

We would re-examine each of these cases, going as far into the numbers as we thought they merited. In upper-year courses, usually the head TA keeps marks in a spreadsheet, and I do this process in my office. In the case of CS 251, ISG had set up a database that is queried via the Web. This is a little safer -- it's just too easy to wreck hidden formulas in a spreadsheet -- but more awkward for this process. Terry had to keep flipping between a StarOffice spreadsheet, the text files of special cases, a file from which we could "grep" sequence numbers to retrieve paper exams, and the browser to look at a student's history.

First, we looked at the weighted exam score and computed final grade. If these were really low, below 40, there was little point in looking further. If one of these fell between 40 and 50, we would pull up the student's full grade record. We had a slight anomaly in the marking scheme: the formula allowed the midterm to substitute for the exam if the exam mark was better, but the requirement to pass the weighted average of midterm and final said nothing about such a substitution. Since we had an easier-than-usual midterm and a final that seemed pretty much dead centre to me, we had to watch for students who improved on the final but whose weighted average might cause them to fail. In the end there were only one or two such students.

When we pulled up the full grade records, we would look to see how they did on the midterm and final, and what their pattern of assignment marks was like. The marking scheme put all missed credit for assignments onto the weight of the final, but the record did not distinguish between a question not attempted and one attempted but no credit earned. (A mark of zero on a whole assignment pretty much indicated no submission.) If that did not give us enough information, we had Terry pull their final exam from the folders, and Steve and I looked over each question.

Normally, for students who fall into the grey area -- either weighted exam mark or final grade between 45 and 50 -- we end up pushing a fair number over to a passing grade. But that is in a situation where they are forced to do the assignments. This situation was distinctly different. The students we were seeing had nearly all gotten mediocre or poor marks on their assignments, usually because they had not handed them in. When we looked at their exam, it would nearly always turn out that they earned their marks on the easy questions -- where they just had to do a trace, or add a single line of microcode. Looking at their short answer questions usually showed a considerable lack of understanding of the issues raised by the second half of the course (which in most courses is harder than the first half).

You can read this in many ways: perhaps we had let students down by not forcing them to do assignments, thereby depriving them of the practice they needed to master the material; perhaps we had only deprived them of the ability to counterfeit knowledge to the point where they put themselves into the grey zone from which we would normally pass them. Whatever the reason, there was much less ambiguity than usual. We passed a few of the students, when they actually showed some ability to earn genuine marks across the exam, but failed most of them. The failing marks are pulled down to 46, because a reported mark of 49 almost certainly guarantees an appeal to try to scramble for a couple of marks here or there, and we had already gone through the exams to make sure there weren't any errors (we found one case where Steve had forgotten to record five marks, which pushed a borderline case to a clear, if marginal, pass). Those who did not write the final exam got a DNW recorded, instead of their computed failing mark (not counting the two known INCs).

When we got down into the plagiarism cases (who, by cheating, had lost the benefit of the liberal assignments policy), things got even clearer; they either clearly failed due to poor exam marks, or they clearly passed (only a few). We couldn't do much about the UR cases without knowing the outcome of the appeals (and some of these appeals are upheld at higher levels), but looking at the two possibilities, it was clear that the same pattern would hold.

So most of what we had to do was to shuffle between windows; there were few really difficult decisions to make. We were done in a little over an hour (sometimes it takes two or three). The final failure rate was over 10%, with more to come from the UR cases (even some of those whose appeals would succeed); the class average would not be significantly lower than the initial estimate. --PR

(Adapted from a blog posting made December 12, 2003.)