Exam Rubrics: 'inert pair' marking in match-up questions

I have been thinking a lot about the ‘nitty gritty’ detail of setting exam questions recently because I judge that this is the problem which lies at the heart of my Department’s gender awarding gap (or at least the immediately-addressable aspects of the problem).

I’d like to use this post to discuss a specific format of question, which asks students to match one list of things with another. For example, I set my third-year students a question asking them to match a list of nickel compounds with the observed magnetic properties of several unidentified compounds.

A tutorial question on matching four nickel compounds to the magnetic data for four unidentified compounds.

Here is the thing I notice about this format of question: students either get two of the assignments correct or they get all four of the assignments correct. No-one has ever (Ever) got three out of four in this question. It is unforgivably whimsical of me, but I think of this as an ‘inert pair’ of marks; just as tin commonly adopts the Sn(II) oxidation state rather than Sn(IV), so do my students commonly score two marks rather than four.

The ‘inert pair’ outcome is a general feature of this style of exam question. The mechanism for this is that students mis-assign unknown compounds A and C. This one error costs them two assignments: incorrectly assigning the chloride as A means that you cannot correctly assign the fluoride as A. You’ve ‘used up’ your chance to assign A correctly. (It is in principle possible to assign both the chloride and fluoride as A which would score three marks, but I have never seen a student do this.)

Is this a problem?

It might be that this mechanism seems attractive to examiners and is chose deliberately. It could be that reserving two marks for flawless execution is what someone thinks is appropriate to the judgement of someone’s skill in this topic. But it isn’t clear to me what the mark of two out of four tells you about a student. Does it tell you something about their scientific ability or about their exam technique?

In this particular case, it is also interesting to think about how the gap between 50% and 100% of the marks for the question is the range containing most of the grades which distinguish degree classifications. When Oxford’s gender gap is specifically about the proportion of women getting a first-class degree (roughly three marks out of four), this rubric is an interesting symbol for the structures through which we assign marks.

Conclusion

I don’t have any grand insight into this phenomenon. But while it’s hard to generalise this analysis beyond the narrow ‘inert pair’ observation, it does resonate with other thematic problems in Oxford’s assessment structure.

Specifically, I think it probably serves as a useful case study for people new to the idea that exams are not a neutral test of someone’s ability. In Oxford’s system of high-stakes summative exam assessment, this revelation seems like a critical missing link in solving the crisis of gendered degree outcomes.

If degree outcomes are largely a function of exam outcomes, gendered degree outcomes must surely relate closely to the detail of how we choose, write, administer, and mark exam questions. Here is one - very narrow! - case where the structure (rather than the scientific content) of an exam question introduces a gap between full marks and half marks. When women are being awarded first-class grades less frequently than men, surely this inert pair’ of marks demands a moment or two of reflection.