As I wrote previously, there are five million hits on ‘how to write effective MCQs.’ We do not need another ‘how to’ post – of that I am sure and yet that is almost what I am about to do.
Only a slight side step – here are a few, more general and hopefully thought provoking points I have learnt about MCQs and the statistics behind improving exam design from the last two weeks MCQ immersion.
First – just to address why I think you should be interested in MCQs. For sure, they don’t show the thought process and, agreed, they don’t develop a question response style, but they do allow you to be precise, to be objective. Right tool for the right job.
One of the clearest benefits of MCQs is that their rapid marking means you can spend more time teaching and actioning remedial learning and feedback. Of course, with a mountain of comparative information (both objective and precise) relatively secure class differences, and even investigations of the various teaching methods adopted by the teachers to teach the content, can be explored.
Signposts in the literature
Write questions in pairs or even as a dept to avoid bias and improve question quality. Designing questions and discussing ‘distractors’ (not the key response) l can spark very lively and interesting professional debate. Question writing then, becomes a PD activity.
Use a wide range of questions, from below to beyond what you think your students are capable of. MCQs are quick to mark, breadth affords the opportunity for students to confirm what you expect them to know (an ugly surprise if they don’t know what you expect them to) and also gives students the chance to surprise you. The exam statistics can be used to make your conclusions even more precise. (More of that later).
MCQs should not be a question of comprehension, keep the stem as clear as possible. I have found the image / graph / diagram MCQs supports comprehension, and if anything, are even more challenging. Accept I have no empirical evidence for that last statement.
The stem should also state clearly what is expected of the student before the options are read. This requirement would rule out use of the questions that start with “which of the following is correct?’
Options or distractors
Options = simple. Distractors raises the level of difficulty, (of course you can investigate the exam itself and the impact if the distractors themselves.)
Try and keep options a consistent length, avoiding the use absolutes (never, always, none of the above, all of the above). Absolutes are rarely used as correct responses.
Try to avoid overlapping responses or responses that are a subset of another response. Double negatives cause confusion, if you must use them, highlight / capitalise / underline the negative.
Testing the test
One of the advantages of using multiple-choice questions is the statistics that the tests holds about itself. The parameters which define the quality of the test item are known as discrimination index (DI) and facility (F).
Let’s take discrimination first. Discrimination is the ability to distinguish low and high ability students. That is one of that tasks we all have to face as teachers.
If the number of correct responses from students in the lower band for the whole test (bottom 27%), is greater than the number of correct responses from those students in top band (top 27%), then the questions may not be effectively discriminating between students.
The higher the DI value, the greater the discrimination. Questions with a near zero score should be removed as it indicates that students that did poorly on the test overall, did better on that particular question than students that did well overall. If that is the case, we need to check the question construction.
Facility (F) is the percentage of the class obtaining the correct answer. It is calculated by taking the sum of the actual marks for each candidate and dividing it by (the maximum multiplied by the number of candidates).
In general:
- if F < 30% the question is hard
- if F = 30-75% the question is satisfactory
- if F > 75% the question is easy
It was generally reported that questions with facility values of around the pass mark for the assessment (e.g. pass mark = 70%, facility = 0.7) give the most most useful information about the students.
Elsewhere I read that questions that are answered correctly by more than 85%, or less than 25%, of the students are of questionable validity. Perhaps simpler and easier to remember.
What can the distractors tell us – distractor efficient (DE)?
Analysis of the incorrect responses to also provides valuable information on the both question difficulty and distractor attractiveness, ie the extent to which answers that were meant to distract, actually did. This data is useful in revising questions and uncovering misunderstandings. These values can also be calculated.
Guessing
Objective tests are often, and simplistically criticised because students can guess the right answers. Just read part I and tell me ARQs are easy, or review the hinge question, or write questions with subtle distractor differentials. As I said ” simplistically criticised.” Still here are a few practical suggestions to counter that view.
Raising of the pass mark. If you use five options, as QuickKey does, In a test of 30 questions, random guessing should allow a score, on average, of 6. There the range of marks should be 6 to 30. The mid-point, 18, not 15.
You could consider using; deductions for wrong answers, no penalty for omitted answers or a combination of both.
Of course, this suggestions that an incorrect response is a guess and that all wrong answers are equal. Of course they are not.
What if you want to assess thinking and logic. Where poor distracters are used students are encouraged to use logical deductions to assist them in their search for the correct answer rather than guessing. See, even poor distractors can be used purposefully.
Next week I am discussing MCQs with Walter from QuickKey and exam feedback with @dataeducator as we push forward with question level analysis.
[qr_code_display]