Creating good multiple-choice question exams

Anthony J. Evans
11 min readFeb 5, 2021

--

I can remember encountering multiple-choice questions (MCQs) as a student — I assumed that the instructor was neglecting their duties as an educator, by either outsourcing the assessment or opting for convenience in grading over effectiveness in learning. And so I am instinctively skeptical about their utilisation.

Photo by Nguyen Dang Hoang Nhu on Unsplash

However, three things have changed recently. Firstly, having taken several online courses, I have come to see their undisputed benefit for asynchronous learning. Secondly, I have been noting their effectiveness within the classroom as an enjoyable way for students to check their learning and for me to monitor their progress. And thirdly, the shift to online assessment due to Covid-19 removed the traditional, invigilated exam as an option. Now that we have better tools, more experience, and weaker alternatives, it seems sensible to be open minded.

Companies use MCQs for important internal compliance training. If we accept the validity of MCQs for critical governance matters, I see no reason to dismiss them out of hand for higher education.

Many educators are burying their head in the sand waiting for things to “return to normal”. But I think this approach is misguided. The industry has been moving online for some time already, and existing assessment methods have their own flaws. We need to move forward, with confidence, and I’m keen to leave the pandemic with a richer set of pedagogical tools, and a stronger understanding of when and where to use them. I believe that MCQs should remain, provided they are done well.

In researching the pedagogical effectiveness of MCQs I found two articles particularly helpful. They are “Writing multiple-choice questions for higher-level thinking”, by Mike Dickinson, and “Writing good multiple choice test questions”, by Cynthia J. Brame. I recommend both articles, but attempt to incorporate some of their key points below.

The advantages of MCQs

MCQs offer advantages to students as well as instructors. These include:

  • They can be clearer, and less ambiguous, than open ended questions.
  • They can cover a greater amount of course content than a reduced number of open ended questions.
  • They remove discretion from marking and therefore generate greater consistency and transparency.
  • They automate grading and therefore eliminate human error.

Note that I do not include “reduces the time commitment of the instructor” on the list above because although MCQs are quicker to grade, their construction and implementation is more burdensome than traditional exam questions. Therefore MCQs are beneficial for instructors who are comfortable shifting their labour from marking exams to designing them.

The suitability of MCQs

I split my assessment questions into three categories (building on the work of J.P. Guilford):

  • Command questions — these test convergent thinking, in that students are intended to arrive at the same answer as each other. The solutions should be unambiguous and objectively assessed. Different instructors should be expected to give identical grades. Verbs for convergent thinking include: choose, select, identify, calculate, estimate, and label.
  • Discussion questions — these test divergent thinking, in that students are expected to provide original answers. What constitutes a good answer can be communicated via a mark scheme but there is significant scope for students to deviate from each other in their work. Verbs for divergent thinking include: create, write, and present.
  • Exhibit questions — this can be either a command or a discussion style question but the students provide their solution in the form of a visual image. For example students must create a graph or complete a worksheet.

In terms of Bloom’s taxonomy of learning (e.g. see the University of Florida), the top two levels broadly relate to discussion questions, and the bottom two are more suitable for command questions. I believe that when constructed well, MCQs can also occupy the middle two (and indeed the articles above both advocate MCQs for higher-order thinking). But this isn’t a necessary condition for their usefulness.

https://citt.ufl.edu/resources/the-learning-process/designing-the-learning-experience/blooms-taxonomy/

Provided that other assessment methods are used to verify higher order thinking, this article explores the use of MCQs in their proper domain.

The structure of MCQs

A well composed MCQ has two elements:

  1. The stem — this is the “question”, and should provide a problem or situation.
  2. The alternatives — a list of options for students to select, containing a (correct) answer and several distractors.

Some examples of best practice include:

  • The stem should be succinct and meaningful — it should only contain relevant information, and focus attention on the learning objective without testing reading skills. It should be meaningful when read in isolation, and it should seek to provide a direct test of students understanding, rather than inviting a vague consideration of a topic.
  • Material should relate to course content — the questions should find a balance between having zero relation to the course and being a trivial memory test. If the question relates to a simple definition, or indeed anything that can be googled, it calls into question the value of the course. (If the instructor is using a bank of questions from a textbook, perhaps students should just study the textbook directly). And having a set of questions that are general and widely used creates an unfair advantage to students that have prior background knowledge. That said, attempts to ensure that students actually took the course (e.g. “which football team does the professor support?”) are clearly superficial. Therefore questions should include some nuance, to detect if students were actively engaged. For example, I don’t believe that senior management are a proper “stakeholder” since they are the decision makers, and stakeholder analysis is geared at understanding who are impacted by those decisions. If I posed a MCQ saying “which of the following are stakeholders” and include options such as “senior managers” and “employees” then both should be counted as correct. But if I asked “which, according to the lectures, should be considered as stakeholders” then it is only “employees”. Similarly, if I ask a question that relates to a case, then I think utilisation of key information from that case is valid, even if not provided in the exam question. Instructions such as “using your knowledge of the case”, and “according to the lectures” are good prompts to ensure that students don’t view these as “trick” questions.
  • Consider the Texas two-step of higher level assessment — as Dickinson explains, this is a way to increase the capability of MCQs by introducing higher-level thinking. The idea is that although students can’t “describe” a concept if they have pre-assigned alternatives, they can “select the best description”. MCQs don’t allow them to make an interpretation, but they can “identify the most accurate interpretation”. Having identified the verb that the instructor wants to use, just change it to a noun and using a contingent verb in front.
  • Try to avoid negative phrasing — providing a list of options and asking which is not correct has the advantage of adding a layer of complexity (and thus difficulty), but especially for non native English speakers they move us toward reading comprehension rather than subject comprehension. If negative phrasing is to be used, using italics to draw emphasis is sensible (e.g. “which of the following statements is false:”)
  • Try to avoid initial or interior blanks — requiring students to fill in missing words are another way of increasing the cognitive load away from a mastery of subject specific knowledge. In many cases stems can be re-written to retain the purpose of the question.
  • All alternatives should be plausible — there is nothing wrong with using distractors as bait. Instead of listing a correct answer and several random words, each distractor should be considered. Brame argues that “common student errors provide the best source of distractors” and as long as they are genuine errors, and not evidence of poorly worded questions, this is true. This is also another situation where nuance is useful. If I ask “which of the following, according to the course material, is an element of justice? a) Power; b) Reciprocity; c) Respect; d) Equity” a student that selects “c) Respect” may be frustrated if they don’t get any points. They may even submit, as evidence in their favour, internet articles arguing that the concept of respect is highly relevant to understanding justice. As indeed it is. But in my class we use a framework that looks at four elements of justice, one of which is b) Reciprocity, and none of which are c) Respect. Note that the fact that one can make a plausible argument for the fact that respect is an important element of what constitutes justice is what makes it a good distractor. (In some cases using partial credit can be a good way to ensure students don’t get “wiped out” by trickier questions).
  • Alternatives should be reasonably homogeneous —having wildly different options can serve as clues about the correct answer, so the alternatives provided should be reasonably similar in language, form, and length. Astute and savvy students shouldn’t be advantaged over more naive students.
  • Don’t always make B the correct answer — whenever I create MCQs I remember a former student who “revealed” his strategy of always choosing option B. Given that this was a postgraduate courses, his presence implied that he’d had previous success :-)
  • Be careful with using “all of the above” or “none of the above” as options — these aren’t considered best practice because they allow students with partial knowledge to deduce the correct answers. However, when used deliberately they can prompt a closer engagement with the question by forcing students to read it several times. This is especially true with alternatives such as “a) A and B; b) B and C; c) none of the above; d) all of the above”.
  • Use different amounts of distractors — only providing 4 options implies that random guessing is worth 25% of the exam. Offering more distractors therefore forces students to confront each one and select something. It also makes it more likely that students will try to answer the question for themselves, and then see if their answer is listed, as opposed to reverse engineering each option, to see if it’s correct. However, too many options make it harder to maintain plausibility and homogeneity. Therefore using two, three, or four distractors, depending on the question, is totally appropriate.
Photo by Rishabh Agarwal on Unsplash

The structure of the exam

Once the instructor has a bank of suitable questions, the issue becomes how to arrange them into an effective exam. Here are my key points:

  • Provide a range of difficulty — there’s no reason to assume that questions should be of equal difficulty, even if they count for the same number of points. In an essay exam students can obtain the first 50% of the grade fairly easily, but every percentage point above 90% is progressively harder to achieve. This helps to create a normal distribution. A risk with a MCQ exam is that instructors have no discretion to deliver a curve by retrospectively modifying the mark scheme, and therefore great care needs to be given to the mix of questions. (When instructors grade essays they typically restrict themselves to a narrow range which ensures the right distribution. In a MCQ you bring both tails into play). In a MCQ exam with 10 questions, ensuring that 5 are relatively easy reduces the risk of a large number of fails. And ensuring that at least 1 question is very difficult prevents scores of 100% (which denies stronger students the ability to distinguish themselves from the rest of the cohort). Assigning different points for different difficulty levels will also help with this. For example having lots of simpler questions worth 1 point each, and several harder questions worth 2 or more.
  • Keep questions independent— having questions that sequentially follow creates higher risk for students (where missing one question severely impacts the next) and can provide information for savvy students that reduces their need to utilise course content (i.e. information provided in one question can be used as an input to solving another). Having multiple questions that relate to the same exhibit can be utilised by duplicating the exhibit in each question.
  • Take steps to reduce cheating — the main risk of cheating is student communication. Providing a mixed order of questions, and a mixed order of alternatives within each stem, reduces this.
  • Don’t reveal the answers — when using MCQs as a practice test it is important that students see their score, and I think it is helpful to let them see which questions they got correct and which were wrong. However, in my experience, revealing the correct answers provides a too-easy shortcut for students to take and stops the learning process. Don’t deny them the opportunity to pass the test by giving away the solutions!
  • Don’t reveal the grades — make sure that you have an opportunity to review the grade distribution, correct any errors in the mark scheme, and finalise any partial credit decisions, before students see their scores.

Implementing MCQs

The points above have convinced me that MCQ exams are a useful assessment method to use. However, I remain cautious and adopt the following rules:

  1. Give students an opportunity to do a practice test before any graded exam.
  2. Provide clear instructions and communicate them effectively.
  3. Monitor the results. It is important that instructors confront their intuition about what constitutes a good or bad MCQ exam. If a significant number of students misunderstand a question, it means that it’s phrased wrong. If all students get the same answer correct or wrong, consider whether it’s serving its purpose.

The first few times I utilised MCQs I got them badly wrong. But I try to be a quick learner.

Photo by NIPYATA! on Unsplash

Some key terms

When using MCQs I distinguish between 4 main uses:

  • A form — this is something that students fill in either online or on paper. It is something they submit for me to monitor their progress. For example, if I provide a structured assignment I will require them to complete a form as they go. This allows me to compare the progress of different groups, and see how much more time is needed. It is ungraded and helps manage the flow of the session. I typically use Google Forms for MCQ forms. Given that this is ungraded I also include open ended questions, where that students write in their answer.
  • A quiz — this is intended to help students check their learning. They can be used at the end of a session to cover recent concepts, or at the start of a session to permit time for reflection. They are ungraded, low risk, and can be fun. I typically use Kahoot! for MCQ quizzes.
  • A test — this is primarily to allow students to practice for an exam. It has a similar format and uses similar types of question. I typically use Google Forms, set as a “Quiz”, for MCQ tests.
  • An exam — this is a formal assessment and constitutes part of a student’s grade. In order to integrate with the grade book I use whatever learning platform is used by the programme (e.g. Blackboard or Canvas).

Note that in this article I’ve focused on the latter two.

--

--