Low stakes Quizzing and Retrieval Practice 4

Everyone seems to be doing retrieval practice now and there is an abundance of research  in support of the effectiveness of self-testing as a learning strategy, particularly with regards to increasing long term retention. Ever since retrieval practice has become popular amongst teachers, there has been a notable concern about how it is being approached and whether or not it really is as effective as its proponents would claim. One line of criticism is that the questions-often closed, recall questions-are nothing like the final performance that students encounter when they take an exam. Merely asking students something along the lines of ‘What word means excessive pride or ambition?’  is, on its own, not going to help students with their understanding of Macbeth. However, understanding the meaning of ‘hubris’ (even in this most restrictive question-answer example) may well be the necessary, inflexible beginning of their journey towards knowing how Macbeth’s hubris is his harmartia. It is the job of teachers to skillfully transform this inflexible, rote knowledge into flexible understanding.


In Learning as a Generative Activity by Fiorella and Mayer, a book that explores 8 learning strategies that promote understanding, self-testing is explained as being an effective study strategy, mirroring the findings of Dunlosky in this AFT paper. One of the strengths of Learning as a Generative Activity is that the authors are careful to outline the boundary conditions under which a strategy is most effective. In the minds of many teachers, retrieval practice has reached the status of ‘universally a good thing’ and this is potentially a problem. Like all pedagogical approaches, the decision when and how to apply it requires thought and judgment. If a strategy reaches the status of ‘100% effective’ then the nuance and theory that supports it will be lost as teachers pursue the surface features, unaware that the deep structure of the approach requires more than the mere robotic delivery of a quiz every single lesson.

Fiorelli and Mayer point out that, for retrieval practice to most effective, there are a number of important things that need to be considered:

  1. Learners need to receive corrective feedback following practice testing

This can act as a laser precise form of AfL as, when corrections are provided, students are able to plug tiny gaps in their knowledge. With instant corrective feedback, students can also benefit from the hyper-correction effect. This is the idea that the more confident students are that their answer is correct, the more likely they are to not repeat the error if they are corrected.

2. Self-testing is often more effective when questions are free-recall or short answer.

Free-recall is also known as a ‘brain dump’ and involves students writing down everything that they know regarding a specific topic.

Here are some examples:

a) Write down everything you know about Hyde

b) Spend 5 minutes writing as much as you can about Hitler’s rise to power.

3.Tests should be taken repeatedly

Distributed practice can massively help with long term retention. If we want students to retain information, then spacing out retrieval practice is crucial. Engelmann highlights it as one of the important shifts of task design-beginning with massed practice and moving to distributed practice. Damien Benney writes in detail about attempting to optimize the spacing gap here.

4. There should be a close match between practice test items and the final test.

Opponents of retrieval practice would point to the disconnect between quizzing and final performance. This is most apparent in subjects where the final assessment is extended writing as there is a stark difference between closed recall questions and essays. Being able to recall that a word beginning with ‘At…’ means relating to characterized by reversion to something ancient or ancestral is in no way going to help a student with writing an essay response that explores how the boys in Lord of the Flies descend into barbarism and savagery. However, proponents of low stakes quizzing would point out that if retrieval practice is being used appropriately, the closed question about ‘atavism’ would not exist on its own, instead being the beginning of a series of questions or being part of a wider recall activity that allows students to make the necessary links between vocabulary, character and theme. If the retrieval practice is effective, the concept of ‘atavism’ would not be retrieved in isolation or seen as an end itself. The teacher would carefully situate it within a wider body of knowledge, asking questions and discussing it in terms of the final test outcome: extended critical interpretation.

A recent paper by Pooja Agarwal entitled ‘Retrieval Practice & Bloom’s Taxonomy: Do Students Need Fact Knowledge Before Higher Order Learning?’ explored the efficacy of different forms of retrieval practice and came to similar conclusions to those in Learning as A Generative Activity.

Here are some of Agarwal’s key findings with some commentary:

  1. Closed, unconnected ‘fact’ quizzing will not help students perform well in higher order tasks

If retrieval practice means merely asking a series of closed recall questions, then this activity will probably not lead to successful performance on higher order tasks like extended writing. Mirroring the findings in Learning as a Generative Activity, the paper again stresses the importance of matching the practice test to the final test.

2. Higher order quizzing helps students with higher order testing

We should ask students retrieval questions that span the higher strata in blooms taxonomy. While there may be some contention regarding the strict hierarchical nature of the taxonomy, it is a good idea to ask questions that involve a deeper level of processing than mere factual recall. As an example, the distributed practice of analytical introductions  is a good method of synoptic recall which involves higher order thinking.

I often have an open recall question on the board at the start of a lesson. Because students tend to trickle into the class over a number of minutes, this means that those who arrive the earliest can begin working instantly instead of waiting for all students to get there before we begin a quiz. Also, the tasks are deliberately open ended-I often give them a 5 minute limit- so that low and high attainers can attempt them successfully, the differentiation here being by the depth and complexity of the outcome. I will often follow these open ended tasks with something that looks more like a quiz.

Here are some examples of higher order, open recall questions:

  • Why is Gerald the most sinister character in An Inspector Calls?
  • Think back to London, how do you know that Blake was a Romantic poet from the content of the poem?
  • What kind of woman is The Landlady in Telephone Conversation?
  • What is the connection between The Blackmailer’s Charter and Jekyll and Hyde?

All of these questions are asking for higher order cognitive processing, ensuring that there is a close link between the practice and final test. However, if students are to produce high quality answers to these questions, it is often important to have previously asked them more restrictive retrieval questions on the required components, initially in isolation, then later asking students to make links between the individual items thereby facilitating the integration of the individual concepts. This process reflects the journey from inflexible to flexible knowledge: well planned and carefully sequenced retrieval tasks can help students move along this continuum. While initial quizzing may be factual and restrictive, later retrieval tasks will look far more like what is expected (extended writing). If I were to skip straight to asking open ended retrieval tasks then students may not be able to retrieve and therefore apply the relevant components, precluding them from producing a high quality response.

Let’s look at an example:

  • Why is Gerald the most sinister character in An Inspector Calls?

Assuming a student has attended the lessons where Gerald’s character has been taught, then they will be able to answer this question at some level. If, however, this was the first retrieval question that they were asked, then their answer may lack some of the specific components that the question requires. Teaching students the components and ensuring that they can retrieve and apply these before they are asked to attempt a more complex task may well be a more efficient approach to mastering the content than beginning with a higher order complex retrieval task. If students skip straight to the open ended retrieval task then their poor performance will necessitate complex and detailed feedback in order to close the gap. Not only will this be time consuming, but it may also be very difficult or even impossible for students to take on board the feedback because of the myriad omissions and errors that they made. It may be far more efficient to insist that students retrieve and master each component before attempting to integrate them into a complex task.

Here is a list of some of the components that I want students to be able to include in their answer:


  • exploitative
  • objectify
  • infidelity
  • unscrupulous
  • disparity
  • benevolence
  • supercilious

Textual References

  • ‘I suppose it was inevitable’
  • ‘young and fresh and charming’
  • ‘made the people find food for her’
  • ‘I didn’t feel about her as she felt about me’
  • ‘I didn’t install her there so that I could make love to her’

 Initial retrieval questions that focus on these components may look like classic closed questions, things like:

  • Which word beginning with EX means the take advantage of someone?
  • Complete the quotation: ‘I suppose it was inev………’

When feeding back with these initial retrieval questions, the teacher should ask a number of follow up questions to ensure that students begin, even at this early stage, to engage in higher order thinking. Although the initial retrieval question may be closed, these follow up questions will have mixed formats.

Here is the original closed single component retrieval question:

1) Which word beginning with EX means the take advantage of someone?

Here are some possible follow up questions:

  1. How does Gerald exploit Eva?
  2. What is it about Gerald that makes his actions so exploitative?
  3. What is the most sinister part of his exploitative behaviour?
  4. Who else exploits Eva?
  5. How were the lower classes exploited in Edwardian society?

In Direct Instruction programmes, individual components- perhaps vocabulary or sentence structures-are ‘firmed’ before students are asked to use them in wider applications. The term ‘firmed’ here refers to accuracy, fluency and retention and students are often expected to demonstrate these stages of learning by applying a concept in a restricted context before they are asked to integrate a concept or skill into something more broad or complex. DI schemes use track planning where many different concepts are being ‘firmed’, each concept moving along a continuum from inflexible to flexible knowledge and slowly being combined and integrated with others. The idea here is that the atomization of content allows students to experience consistently high success rates which can be really motivating, particularly for low attaining students. Equally, it allows the teacher to give instant, precise and effective feedback on each of the components. In the initial stages of learning, instant feedback is really important and if used in conjunction with some form of atomization where components are taught and practiced initially in isolation, then this can help prevent cumulative dysfluency. If students are asked to skip straight to the higher order retrieval question (Why is Gerald the most sinister character in An Inspector Calls) then the danger is that they may make so many errors and omissions that effective feedback becomes impossible.

Effective and efficient instructional sequences will depend upon two important variables. Firstly, the context and type of retrieval activities should begin as restrictive tasks and move slowly towards wider application. Secondly, retrieval will be distributed over time in order to ensure long term retention.

This graph shows the relationship between the two variables and how specific retrieval tasks may be more appropriate at the start or the end of an instructional sequence:



At the start of an instructional sequence, a ‘quiz’ of restrictive closed questions may be most appropriate; at the end of a sequence and closer to the final test, wider retrieval tasks like paragraph and essay writing may be more suitable. As time progresses, the retrieval tasks should become wider, eventually mirroring the final test: extended writing.

With essay writing as the final outcome, this table explores the benefits and detriments of different question types:

retrieval practice table.png

Next Post: Retrieval Practice 5: further findings and extended quizzing

