Applying Cognitive Load Theory to English Part 2: The Alternation Strategy- How example problem pairs can work in English.

This is the second post looking at the application of Cognitive Load Theory to English. The first one can be found here.

‘The Alternation Strategy’, also referred to as ‘worked example problem pairs’, is the idea that ‘for an example to be most effective, it had to be accompanied by a problem to solve.’ The most effective use of worked examples is to ‘present a worked example and then immediately follow this example by asking the learner to solve a similar problem.’ Interestingly, the researchers found that if you give students a number of massed worked examples and follow that later with a similar massed set of problems, then this led to poor outcomes. For The Alternation strategy to work effectively, both example and problem need to be presented simultaneously.

Greg Ashman has written here about how this applies to Maths teaching.

efficiency in learning coverOne of the specific benefits of using worked example problem pairs is that they accelerate learning, reducing the time required for instruction. In Efficiency in Learning Evidence Based Guidelines to Manage Cognitive Load by Clark, Nguyen and Sweller, the authors explore a 1985 study by Sweller and Cooper which used algebra problems. Students were assigned to two groups: one in which students completed eight problems and the other in which they studied 4 sets of example problem pairs. Here are the findings:

worked example results table

The ‘all practice’ group took nearly six times as long to complete the instructional sequence. Students in the example problem pair group were not only faster at completing the lessons, but they were also faster at completing the test which followed. Additionally, the number of test errors was less for those students who had studied under the Alternation Strategy (example problem pairs).

Another later study quoted in the book was ‘Conducted in Chinese middle schools in which a traditional three-year course consisting of two years of algebra and one year of geometry were successfully completed in two years by replacing some practice with worked examples!’ This confirms the findings of the previous study and suggests that the technique can be adapted to real life classrooms and learning.cogload theory

Although both of these studies involve maths, in Cognitive Load Theory by Sweller, Kalyuga and Ayers, the authors make the important point that ‘the cognitive architecture…does not distinguish between well-structured and ill-structured problems’ meaning that the findings of Cognitive Load Theory apply to all domains.

So why does this work? In Efficiency in Learning Evidence Based Guidelines to Manage Cognitive Load by Clark, Nguyen and Sweller, the authors explain that ‘Having a worked example to study just prior to solving a similar problem provides the learner with an analogy available while solving the problem. When having to actively solve a problem without the benefit of an analogous example, most working memory capacity is used up for figuring out the best solution approach, with little remaining for building a schema’. Not only does the worked example present an indication of the ‘best solution approach’, providing students with a clear idea as to what is expected when they attempt the problem, but it also exemplifies the level of quality that they are to aim for.

Applying the Alternation Strategy in English

This is how we apply this approach in English. Here is a screenshot of a typical double page spread from one of our ‘booklets’ (essentially in-house textbooks that we produce and centrally plan for each unit of study):

booklet double

Before working through the stages, students would have typically completed a cumulative recap quiz or a sentence practice activity. They may also have begun the lesson with some whole class feedback and tasks that address common misconceptions from a previous piece of work.

An overview and key:

In the screen shot above, I have numbered each section in order to make this explanation clearer; lessons would normally work through the sections in numerical order.

Stage 1: A vocabulary table

This provides students with words to use in their analysis. We list all forms of the word, labelling each one with word class, an approach that, as a result of systematic and regular usage, has helped students to categorise words and to make generalisations about how different affixes can change one class to another. Crucially, the table contains an example sentence for each word, deliberately using the sentence structures that we want students to use when they are writing and each one acting as a mini worked example. The teacher will ask questions about the vocabulary, provide further examples, antonyms and synonyms and elaborate further. Students will make annotations to their vocabulary table, following how the teacher annotates under the visualiser. The teacher will make links to previous learning, asking about preceding lessons on the same text as well as other units, taking advantage of distributed retrieval practice.

They may ask student to complete oral because/but/so sentences:


After reading and discussing the relevant row from the table

Teacher: Finish the sentence, starting from the beginning: ‘Gerald objectifies Eva Smith because… .

Student: Gerald objectifies Eva Smith because he complements her appearance.

Teacher: Add evidence.

Different Student: Gerald objectifies Eva Smith because he complements her appearance, calling her ‘Young and fresh and charming.’

Stage 2: The First ‘problem’

In this case the question is What kind of man is Gerald?

Stage 3: The first extract to annotate

Students have already read many of these quotations in the vocabulary table. Many of the example sentences from the table contain embedded evidence that uses these lines. Through a process of quick questioning, the students will get lots of massed practice in seeing how the vocabulary words apply to the quotations. The teacher will annotate their copy under a visualiser: I tell students that their annotations must look at a minimum exactly the same as mine, although they are free to add additional ideas, opinions and interpretations that come up in focussed discussion and questioning. This stage of the process allows the teacher to directly and ‘live’ model what annotation looks like, providing a worked example of this crucial stage of textual analysis. The skill of annotation, something that Joe Kirby has written about here, allows students to engage in close analysis and deconstruction of language, helping them to understand the important link between text and interpretation. Annotations also serve as a useful source of revision, capturing ideas and thoughts and making them permanent.

Underlined quotations are the ones that students will memorise (essential for GCSE literature), most likely in a test that will be given a month or so after they have encountered them here, exploiting the benefit of distributed practice.

Stage 4: Worked Example

The worked example uses the example sentences from the vocabulary table. The idea here is to demonstrate how the constituent sub-components (sentences and vocabulary) fit together to form a more complex whole. It also uses the lines and ideas from stage 3, demonstrating how annotations and notes can be transformed into analytical prose. The worked examples also contain ‘The 6 Skills’ (our analytical framework) and exemplify how they are applied to a specific problem. The 6 Skills are first taught in year 7, the idea being that they are a generalizable strategy that can be applied to a wide range of analytical problems and situations. In year 7, they are taught and practiced at a sentence level, building cumulatively to their application within more extended writing. Over a five year curriculum, students will see them used in conjunction with the full spectrum of analytical writing, including character, setting and thematic responses.

How do we use the worked example?

In Efficiency in Learning Sweller et al posit that ‘To be effective, a worked example must be studied’. Following Engelmann’s ideas that tasks should ‘shift’ from prompted to unprompted  as well as from teacher led to student led, none of the worked examples are labelled in the booklets, allowing teachers to judge the approach based upon the proficiency of the class. All classes use the same models and the level of in class support, as well as the focus of the final writing task, is chosen by the teacher.

Here are some possible approaches:

A) Low level class who are inexperienced with writing analytically.

Under a visualiser, a teacher may underline, label, explain and question most if not all relevant aspects of the model, making it clear which parts are important and asking students to do the same on their copy. They may choose to merely focus on the easier skills like ‘3 part explanations’ or ‘zoom in on a word’, or even simpler, more fundamental skills like embedded evidence.

B) Class who have some proficiency with analytical writing.

In Efficiency in Learning Sweller et al describe something called A Completion Example, essentially a hybrid strategy where ‘some of the steps are demonstrated as in a worked example and the other steps are completed by the learner as in a practice problem’. With students who have already acquired some of the analytical skills or specific sentence constructions, a teacher may label one or two examples of a specific analytical skill within the model paragraph, asking students to copy their annotations. The teacher may then ask student to find other examples, using the teacher directed ones as models to guide their annotations. The next post will explore the utility of completion problems in more detail.

C) More proficient students

If a class is proficient, the teacher may ask students to annotate with minimal teacher instruction, essentially using the model as a means of retrieval practice of analytical skills as well as allowing student to broaden their understanding of their scope and breadth. If a student has seen multiple, slightly different examples of how a specific skill has been applied across different contexts, then this exposure will hopefully lead to a firmer understanding of it.

These particular approaches will be used, irrespective of proficiency level:

  • Teachers will ask multiple questions about the target vocabulary, asking students to cover the vocabulary table beforehand to ensure that they are engaging in actual retrieval practice. They may make annotations next to words e.g. exploit=use/take advantage
  • Teachers will ask about the interpretations and analysis, often asking students to use vocabulary from the table. Why might Gerald’s attempt to find food mean he is benevolent? Is he exploiting Eva here or being compassionate? What does ‘distressed’ tell us about Gerald?

Stage 5: The second extract to annotate:

Students have already been taught vocabulary in stage one, applied it in stage 3 and 4 and now they will need to apply it again in stage 5. Again, the proficiency of the class will determine how teacher led this annotation segment will be, but most importantly, students will apply the vocabulary from the table in stage 1 when annotating the lines. In stage 4, students were led through a worked example of annotations and this can be used as ‘an analogy’ with which to support their annotations in this second text extract. Thinking analytically about a text is, by its very definition, something covert and implicit, a process that exists in the mind of a writer but cannot be observed. Asking students to ‘overtise’  this process by annotating a text can provide valuable feedback to the teacher. Have they understood and applied the vocabulary from the table to the appropriate evidence or extract? Have they correctly identified a technique? Have they made a link between similar pieces of evidence to build up an argument? The teacher can circulate during this stage and address any misconceptions promptly, preventing them from becoming embedded and ensuring that the errors do not manifest themselves when the student completes the final written task. During the acquisition stage of learning, when students are learning new content and lack proficiency, immediate feedback is key to prevent errors from becoming embedded.

Stage 6: Second Problem

When I first started using model answers, I would demonstrate one and then ask students to write their own. This almost always resulted in indolent or weaker students copying the model without thinking at all about the task. Higher ability students would also explain that the model was a hindrance. It was as if the model had an intrusive and malign anchoring effect: students didn’t want to deviate from it as the implicit assumption was that it exemplified excellence; however, they didn’t want to entirely emulate it as this was clearly just plagiarism.

Here is a sequence from the old, problematic approach:

1) Question

2) Worked Example


4) Student response (plagiarised from or hindered by the model!)

Here is a sequence for the new approach:

1) Vocabulary that applies to both questions (containing multiple worked examples of sentences to be studied)

2) First Question

3) First extract to annotate (acting as a worked example of annotation)

4) Worked Example to study (contains lines vocabulary and lines from stage 1 and 3)

5) Second extract to annotate. (these are different lines to the ones explored in the model)

6) Second Question

This six stage process has been designed in order to avoid the weakness with my earlier approach. The first and second questions may vary in focus and wording, although they may be exactly the same. Because the lines in stage 5 are different, it prevents students from merely copying the model answer. Instead, the model can act in the way that Sweller intends when he explains that ‘Having a worked example to study just prior to solving a similar problem provides the learner with an analogy available while solving the problem’. However, students are still able to apply the sub-components (vocabulary, sentence structures, analytical skills) that they have been explicitly taught.

We have methodically and systematically threaded this approach into all schemes of work from the start of year seven onwards. When students engage in analytical writing, they almost always encounter these double page spreads.

The alternation strategy is a core part of our resources and this table provides a summary of how it works in our booklets:

abNext Post:  ‘The Problem Completion Effect’: An Overview


Applying Cognitive Load Theory part 1: Overview and The Worked Example Effect

This is the first blog post looking at how Cognitive Load Theory can be applied in the classroom.

I first came across Cognitive Load Theory a few years ago via Greg Ashman’s informative and prolific blog. Since then, interest in John Sweller’s work has spiked and it seems to be one of the hottest topics amongst research-informed teachers.  Dylan Wiliam has even gone so far as to say:


While much of the interest and discussion surrounding Sweller’s ideas is centred around the domains of science and maths (this is understandable as a large number of his studies focus on these areas), this series of blogs will explore how I have been attempting to apply his ideas to English, a subject area that Sweller would call an ‘ill-structured learning domain’.

If you would like to know more about Cognitive Load Theory, here are some useful resources:

1) Greg Ashman’s blog has many detailed posts about CLT

2) This succinct and practical summary 

3) Oliver Caviglioli ­ has made a fantastic graphic overview of Cognitive Load Theory by Sweller, Kalyuga and Ayers

Six years ago I read Why Don’t Students Like School by Daniel Willingham, a text that not only made me reconsider almost all aspects of how I was teaching, but also acted as a springboard into the depths of educational research. His explanation of the importance of memory and the conceptual distinction between working and long term memory revolutionised how I thought about instruction and made it abundantly clear that I had not been focussing upon the vital notion of retention. Cognitive Load Theory is also based on the conceptual difference between working and long term memory and provides a number of strategies to optimise instruction within that framework.

All subsequent quotations and references that I use are from Cognitive Load Theory by Sweller, Kalyuga and Ayres.

cogload theory

An Overview of some of the Theory 

What is it that makes experts proficient? In 1973, a study was conducted to investigate what made grandmaster chess players superior to other players. While an intuitive answer may have attributed their dominance to more proficient problem solving abilities, the application of a generic ‘means-ends’ analytical approach or the fact that they weighed up and considered a wider range of alternative strategies, the reality was a difference in their memories. Players, both expert and novice, were shown a chess board with pieces arranged in plausible and typical game situations for 5 seconds. When asked to recall the positions of the chess pieces, expert players were significantly and consistently better than novices. However, if the pieces were arranged randomly, then this gap in performance disappeared: experts and novices performed the same. With the random configurations, experts could not rely upon recalling thousands of game configurations as the pieces did not conform to or fit game patterns that they had stored in long term memory. Similar results have also been found in other domains, including recall of text and algebra. The conclusion of these studies is that when solving problems or engaged in cognitive work, experts within a field rely upon their larger and more developed long term memory deposits, patterns of information that are also called schemata. While short term memory has a limited capacity, long term memory capacity is vast and seemingly endless.

Recognising the fact that novices have less relevant knowledge stored in their long term memory, Sweller et all explain: ‘Novices need to use thinking skills. Experts use knowledge’. Because ‘thinking skills’ rely upon working memory, an aspect of cognition that has a small and fixed capacity for holding and manipulating items, novices soon reach the limits and, due to excessive cognitive load, find tasks difficult or impossible as a result. The implications of these findings are striking for teachers. In a general sense, we should be spending much if not most of our time as teachers trying to increase our students’ domain specific background knowledge so that we can help them overcome the seemingly unalterable capacity in their short term memory and instead recall, apply and use relevant knowledge from their long term memories. Sweller et al posit that ‘we should provide learners with as much relevant information as we are able’ and that ‘assisting learners to obtain needed information during problem solving should be beneficial’. They also posit that ‘Providing them with that information directly and explicitly should be even more beneficial.’ I have written before about how we provide students with relevant vocabulary when responding to texts, as well as how to create focussed practice activities that assist learners when solving problems.

The Worked Example Effect

In short, the worked example effect refers to the idea that if you want novices to succeed in a domain, they would be better studying the solutions to problems rather than attempting to solve them. Asking students to repeatedly write extended answers to questions ‘unnecessarily adds problem-solving search to the interacting elements, thus imposing an extraneous cognitive load’. In the absence of well-developed background knowledge, students flounder because they have little stored in their long term memories to help them. Comments in class such as ‘I don’t know how to start’ and ‘what do I write’ are sometimes indicative of this scenario.

Responding analytically to texts is a complex activity containing multiple components, many of which are abstruse for novice learners. If you try to describe these elements, you are forced to use abstract phrases like sophisticated analysis, judicious use of quotations and, in the absence of examples, these terms merely serve to mystify the process further. This is the language of mark schemes, terminology that may make sense to experts but leaves novices confused. Creating worked examples-in English this may mean sentences, paragraphs or essays-exemplifies these opaque terms, converting the abstract into the concrete.

Sweller et al posit ‘worked examples can efficiently provide us with the problem solving schemas that need to be stored in long-term memory’. Studying worked examples is beneficial because it helps to build and develop students’ background knowledge within their long term memories, information that can then be recalled and applied when attempting problems. The grandmasters in the chess study were successful because of the breadth and depth of their background knowledge. Similarly, English teachers find writing (one of the problems in our domain) easy because we have long term memories that contain myriad ‘problem solving schemas’ and mental representations of analytical responses to texts.

If we accept the notion that short-term memory capacity is pretty much fixed as well as the idea that we cannot really teach generic higher order thinking skills , then building domain specific background knowledge may be our most important job as teachers. Studying worked examples is more effective and efficient than merely attempting problems: deconstructing and studying model sentences, paragraphs and essays should, in the long run, be superior to merely writing them.

Research into The Worked Example Effect in English

In Cognitive Load Theory, Sweller et al refer to English, the humanities and the arts as ‘ill structured learning domains’ to distinguish them from mathematics and science. They make the point that while maths and science problems have ‘clearly specified problem states or problem solving operators’, essentially rules that dictate process and approach, ‘ill structured domains’ do not have such rigid constraints. Although there are subjective elements within English and often innumerable ways of approaching a task, different approaches may be considered of equal worth and demonstrate a comparable level of proficiency. The variables within analytical writing can, like the colours within a painter’s palette, be arranged in numerous and diverse patterns; however, these different configurations can be judged to contain equivalent skill and quality. Despite this, the researchers make the important point that ‘the cognitive architecture…does not distinguish between well-structured and ill-structured problems’ meaning that the findings of Cognitive Load Theory apply to all domains. The researchers also posit ‘the solution variations for ill-structured problems are larger than for well-structured problems but they are not infinite and experts have learned more of the possible variations than novices.’ Over the years, teachers have read, thought about and produced innumerable pieces of analysis and, as a result, have developed rich schemata of this kind of knowledge which we can recall, choose from and apply when dealing with problems.

Sweller et al point out that ‘even though some exposure to worked examples is used in most traditional instructional procedures, worked examples, to be most effective, need to be used much more systematically and consistently to reduce the influence of extraneous problem-solving demands’ A five year curriculum that systematically and consistently uses worked examples should help students build a rich schemata of ‘possible variations’, moving them quicker and more efficiently along the continuum from novice to expert than if they had just completed lots of writing tasks. The constant studying of concrete worked examples is far superior to describing proficiency using abstracted and often vague descriptors and success criteria. When describing complex performance in the absence of concrete examples, which is the purpose of a mark scheme, the sheer breadth and possible variation of what is being described necessitates a wide lens of representation. While this is advantageous to the expert, allowing complexity to be summarised and condensed, it is obfuscatory and perhaps even meaningless for students. Experts have detailed and abundant schemata that exemplify abstract terms like critical analysis, judicious references, contextual factors; novices do not.

In Cognitive Load Theory, two studies directly relevant to English are referenced. In the first (Oksa, Kalyuga and  Chandler 2010), students were given extracts from Shakespearean plays, half receiving texts with accompanying explanatory notes, the other half receiving no additional notes. Perhaps unsurprisingly, the group who were given the notes performed better on a comprehension task. In another study cited in the book (Kyun, Kalyuga and Sweller), students were given an essay question to answer. One group received model answers to study, the other did not. The study found that ‘the worked example group performed significantly better than the conventional problem-solving group’.

What does this look like in English?

If we want students to perform well in complex tasks like writing, we should be giving them the necessary information ‘directly and explicitly’. Echoing Engelmann’s sentiment that we should teach everything students will need, Sweller’s et al’s work also points to the superiority of explicit, direct instruction, approaches that seem more efficient and effective for novice learners. With regards to English, we should be explicitly teaching sentence structures and vocabulary. We should provide this information to students when they are completing extended writing and one way of doing this is through vocabulary tables that contain definitions and examples. Not just examples of how the vocabulary words are used, but also examples of the sentence styles that students should include. Each of these example sentences is a worked example in itself and, with effective teacher questioning and annotation, can be a powerful way of turning abstract and amorphous success criteria (use sophisticated sentences/use a range of complex sentences etc) into concrete examples that the learner can ‘study and emulate’. Here is a section of a vocabulary table for London, one of the poems from the GCSE poetry vocab table

To minimise cognitive load, students have these tables when they are annotating the poem, allowing them to make the link between text and interpretation.

Although Cognitive Load Theory contains a number of different effects, the worked example effect is described by the researchers as being ‘the most important’ and, because of this importance, we have incorporated it into all stages and aspects of our curriculum. Almost always, when students are asked to write, they will have studied a related and relevant worked example.

Next post: The Alternation Strategy: How example problem pairs can work in English.