Insights from DI part 9: The Sequencing of Skills

This is the ninth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first eight can be found here: one, two, three, four, five, six, seven, eight

Like the last post, this one will primarily examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here:

In the last few posts, I have explored a number of factors that determine whether or not an instructional sequence is effective. Here is a quick recap:

a) We should choose to teach high utility concepts that have wide applications, ensuring that students can ‘exhibit generalised performance to the widest possible range of examples and situations.’

b) Teaching through examples and non-examples may be more efficient than relying upon lengthy, abstract and confusing explanations. Examples should be rigorously chosen and sequenced to maximise clarity and efficacy.

c) Instructional formats should be suited to the level of proficiency of the students. Earlier on, they should be overt and split into logical, sequential steps, allowing teachers to give precise feedback and error diagnosis. Later on, the steps should be removed, encouraging more independent application.

d) Linked heavily to the last point, practice activities and tasks should broadly move along six different continuums as students develop in proficiency.

The Sequencing of Skills

One of the core premises of Direct Instruction is that ‘students should be well prepared for each step of the program to maintain a high rate of success’. On new material, students should be at least 70% successful; on material that is being firmed and practiced, they should be 90% successful. In order to achieve these staggeringly high and impressive success rates, sequences of learning should be systematically ordered according to four main guidelines.

1) Prerequisite skills for a strategy should be taught before the strategy itself.

In a previous post I explored the idea that DI schemes teach ‘everything students will need for later applications’, and this is almost always achieved by teaching the individual, component parts of a more complex skill. The most obvious example of this idea is decoding. If a student cannot decode properly-and it is unfortunate that some still leave primary school in this position-then all other higher order skills are unattainable. You cannot understand the meaning of a text if you cannot convert graphemes into phonemes. You cannot analyse language if you cannot decode it. In fact, if you cannot decode properly, then it is very hard to learn anything at all. Although a few years ago, I would perhaps have accepted that some people will not ever learn to read-ignorance allows the mind to form all kinds of excusatory justifications- I have since learnt that almost everyone can learn to decode, given the right instruction. At our school, we have 6 teachers, including myself, who are trained to deliver Thinking Reading, a systematic and highly effective reading intervention. Our first student graduated from the programme a month or so ago: she began the course decoding at the age of a nine year old, and, after six months, she is now decoding at the age of a fifteen year old, making an average rate of progress of 1 year for every three hours of instruction. If you have students who are weak decoders, then I would highly recommend Thinking Reading!

I wrote here about the importance of teaching the different components that make up a complex task before students actually attempt the more difficult application. Here is another example:

components before whole

The underlined parts of the sentences are noun appositives, constructions that rename nouns within a sentence. The bullet pointed list contains some (not all: you can break this down into far more components!) of the ‘prerequisite skills’ that would need to be taught before students can successfully attempt to create appositives. Kris Boulton wrote a series of blogs about how he applied Engelmann’s ideas to teaching simultaneous equations; this one explores the thirteen sub-components that he decided needed explicitly teaching before students attempted the entire equation.

2) Instances consistent with a strategy should be taught before exceptions to that strategy

According to the article, ‘Students learn a strategy best when they do not have to deal with exceptions’. When learning something new, exceptions will confuse students and impede their understanding, particularly if the learning is centred around a rule or some form of ‘If-then’ statement. However, ‘once students have mastered the basic strategy, they should be introduced to exceptions. For example, when the VCe rule is first introduced, students apply the rule to many examples (e.g. note) and nonexamples (e.g. not). Only when they are proficient with these kinds of words will they be introduced to exception words (e.g. done)’

Here is an example about eusociality.

incident exceptionMost eusocial species are insects, many of whom belong to the Hymenoptera order. If you were teaching this concept and following Engelmann’s sequencing rule, it may make sense to deal with this large set first, waiting until students have mastered it before introducing mammals and crustacean examples.

 In Theory of Instruction, Engelmann presents detailed instructions about how to deal with large sets of items, specifically sets that contain subsets that should be split from the main group and taught separately to avoid confusion.

3) Easy skills should be taught before more difficult ones.

There seems to be a position amongst teachers that allowing students to struggle is a good and desirable thing. Lots of people seem to be talking about Grit at the moment, valorising the idea of resilience in the face of difficulty. Some teachers equate hard work with learning, seeing a strong and fixed line of causation running between them. The problem with this idea is that the success of a program of study is predicated on the tenacity and perseverance of the student instead of the quality of the curriculum or the teacher. While persistence and stoicism are admirable traits that we should praise and encourage, relying upon them for a student’s success seems risky: if the student does not have these characteristics, what then? When I began teaching, I foolishly and ignorantly believed that if I raised the level of challenge, students would magically respond with increased determination and effort, resulting in improved attainment and higher levels of proficiency. Unsurprisingly, this did not work. All I was doing was increasing their feelings of inadequacy and highlighting their lack of understanding whilst doing nothing at all to help them improve. If students begin with easier tasks, they are far more likely to succeed; as the article points out ‘The experience of success is one of the most important bases of motivation in the classroom.’

Here is a possible overview of KS3 poetry teaching, following the ideas that easier skills should be taught first.

Year 7: Teach techniques, sentence level analysis and single paragraph responses.

  • Focusing on the sub-components of analysis and getting students to master them in isolation is probably more effective than continually practising the production of multi-paragraph analysis. Not only is feedback easier to give when skills are drilled, but students experience success which is the ultimate form of motivation.

Year 8: Begin to teach comparative structures; multiple paragraph responses.

  • Comparing poems is hard: synthesising information from two texts is more complex than dealing with just one. Like in year 7, we begin by drilling comparative structures in isolation, building up to writing comparative multi-paragraph responses.

Year 9: How to approach unseen

  • We deliberately leave unseen until year 9 as success here depends heavily upon a student’s vocabulary and background knowledge. Although we teach an approach to unseen in year 9, they still experience the majority of the poems that they encounter through teacher led, explicit instruction. Once students have mastered the approach to unseen, there are diminishing returns to practising lots of unseen poems. Assuming students have mastered the generic approach to unseen poems, if students are attempting multiple unseen tasks without support, what are they learning? Could this time be spent teaching more complex poems and the rich vocabulary that would be needed when responding to them?

GCSE: Consolidation of KS3; analytical essay introductions; development of entire essay responses.

  • The idea here is that students enter KS4 having mastered everything that they will need for GCSE, allowing these two years to be used for honing whole essay responses.

Crucially, this progression model is cumulative and all of the easier skills that are taught in the earlier years are encountered, used and applied throughout all subsequent poetry sequences. Students are taught sibilance in year 7 in a unit on Poetry from other Cultures. They will use it again with Romantic poetry, Civil Rights poetry and Dystopian poetry in year 8. In year 9, they will need it when looking at War poetry, Victorian poetry and Shakespearean Sonnets. For GCSE, they will need it again when analysing their anthologies.

4) Keep confusing things separate.

If things are incredibly similar, then we should not introduce them at the same time. I am currently using ‘Teach Your Child to Read in 100 Easy Lessons’ to teach my daughter to read. The ‘d’ sound is taught in lesson 12 and the course waits until lesson 54 to teach the ‘b’ sound because of the huge potential for confusion. Both are voiced consonant sounds and the symbols are exactly the same except for their orientation on the page, reasons why confusion between these two is common amongst weak readers.

Participle phrases and some absolute phrases are very similar and as a result, they should be separated to minimise confusion. As absolute phrases are the furthest away from functional, everyday language, I would probably teach them last, beginning instead with participle phrases.

Present Participle phrase: Raising her voice, she seemed to be getting angry.

Absolute phrase: Her voice raising, she seemed to be getting angry.

Both constructions not only contain participles, but they use exactly the same words, demonstrating just how close these two phrases are to each other.

Whatever subject you teach, there will be numerous examples of pairs or groups of concepts that students confuse, perhaps because of their function, perhaps because of their spelling or maybe because they have similar definitions. If these points of confusion are already apparent, then thinking deliberately about when they are taught and separating them from each other may go some way towards preventing this confusion.

Next post: Cognitive Load Theory-The Worked Example Effect.





Insights from DI part 8: The 6 Shifts of Task Design

This is the eighth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first seven can be found here: one, two, three, four, five, six, seven.

Like the last post, this one will primarily examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here.

The 6 ‘shifts’ of task design

The last post looked at idea that ‘the support that is so important during initial instruction must be gradually reduced until students are using the skill independently, with no teacher assistance.’ According to the article, ‘Becker and Carnine (1980) described six ‘shifts’ that should occur in any well-designed teaching program to facilitate this transition’.

1) The shift from ‘overtised’ to ‘covertised’ problem-solving strategies.

Initial instruction will break down a concept or strategy into multiple individual steps (see ‘Format 1’ in this post). In Theory of Instruction, Engelmann explains the difference between physical and cognitive operations. A physical operation includes things like ‘fitting jigsaw puzzles together, throwing a ball, ‘nesting’ cups together, swimming, buttoning a coat’-essentially any process that involves a series of steps that may include motion or the manipulation of matter, and one that receives immediate ‘feedback’ from the environment. If you are trying to hammer a nail and not doing it properly, the environment will give you ‘feedback’ and prevent you from completing the physical operation. Perhaps you missed the nail. Perhaps you didn’t use a hammer. Maybe you didn’t hit the nail hard enough. Perhaps you used the wrong striking technique. The important idea is that it is easy for an instructor to observe the reason behind your failure because the entire process is overt and each step is observable. For cognitive operations, ‘there are no necessary overt behaviors to account for the outcome that is achieved’. Essentially, we do not know how an outcome has been reached unless all the individual steps are made overt. A student may have got lucky; they may have relied upon a rule that, while being effective in this instance, may end up causing problems as a sequence become more complex. Unlike with physical operations like hitting a nail with a hammer, the physical environment does not provide feedback for cognitive operations. This passage from Theory of Instruction explains this idea further:

‘The physical environment does not provide feedback when the learner is engaged in cognitive operations. If the learner misreads a word, the physical environment does nothing. It does not prevent the learner from saying the wrong word. It does not produce an unpleasant consequence. The learner could look at the word form and call it ‘Yesterday’ without receiving any response from the physical environment. The basic properties of cognitive operations-from long division to inferential reading-suggest both that the naïve learner cannot consistently benefit from unguided practice or from unguided discovery of cognitive operations. Unless the learner is provided with some logical basis for figuring out possible inconsistencies (which is usually not available to the naïve learner), practicing the skills without human feedback is likely to promote mistakes.’

Making the steps of a cognitive operation overt allows extremely precise feedback to be given as the instructor can easily see the reason behind a particular outcome. During the acquisition phase of learning-the initial stage where, in the absence of teacher support, a student would soon become confused-precise and immediate feedback is of crucial importance, preventing errors from becoming embedded and ensuring accuracy is achieved. If students engage in ‘unguided practice or…unguided discovery’ then they will flounder; this is the central idea behind the influential Kirschner, Sweller and Clark paper which critiques one of the popular premises behind progressive education.

How does this apply to English?

 1) In this post, I demonstrated an initial (flawed) communication sequence for teaching students to select and punctuate quotations, exemplifying how the covert can be made overt.

2) Precise feedback is vital: try making restrictive practice activities that isolate and focus on the specific content that is being taught: this post explains one way of doing these.

3) When students write sentences, ask them to feedback orally by saying/narrating the punctuation, turning the covert into the overt and allowing you to quickly and efficiently ascertain if they have succeeded without having to actually mark or look at their book. If appropriate, other students can listen and give feedback about accuracy too!


Teacher: Read out your absolute phrase sentence, narrating the punctuation

Student: His paranoia growing COMMA Macbeth seems increasingly unstable and unhinged FULL STOP.

4) Live modelling: creating example sentences, paragraphs or even essays under the visualiser whilst narrating your thought process is a really powerful way of letting students observe the process of writing. When this process is made interactive with lots of teacher-student questioning, it can be even more effective.

2) The shift from simplified contexts to complex contexts.

simple to complex

In sport as much as in teaching, drills deliberately isolate elements to practice, increasing the amount of practice and decreasing cognitive load. Drills allow students to focus on ‘critical new learning’.

In Expressive Writing 2, one of the DI corrective writing schemes, one of the key skills that is taught is punctuating direct speech. Here is an overview of the ‘track’ (series of lessons) where this skill is taught and applied: this track spans 48 lessons! (See this post which explains the idea of a strand curriculum where multiple ‘tracks’ are intertwined over time).

1) Lesson 2,3,4: The rules for punctuating direct speech are introduced. Students are heavily directed by teachers and, following detailed and precise modelling of examples, have to complete isolated sentences of punctuated speech.

2) Lessons 5 to 9: Students write their own simple sentences that contain direct speech with minimal prompts. Sometimes they are statements; sometimes they are questions.

3) Lesson 10,11,12: Students edit and correct individual sentences that contain direct speech, adding in missing punctuation marks and capital letters.

4) Lesson 13 and 14: Students edit sentences in a paragraph that includes direct speech, adding in missing punctuation marks and capital letters.

5) Lesson 16 and 17: Students punctuate direct speech that includes two consecutive sentences.

6) Lesson 18: Students edit and correct an entire passage with two-sentence direct speech.

7) Lesson 24 to 28: Students punctuate direct speech that appears at the start of a sentence.

8) Lesson 33 to 37: Students punctuate sentences that include two different pieces of direct speech that are separated.

The practice activities gradually build in complexity in two ways: firstly, the subject matter and what is being taught slowly becomes more challenging; secondly, the context of the practice changes from isolated, supported drills to increasingly complex and contextualised activities. As the scheme progresses, students are increasingly asked to apply these component skills within a freer piece of writing, something that almost all of the lessons end with.

How does this apply to English?

Following this ‘shift’, I wrote here about how skills and items that are taught can be moved from simplified to more complex contexts.

3) The shift from prompted to unprompted formats


According to the article, ‘In the early stages of instruction, formats include prompts to help focus students’ attention on important aspects of the item and to increase their success. These prompts are later systematically removed as students gain a skill. By the end of the instruction, students apply the skill without any prompts.’

How does this apply to English?

1) In the earlier stages of instruction, worked examples should be labelled clearly, identifying relevant bits and making clear to students which parts are important.

Here is an example:

6 skills

2) When teaching sentence styles, arrows and labelling can help make the implicit interactions and relationships between different components obvious to students.

Here is an example:

prompt grammar

4) The shift from massed practice to distributed practice

According to the article ‘Initially, students learn a new skill best when they have many practice opportunities in a short period of time. In later learning, retention is enhanced by practice opportunities conducted over a long period of time.’ During the acquisition stage of learning, it may be helpful to have multiple practice opportunities in order for students to become proficient with a concept. As the sequence progresses, this practice should become increasingly more distributed.

How does this apply to English?

1) This post looks at an example lesson plan, exemplifying how concepts are initially taught via massed practice and then move to distributed practice, through regular and systematic retrieval practice.

2) This post looks at the journey of a test item across multiple lesson, again moving from massed to distributed practice.

5) The shift from immediate feedback to delayed feedback

immediate delayed

At the beginning of an instructional sequence, feedback should be immediate and precise, preventing errors from becoming embedded. Although feedback is lauded as a universally positive thing-the implication being that the more you give, the better students will perform and learn-this notion is overly simplistic and erroneous. The EEF point out that ‘Feedback studies tend to show very high effects on learning. However, it also has a very high range of effects and some studies show that feedback can have negative effects and make things worse.’ Referencing Soderstrom and Bjork’s ‘Learning versus performance’ paper (accessible here), David Didau points out that ‘there is empirical evidence that “delaying, reducing, and summarizing feedback can be better for long-term learning than providing immediate, trial-by-trial feedback.” This last point seems to corroborate Engelmann’s idea that the optimum type of feedback will change according to the stage of instruction.

How does this apply to English?

1) Initial practice activities of all concepts (analytical skill, vocabulary, context knowledge, sentence styles, punctuation etc) should be through restrictive and isolated drills, allowing precise and immediate feedback to be given. A teacher can either circulate, giving verbal feedback, or write a model and, by displaying it under the visualiser, provide students with an answer with which to check their own efforts against.

2) Regular cumulative recap quizzes at the start of lessons provide the perfect opportunity for immediate feedback regarding spelling and conceptual understanding. Referring to the ‘hypercorrection effect’, the idea that ‘The more confident someone is that an incorrect answer is correct, the more likely they are not to repeat the error if they are corrected’, Dylan Wiliam explains that ‘The benefits of testing come from the retrieval practice that students get when they take the test, and the hypercorrection effect when they find out answers they thought were correct were in fact incorrect. In other words, the best person to mark a test is the person who just took it.’ Following this advice, we conduct retrieval practice under the visualiser, filling in the answers immediately after the quiz, and asking students to check and correct their own work.

6) The shift from an emphasis on the teacher’s role as a source of information to an emphasis on the learner’s role as a source of information

This shift matches the idea of ‘I-we-you’, where responsibility and agency gradually moves from teacher to student. This table from p.72 of Teach Like a Champion, illustrates this shift:

i we you

How does this apply to English?

Let’s take an example from writing paragraphs:

1) The teacher writes a paragraph under the visualiser, narrating thought process and decisions   (The ‘I’ stage)

Using a semantic field of light, Dickens describes Scrooge’s room with words like ‘bright…gleaming…glistened’, symbolising warmth, comfort and opulence. Perhaps the ghost wants Scrooge to experience a convivial and celebratory scene so that Scrooge will not only realise that being as ‘solitary as an oyster’ is a bad choice, but that he could spend his money and enjoy himself instead.

  • If students have been already acquired the analytical skills, you could ask them to identify them (3 part explanation/evidence in explanation/tentative language/multiple interpretations)
  • Quick fire questioning about vocabulary (convivial/opulence/semantic field will provide valuable retrieval practice.


2) Teacher begins a second paragraph under the visualiser, asking students to help them complete it. (The ‘We’ stage)

Using a hyperbolic metaphor, Dickens describes the food as ‘…..

  • If you use the same structure as the first paragraph, students have a framework to follow (taking advantage of the alternation effect from Cognitive Load theory)
  • The teacher can prompt students to use taught vocabulary or specific sentence styles
  • When complete, you could undertake another round of quick fire questioning, again providing valuable retrieval practice.

3) Students now have 2 models to use as analogies. They should then write their own paragraph using different evidence but following the same process and structure. (The ‘You’ Stage)


Next post: Insights from DI part 9-The Sequencing of Skills. What principles should guide how we order instruction?