Insights from DI part 9: The Sequencing of Skills

This is the ninth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first eight can be found here: one, two, three, four, five, six, seven, eight

Like the last post, this one will primarily examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here:

In the last few posts, I have explored a number of factors that determine whether or not an instructional sequence is effective. Here is a quick recap:

a) We should choose to teach high utility concepts that have wide applications, ensuring that students can ‘exhibit generalised performance to the widest possible range of examples and situations.’

b) Teaching through examples and non-examples may be more efficient than relying upon lengthy, abstract and confusing explanations. Examples should be rigorously chosen and sequenced to maximise clarity and efficacy.

c) Instructional formats should be suited to the level of proficiency of the students. Earlier on, they should be overt and split into logical, sequential steps, allowing teachers to give precise feedback and error diagnosis. Later on, the steps should be removed, encouraging more independent application.

d) Linked heavily to the last point, practice activities and tasks should broadly move along six different continuums as students develop in proficiency.

The Sequencing of Skills

One of the core premises of Direct Instruction is that ‘students should be well prepared for each step of the program to maintain a high rate of success’. On new material, students should be at least 70% successful; on material that is being firmed and practiced, they should be 90% successful. In order to achieve these staggeringly high and impressive success rates, sequences of learning should be systematically ordered according to four main guidelines.

1) Prerequisite skills for a strategy should be taught before the strategy itself.

In a previous post I explored the idea that DI schemes teach ‘everything students will need for later applications’, and this is almost always achieved by teaching the individual, component parts of a more complex skill. The most obvious example of this idea is decoding. If a student cannot decode properly-and it is unfortunate that some still leave primary school in this position-then all other higher order skills are unattainable. You cannot understand the meaning of a text if you cannot convert graphemes into phonemes. You cannot analyse language if you cannot decode it. In fact, if you cannot decode properly, then it is very hard to learn anything at all. Although a few years ago, I would perhaps have accepted that some people will not ever learn to read-ignorance allows the mind to form all kinds of excusatory justifications- I have since learnt that almost everyone can learn to decode, given the right instruction. At our school, we have 6 teachers, including myself, who are trained to deliver Thinking Reading, a systematic and highly effective reading intervention. Our first student graduated from the programme a month or so ago: she began the course decoding at the age of a nine year old, and, after six months, she is now decoding at the age of a fifteen year old, making an average rate of progress of 1 year for every three hours of instruction. If you have students who are weak decoders, then I would highly recommend Thinking Reading!

I wrote here about the importance of teaching the different components that make up a complex task before students actually attempt the more difficult application. Here is another example:

components before whole

The underlined parts of the sentences are noun appositives, constructions that rename nouns within a sentence. The bullet pointed list contains some (not all: you can break this down into far more components!) of the ‘prerequisite skills’ that would need to be taught before students can successfully attempt to create appositives. Kris Boulton wrote a series of blogs about how he applied Engelmann’s ideas to teaching simultaneous equations; this one explores the thirteen sub-components that he decided needed explicitly teaching before students attempted the entire equation.

2) Instances consistent with a strategy should be taught before exceptions to that strategy

According to the article, ‘Students learn a strategy best when they do not have to deal with exceptions’. When learning something new, exceptions will confuse students and impede their understanding, particularly if the learning is centred around a rule or some form of ‘If-then’ statement. However, ‘once students have mastered the basic strategy, they should be introduced to exceptions. For example, when the VCe rule is first introduced, students apply the rule to many examples (e.g. note) and nonexamples (e.g. not). Only when they are proficient with these kinds of words will they be introduced to exception words (e.g. done)’

Here is an example about eusociality.

incident exceptionMost eusocial species are insects, many of whom belong to the Hymenoptera order. If you were teaching this concept and following Engelmann’s sequencing rule, it may make sense to deal with this large set first, waiting until students have mastered it before introducing mammals and crustacean examples.

 In Theory of Instruction, Engelmann presents detailed instructions about how to deal with large sets of items, specifically sets that contain subsets that should be split from the main group and taught separately to avoid confusion.

3) Easy skills should be taught before more difficult ones.

There seems to be a position amongst teachers that allowing students to struggle is a good and desirable thing. Lots of people seem to be talking about Grit at the moment, valorising the idea of resilience in the face of difficulty. Some teachers equate hard work with learning, seeing a strong and fixed line of causation running between them. The problem with this idea is that the success of a program of study is predicated on the tenacity and perseverance of the student instead of the quality of the curriculum or the teacher. While persistence and stoicism are admirable traits that we should praise and encourage, relying upon them for a student’s success seems risky: if the student does not have these characteristics, what then? When I began teaching, I foolishly and ignorantly believed that if I raised the level of challenge, students would magically respond with increased determination and effort, resulting in improved attainment and higher levels of proficiency. Unsurprisingly, this did not work. All I was doing was increasing their feelings of inadequacy and highlighting their lack of understanding whilst doing nothing at all to help them improve. If students begin with easier tasks, they are far more likely to succeed; as the article points out ‘The experience of success is one of the most important bases of motivation in the classroom.’

Here is a possible overview of KS3 poetry teaching, following the ideas that easier skills should be taught first.

Year 7: Teach techniques, sentence level analysis and single paragraph responses.

  • Focusing on the sub-components of analysis and getting students to master them in isolation is probably more effective than continually practising the production of multi-paragraph analysis. Not only is feedback easier to give when skills are drilled, but students experience success which is the ultimate form of motivation.

Year 8: Begin to teach comparative structures; multiple paragraph responses.

  • Comparing poems is hard: synthesising information from two texts is more complex than dealing with just one. Like in year 7, we begin by drilling comparative structures in isolation, building up to writing comparative multi-paragraph responses.

Year 9: How to approach unseen

  • We deliberately leave unseen until year 9 as success here depends heavily upon a student’s vocabulary and background knowledge. Although we teach an approach to unseen in year 9, they still experience the majority of the poems that they encounter through teacher led, explicit instruction. Once students have mastered the approach to unseen, there are diminishing returns to practising lots of unseen poems. Assuming students have mastered the generic approach to unseen poems, if students are attempting multiple unseen tasks without support, what are they learning? Could this time be spent teaching more complex poems and the rich vocabulary that would be needed when responding to them?

GCSE: Consolidation of KS3; analytical essay introductions; development of entire essay responses.

  • The idea here is that students enter KS4 having mastered everything that they will need for GCSE, allowing these two years to be used for honing whole essay responses.

Crucially, this progression model is cumulative and all of the easier skills that are taught in the earlier years are encountered, used and applied throughout all subsequent poetry sequences. Students are taught sibilance in year 7 in a unit on Poetry from other Cultures. They will use it again with Romantic poetry, Civil Rights poetry and Dystopian poetry in year 8. In year 9, they will need it when looking at War poetry, Victorian poetry and Shakespearean Sonnets. For GCSE, they will need it again when analysing their anthologies.

4) Keep confusing things separate.

If things are incredibly similar, then we should not introduce them at the same time. I am currently using ‘Teach Your Child to Read in 100 Easy Lessons’ to teach my daughter to read. The ‘d’ sound is taught in lesson 12 and the course waits until lesson 54 to teach the ‘b’ sound because of the huge potential for confusion. Both are voiced consonant sounds and the symbols are exactly the same except for their orientation on the page, reasons why confusion between these two is common amongst weak readers.

Participle phrases and some absolute phrases are very similar and as a result, they should be separated to minimise confusion. As absolute phrases are the furthest away from functional, everyday language, I would probably teach them last, beginning instead with participle phrases.

Present Participle phrase: Raising her voice, she seemed to be getting angry.

Absolute phrase: Her voice raising, she seemed to be getting angry.

Both constructions not only contain participles, but they use exactly the same words, demonstrating just how close these two phrases are to each other.

Whatever subject you teach, there will be numerous examples of pairs or groups of concepts that students confuse, perhaps because of their function, perhaps because of their spelling or maybe because they have similar definitions. If these points of confusion are already apparent, then thinking deliberately about when they are taught and separating them from each other may go some way towards preventing this confusion.

Next post: Cognitive Load Theory-The Worked Example Effect.





Insights from DI part 8: The 6 Shifts of Task Design

This is the eighth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first seven can be found here: one, two, three, four, five, six, seven.

Like the last post, this one will primarily examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here.

The 6 ‘shifts’ of task design

The last post looked at idea that ‘the support that is so important during initial instruction must be gradually reduced until students are using the skill independently, with no teacher assistance.’ According to the article, ‘Becker and Carnine (1980) described six ‘shifts’ that should occur in any well-designed teaching program to facilitate this transition’.

1) The shift from ‘overtised’ to ‘covertised’ problem-solving strategies.

Initial instruction will break down a concept or strategy into multiple individual steps (see ‘Format 1’ in this post). In Theory of Instruction, Engelmann explains the difference between physical and cognitive operations. A physical operation includes things like ‘fitting jigsaw puzzles together, throwing a ball, ‘nesting’ cups together, swimming, buttoning a coat’-essentially any process that involves a series of steps that may include motion or the manipulation of matter, and one that receives immediate ‘feedback’ from the environment. If you are trying to hammer a nail and not doing it properly, the environment will give you ‘feedback’ and prevent you from completing the physical operation. Perhaps you missed the nail. Perhaps you didn’t use a hammer. Maybe you didn’t hit the nail hard enough. Perhaps you used the wrong striking technique. The important idea is that it is easy for an instructor to observe the reason behind your failure because the entire process is overt and each step is observable. For cognitive operations, ‘there are no necessary overt behaviors to account for the outcome that is achieved’. Essentially, we do not know how an outcome has been reached unless all the individual steps are made overt. A student may have got lucky; they may have relied upon a rule that, while being effective in this instance, may end up causing problems as a sequence become more complex. Unlike with physical operations like hitting a nail with a hammer, the physical environment does not provide feedback for cognitive operations. This passage from Theory of Instruction explains this idea further:

‘The physical environment does not provide feedback when the learner is engaged in cognitive operations. If the learner misreads a word, the physical environment does nothing. It does not prevent the learner from saying the wrong word. It does not produce an unpleasant consequence. The learner could look at the word form and call it ‘Yesterday’ without receiving any response from the physical environment. The basic properties of cognitive operations-from long division to inferential reading-suggest both that the naïve learner cannot consistently benefit from unguided practice or from unguided discovery of cognitive operations. Unless the learner is provided with some logical basis for figuring out possible inconsistencies (which is usually not available to the naïve learner), practicing the skills without human feedback is likely to promote mistakes.’

Making the steps of a cognitive operation overt allows extremely precise feedback to be given as the instructor can easily see the reason behind a particular outcome. During the acquisition phase of learning-the initial stage where, in the absence of teacher support, a student would soon become confused-precise and immediate feedback is of crucial importance, preventing errors from becoming embedded and ensuring accuracy is achieved. If students engage in ‘unguided practice or…unguided discovery’ then they will flounder; this is the central idea behind the influential Kirschner, Sweller and Clark paper which critiques one of the popular premises behind progressive education.

How does this apply to English?

 1) In this post, I demonstrated an initial (flawed) communication sequence for teaching students to select and punctuate quotations, exemplifying how the covert can be made overt.

2) Precise feedback is vital: try making restrictive practice activities that isolate and focus on the specific content that is being taught: this post explains one way of doing these.

3) When students write sentences, ask them to feedback orally by saying/narrating the punctuation, turning the covert into the overt and allowing you to quickly and efficiently ascertain if they have succeeded without having to actually mark or look at their book. If appropriate, other students can listen and give feedback about accuracy too!


Teacher: Read out your absolute phrase sentence, narrating the punctuation

Student: His paranoia growing COMMA Macbeth seems increasingly unstable and unhinged FULL STOP.

4) Live modelling: creating example sentences, paragraphs or even essays under the visualiser whilst narrating your thought process is a really powerful way of letting students observe the process of writing. When this process is made interactive with lots of teacher-student questioning, it can be even more effective.

2) The shift from simplified contexts to complex contexts.

simple to complex

In sport as much as in teaching, drills deliberately isolate elements to practice, increasing the amount of practice and decreasing cognitive load. Drills allow students to focus on ‘critical new learning’.

In Expressive Writing 2, one of the DI corrective writing schemes, one of the key skills that is taught is punctuating direct speech. Here is an overview of the ‘track’ (series of lessons) where this skill is taught and applied: this track spans 48 lessons! (See this post which explains the idea of a strand curriculum where multiple ‘tracks’ are intertwined over time).

1) Lesson 2,3,4: The rules for punctuating direct speech are introduced. Students are heavily directed by teachers and, following detailed and precise modelling of examples, have to complete isolated sentences of punctuated speech.

2) Lessons 5 to 9: Students write their own simple sentences that contain direct speech with minimal prompts. Sometimes they are statements; sometimes they are questions.

3) Lesson 10,11,12: Students edit and correct individual sentences that contain direct speech, adding in missing punctuation marks and capital letters.

4) Lesson 13 and 14: Students edit sentences in a paragraph that includes direct speech, adding in missing punctuation marks and capital letters.

5) Lesson 16 and 17: Students punctuate direct speech that includes two consecutive sentences.

6) Lesson 18: Students edit and correct an entire passage with two-sentence direct speech.

7) Lesson 24 to 28: Students punctuate direct speech that appears at the start of a sentence.

8) Lesson 33 to 37: Students punctuate sentences that include two different pieces of direct speech that are separated.

The practice activities gradually build in complexity in two ways: firstly, the subject matter and what is being taught slowly becomes more challenging; secondly, the context of the practice changes from isolated, supported drills to increasingly complex and contextualised activities. As the scheme progresses, students are increasingly asked to apply these component skills within a freer piece of writing, something that almost all of the lessons end with.

How does this apply to English?

Following this ‘shift’, I wrote here about how skills and items that are taught can be moved from simplified to more complex contexts.

3) The shift from prompted to unprompted formats


According to the article, ‘In the early stages of instruction, formats include prompts to help focus students’ attention on important aspects of the item and to increase their success. These prompts are later systematically removed as students gain a skill. By the end of the instruction, students apply the skill without any prompts.’

How does this apply to English?

1) In the earlier stages of instruction, worked examples should be labelled clearly, identifying relevant bits and making clear to students which parts are important.

Here is an example:

6 skills

2) When teaching sentence styles, arrows and labelling can help make the implicit interactions and relationships between different components obvious to students.

Here is an example:

prompt grammar

4) The shift from massed practice to distributed practice

According to the article ‘Initially, students learn a new skill best when they have many practice opportunities in a short period of time. In later learning, retention is enhanced by practice opportunities conducted over a long period of time.’ During the acquisition stage of learning, it may be helpful to have multiple practice opportunities in order for students to become proficient with a concept. As the sequence progresses, this practice should become increasingly more distributed.

How does this apply to English?

1) This post looks at an example lesson plan, exemplifying how concepts are initially taught via massed practice and then move to distributed practice, through regular and systematic retrieval practice.

2) This post looks at the journey of a test item across multiple lesson, again moving from massed to distributed practice.

5) The shift from immediate feedback to delayed feedback

immediate delayed

At the beginning of an instructional sequence, feedback should be immediate and precise, preventing errors from becoming embedded. Although feedback is lauded as a universally positive thing-the implication being that the more you give, the better students will perform and learn-this notion is overly simplistic and erroneous. The EEF point out that ‘Feedback studies tend to show very high effects on learning. However, it also has a very high range of effects and some studies show that feedback can have negative effects and make things worse.’ Referencing Soderstrom and Bjork’s ‘Learning versus performance’ paper (accessible here), David Didau points out that ‘there is empirical evidence that “delaying, reducing, and summarizing feedback can be better for long-term learning than providing immediate, trial-by-trial feedback.” This last point seems to corroborate Engelmann’s idea that the optimum type of feedback will change according to the stage of instruction.

How does this apply to English?

1) Initial practice activities of all concepts (analytical skill, vocabulary, context knowledge, sentence styles, punctuation etc) should be through restrictive and isolated drills, allowing precise and immediate feedback to be given. A teacher can either circulate, giving verbal feedback, or write a model and, by displaying it under the visualiser, provide students with an answer with which to check their own efforts against.

2) Regular cumulative recap quizzes at the start of lessons provide the perfect opportunity for immediate feedback regarding spelling and conceptual understanding. Referring to the ‘hypercorrection effect’, the idea that ‘The more confident someone is that an incorrect answer is correct, the more likely they are not to repeat the error if they are corrected’, Dylan Wiliam explains that ‘The benefits of testing come from the retrieval practice that students get when they take the test, and the hypercorrection effect when they find out answers they thought were correct were in fact incorrect. In other words, the best person to mark a test is the person who just took it.’ Following this advice, we conduct retrieval practice under the visualiser, filling in the answers immediately after the quiz, and asking students to check and correct their own work.

6) The shift from an emphasis on the teacher’s role as a source of information to an emphasis on the learner’s role as a source of information

This shift matches the idea of ‘I-we-you’, where responsibility and agency gradually moves from teacher to student. This table from p.72 of Teach Like a Champion, illustrates this shift:

i we you

How does this apply to English?

Let’s take an example from writing paragraphs:

1) The teacher writes a paragraph under the visualiser, narrating thought process and decisions   (The ‘I’ stage)

Using a semantic field of light, Dickens describes Scrooge’s room with words like ‘bright…gleaming…glistened’, symbolising warmth, comfort and opulence. Perhaps the ghost wants Scrooge to experience a convivial and celebratory scene so that Scrooge will not only realise that being as ‘solitary as an oyster’ is a bad choice, but that he could spend his money and enjoy himself instead.

  • If students have been already acquired the analytical skills, you could ask them to identify them (3 part explanation/evidence in explanation/tentative language/multiple interpretations)
  • Quick fire questioning about vocabulary (convivial/opulence/semantic field will provide valuable retrieval practice.


2) Teacher begins a second paragraph under the visualiser, asking students to help them complete it. (The ‘We’ stage)

Using a hyperbolic metaphor, Dickens describes the food as ‘…..

  • If you use the same structure as the first paragraph, students have a framework to follow (taking advantage of the alternation effect from Cognitive Load theory)
  • The teacher can prompt students to use taught vocabulary or specific sentence styles
  • When complete, you could undertake another round of quick fire questioning, again providing valuable retrieval practice.

3) Students now have 2 models to use as analogies. They should then write their own paragraph using different evidence but following the same process and structure. (The ‘You’ Stage)


Next post: Insights from DI part 9-The Sequencing of Skills. What principles should guide how we order instruction?

Insights from DI part 7: Instructional Formats

This is the seventh post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first six can be found here: one, two, three, four, five, six

Like the last post, this one will examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here:

Instructional Formats

Once you have decided upon what you will teach (choosing high utility, generalizable concepts that allow students to ‘further develop their expertise’ in a subject), and have devised sequences of clear communication, the next step is to decide upon the instructional format which ‘specifies the way that teachers will present each example, explanations that they will give, questions that they will ask, and corrections that they will use’. Broadly, teaching should be massively structured, explicit and supported at the beginning in order to ‘ensure a high level of success when strategies are initially introduced’. As student proficiency increases, this support will be faded out ‘so that students learn to apply the skills independently’, preventing them from becoming reliant on teacher assistance. Sequences of learning and practice should  move along a continuum from restricted to freer practice, the intention being that student understanding moves from being inflexible to flexible.

Lets have a look at an example from the article which uses these items:

CVC word table

If students were learning to read ‘VCe’ words like the ones in the table above (words that, due to the addition of an ‘e’ at the end, cause a long vowel sound) and were trying to discriminate these from ‘VC’ words like rat and not where the vowel sound is short, then an initial, highly supported instructional sequence may look like this:

instructional format box

Detailed and explicit, the sequence breaks what seems initially to be a simplistic concept into five logical, sequential steps. Each step potentially provides accurate information to the teachers not only about whether a student is proficient, but if they are not, what specific remedial work or additional teaching that may be required in order to address their lack of understanding. If a student fails step 2, perhaps they cannot recognise or do not understand the concept of ‘e’ and may require more practice distinguishing between letters of the alphabet. If students are correct on step 2, but incorrect on step 3, perhaps they require more practice with learning and saying the rule that is explained in step 1.

Later on in a sequence of learning, instructional formats may look like this:

instructional format box2 later

Compared to format 1, these are far less detailed and involve far fewer steps, the assumption being that students are proficient enough to cope with reduced teacher support and instruction.

How does this apply to the everyday classroom?

Following this idea, I wrote here about how items that are taught should move from narrow to wider tasks, probably beginning with massed practice and moving towards spaced practice where students are increasingly expected to apply the knowledge in wider, less structured applications like paragraphs and essays.

Here is an example from teaching students to select and punctuate quotations (the assumption being that this is a new skill that they currently cannot do properly):

Format 1: (heavily supported and broken into sequential steps)

Text extract from Telephone Conversation by Wole Soyinka (a poem from a year 7 unit).

 “ARE YOU DARK? OR VERY LIGHT?” Revelation came.

“You mean–like plain or milk chocolate?”

Her accent was clinical, crushing in its light

Impersonality. Rapidly, wave-length adjusted,

I chose. “West African sepia”–and as afterthought,

“Down in my passport.” Silence for spectroscopic

Flight of fancy, till truthfulness clanged her accent

Hard on the mouthpiece. “WHAT’S THAT?”

Instructional Format:

1) Teacher: Here’s a rule: evidence should start and end with quotation marks.

2) Teacher: what’s the rule? Students: evidence should start and end with quotation marks.

2) Teacher: In the last line, underline the words that tell us that the landlady doesn’t understand what the poet is saying.

3) Teacher: say the words Students: WHAT’S THAT?’

4) Teacher: Did she say something or ask something? Students: Ask something

5) Teacher: How do you know? Students: There’s a question mark

6) Teacher: This is how you start: The Landlady asks QUOTATION MARKS

7) Teacher: what comes after the quotation marks? Students: What’s that?

9) Teacher: What is the rule about quotation marks? Students: evidence should start and end with quotation marks.

8) Teacher: what comes after the last word? Students: quotation marks

9) Teacher: How do you know? Students: because evidence should start and end with quotation marks.

Although this sequence is imperfect and certainly violates some of Engelmann’s rigorous principles, I have included it to demonstrate the necessity of breaking down communications into logical, sequential steps as well as to highlight the importance of explicit, teacher directed instruction in the early stage of a sequence of learning. The use of evidence seems like a simple skill to an expert, but it actually requires multiple tiny steps. Each step, in the absence of explicit teaching, guidance and support, is a potential source of confusion and a hindrance for a novice student.

Like the examples from the article, later instructional formats would gradually fade out some if not all of the steps within the earlier sequence, making the process less overt, eventually resulting in the expectation that students apply this important skill in extended writing without any teacher direction or support at all.

As experts we are prone to expert induced blindness: it is hard for us to remember that the processes, skills and abilities which we have automatized to the point of accurate and effortless fluency contain multiple steps which require explicit teaching if students are to reach a similar level of proficiency. Engelmann talks about the idea of dysteachia, the notion that student failure is almost always the result of ineffective teaching, and I think that many students struggle because we often fail to see just how far we need to break a process or concept down when teaching. I will explore this idea more in the next post.

The goal is that ‘by the completion of the instructional program the students’ performance is independent, widely generalized, and applied to various contexts and situations.’ What students could only initially achieve with massive and detailed support, they should eventually be able to do independently, making adaptations and generalisations and succeeding across a range of different and novel problem scenarios.

Next post: Insights from DI part 8: The 6 ‘shifts’ of task design




Insights from DI part 6: Five Principles for Sequencing and Ordering Examples.

This is the sixth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first five can be found here: one, two, three, four, five

Like the last post, this one will examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here.

Direct Instruction teaches ‘generalizable strategies that students can use to solve a wide range of problems’. Instead of teaching a ‘set of discrete specific cases’, it teaches the ‘general case’ and the teaching ‘clearly communicates one and only one meaning and enables students to exhibit generalized responding.’

The authors of the article explain that ‘In order to teach a general case, it is necessary to show students a set of items that includes examples and non-examples arranged so that similarities and differences are readily apparent. Irrelevant aspects of the teaching must be held constant to minimize confusion, and relevant aspects must be carefully manipulated to demonstrate important differences.’

Although the theory behind the sequencing and ordering of examples is incredibly detailed and varies according to the nature of the concept being taught, there are five overarching, general principles that should be followed in order to ensure that communication is clear.

1) The Wording Principle

The idea of ‘faultless communication’ is a key tenet of DI and I touched on this idea here. The wording principle dictates that we should use the same wording for all items in a sequence and, by precluding variance in the language used, we can minimise potential confusion and unnecessary distraction for students. While tiny or even substantial differences in wording may seem trivial to the teacher, they may have catastrophic effects for novice learners. In DI sequences, teacher scripts detail exactly what is communicated, ensuring that this principle is adhered to. This table demonstrates how it is applied in maths.

wording principle table

Scripted lessons are controversial: many people abhor the idea, perceiving them to be dehumanising, dystopian and robotic. This, however, is a clear straw man. If you accept the idea that large parts of the act of teaching involve communication between a teacher and students, then it is self-evident that the clarity of this communication will either hinder or enable learning. Although I am not advocating creating scripted lessons for all aspects of teaching, there are clear benefits to creating communication sequences that are as unambiguous as possible.

How might this apply to the everyday classroom?

a) It may be worthwhile agreeing upon and standardising definitions for different concepts within a department.

b) When teaching concepts, try writing down what you expect to say to see if you are being consistent with the words that you use. Are you using several synonyms for one concept? Are you explaining an idea using unnecessarily technical language? Is all of the communication necessary? Is your explanation meandering and protracted?

c) Try to ensure that communications are not contradicted by later examples. If you define a verb as ‘an action word’, then this is problematic because many verbs have more than one word ( I am eating toast/The boys have been watching the news). If you are interested in the idea of creating communication sequences that avoid complication, ambiguity and contradiction, The Rubric for Identifying Authentic Direct Instruction Programs provides examples of flawed sequences of communication and suggestions for how they could be improved.

2) The Setup Principle

According to this principle, ‘Examples and non-examples selected for initial teaching of a concept should share the greatest possible number of irrelevant features’. This means that examples and non-examples should vary in only one way with all other aspects and features held constant. By doing this, you create a situation where interpretations and inferences are controlled, ensuring that ‘only one interpretation is possible’. Figure 2.2 demonstrates this idea:

setup principle

The items in the left column differ in only one way, meaning that a student could only logically infer one meaning of ‘on’. On the right, there are numerous variables between the example and the non-example. A novice learner could logically infer that ‘on’ means any of these things:

a) ‘on’ means rectangular

b) ‘on’ means things with corners

c) ‘on’ means horizontal

d) ‘on’ means light grey.

Because a student could make any of these inferences, the setup principle has been violated and the presentation would be considered unnecessarily ambiguous.

In Theory of Instruction, Engelmann talks about ‘stipulation’. This is the idea that ‘the presentation implies that all features of these examples are necessary to the label. The result is that if the learner is presented with variations in any features, the learner will not treat the example in the same way’. After studying multiple examples that follow the left column of Figure 2.2, there may be a danger that a student will infer that ‘on’ only refers to rectangular objects. To prevent this, subsequent sequences would follow where the setup is changed, perhaps by using different shapes or objects or surfaces, demonstrating that ‘on’ is a wide ranging concept.

How might this apply to teaching English?

I have begun experimenting with creating sequences of examples and non-examples in order to teach grammar and specific sentence constructions. Students seem to be responding well to this approach, although I am certain that there are numerous tweaks and improvements that need to be made if the sequences are to fully conform to Engelmann’s rigorous theory.

Here are some examples from teaching present participles:

set up principle2

The first column contains only one variable between the example and non-example, meaning that a learner can only make one logical inference about the meaning of ‘present participle.’ In the second column, there are numerous variables between the example and non-example. Although there may be many more inferences, a learner could logically infer that ‘present participle’ means:

a) The inclusion of the word ‘Enfield’.

b) A sentence that is in the present tense.

c) A sentence that is in the active voice.

3) The Difference Principle

Carefully choosing non-examples is a crucial factor in helping students understand the ‘limits or boundaries of a concept’. To understand what something is, it is helpful to comprehend what it is not. Figure 2.3 demonstrates this idea:

difference principle

The column on the left provides far more accurate and precise information as to the ‘point at which an example is no longer horizontal’ because both example and non-example are highly similar. The difference in orientation is only several degrees. On the right, the examples do not provide clear information as to the delineation between horizontal and not-horizontal, the difference of orientation spanning 90 degrees. For the difference principle to be most effective, examples and non-examples should be juxtaposed consecutively, making ‘the similarities and differences most obvious.’

How might this apply to the teaching English?

Here are some examples from teaching participial phrases:difference principle 2.jpgIn the left hand column, the non-example is a gerund phrase (the subject of the verb ‘was’). Gerund phrases are often confused with participle phrases and the juxtaposition of these two examples demonstrates why that is: they are incredibly similar. The non-example here is helpful as it gives precise information as to the delineation between ‘present participle’ and ‘not present participle’. In the right hand column, the non-example is massively different, making it harder for a student to ascertain the boundaries of the concept being taught.

Like with the setup principle, to avoid stipulation (the idea where a student thinks that the examples in a sequence encapsulate the full range of the concept and that other, different examples will therefore fall outside of it), subsequent sequences would follow where the examples are changed, perhaps by using different sentence constructions or subject matter, demonstrating that ‘present participle’ is a wide ranging concept.

4) The Sameness Principle

In order to demonstrate the range and scope of a concept, we should juxtapose maximally different examples. If we were trying to teach the concept of ‘dog’, then we should choose examples that represent the widest possible variety of dogs. Let’s look at an example:

sameness principle

Although we could debate whether there are more strikingly different examples of dogs that we could use, these three have been chosen because, despite all being dogs, they are massively different. If we had merely shown different breeds of terrier, then a student may infer that any future examples that are not terriers would fall outside of the concept of ‘dog’: again, this is what Engelmann refers to as ‘stipulation’.

How does this apply to teaching English?

Here are some example from teaching participial phrases:

sameness principle 2

The right hand column only demonstrates a tiny range of possible examples of the concept. If we had merely shown these examples, then a student may infer that any differing future examples fall outside of the concept of ‘present participle’. They may logically infer that present participle sentences:

a) Always contain two words.

b) Always begin with a word that ends in ‘ing.’

c) Always precede the subject of a sentence.

d) Always begin a sentence.

d) Always contain the word ‘gossip’.

The left hand column demonstrates a far wider range of examples. I have deliberately inserted an example that includes a quotation as this is how students will most frequently apply these constructions. The second example begins with an adverb, preventing students from inferring the misrule that all present participle phrases begin with an ‘ing’ word. The third example has the phrase at the end of a sentence, demonstrating that these constructions are not always used at the beginning. A full sequence would contain many more maximally different examples, further broadening the scope of the concept.

5) The Testing Principle

After demonstrating several examples and non-examples, ‘to test for acquisition, we should juxtapose new, untaught examples and non-examples in random order’.  Figure 2.5 demonstrates this idea:

testing principle

In order to ensure that that we are receive accurate information about a student’s understanding of a concept, we need to create tests that do not follow a predictable pattern.

All of these five principles are presented here as separate guidelines. Creating sequences of communication through examples and non-examples often requires multiple sets of juxtaposed examples and non-examples, which helps to avoid misrules and faulty inferences, ensuring that students not only learn the precise point when something stops being a concept (like the horizontal line and the line that is ever so slightly slanted), but also that they learn that a concept can contain innumerable varieties and differences yet still have the same label (like a Chihuahua and an Irish wolfhound: massively different, but still dogs.)

Next post: Insights from DI part 7: Instructional Formats

Insights from DI Part 5: Teaching Generalisable, High-utility Content

This is the fifth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first four can be found here: one, two, three, four.

While the last four posts have primarily taken ideas from Successful and Confident Students with Direct Instruction, a recent book where Engelmann explains an overview of his approach, this post will examine The Components of Direct Instruction by Cathy L. Watkins and Timothy A. Slocum, an article from The Journal Of Direct Instruction and an extract from Introduction to Direct Instruction. The paper can be found here.

1) ‘The goal of Direct Instruction is to teach generalised skills: thus the first step in developing a Direct Instruction program is analysis of the content and identification of concepts, rules, strategies, and ‘big ideas’ (i.e. those concepts that provide strategies that students can use to further develop their expertise in a subject matter)…to enable them to exhibit generalised performance to the widest possible range of examples and situations. p.76

Lesson time is finite and one of the most important decisions when designing curricula is to consider the utility and importance of what is being taught. According to DI theory, we should design curricula by ‘identifying central organizing ideas and generalizable strategies that enable student to learn more in less time’. Although a wide curriculum that exposes students to myriad ideas, concepts and knowledge may seem optimal, if students merely experience the content at the expense of any real attempt to master or retain it, is it worthwhile? If a concept takes a long time to teach, yet its utility is limited to one specific unit or section of a curriculum, is it worth teaching and could the curriculum time be used for something that is more ‘generalizable’? If students are regularly taught concepts that they will never be expected to use in later lessons or applications, is this a good use of lesson time?

While all bits of knowledge are useful, some bits of knowledge are more useful than others.

Here are some things that I would consider to be high utility ‘concepts, rules, strategies, and ‘big ideas’ in English. These ideas, skills and concepts can be applied across units, years, and key stages, as well as helping students to ‘further develop their expertise in a subject.’

Specific sentence constructions

Explicitly teaching specific sentence styles and grammatical constructions to students can help them to broaden their range of expression, moving them from functional and simplistic written communication to sophisticated, nuanced and complex writing. Teaching phrases (participles, appositives and absolutes) is one high utility strategy because they can be used with all of ‘The Big Three’ genres of writing that we focus on (analysis, rhetoric and description). If you teach the component parts of a sentence, then they can be combined, manipulated and generalised by students into an immense number of combinations. To draw an analogy, Spelling Through Morphographs-one of the DI spelling programmes-teaches 750 morphographs that can be combined into 12,000 to 15,000 different words. This is far more efficient that teaching spelling through lists of individual words. This table illustrates the efficiency of teaching morphographs to students:

morphographs table



When teaching vocabulary we use vocabulary tables and deliberately list all relevant forms of the words, allowing students to explore morphology and affixes. As a result of this, students are beginning to make generalisations.

While morphographs are the building blocks of words, phrases are some of the building blocks of sentences and can be combined in lots of different ways when writing. Here are a few examples of how these sub components can be combined:

phrases combine table

Choosing sentence styles that are high-utility is important and if students are to master them, they will need extended, distributed and varied practice, ideally spread across texts, units and years. Students begin to learn these structures in year 7, deconstructing worked examples that exemplify how they are applied within analytical paragraphs as well as practicing creating the structures themselves.

Here is a possible overview of a sequence for teaching present participle phrases, moving along a continuum from inflexible to flexible knowledge  and gradually fading out teacher support. This sequence would span many lessons and perhaps weeks of school time:

  1. Students identify the specific structures within examples of isolated sentences. Following Engelmann’s theory, students should be presented with examples that demonstrate the full scope of the concept. Crucially, they should also see examples that are minimally different and, by treating them differently, be made aware of the limits of the concept. These non-examples will often elucidate common misconceptions. Engelmann’s theory behind sequencing and ordering examples in order to induce student understanding is fascinating and I am slowly working on creating sequences to teach specific sentence styles.
  2. Students finish half-completed sentences or combine sentences. See this post for an overview of sentence combining.
  3.  Students create sentences in response to a specific task:


Write three present participle sentences about Macbeth’s ‘Is this a dagger soliloquy’ 

Write three present participle sentences that describe the picture.

4. You could then ask them to attempt smaller pieces of writing, perhaps just a paragraph, where they can apply the structure in a freer context, perhaps combining it with other concepts and skills.


London: How is the omnipresence of suffering presented in the poem?

  • Participial phrase
  • 3 quotations
  • ‘denounce’ ‘indignant’ ‘marginalised’


5. After students have become proficient at the specific structures in isolated, scaffolded contexts, they should then be expected to apply these component skills in wider writing. This can be achieved by regularly drawing attention to the specific structures within worked examples as well as including the structures within the success criteria for a piece of writing.

Vocabulary to be used in analysis

Instead of only teaching the vocabulary that you encounter within a text, teach the vocabulary required when responding to a text. Although it will be useful to teach some of the words within a text as they will be integral to comprehension and analysis, other words may be so recondite, anachronistic or genre specific that their utility is limited. Focussing on Tier 2 words-vocabulary that spans contexts and domains-is one way of promoting generalisations. See this post for more information.

A generalised analytical framework:

We have developed an analytical framework that can be used across texts and different tasks. Tentatively called ‘The 6 Skills’ (I am still unsure if these 6 are sufficiently distinct or whether they comprehensively encapsulate analysis!), our intention is to promote a generalizable strategy when responding to texts. Unlike vocabulary, contextual information, interpretation, authorial intention and explaining the effect of techniques (all potential examples of declarative knowledge in English), it is an attempt to formalise the procedural knowledge required when writing analytically. We are heavily indebted to the great work of many other teachers here!

PEE/PEEL and other similar frameworks were problematic and restrictive, resulting in clunky, predictable and overly formulaic paragraphs. Invariably, iterations of these abbreviations and acronyms also have fixed orders, placing evidence in the middle, one of the problematic inferences being that each train of thought contains only one quotation. Interesting analytical writing does not follow a predictable, sequential order. Finally, PEE also precludes embedded quotations, due to the unnatural scaffolding of sentence stems like ‘My evidence for this is….’.

Here is an annotated worked example that exemplifies our generalizable framework:

6 skills

As well as exemplifying the 6 skills, the screen shot also demonstrates the utility of teaching phrases and how they can be used to create dense and linguistically sophisticated analysis.

Unlike PEE/PEEL, there is no fixed, consecutive order for how these skills are deployed and applied, meaning that student responses are not as formulaic and rigid. Although students will study worked examples that contain most if not all of these skills from the beginning of year 7, they will practice the skills individually and cumulatively, slowly building students up to being able to use all 6. See this post  for an outline of how to create focussed and cumulative practice activities.

A future post will look at this framework in more detail, explaining how it can be broken into constituent parts and deliberately practiced. I am aware that there are multiple potential flaws with this framework, not least regarding the amorphous nature of some of the skills, particularly those regarding the use of evidence.

Next Post: Insights from DI part 6– Five principles for sequencing and ordering examples

Insights from Direct Instruction: part 4

This is the fourth post looking at how ideas from Engelmann’s DI can be applied to the everyday classroom. The first three can be found here: one; two; threeDI book cover

Elements of well-designed programs for teaching to mastery (p.35)

1) They teach everything students will need for later applications

In DI programmes, the meticulous track system of planning means many activities happen each lesson, the intention being that these strands of knowledge weave together to help students succeed at a more complex skill like extended writing. Engelmann says ‘conceive of the program as being like a stairway that transports students to increasingly complex performance.’ p.13. For example, Expressive Writing lessons have activities that focus on comma splicing, run-on sentences, punctuating speech, writing complex sentences and avoiding pronoun ambiguity. The ‘later application’ in this scheme is writing a mini-story and each of the aforementioned areas of knowledge are crucial if students are to succeed. The idea is that if students master the parts, they will master the whole.

Writing sophisticated analytical responses to texts is complex: paragraphs and essays are made up of multiple strands of knowledge, all playing a vital role and each conspicuous in their absence. As a literate adult, it is easy to forget that these ‘applications’ are made up of so many individual yet interconnected parts. Our expertise not only blinds us to the fact that these parts exist, but also to the fact that we most likely achieved our expertise through extended, deliberate and purposive practice. While I used to think that endless extended writing practice was the best way for students to improve their analytical writing, I now think the opposite and favour a deliberate practice model, a framework that Daisy Christodoulou explains in her book Making Good Progress. Rosenshine makes a similar point, advising teachers to ‘present new material in small steps with student practice after each step, only present small amounts of new material at any time, and then assist students as they practice this material.’ His point about ‘student practice after each step’ is so important. We often underestimate just how much practice a student needs before even the tiniest concept is mastered. If a complex skill is made up of multiple tiny parts, then if one or many have not been mastered, the cumulative effect is a magnification of these separate deficiencies, resulting in catastrophic misunderstanding for the student and frustration for both student and teacher.

If you have ever been involved with sports training, you will recognise the notion that drills, repetitive practice of isolated techniques and mini-game scenarios also revolve around this idea of mastering parts before whole. Not only do they allow teachers to give more precise and effective feedback-we have all ‘marked’ extended writing that has so many errors that feedback seems futile-but they motivate students as, instead of writing loads of rubbish essays that contribute to feelings of inadequacy and fuel the notion that successful writing is somehow mystical and out of reach, they experience regular success in manageable practice activities. If these activities are sequenced well, they will build slowly towards the complex skill of extended analytical writing. Two overriding principles apply to the sequencing of these practice activities: firstly, they should move along a continuum from inflexible to flexible knowledge; secondly teacher support should be gradually faded out.

I will show you a worked example from a year 10 Jekyll and Hyde unit, answering the question ‘How is Utterson presented at the start of the novella?’

Before you see the example, here is a list of some of the constituent parts that require explicit teaching and practice before a student can attempt something similar. Each one of these should have been taught, retrieved and practiced in focussed activities until students are competent and able to use them successfully in ‘applications’.

1) Vocabulary words: reputation, decorum, obsessed, exemplifies, secrecy, frivolity, impeccable, propriety, social conformity, disdaining, conveys, paranoia, repressing, repression, social inhibitions.

All of these words are initially presented via vocabulary tables and will be practiced using because, but, so or other focussed drills. Although some of these words will have been taught in previous units, having been chosen due to their high-utility nature, others will have been taught for the first time in this unit. Not only do I want student to use these words in the lesson that sees them writing an answer to the question above, I also want them to remember and apply these words across units and years: for this to happen, regular retrieval practice is important.

2) Context information: Fear of Blackmail and scandal, Repressive Victorian Social Mores, Upper Class

We generally deliver context information through non-fiction articles that complement the main text that we are studying, following Lemov’s advice in Reading Reconsidered. We then use retrieval practice to ensure that this information is retained as well as challenging students to apply the information through very specific success criteria.

3) Quotations: ‘was never lighted by a smile’, ‘austere’, ‘cold’, ‘embarrassed in discourse’ ,‘drank gin…to mortify a taste for vintages’, ‘when the wine was to his taste, something eminently human beaconed from his eye’, ‘never found its way into his talk’

Crucially, I ask students to explore, manipulate and apply the quotations that I want them to memorise and use in the exam. Although we look at extracts and other quotations, the core exam list forms the main focus when writing paragraphs or analysing. Memorising and mastering 40 quotations from the novella is infinitely preferable to being exposed to hundreds, the latter approach being condemned by Engelmann as he says ‘the expectations for student performance are low because the teachers understand that students will not actually master the material. They will simply be exposed.’

While the first three categories of knowledge that I covered are about the content of the analysis-some of which is text specific-the next three are about the form and structure of analytical writing. This knowledge has even higher utility as students can use, adapt and apply these concepts across units, texts, years and key stages.

4) Grammatical constructions: noun appositive phrases, past participle phrases, present participle phrases

Once students have mastered the concept of a sentence (subject+verb), and can parse sentences fairly well, naming nouns, verbs and other word classes, we teach and practice these more advanced constructions, building on KS2 and attempting to follow Engelmann’s comment that ‘For mastery teaching to be possible, programs must be thoroughly coordinated from level to level’. p.34 One of his criticisms of traditional programs is that different levels of traditional programs present the same topics and the same examples.’ p.34. We don’t want to merely repeat the grammar that they are taught in KS2; we want to take them a step further.

5) Embedded evidence

This is a threshold concept, without which students are effectively locked out of sophisticated analytical writing, and so should be explicitly taught and practised from the beginning of year 7. One of the unintended limitations of PEE and other formulaic analytical frameworks is that students do not have to learn how to manipulate or embed quotations, instead following the predictable and clunky framework: My evidence for this is ‘……….’ In our curriculum, students see models of embedded quotations from the beginning of year 7 and examples of embedded quotations are threaded throughout vocabulary tables and worked examples. Although many student intuit this skill-perhaps as a result of the high frequency of examples, many do not and require explicit teaching until they can master it. Teaching Expressive Writing has made me more methodical in the actual teaching of this skill-it is hard for weaker students!

7) 4 out of the 6 Analytical Skills: 3 part explanation, multiple interpretations, tentative language, evidence in explanation

Although I will explain this analytical framework in a future post, each of these constituent concepts should be practiced in isolation, again building up to applying them in paragraphs and essays.

Here is the model that exemplifies these constituent parts:

The archetypal Victorian gentleman, Utterson cares deeply about his reputation and attempts to act with the upmost decorum at all times. His face ‘was never lighted by a smile’, a description that exemplifies his serious nature and desire to be respected by his peers. Obsessed with secrecy and the opinions of others, Utterson is ‘austere’ and ‘cold.’, avoiding all forms of frivolity in order to project an image of impeccable propriety. Like other upper class gentlemen, he was expected to display social conformity at all times because Victorian social mores were incredibly restrictive. He was ‘embarrassed in discourse’, disdaining gossip, conversation and small talk. Perhaps he does this in order to remain secretive and private; it is as if his reputation is the most important aspect of his life and he doesn’t want people to judge him. The Upper Class, concerned as they were with their reputations, greatly feared blackmail and scandal and Utterson’s behaviour conveys this fear and lack of trust. He ‘drank gin…to mortify a taste for vintages’, repressing his desire for pleasure. However, this repression and mask of seriousness occasionally slips as ‘when the wine was to his taste, something eminently human beaconed from his eye’, meaning that he sometimes loses his social inhibitions. Interestingly though, this lack of inhibitions ‘never found its way into his talk’, suggesting he is guarded, paranoid and incredibly secretive.

A typical English lesson will involve multiple activities that attempt to practise many of the constituent parts listed above.

Here is an example plan:

1) Cumulative retrieval practice of vocabulary, quotations to be memorised (KS4) and context information.

a) Some questions will be closed, rote style factual recall, perhaps with clues:

What adjective means flawless or perfect:  imp_cc_ble   ANSWER: impeccable

b) Some will be more open, the assumption being less clues are needed for successful recall:

What noun means good behaviour?  ANSWER: propriety

c) Some will involve students making links, hopefully developing schemas and connections in their minds:

Name three terms that refer to standards of behaviour ANSWER: decorum, social conformity, propriety

d) Some will involve even wider links:

Complete the quotation: ‘embarassed in……………….’ Write down another two quotations that describe Utterson’s secrecy. What context information is relevant here? Write down three adjectives to describe Utterson’s behaviour

ANSWER: Quotations:’embarrassed in discourse’ ‘cold’ ‘never found his way into his talk’ Context: Fear of Blackmail and scandal, Repressive Victorian Social Mores. Adjectives: secretive, paranoid, obsessed.

This final answer is essentially the bare bones of an analytical paragraph and asking these freer retrieval questions help students to build mental plans of what to write. You could then ask them to use it to write a paragraph at speed.

2) Sentence Practice, using because but so or practising phrases and other relevant structures that you want students to use in their applications. These sentences will frequently require embedded evidence as success criteria, providing practice of this vital skill.

3) Line annotation, applying vocabulary that has been taught and using rapid sequences of questions in order to maximise student participation and recall.

4) Worked example annotation. The model answer contains all of the constituent elements mentioned above, exemplifying their usage within ‘applications’.

5) Focussed paragraph writing, allowing students to apply the constituent concepts.

Next post: Insights from Direct Instruction part 5


Insights from Direct Instruction: Part 3

This is the third post looking at the relevance and application of the five key philosophical principles that underpin Engelmann’s Direct Instruction. You can find the first one here and the second one here.DI book cover

3) All teachers can succeed if provided with adequate training and materials p.2

When I started teaching, I was shockingly bad and I’m convinced that part of this ineptitude was down to a combination of two things. Firstly, I was largely left to my own devices and encouraged from the beginning to create schemes of learning and medium term planning in the absence of any evidence based principles, knowledge about cognitive science or actual experience. Secondly, I was broadly led down the path of group work, generic skills, discovery learning, being a ‘learning facilitator’ and creating engaging (fun) lessons. Although these methods may work in some contexts, in my experience-especially compared with an explicit, deliberate practice approach-they were pretty useless. I now fully subscribe to the idea that novices-and this includes both students and novice teachers-learn more effectively and efficiently through fully guided instruction. Summarising the seminal Kirchner and Sweller paper that has been so influential, this AFT article looks at this idea in more detail.

I have mentored a number of NQTs and, following the DI principle, I always give new teachers planning, lessons and resources to use, training them incrementally with regards to the theory behind them and their practical application. To draw an analogy, asking a new teacher to plan a scheme of work from the beginning is likely to be as ineffective as asking a student to write an essay in order to improve their writing: it will result in misconceptions, confusion and a lack of success. My own experience as a novice teacher saw me being encouraged to experiment with a whole spectrum of disconnected ideas and approaches and I was given little guidance as to their relative efficacy. While you could make the point that taking risks, attempting to innovate and developing an individual approach are laudable aims, they pose significant problems. School time is finite. Students get one shot at their education and, if this is made less effective due to teachers being encouraged to find their own methodology and innovative approaches-especially when there is extensive and robust research that demonstrates the efficacy of already existing techniques-then this is both unprofessional and harmful. Although comparing teaching with more established professions has its flaws and limitations, you would be appalled if an architect or a doctor approached a procedure in a way that ignored research, justifying their radicalism in the name of personal autonomy or creativity.

The idea that ‘By providing the most effective and efficient way to present materials, teachers are free to provide the support students need’ is also of vital importance. I am not claiming that our curriculum is flawless or complete-we have a long way to go; however, the fact that it is centralised and long term, planned by experienced teachers and based on sound research means that teachers are ‘free to provide the support students need’. I want teachers to have the time and energy to deal with the innumerable and spontaneous problems that arise each and every day. I want them to be able to give corrections that are efficient and immediate. I want new teachers to learn how to teach effectively through fully guided instruction, not through a drawn out and potentially flawed process of discovery.

When you remove the additional burden of planning lessons every day (or at least the requirement to continually create things from scratch) then these aims become realistic. Workload is a problem for teachers: a centralised approach helps to prevent unnecessary and unsustainable working habits. Instead of changing units every year, we evaluate, adapt and refine the ones that we already have.

4) Low performers and disadvantaged learners must be taught at a faster rate if they are to catch up to higher-performing peers.

This particular philosophical principle is a tricky one. How do you accelerate the progress of those that are behind? When I began teaching DI sequences, I taught Corrective Reading and Expressive Writing to bottom sets instead of their usual English curriculum. This decision was a double edged sword: over time and as expected, their basic skills deficit, particularly with regards to writing, was reduced; however, removing them from the mainstream curriculum meant they were not accessing as much new vocabulary (see this AFT article  which explains that once decoding is secure, reading proficiency is almost entirely based upon vocabulary and domain knowledge). In future, we are looking at offering DI interventions in addition to normal English lessons, meaning students get the best of both worlds.

Students taught with DI ‘learn more in a short amount of time’ due to the meticulous design of the programmes and the extensive field testing that goes into their development. Although this level of complexity and rigour is out of our reach (I have been told that they can take up to ten years to develop), our choices as to what we teach can go some way towards helping low-attainers catch up. Our aim should be the development of well sequenced, methodical curricula filled with high-utility concepts, moving students along a continuum from flexible to inflexible knowledge whilst also promoting generalisation. Although an insistence on teaching for retention is important for all students, it is crucial for low performers and disadvantaged students as they will have less information stored within long term memory. Creating centralised resources that define content down to a word-level is the first step to addressing their knowledge deficit. In the absence of such detailed resources, it is very difficult to engage in effective and efficient retrieval practice.

5) All details of instruction must be controlled to minimise student misinterpretations and to maximise learning p.3

Engelmann states that ‘years of research on how children learn show that even minor changes in teachers’ wording can confuse students and slow their learning’. The idea of ‘faultless communication’ is a central part of DI. Sequences of examples and teacher wording should only communicate one logical interpretation. Naveen Rizvi has written here about how this idea applies to teaching algebra. I am currently reading Theory of Instruction, Engelmann’s comprehensive and detailed explanation of his theory, and have begun developing example and test sequences that follow this principle. Future posts will explore this.

To explain the idea of ‘faultless communication’ and giving you an idea of what it must be like to be a novice learner, here is an example from Clear Teaching by Shepard Barbash (a short introduction to Engelmann’s approach and a great place to start if you want to know more!)

‘Try this experiment. Make up a nonsense word for a familiar concept and try teaching the concept to someone without using its regular name. Engelmann holds up a pencil and says, “This is glerm.” Then he holds up a pen and says, “This is glerm.” Then he holds up a crayon—also glerm. So what is glerm? A student responds: “Something you write with.” Logical, but wrong, Engelmann says. Glerm means up. The student learned a misrule—Engelmann’s examples were deliberately ambiguous, exemplifying both the concepts for up and for writing implements, and the student came to the wrong conclusion. This is one of the exercises Engelmann uses to teach instructional design. His point is to make us aware of the minefield teachers must navigate to avoid generating confusion in their students. Next he wanders around the room giving examples of the concept graeb, without success. At last he opens the door, walks out and shouts: “This is not graeb.” Graeb means in the room. To show what something is, sometimes you have to show what it’s not. He points to a cup on his desk and says, “That’s glick.” Then he holds up a spoon and says, “Not glick.” He points to a book on a student’s desk—glick—then raises a pen—not glick. What’s glick? No one is sure. Finally he puts the spoon on his desk—that’s glick—lifts it—not glick—puts the pen on the student’s desk—glick—and lifts it—not glick. Everyone gets it: glick means on. (p.19)

For a novice learner, normal communication and instruction is riddled with ambiguity. As shown in this example, students are unaware as to which points are important and instructions may contain multiple terms that, despite being clear to the teacher, are vague, ill-defined or meaningless to the student.

One of the main controversies surrounding DI sequences is the fact that lessons are scripted and that all teachers are expected to teach the same thing in the same way. Critics believe that this removes teacher autonomy, replacing it with a mechanical and sclerotic approach, dehumanising students and deskilling practitioners. However, it is precisely because we are not robots that scripts and standardisation are helpful: teaching is incredibly complex, containing numerous variables and requiring hundreds of split-second decisions and sequences of communication every lesson, each one fraught with the potential for errors regarding interpretation. I disagree with the oft stated axiom that there is no best way to teach. Like all scientific endeavours-and if we agree with Engelmann we should definitely approach teaching as a science-some ways are probabilistically more effective than others.

In case you are wondering, we are not developing scripted lessons, although I see the benefit of creating communication sequences that are as unambiguous as possible, particularly when dealing with what Engelmann calls Basic form concepts (in English this could refer to specific sentence structures and vocabulary). No-one, not even staunch DI enthusiasts, is suggesting that scripted lessons are appropriate to teach extended essay responses, literary analysis or other such subjective, deeply complex skills.

Next post: Insights from Direct Instruction part 4


Insights from Direct Instruction: Part 2

In my last post, I outlined how Engelmann’s Direct Instruction has helped inform how we sequence, plan and resource our English curriculum. This post will look at the first two of five key philosophical principles that drive and underpin Engelmann’s programmes and how these ideas are relevant to everyday planning and teaching. The five principles have been taken from Successful and Confident Students with Direct Instruction.DI book cover

1) All Children Can be taught p.2

With DI, the assumption is that ‘if children haven’t learned, the instruction is to blame-not the student’. This is a refreshing and interesting position, putting full responsibility for success upon instructional design, clarity of communication and teaching quality. Although we may not like to admit it, I’m sure we have all asked ourselves why don’t they get it! when presented with students who struggle to understand, our incredulity and frustration causing us to settle for the easiest explanation which, as a result of our exasperation, may end up being the students themselves. Blaming students, the alternative to taking full responsibility for the success of instruction, can often lead to the soft bigotry of low expectations where disadvantage, need, class set or other such labels and categories can be blamed for underperformance. This often leads to other related issues such as the dumbing down of content or problematic differentiation, ideas that this blog discusses in more detail.

This chain of thought is not being dismissive of student needs in any way whatsoever-I recognise students have myriad needs, differences and challenges that make learning and general school life difficult. In fact, by raising the level of accountability with regards to the programme of instruction and teaching, starting from the high expectation that ‘all children can be taught’, you could argue that teachers are paying greater respect to student needs as the first principle of this line of thinking is a solution not an excuse. If student success is entirely predicated upon the quality of instruction, then it forces teachers to be more rigorous, thoughtful, reflective and methodical: this cannot be a bad thing!

This post by Katie Ashford looks at how labels can damage students, the implication being that the label itself can sometimes be used to justify low expectations, reinforcing the idea that the student is at fault. Similarly, in a post about how labels can be used to make excuses for students who cannot read, Dianne Murphy writes ‘Various deficits within Richard are now being offered as the explanation for why he is struggling. No one questions the teaching’

In this interview with Engelmann he talks in more detail about ‘dysteachia’, the idea that ‘confusing, illogical, or inconsistent’ teaching is the cause of poor achievement, not the student. Although the interview is worth reading in full, here are a few quotations from the transcript with some commentary:

  • ‘You need to look at their mistakes for qualitative information about what you need to change in your instruction to teach it right.

Engelmann’s point about looking at mistakes and output in order to make inferences and judgements about the quality of the input is crucial. DI programmes are extensively field tested before they are published in order to check that they are effective. Although we cannot create an actual DI programme, we can and should analyse student responses carefully, both successes and mistakes, in order to inform our instructional decisions and adaptations. During a sequence of teaching, this could be through the use of whole class feedback  where common errors are compiled and addressed through extra teaching or practice in subsequent lessons. At a curriculum level, this could be through the analysis of summative assessments in order to draw inferences as to which elements of a course have been misunderstood. For example, we teach specific sentence constructions such as appositives, the intention being that students will be able to use them across types of writing. We use NoMoreMarking to summatively grade all of our student assessments and when judging an entire year group’s essays, we compile lists of misconceptions-see this post for an analogous approach.  One such misconception in our last set of literature essays was the prevalence of redundant and non-analytical appositives such as:

a) Sheila, Birling’s daughter, seems to change as the play progresses.

b) Priestley uses dramatic irony, a technique where the audience knows something that the character does not, in order to accentuate Birling’s ignorance and pomposity.

Following Engelmann’s approach, this is the fault of the instruction, not the student. We should be looking back through our curriculum in order to see where best to include tasks that address this misconception: this will involve creating sequences of examples and non-examples in order to induce student mastery of the difference between analytical and superfluous constructions, an approach for a future blog post!

Both whole class feedback and the analysis of summative tests are reactive approaches, responding to student outcomes; however, the latter approach, if done thoroughly, is preventative: the adaptation of instructional sequences ‘based strictly on feedback’ should progressively refine the curriculum so that the errors picked up through whole class feedback become less frequent and less complex. In an ideal world, reactive feedback-especially complex, multifaceted corrective work-would be largely unnecessary as student success rates would be consistently high. In DI programmes, students should be at least 70% correct on anything that is completely new and 90% correct on items that have been introduced earlier in the programme. These statistics should makes us pause for thought. If what we ask students to do consistently results in lesser percentages, is this adequately remedied by feedback? If students initially fail to reach a similar success rate, does our feedback reliably ensure that they will close the gap? Or, is our obsession with perfecting feedback blinding us to the imperfections within our instructional sequences?

Although the idea of a yearly departmental review is fairly commonplace (see this post as an example), Engelmann’s ideas should make us look at student responses in far more detail. Feedback to the teacher about the effectiveness of a programme of study is vitally important. As he bluntly puts it in the interview: ‘If they make mistakes, they’re telling you, fundamentally, that you goofed up and they’re also implying exactly what they need to know.’

2) All Children can improve academically and develop a stronger self-image p.2

Low attaining students often have low self-esteem and poor levels of motivation, but which direction does causation run? Does their lack of motivation cause low attainment or does their low attainment cause poor motivation? Nick Rose has written a lot about the psychology behind motivation and it seems to be a complex and contentious area. In this post, he links to the 2014 report from The Sutton Trust entitled What Makes Great Teaching, a review that notes ‘In fact the poor motivation of low attainers is a logical response to repeated failure. Start getting them to succeed and their motivation and confidence should increase.’

Think back to when you were at school: I bet there is strong positive correlation between the subjects that you enjoyed and those that you were successful at.

In Project Follow Through, the 30 year longitudinal study that compared pedagogical approaches in the US, Direct Instruction students placed first with regards to attainment and self-esteem. Perhaps unsurprisingly, being successful is motivational and contributes to a ‘stronger self-image’.

David Didau looks at motivation in this post, pointing out that Daniel Pink, the author of ‘Drive: The Surprising Truth about What Motivates Us’, sees motivation as being driven by mastery, autonomy and purpose. The ideas of purpose and mastery are linked to how students perceive the value and worth of what they are learning. A curriculum that ensures that ‘skills and knowledge do not go away. Once introduced, they are used throughout the rest of the program’ helps to reinforce the idea that what is being learnt has inherent value. In our English curriculum, students initially encounter items and concepts via decontextualized, restrictive practice exercises, moving towards freer application tasks, hopefully resulting in fluency and mastery. When combined with cumulative quizzing across units and years and the benefit of choosing high utility content, this process allows students to see the value of what they are learning.

In the next post I will explore the last three philosophical principles:

3) All teachers can succeed if provided with adequate training and materials

4) Low performers and disadvantaged learners must be taught at a faster rate if they are to catch up to higher-performing peers.

5) All details of instruction must be controlled to minimise student misinterpretations and to maximise learning


Insights from Direct Instruction part 1

I have been teaching DI schemes for a couple of years now, having first been made aware of Engelmann’s work via Joe Kirby’s blog post, which gives an excellent overview of the theory, approach and research base that supports it. The evidence base behind DI is wide ranging and robust: multiple large scale studies have been conducted over the past fifty years, demonstrating DI’s effectiveness.  This recent article gives some links: Kris Boulton has helpfully collated a number of resources and texts that explain the theory and evidence behind DI and you can find these here.

I teach Expressive Writing 2 and Corrective Reading B1 and B2; I have also set up a decoding intervention using Corrective Reading Decoding. Soon, I will be buying Spelling Through Morphographs in order to set up a spelling intervention.

Put simply, teaching these schemes and reading more about the theory behind DI has made me a more thoughtful and diligent teacher. While in the past, my approach to curriculum planning used to be the sum total of the number of lessons in a term, I now tackle it in a far more methodical and systematic way. This will be the first in a series of posts looking at what we can learn from Engelmann’s theory and approach.

DI book cover

While actual, published DI schemes take years to write, develop, test and refine, containing a daunting level of complexity, precision and detail, I will summarise some key ideas from Successful and Confident Students with Direct Instruction, a book that explains some general DI principles that can be applied to everyday teaching.

1) ‘A program design that supports mastery does not present great amounts of new information and skill training in each lesson. Rather, work is distributed so new parts in a lesson account for only 10-15 percent of the total lesson’ p.12

Only 15% of a lesson is new learning! 15%! 85% is spent practising, reviewing and recapping previous learning. Novices require extended practice and repeated exposure to new material if they are to truly master it. I would suggest that this percentage distribution is at best an exact inversion of how teaching is approached in many classrooms. Thinking of learning in terms of discrete, separate lessons combined with a fixation on variety, novelty and pace at all costs means that we may be making it hard for our students to master things. Quickly moving through content with no thought as to the necessity of recap and extended practice will almost always result in a lack of mastery and proficiency from students. Although I do not replicate this percentage distribution-my planning not being effective and efficient enough and the content being too wide to fit into the limited time that we have-since teaching DI schemes, I have massively increased the amount of practice and recap that I do.

2) ‘nothing is taught in one lesson. Instead, new concepts and skills are presented in two or three consecutive lessons to provide students with enough exposure to new material that they are able to use it in applications.’ p.12

If you present new information to students once, you can make a fairly safe bet that they will forget it. This is why retrieval practice is so important as it prevents forgetting. If we are going to ask students to complete ‘applications’ (in English this may be writing paragraphs or essays), then we need to ensure that they retain and are able to use the relevant pieces of knowledge (plot details, vocabulary, sentence constructions, analytical skills etc.) before doing so. A typical DI lesson will contain new material, material from the last few lessons that is ‘being firmed’ and material from earlier in the sequence which, because students have demonstrated sufficient proficiency and competency in restricted drills and practice exercises, is now being used in wider problem solving and freer applications. I wrote about how extended retrieval practice across different lessons can help students move from restricted recall to wider and freer application here. The post explores how to move students from inflexible to flexible knowledge, describing an approach that is heavily indebted to DI planning and sequences.

3) ‘The systematic stairway design does not provide relief because skills and knowledge do not go away. Once introduced, they are used throughout the rest of the program, either as elements that are used regularly (such as a word type that is learned), as details that are embedded in the problems and applications…or as items that are frequently reviewed’ p.14-15

Making choices as to the utility of what you teach is important: as mentioned in this post, choosing and teaching vocabulary that can be used across units and years means that the knowledge ‘will not go away’. Although it may be hard to plan for the usage and practice of a concept or skill across an entire unit or even curriculum, we should be trying as much as possible to do so. As mentioned here, we try to choose words to teach that have high utility-words that can be used across texts, units and school years.

4) ‘Most programs do not teach to mastery…Students will work on a particular unit for a few days and then it will be replaced by another unit that is not closely related to the first and that does not require application of the same skills and knowledge. This design, referred to as a “spiral curriculum”, is more comfortable for the program designers, teachers and students; however, it is inferior for teaching skills and knowledge’ p.16

Since teaching DI sequences, I am now hyperaware of the flaws and inadequacies of conventional medium and long term planning. The idea of a ‘spiral curriculum’ is common across subjects: in English it might mean ‘doing’ poetry once a year for five years in the hope that if they don’t get it the first time, there will be further opportunities to do so. While the spiral approach to curriculum planning is common place across subjects, Engelmann points out a number of flaws. Firstly, it creates a low expectation for performance as students are merely ‘exposed’ to content without any real expectation for mastery. Secondly, students quickly come to realise that the information and concepts that are being taught are temporary and will soon be replaced with another unit that ‘does not require application of skills and knowledge from the previous unit’. It is almost as if the spiral curriculum, by its very design and approach, reinforces apathy and a lack of application: if you knew that you would not need to use or be asked to use a concept again, then it might be entirely rational to give it less than the desired level of effort and thought. Disposable content fosters indifference.

The alternative to a spiral curriculum, and the approach favoured by Engelmann’s DI schemes, is called the strand curriculum. This paper provides a useful comparison between spiral and strand curricula, focussing on Maths.spiral

In Expressive Writing, a DI corrective writing scheme that I teach which is aimed at helping students who write with poor grammar, bad punctuation and little coherence in their compositions, items are taught across multiple lessons and combined with other items to form a ‘strand curriculum’. In total, there are 55 carefully and methodically sequenced lessons. Exemplifying the premise that knowledge ‘does not go away’, the course teaches students to identify the parts of a sentence from lesson 1-10, then again intermittently from lessons 16-27. Capitals and full stops are taught from lesson 1-3, 11-14, and then intermittently from 16-55. All lessons end with a freer application (writing a story in this case) where students are required to apply ‘the same skills and knowledge’ from the drills and restricted practice activities earlier in the lesson and course.

Our English curriculum is deliberately narrow in terms of the types of texts that we ask students to write, focussing on ‘The Big Three’: analysis (responding to texts-a broad text type!); rhetoric (argument and persuasion) and descriptive/creative. The majority of student writing is focussed on analysis as the broad genre of responding to texts has the highest utility, applicable to all responses in literature and the reading responses in language. The finer details between different exam questions can be taught as exam technique in year 11: their core is broadly the same, sharing more similarities than differences. The three broad text types cover all literature and language questions, as well as hopefully providing a useful springboard for A-level and beyond. By narrowing our focus, we are able to go deeper and ensure that students hopefully master three, rather than being exposed to many and mastering none. Students know that subsequent units will ‘require application of the same skills and knowledge’. These ‘skills and knowledge’ are things like speech structure and rhetorical techniques for rhetoric; sentence structures for descriptive writing and our analytical approach (tentatively dubbed The 6 skills) for all forms of analysis and responding to texts, not to mention high-utility vocabulary that is applicable across units and texts. Our units deliberately attempt to intertwine all these aspects, containing elements from all three text types, deliberate sentence practice (see here for an overview), vocabulary and more. Although it is a million miles away from the level of rigour and complexity contained within DI schemes, it is an attempt to move beyond the flawed model of disposable, forgettable and separated units of unconnected lessons.

When choosing what items to teach, plan to teach them across multiple lessons, moving from a start point of restricted recall to an end point of freer application, as described in this post. We should also stop thinking of the ideal lesson as having one focus and one objective: instead, a lesson could have multiple foci: recalling, practising, ‘firming’ and applying items. As many others have pointed out, learning does not come in neat 60 minute chunks and our obsession with the lesson objective may not only be distorting our approach to curriculum planning, it may be actively lessening the efficacy of our teaching.

Next post: Insights from Direct Instruction part 2