I believe that applying cognitive load theory to teaching has the potential to alleviate some of the problems at the heart of programming, especially at primary school level, and can help pupils develop long-term agency.
What is cognitive load?
Totally new information is first processed by the brain’s limited short-term or working memory, which is typically able to hold two to seven elements at a time, depending on the complexity of the data and surrounding distractions. This information is then transferred into the brain’s unlimited long-term memory. If the brain’s working memory is overloaded with new information, it hampers, reduces, or stops the transfer of information into long-term memory, which is necessary to retain learning. Cognitive load theory is based on the idea that, to maximise learning, we need to ensure that we do not exceed the cognitive load that our pupils can deal with.
Managing intrinsic load
Intrinsic load refers to the complexity of the new information that needs to be processed. This is strongly dependent on a student’s prior knowledge, and while what we teach is often out of a classroom teacher’s control, there are approaches we can use to keep intrinsic load manageable.
One problem comes when we introduce too many new concepts at the same time. To create even the most elementary programs, you often need many concepts and it can be tempting to use ones that pupils haven’t learnt about yet. We like what it does, and we want to share it with our pupils, but are we creating more cognitive load that will lead to less long-term assimilation of ideas?
We can also be tempted to introduce a concept using complex examples. The golden rule is: if the idea is totally new, strip away all complexity and present it in its most simple form, with the least information possible, and increase complexity gradually.
As an example, imagine we want to introduce the idea of conditional selection.
if you are hungry
clap once
The above example presents a simple condition that triggers an action that will only be checked once. It is written in everyday readable language, with the only nod to computing being the layout and the indentation. It uses pupils’ understanding of language, which they study on a daily basis.
Now consider:

This also uses conditional selection, but uses the popular quiz format. There is extra complexity in understanding how the answer variable works with the input ‘ask’ block and in making sense of the more complex condition.
Finally, we could also use a typical example of conditional selection, where the condition is checked repeatedly inside an infinite loop:

This repeated checking adds more complexity and thus more cognitive load. All of these examples are useful and will build understanding, but there is less cognitive load in the first, so I’d recommend it as the simplest starting point. It also has the added bonus of developing an algorithmic concept away from a specific programming language, thus increasing the chance of pupils transferring the knowledge to their long-term memories.
Worked examples and completion problems Extraneous cognitive load is affected by how the information is presented, and what the learner has to do as part of the learning process. A key strategy for reducing extraneous cognitive load is to provide pupils with worked examples, which allows them to see how a specific type of problem can be solved. In computing, this might involve showing pupils all the stages of programming, including a clear idea of what we want to create, an algorithm that includes initialisation, stages of coding, and the debugging process. Designing effective worked examples is more challenging than we might think. Worked examples do not always elicit careful consideration in learners, and they can be less effective for pupils who are more advanced in the topic we are teaching. One potential solution to these shortcomings is to make use of completion problems. These can provide a bridge between worked examples and conventional problems: pupils are given a partial solution that needs to be completed. This reduces scaffolding and ensures learners are engaged with the information being presented. One strategy I have used is to provide pupils with an idea, code, and a list of bugs solved, and ask pupils to create the algorithm. Alternatively, I have provided the algorithm, code, and bugs, and asked pupils to explain the idea.
Split attention effect
Another strategy for reducing extraneous cognitive load is to prevent the split attention effect. This is where two sources of information are presented alongside each other, neither of which is effective on its own, but in which both are needed for understanding. The learner has an increased cognitive load from trying to integrate both sources of information into a whole.
As an example, imagine we want to explain networks to pupils. We might provide a basic diagram illustrating the pathway that data packets take from our home computing device to a waiting web server to retrieve a webpage. Alongside that, we give them a description of each item in the chain and what it does. Neither example gives the full explanation on its own. To reduce cognitive load, we can integrate the text and the diagram into one resource with well-placed labels.
As with any aspect of learning science, there are criticisms and potential shortcomings of cognitive load theory. But the strategies I’ve shared here are well-grounded in empirical research and an effective way of helping our students to learn. I’d really recommend trying some of the approaches I’ve outlined and considering cognitive load in your teaching.
Further reading
John Sweller’s recent review of cognitive load theory
Ton de Jong’s discussion of the limitations of the theory
Ofsted’s rationale on using cognitive load theory in its new inspection framework, from Daniel Muijs
Greg Ashman’s blog post on how cognitive load theory has influenced his teaching of maths