Working Memory

Working memory is an aspect of human memory that permits the maintenance and manipulation of temporary information in the service of goal-directed behavior. Its apparently inelastic capacity limits impose constraints on a huge range of activities from language learning to planning, problem-solving, and decision-making. A substantial body of empirical research has revealed reliable benchmark effects that extend to a wide range of different tasks and modalities. These effects support the view that working memory comprises distinct components responsible for attention-like control and for short-term storage. However, the nature of these components, their potential subdivision, and their interrelationships with long-term memory and other aspects of cognition, such as perception and action, remain controversial and are still under investigation. Although working memory has so far resisted theoretical consensus and even a clear-cut definition, research findings demonstrate its critical role in both enabling and limiting human cognition and behavior.

Keywords

Subjects

Introduction

The term working memory refers to human memory functions that serve to maintain and manipulate temporary information. There is believed to be a limited capacity to support these functions which combine to play a key role in cognitive processes such as thinking and reasoning, problem-solving, and planning. A common illustration is mental calculation which typically involves maintaining some initial numerical information whilst carrying out a series of arithmetical operations on parts and maintaining any interim results. However, the range of activities that depend on working memory is very much wider than that example might suggest. Thus, perception and action can also depend critically on maintaining and manipulating temporary information, as for instance when identifying a familiar constellation in the night sky, or when preparing a meal.

Information about a stimulus remains available for a few seconds after it is perceived (short-term memory) but without active maintenance it rapidly becomes inaccessible (Peterson & Peterson, 1959; Posner & Konick, 1966). Conceptually, working memory extends short-term memory by adding the active, attentional processes required to hold information in mind and to manipulate that information in the service of goal-directed behavior.

The short-term storage required for working memory can be distinguished from long-term memory, which is concerned with more permanent information acquired through learning or experience and includes declarative memory (retention of factual information and events) and procedural memory (underpinning skilled behavior; see Cohen & Squire, 1980). Notably, and in contrast to short-term memory, these forms of long-term memory are passive in the sense that, once acquired, memory for facts, events, and well-learned skills can persist over very long periods without moment-to-moment awareness. For example, a vocabulary of many thousands of words, including the relationship between their spoken forms and meanings, can be retained effortlessly over a lifetime. Similarly, once acquired through practice, complex and initially challenging behaviors such as swimming or riding a bicycle can become almost automatic and can be carried out with relatively little conscious control.

In early models of the human memory system (e.g., Atkinson & Shiffrin, 1968; see Logie, 1996) short-term memory was seen as a staging post or gateway to long-term memory, and it was recognized that it could also support more complex operations, such as reasoning, thus acting as a working memory. Subsequent research has attempted to refine the concept of working memory, characterizing its functional role, limits, and substructure, and distinguishing the processes involved in maintenance and manipulation of information from the storage systems with which they interact.

It has proven difficult, however, to disentangle working memory function from other aspects of cognition with which it overlaps. First, as described in more detail in the section “Substructure and Relationship to Other Aspects of Cognition,” many current accounts view the mechanisms of working memory as contributing to other perhaps more fundamental functions such as attention, long-term memory, perception, action, and representation. It is also notable that many informal descriptions of working memory emphasize consciousness and awareness as key features. Intuitively, many working memory functions are accessible to consciousness, and concepts such as mental manipulation, rehearsal, and losing track of information through inattention are subjectively encountered as characteristics of the conscious mind. Of course, by definition, people cannot be subjectively aware of any unconscious contributions to working memory (although they can potentially be inferred from behavior). Some theorists have argued that working memory is central to conscious thought (e.g., Baars, 2005; Carruthers, 2017), while other empirical researchers have sought to demonstrate nonconscious processes operating in what would typically be considered working memory tasks (e.g., Hassin et al., 2009; Soto et al., 2011). It is not clear whether, how, or to what extent consciousness is essential for working memory functions, or whether indeed the definition of working memory ought to include, or avoid, aspects of conscious experience. This article steers away from the topic, but the current status of the debate is captured in reviews such as Persuh et al. (2018). Overall, it is difficult to precisely delineate the boundaries of working memory, whether with other cognitive functions or with consciousness and awareness; in philosophical terms it may not constitute a “natural kind” (Gomez-Lavin, 2021).

These challenges make it difficult to establish a clear-cut and uncontroversial definition of working memory itself, its function, and substructure. Yet it is clear that working memory describes a cluster of related abilities that play a critical role in everyday thinking, placing important constraints on what we can and cannot do. Research on the topic has proved fruitful and although there remain many theoretical controversies about how working memory should be defined and analyzed, these mainly relate to the way in which its operations and substrates can be usefully subdivided, and their interrelationships with other cognitive systems such as those responsible for long-term memory and attention (see Logie et al., 2021 for in-depth discussion).

The following sections begin by identifying relatively uncontroversial characteristics of working memory and its temporal and capacity limits before outlining the main theoretical perspectives on the structure of working memory and its relationship to other forms of cognition. This is followed by a summary of the main experimental tasks and key empirical observations which underpin current understanding. Finally, a brief discussion of the importance of working memory beyond the laboratory is provided.

Limits

Temporal Limits

It is broadly agreed that its temporary or labile character is a defining characteristic of working memory. In contrast with established declarative and procedural memories that can be retained indefinitely, recently presented novel information is typically lost after a few seconds unless actively maintained. This active maintenance of short-term memory in order to complete a task is one of the core functions of working memory. As discussed further (see “Limiting Mechanisms”), it is less clear how such information is lost over time, or whether forgetting is strictly linked to the passage of time (decay) or merely correlated with it (for example, through an accumulation of interfering information). Nonetheless the vulnerability of short-term memory to degradation over time constrains the uses to which it can be put. Active maintenance processes include rehearsal—covertly subvocalizing verbal material, and attentional refreshing—selectively attending to an item that has not yet become inactive (see e.g., Camos et al., 2009). These active processes are themselves limited by the modality and quantity of the stored material, so that for instance subvocal rehearsal is disrupted by speaking aloud at the same time (“articulatory suppression”; Murray, 1967), and attentional refreshing can only be directed at a limited number of items in a given period of time (Camos et al., 2018). Even though such active maintenance processes extend the temporal limits of short-term memory, when they do so at the cost of limited attentional resources, this reduces the availability of those resources for other goals.

Capacity Limits

It is also agreed that the limited capacity of working memory is a defining characteristic; in subjective terms, only a limited number of items can be “held in mind” at once. For example, in the classic digit span test of short-term memory capacity, participants are asked to briefly store, and then recall in order, arbitrary sequences of digits of gradually increasing length. In this type of task, accurate performance is typically only possible for very short sequences of up to three or four items beyond which errors of ordering become ever more frequent. Memory span is defined as the sequence length at which recall is correct half the time and is found to be between six and seven for digits, and even less for items such as unrelated words (Crannell & Parrish, 1957). Similar capacity constraints are evident in nonverbal tasks requiring the recall of spatial sequences or the locations or visual properties of objects in spatial arrays. For instance, in the Corsi Block task, participants follow an assessor in tapping out a sequence of blocks in a tabletop array or a sequence of highlighted squares on a computer display. In the standard task, nine blocks are used in a fixed configuration and healthy participants can only recall sequences of around six taps even when tested immediately after presentation (Corsi, 1972; Milner, 1971). Such tasks are helpful in identifying the fundamental capacity constraints on short-term memory but working memory capacity is also constrained by the active processes that maintain and manipulate information. This is typically assessed using complex span tasks which measure how many items can be held in mind while carrying out an attention-demanding concurrent task, leading to far lower estimates than simple spans (Daneman & Carpenter, 1980). Similarly, participants show greatly reduced performance on a backward digit span task where mental manipulation is required to reverse the original sequence at recall. (Interestingly the Corsi span is the same in both directions; Kessels et al., 2008). Notably, forward and backward digit span and Corsi Block tasks are all used in the clinical assessment of neuropsychological patients as well as in research studies, highlighting the importance of working memory capacity in characterizing healthy and impaired cognitive function.

Just as the temporal limits of short-term memory can be extended by active maintenance processes, its capacity limits can be mitigated through strategic processing. Although it is clear that the number of items that can be stored in working memory is limited, there is some flexibility about what constitutes an item. For example, the sequence “1-0-0” might constitute three digits or might be represented as a single item, “hundred.” The possibility of more efficient forms of coding depends on interactions with long-term memory and can be exploited strategically to extend working memory capacity through “chunking” (Miller, 1956). Thus, for an IT professional, the sequence “CPUBIOSPC” is more easily maintained as the familiar acronyms “CPU,” “BIOS,” and “PC” than as an arbitrary sequence of 10 letters.

While the previous example exploits long-term knowledge, even arbitrary grouping can extend the capacity of working memory, for example, in the immediate serial recall of verbal sequences, performance is improved when items are presented in groups. A spoken sequence of digits like “352-168” (i.e., with a pause between the two groups of digits) is recalled more easily than the ungrouped sequence “352168” (Ryan, 1969). Again, this effect can be deployed strategically, and there is evidence that participants spontaneously group verbal material in memory.

More generally, prior learning and experience can not only expand effective storage capacity but can also contribute to efficient active processing operations. For example, children may initially use a counting-on strategy to perform simple sums such as 2 + 3 = 5, but later typically learn arithmetic number facts that automate such operations, in turn permitting more demanding mental arithmetic to be carried out within working memory (Raghubar et al., 2010). In the extreme, expert calculators may collect extraordinarily large “mental libraries” of number facts (Pesenti et al., 1999). Another powerful strategy for extending working memory capacity is seen in expert abacus operators who in mental calculation are able to use visual imagery to internalize algorithms learned from using the physical device (Stigler, 1984).

Limiting Mechanisms

Despite the clear consensus that limited capacity and duration are defining characteristics of working memory, distinguishing it from other forms of memory and learning, there is less agreement about the mechanisms through which information is limited and forgotten.

In one account, the ultimate capacity limits of the system are determined by its access to a limited number of discrete slots, each of which can be used to hold a chunk of information (Cowan, 2001; Luck & Vogel, 1997). However, an alternative and increasingly influential view is that working memory has access to a continuous resource which can be flexibly deployed to support a greater number of chunks or items on the one hand, or greater fidelity and precision on the other (Bays & Husain, 2008; see Ma et al., 2014 for discussion).

The loss of information from working memory over time can similarly be attributed to different mechanisms, although here they do not amount to mutually exclusive models of the same phenomenon. One potential mechanism is decay, assumed to be a fundamental property of the substrate of short-term memory, through which information is lost due to the passage of time alone. In this view the attentional/executive component of working memory is typically deployed to extend its capacity by strategically (but effortfully) refreshing or rehearsing the content of short-term memory before it decays irretrievably. A further potential mechanism is interference. In this account, memory traces are prone to be confused with, or gradually corrupt one another. Several current models incorporate a combination of decay and interference (Baddeley et al., 2021; Barrouillet & Camos, 2021; Cowan et al., 2021; Vandierendonck, 2021), while Oberauer (2021) stands out in rejecting time-based forgetting and maintenance processes, proposing in their place loss due to interference, and requiring a process dedicated to the active removal of outdated information from working memory.

Substructure and Relationship to Other Aspects of Cognition

Because it is linked to such a wide range of cognitive capacities, it can be difficult to clearly distinguish mechanisms of working memory from those of its specialized subcomponents or of general-purpose cognitive mechanisms which contribute to nonmemory functions. There is a broad consensus that working memory involves the interaction of an active process (corresponding to “attention” or “executive control”) with a substrate that can represent the content of memory and thus act as a short-term store. Authors disagree, or are sometimes agnostic, as to the extent to which these components can be usefully subdivided and the degree to which they are uniquely involved in working memory or more generally in cognition. Authors also differ in the emphasis they put on different modalities and tasks. These different emphases may sometimes mask a deeper consensus in which models are complementary rather than incompatible (Miyake & Shah, 1999).

Although the term working memory had already been applied to the use of short-term memory in goal-directed behavior (Atkinson & Shiffrin, 1968), it was the influential work of Baddeley and Hitch (Baddeley, 1986; Baddeley & Hitch, 1974), that introduced the separation of attentional control processes (governed by a “central executive”) and short-term storage systems (thought of as “buffers,” i.e., distinct and specialized systems). They further identified a distinction between verbal and visual buffers which were subject to different forms of disruption and appeared to use distinct codes. In particular, verbal information could be stored in a speech-based system (termed the “phonological loop”), in which similar sounding items were more likely to be confused and which was disrupted by concurrent articulation. This work led to the development of the multicomponent model, which subsequently incorporated a richer characterization of the visuo-spatial store (the “visuospatial sketchpad,” see e.g., Baddeley & Logie, 1999; Logie, 1995) and, later, an additional store—the “episodic buffer” which holds amodal information and interacts with episodic long-term memory (Baddeley, 2000). The possibility of further substructure within these core components is also recognized (e.g., Logie, 1995 on distinguishing visual and spatial subcomponents; see also Logie et al., 2021 on the possibility of multiple substrates within a multicomponent perspective).

An alternative view, the embedded processes model put forward by Cowan (1999), is that working memory can be seen as the controlled, temporary activation of long-term memory representations, with access to awareness being limited to three to four items or chunks. A key distinction with the multicomponent model hangs on whether working memory relies on a distinct substrate (as implied by the term “buffer”), or whether the substrate is shared with long-term memory. Oberauer (2002) similarly identifies working memory with activated representations in long-term memory. In this account, the activated region forms a concentric structure within which a subset of individual chunks inside a “region of direct access” compete to be selected as the focus of attention.

Other more recent theoretical accounts have also emphasized the role of attentional control in determining the limits of working memory. For example, Engle (2002) regarded capacity constraints as reflecting the limited ability to control domain-general executive attention in situations where there is the potential for interference among conflicting responses. The time-based resource sharing account (Barrouillet & Camos, 2004) highlights the need to balance the active refreshing of short-term with concurrent processing demands. In this view, constraints arise from the necessary trade-off between maintenance and manipulation, both of which rely on common attentional resources.

Many theoretical approaches to working memory do not follow Baddeley and Hitch in identifying modality-specific substrates for the temporary storage of information and assume instead a unitary system in which many different types of feature can be represented (e.g., Cowan et al., 2021; Oberauer, 2021). In such accounts, modality-specific phenomena are attributed to differences in the extent to which such features overlap within and between modalities. On the other hand, some authors acknowledge the possibility that there may be many alternative substrates, and that even within a modality further subdivisions may be possible. So, for example substrates supporting memory of verbal/linguistic content might further distinguish auditory-verbal, lexical, and semantic levels of representation (Barnard, 1985; Martin, 1993).

Neuroscientific investigations have tended to support the consensus idea of a broad separation between executive and attentional control processes on the one hand, and (often modality-specific) stores on the other, but if anything have highlighted even more extensive overlap of the neural substrate of working memory with other cognitive functions including sensory–perceptual and action–motor representation, and greater granularity and fractionation of function within both storage and control systems. This led Postle (2006) to argue that working memory should be seen as an emergent property of the mind and brain rather than a specialized system in its own right:

Working memory functions arise through the coordinated recruitment, via attention, of brain systems that have evolved to accomplish sensory-, representation-, and action-related functions.

Even in this view it is clear that the mechanisms of working memory (however they overlap with other cognitive functions) involve the interaction of distinct components (at minimum “attention” is distinguished from sensory/representation and action-related function, and these latter functions may also be further subdivided).

Empirical Investigation and Key Findings

A variety of tasks have been developed to investigate working memory in the laboratory. These tasks, of course, always require participants to briefly retain some novel information, often the identity of a set of items which might be visual (for example, colored shapes) or verbal (digits, words, letters). However, they vary quite considerably in the extent to which they require memory for the structure of the set (such as, for verbal stimuli, their order or the spatial layout of an array of items), the degree to which they place an ongoing or concurrent load on memory and attention, and the precision with which sensory and perceptual properties of the individual items must be represented. An excellent overview of these techniques and associated benchmark findings can be found in Oberauer et al. (2018).

Tasks

In an item recognition task, participants determine whether a specific item was in a set (a sequentially presented list or simultaneously displayed array) that they previously studied (McElree & Dosher, 1989). In probed recall, they are provided with a cue that uniquely specifies a given item from a previously presented set, which they are then required to recall (Fuchs, 1969). In free recall tasks, typically employed with verbal stimuli, participants are presented with an ordered list, but are allowed to recall the items in any order (Postman & Phillips, 1965), whereas in serial recall (Jahnke, 1963) they are required to retain the original order of presentation.

The preceding tasks place increasing demands on short-term memory for the structure as well as the content of the presented stimuli, but place relatively little requirement for attention or the manipulation of memory content. To address these aspects of working memory, a range of additional tasks have been developed. In complex span tasks the to-be-remembered items are interleaved with a processing task, placing a greater concurrent load on the attentional system (Daneman & Carpenter, 1980). In the N-back task, items are presented rapidly and continuously, with the participant being required to decide whether each new item repeats one encountered exactly n-items earlier in the sequence; to do this they must not only maintain the order of the previous n-items, but also manage the capacity-limited short-term memory resource as every new item arrives. These demands become increasingly taxing as the value of n increases, again giving an indication of the effects of load on performance or, since it is particularly amenable to neuroimaging, brain activity (see Owen et al., 2005 for review). 1 As mentioned, the manipulation requirements of serial recall can be increased by reversing the order in which items are to be recalled. More involved forms of mental manipulation are explicitly tested in memory updating paradigms (Morris & Jones, 1990), within which, after being presented with an array or description, participants are instructed to carry out a sequence of operations before retrieving the result.

To assess its fidelity over brief intervals, tasks that require memory for detailed properties of the items are useful. In change detection tasks (e.g., Luck & Vogel, 1997), participants are required to respond to alterations in the stimulus (typically a visually presented array) between presentation and testing. These alterations can be made arbitrarily small, thus testing the precision of the underlying memory representation. Going beyond recognition-like responses to change, in continuous reproduction or delayed estimation tasks, participants are asked to recall continuous features of the stimuli such as the precise color or orientation of a shape within a previously-studied array (e.g., Bays & Husain, 2008). These tasks allow researchers to go beyond the question of whether information is merely retained or lost; they can be used to characterize and quantify the quality of the underlying representation, which in turn can shed light on the potential trade-off between capacity and precision in working memory.

The preceding tasks provide a very useful set of tools for investigating working memory in the laboratory. To investigate the structure and operation of the system, experiments typically manipulate characteristics of the items to be stored, and often employ concurrent tasks devised to selectively disrupt putative components or processes. In their standard forms, the individual items are treated as equally valuable or important, but it is also possible to cue specific items, locations, or serial positions in order to encourage participants to prioritize specific content (e.g., Hitch et al., 2020; Myers et al., 2017). Improved recall for such prioritized items can then reveal the operation of strategic processes. Overall, such manipulations show a range of replicable effects, not just on overall performance and response times, but also on patterns of error. In turn these benchmark effects have provided the impetus for current theories and provide important constraints for emerging computational models of working memory (Oberauer et al., 2018).

Set Size and Retention Interval Effects

The most important effects relate to capacity and temporal limits that have already been discussed, and these apply across all applicable experimental paradigms and modalities. Specifically, in terms of capacity limits, task accuracy is impaired as the number of items (set size) is increased (response times also generally increase with set size), and in terms of temporal limits, accuracy declines monotonically with the duration of a delay between presentation and testing. The latter effect is reliably seen for both verbal and spatial materials when the retention interval is filled with a distracting task. It does not apply to unfilled delays in tasks with verbal materials, and only sometimes occurs with spatial materials. The difference between filled and unfilled delays forms part of the evidence in favor of the core working memory concept of active executive/attentional processes in sustaining otherwise fleeting short-term memories.

Primacy and Recency Effects

Another signature of working memory is that items are retrieved with greater accuracy if they are presented at the beginning (primacy) or end (recency) of a sequence relative to other items. The operation of primacy and recency effects is seen in immediate serial recall and other tasks where the presentation order is well-defined, and for both verbal and visuo-spatial content. This leads to a serial position curve (in which accuracy is plotted for each serial position in a list) with a characteristic bowed shape. The effect suggests that a shared or general serial ordering mechanism privileges access to these serial positions in an ordered list and/or impairs access to other serial positions. It is important to note that primacy and recency effects are also observed in the immediate free recall of lists of words when the capacity of working memory is greatly exceeded and where they may have a very different explanation (see e.g., Baddeley & Hitch, 1993).

Errors and Effects of Similarity

Working memory errors frequently involve confusion between items in the memory set. This is evident in a wide range of tasks (including variants of recognition, change-detection, and continuous reproduction tasks), but is perhaps clearest in immediate serial recall, where the most common forms of error involve the misordering of items. These errors most frequently involve local transpositions in which an item moves to a nearby list position, often exchanging with the item in that position. For example, a sequence like “D, F, E, O, P, Q” might be recalled as “D, F, O, E, P, Q.” Items are most likely to transpose to immediately adjacent list positions, with the probability of a transposition decreasing monotonically as the distance within the sequence increases. Note that there are fewer opportunities for local transpositions at the beginning and end of a sequence so the locality constraint on transpositions likely plays at least some role in primacy and recency effects.

In a verbal working memory task, when items from the memory set are confused with one another, they are most likely to be confused with phonologically similar items making performance for lists of similar sounding items poorer than for phonologically distinct items. In serial recall, this effect manifests itself as an increased tendency for phonologically similar items to transpose with one another, so that in the preceding example, items “D,” “E” and “P” (because they rhyme) would be more likely to transpose with one another than items “F,” “O,” and “Q.” Although these similarity effects are largely reported in verbal paradigms, analogous findings are sometimes observed with visual materials (for example, a sequence of similar colored shapes is harder to reconstruct than a sequence of distinctively colored shapes; Jalbert et al., 2008).

The analysis of errors and confusion has been critical in understanding the nature of representation in verbal working memory (for example, demonstrating the importance of speech-based rather than semantic codes), in developing the concept of the phonological loop, and in developing computational models which account for these findings in terms of underpinning serial ordering mechanisms.

Individual Differences and Links With Other Facets of Cognition

Speaking to questions about the relationship between working memory and other aspects of cognition, another set of benchmark findings is concerned with correlations between performance on working memory tasks and other measures. In particular, working memory is correlated with measures of attention and fluid intelligence (the capacity to solve novel problems independent of prior learning; see e.g., Engle, 2002) suggesting that all three constructs involve common resources. There is consensus that aspects of attention contribute to working memory, but attention is also relevant to tasks that make minimal demands on memory. At the same time, working memory plays an important role in problem solving in the absence of relevant prior learning, but it can also be applied to tasks that do not involve complex problems. This suggests a hierarchical relationship in which limited cognitive resources (i.e., attention) are applied to maintain and manipulate information in memory (attention + short-term memory = working memory) in the context of demanding problems (working memory + problem solving = fluid intelligence).

This somewhat simplistic sketch of the relationship between constructs omits the contribution of long-term memory and learning to working memory. That contribution is evident in several empirical phenomena. For example, the beneficial effect of chunking on recall often depends on familiarity with the chunks, as in the examples given previously. It is easily overlooked that the familiarity of the materials themselves is also important. For example, familiar words are recalled much better than nonwords (Hulme et al., 1991) suggesting that words act as specialized phonological/semantic “chunks.” Similarly, grammatical sentences are recalled better than arbitrarily ordered lists or jumbled sentences (Brener, 1940). The word–nonword and sentence superiority effects show that well-learned constraints on serial order (whether through syntax or phonotactics) can benefit recall. A related phenomenon, the Hebb repetition effect (Hebb, 1961), can be seen in the laboratory: immediate serial recall for a specific random list gradually improves over successive trials when it becomes more familiar through being repeatedly but covertly presented interleaved among other lists.

The Importance of Working Memory

The laboratory tasks and benchmark findings outlined in the section “Empirical Investigation and Key Findings” have established its key characteristics, but the practical significance of working memory extends well beyond these phenomena into everyday cognition and learning. Notably the limits of working memory constrain what we can think about on a moment-to-moment basis and hence how quickly we can learn and what we can ultimately understand. An appreciation of the impact of working memory and its limitations is thus vitally important in the context of education (see e.g., Alloway & Gathercole, 2006 for a review). For example, individual differences in the capacity of phonological storage in verbal working memory are reciprocally linked to vocabulary acquisition in early childhood; children’s ability to repeat nonwords at age four (i.e., unfamiliar phonological sequences) predicts their vocabulary a year later. In turn, the emergence of vocabulary (i.e., phonological chunks) is associated with later improvements in nonword repetition (Gathercole et al., 1992). It is not hard to imagine that this process amplifies the initial effect of variation in capacity, affecting literacy and then more advanced learning (potentially well beyond language abilities) that depends on reading. Working memory can similarly exert an influence on the emergence of numeracy and through it more advanced skills in arithmetic and mathematics. For example, kindergartners’ performance on a backward digit span task predicts their scores on a mathematics test a year later (Gersten et al., 2005). In addition to these effects on the acquisition of foundational skills such as literacy and numeracy, working memory is important in maintaining and manipulating the information needed to carry out complex tasks in the classroom. Thus, students with lower working memory capacity can have difficulty retaining and following instructions (Gathercole et al., 2008) again potentially hampering their ability to build more advanced skills and knowledge. Because of its critical involvement in classroom learning, working memory plays a central role in Cognitive Load Theory” (Sweller, 2011) an influential educational framework which aims to incorporate principles derived from the architecture of human cognition into teaching methods.

Many measures of short-term memory and working memory show marked year-on-year improvement in childhood, with developmental change likely reflecting the maturation of several components that underpin performance (Gathercole, 1999; Gathercole et al., 2004). These include changes in processes such as verbal recoding, subvocal rehearsal, the activation of temporary information and executive attentional control (Camos & Barrouillet, 2011; Cowan et al., 2002; Hitch & Halliday, 1983). As might be expected given the centrality of working memory in the acquisition of language and numeracy, developmental disorders are commonly associated with reduced short-term or working memory capacity. Prominent examples include dyslexia (Berninger et al., 2008), developmental language disorder (Archibald & Gathercole, 2006; Montgomery et al., 2010), and dyscalculia (Fias et al., 2013; McLean & Hitch, 1999). However, the nature of any causal role for working memory in developmental disorders has been controversial (see e.g., Masoura, 2006).

In adulthood, working memory capacity continues to limit the bandwidth that is available for cognitive operations, for example affecting planning and decision-making (Gilhooly, 2005; Hinson et al., 2003). As we grow older, working memory capacity tends to decline, and there are some indications that this is associated with failing attention and greater vulnerability to distraction (Hasher & Zacks, 1988; McNab et al., 2015; Park & Payer, 2006) rather than a mere reversal of earlier developmental gains. Across the entire lifespan, as it waxes and wanes, working memory plays an important part in shaping our daily experience.

Given its central role in constraining human cognitive abilities, extensive efforts have been made to develop interventions that can improve working memory, for example through computerized training programs. However, these efforts have so far met with limited success. Some working memory tasks show improvements with practice, but these effects tend to reflect near or intermediate transfer, specific to the trained task or (often closely-related) direct measures of working memory, rather than far transfer extending to more general improvements in other tasks thought to depend on working memory, such as reading comprehension or arithmetic (Melby-Lervåg et al., 2016; Owen et al., 2010; Sala & Gobet, 2017). It has been argued that near and intermediate transfer effects arise through improvements in task-specific efficiency via refinement of strategies and long-term memory support (e.g., chunking) whereas more general benefits and far transfer would be expected to depend on the underlying capacity of attentional and storage systems (von Bastian & Oberauer, 2014). The absence of clear evidence for far transfer despite such extensive research thus suggests that working memory capacity limits are a fundamental and unalterable feature of the human cognitive system.

Although it is perhaps premature to rule out the possibility of interventions that achieve increased working memory capacity, it appears at present that it can only be extended in specific contexts through more specialized training with particular tasks and materials. Paradoxically, this resistance to more general training may be what makes working memory so important; to the extent that its capacity limits are unavoidable, working memory helps to determine the scope of human cognition and spurs us to find strategies, technologies and cultural tools that allow us to go beyond them.

In conclusion, through the development of a powerful toolkit of experimental methods and of replicable empirical phenomena, the study of working memory function has provided many useful insights into interactions between attention and short-term memory. On the one hand these interactions can be used strategically to enhance goal-directed behavior and long-term learning while on the other they provide fundamental limits on cognition across the lifespan. Ongoing controversy over the structure of working memory relates to the difficulty in isolating these interactions from other facets of cognition, but there is little doubt about their importance in governing what we can and cannot do.

References

Alloway, T. P. , & Gathercole, S. E. (2006). How does working memory work in the classroom? Education Research and Reviews, 1(4), 134–139.