## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Variability in encoding precision accounts for visual short-term memory limitations

Edited by Richard M. Shiffrin, Indiana University, Bloomington, IN, and approved April 11, 2012 (received for review October 24, 2011)

## Abstract

It is commonly believed that visual short-term memory (VSTM) consists of a fixed number of “slots” in which items can be stored. An alternative theory in which memory resource is a continuous quantity distributed over all items seems to be refuted by the appearance of guessing in human responses. Here, we introduce a model in which resource is not only continuous but also variable across items and trials, causing random fluctuations in encoding precision. We tested this model against previous models using two VSTM paradigms and two feature dimensions. Our model accurately accounts for all aspects of the data, including apparent guessing, and outperforms slot models in formal model comparison. At the neural level, variability in precision might correspond to variability in neural population gain and doubly stochastic stimulus representation. Our results suggest that VSTM resource is continuous and variable rather than discrete and fixed and might explain why subjective experience of VSTM is not all or none.

Thomas Chamberlin famously warned scientists against entertaining only a single hypothesis, for such a modus operandi might lead to undue attachment and “a pressing of the facts to make them fit the theory” (ref. 1, p. 840). For half a century, the study of short-term memory limitations has been dominated by a single hypothesis, namely that a fixed number of items can be held in memory and any excess items are discarded (2⇓⇓–5). The alternative notion that short-term memory resource is a continuous quantity distributed over all items, with a lower amount per item translating into lower encoding precision, has enjoyed some success (6⇓–8), but has been unable to account for the finding that humans often seem to make a random guess when asked to report the identity of one of a set of remembered items, especially when many items are present (9). Specifically, if resource were evenly distributed across items (6, 10), observers would never guess. Thus, at present, no viable continuous-resource model exists.

Here, we propose a more sophisticated continuous-resource model, the variable-precision (VP) model, in which the amount of resource an item receives, and thus its encoding precision, varies randomly across items and trials and on average decreases with set size. Resource might correspond to the gain of a neural population pattern of activity encoding a memorized feature. When gain is higher, a stimulus is encoded with higher precision (11, 12). Variability in gain across items and trials is consistent with observations of single-neuron firing rate variability (13⇓–15) and attentional fluctuations (16, 17).

We tested the VP model against three alternative models (Fig. 1). According to the classic item-limit (IL) model (4), a fixed number of items is kept in memory, and memorized items are recalled perfectly. In the equal-precision (EP) model (6, 10), a continuous resource is evenly distributed across all items. The slots-plus-averaging (SA) model (9) acknowledges the presence of noise but combines it with the notion of discrete slots. Resource consists of a few discrete chunks, each of which affords limited precision to the encoding of an item. When there are fewer items than chunks, an item might get encoded using multiple chunks and thus with higher precision. To compare the four models, we used two visual short-term memory (VSTM) paradigms, namely delayed estimation (7) and change localization, each of which we applied to two feature dimensions, color and orientation (Fig. 2). We found that the VP model outperforms the previous models in each of the four experiments and accounts, at each set size, for the frequency that observers appear to be guessing. Thus, the VP model poses a serious challenge to models in which VSTM resource is assumed to be discrete and fixed.

## Theory

### VSTM Encoding and Variable Precision.

An observer memorizes *N* simultaneously presented stimuli. The task-relevant feature is orientation or color, both of which are circular variables in our experiments. Each stimulus is encoded with precision *J*, which is formally defined as Fisher information (18). We assume that the observer’s internal measurement of a stimulus is noisy and follows a Von Mises (circular normal) distribution,

where *I*_{0} is the modified Bessel function of the first kind of order 0 and the concentration parameter κ is uniquely determined by *J* through (*SI Text*). For a variable with a Gaussian distribution, *J* would be equal to inverse variance. A higher *J* produces a narrower distribution *p*(*x* | *s*, *J*) (Fig. 3*A*). In the VP model, *J* is variable across items and trials and we assume that it is drawn, independently across items and trials, from a gamma distribution with mean and scale parameter τ (Fig. 3*A*). The measurement is then described by a doubly stochastic process, . We further assume that depends on set size, *N*, in power-law fashion, (Fig. 3*B*). The free parameters , α, and τ are fitted to subject data.

### Models for Delayed Estimation.

In experiments 1 and 2, observers estimated the value of a remembered stimulus (Fig. 2 *A* and *B*). The stimulus estimate, denoted , is equal to the measurement, *x*. In the IL model, the measurement of a remembered stimulus is noiseless but only *K* items (the “capacity”) are remembered (or all *N* when *N* ≤ *K*), producing a guessing rate of 1 − *K*/*N* for *N* > *K*. In the SA model, *K* chunks of resource are allocated and the estimate distribution has two components. When the tested item has no chunks, the observer guesses and the estimate distribution is uniform; otherwise, it is a Von Mises distribution with κ determined by the number of chunks. In the EP model, the estimate distribution is Von Mises as in Eq. **1**, but with precision *J* equal across items and across trials with the same *N* and dependent on *N* as . In the VP model, the estimate distribution is a mixture of many Von Mises distributions, each with a different value of κ: (Fig. S1*A*). In all models, we assume that the observer’s response is equal to the estimate plus zero-mean Von Mises response noise with concentration parameter κ_{r}. Model details can be found in *SI Text*.

### Models for Change Localization.

In experiments 3 and 4, observers sequentially viewed two displays, which were identical except that one stimulus changed between them. Observers reported where the change occurred (Fig. 2 *C* and *D*). The stimuli in the first display and the magnitude of the change were all drawn independently from a uniform distribution. In each model, stimuli are encoded in the same way as in delayed estimation, but the decision-making stage is different (Fig. 3*C*). We denote the measurements of the stimuli in the first and second displays by vectors **x** and **y**, respectively, and the corresponding concentration parameters by a vector **κ**. In the EP and VP models, the observer has access to all *N* pairs of measurements, but in the SA model only to *K* of them (or *N* when *N* ≤ *K*). The statistical structure of the task-relevant variables is shown in Fig. S1*C*. In all models with noisy encoding, the observer’s decision process is modeled as Bayesian inference. The Bayesian decision rule is to report the location *L* for which the posterior probability of change occurrence is largest, which is equivalent to the quantity being largest (*SI Text*).

## Psychophysics and Model Comparison

### Experiment 1: Delayed Estimation of Color.

To compare the models, we first performed a delayed-estimation experiment (7). Observers briefly viewed and memorized the colors of *N* discs (*N* = 1, … , 8) and reported the color of a randomly chosen target disk by scrolling through all possible colors (Fig. 2*A*). Following other authors (9), we fitted to the observer’s estimation errors a mixture of a Von Mises distribution and a uniform distribution (see Fig. S2 for an example). We refer to the mixture proportion of the Von Mises component as *w* and to its circular SD as CSD. Note that this fitting procedure does not constitute a model, but is simply a way of summarizing the data into two descriptive statistics. It would be premature to interpret *w* as the probability that an item was encoded and 1 − *w* as the guessing rate, as suggested in ref. 9, because such an interpretation is meaningful only if the true error distribution is a uniform+Von Mises mixture, which we argue here is not the case. We verified that observers did not report colors of nontarget discs (Fig. S3; a different response modality, namely clicking on a color wheel, did produce nontarget reports). For each model, we generated synthetic datasets of the same size as the subject datasets, using the maximum-likelihood estimates of the parameters obtained from the subject data (Table S1), and then fitted the uniform+Von Mises mixture to these synthetic data. The resulting model predictions, averaged over subjects, are shown in Fig. 4*A* (for individual-subject fits, see Fig. S4). Consistent with previous results (9), we find a significant main effect of set size on both *w* [one-way repeated-measures ANOVA; *F*(7, 84) = 42.1, *P* < 0.001] and CSD [*F*(7, 84) = 4.60, *P* < 0.001]. This result rules out both the EP model, which predicts *w* close to 1 at each set size (the slight deviation is an artifact of the limited number of trials), and the IL model, which predicts that CSD is constant. The SA and VP models explain the data better, with the VP model having the lowest root mean-square (RMS) error (Fig. 4*A*). In the SA model, capacity *K* equals 4.00 ± 0.34 (mean ± SEM), in line with earlier work (9). In the VP model, the power α equals 1.33 ± 0.14 (Fig. S5*A*).

There is a clear intuition for why the VP model, but not the EP model, accounts for the decrease of *w* with set size. Because of trial-to-trial variability in precision, the target item sometimes, by chance, receives so little resource that the estimate on that trial is grouped into the uniform distribution, even though it was not a “real” guess. When set size is larger, mean precision is lower, resulting in more probability mass near zero precision (Fig. 3*B*) and a higher apparent guessing rate. Thus, it is not necessary to assume discrete resources to explain the decrease of *w* with set size.

To further determine which model best describes the data, we performed Bayesian model comparison (19), a principled method that automatically corrects for the number of free parameters (*SI Text*). We found that the log likelihood of the VP model exceeds those of the IL, SA, and EP models by respectively 15.6 ± 3.1, 12.0 ± 3.1, and 40.3 ± 6.3 points (Fig. 5*A*). A log-likelihood difference (or log Bayes factor) of 12.0 means that the data are *e*^{12.0} times more probable under one model than under another. At the level of individual subjects (Fig. S6*A*), we find that the VP model is most likely for 12 of 13 subjects, whereas SA is slightly better for one. Consistent results were obtained using the Bayesian information criterion (20) (Fig. S6*B*).

### Residual in Delayed Estimation.

The VP model makes an intuitive prediction distinct from the other models. So far, we have fitted the data with a uniform+Von Mises mixture to obtain two descriptive statistics, *w* and CSD. The VP model postulates variability in precision, causing its predicted error distribution to be a mixture of a large number of Von Mises distributions, each with a different *J*. Such a mixture cannot be fitted perfectly with a uniform+Von Mises mixture and will therefore leave a residual. Using the synthetic data described above, we find that the residual predicted by the VP model, but not by other models, has a central peak and negative side lobes (Fig. 5*B*). The subject data show a residual of exactly this shape (Fig. 5*C* and Fig. S2). This result constitutes additional evidence for variability in precision.

### Experiment 2: Delayed Estimation of Orientation.

To investigate the generality of these results, we replicated the experiment using orientation (Fig. 2*B*). The data show a significant main effect of set size on both *w* [one-way repeated-measures ANOVA, *F*(7, 35) = 32.4, *P* < 0.001] and CSD [*F*(7, 35) = 3.28, *P* < 0.01] (Fig. 4*B* and Fig. S7), again ruling out the IL and EP models. The SA and VP models explain the data better, with the VP model having the lowest RMS error (Fig. 4*B*). In the SA model, capacity *K* = 3.33 ± 0.56. In the VP model, the power α = 1.41 ± 0.15 (Fig. S5*A*). Bayesian model comparison shows that the VP model outperforms the IL, SA, and EP models by 103 ± 15, 52 ± 11, and 142 ± 30 log-likelihood points, respectively (Fig. 5*D*). The VP model is most likely for all six subjects (Fig. S6*C*). Results were confirmed using the Bayesian information criterion (Fig. S6*D*). The residual after subtracting the uniform+Von Mises mixture has the shape predicted by the VP model (Fig. 5 *E* and *F*).

### Experiments 3 and 4: Change Localization.

To examine whether the VP model can account for human behavior in other VSTM tasks, we conducted two experiments in which subjects localized a change in the color or orientation of a stimulus (Fig. 2 *C* and *D*). Set size had a significant main effect on accuracy both for color [one-way repeated-measures ANOVA, *F*(3, 18) = 256.6, *P* < 0.001] and for orientation [*F*(3, 30) = 356.5, *P* < 0.001] (Figs. S8*A* and S9*A*). Magnitude of change has a significant effect on accuracy both for color [one-way repeated-measures ANOVA, *F*(8, 48) = 114.3, *P* < 0.001] and for orientation [*F*(8, 80) = 238.5, *P* < 0.001] (Fig. 6). Judged by RMS error, the VP model provides the best fits to the psychometric curves (Fig. 6). Individual-subject fits are shown in Figs. S8 and S9. In the SA model, capacity *K* = 2.86 ± 0.14 for color and 4.09 ± 0.39 for orientation. In the VP model, the power α = 0.974 ± 0.090 for color and 0.993 ± 0.075 for orientation (Fig. S5*B*). In Bayesian model comparison, the VP model outperforms the IL, SA, and EP models both for color (by 143 ± 11, 10.1 ± 2.6, and 15.0 ± 2.8 log-likelihood points) and for orientation (by 145 ± 11, 11.9 ± 2.6, and 17.3 ± 2.8 points) (Fig. 7 *A* and *C*). In both experiments, the VP model outperforms all other models for every individual subject (Fig. S10).

### Apparent Guessing in Change Localization.

To further distinguish the models, we computed an apparent guessing rate analogous to 1 − *w* in delayed estimation. We did so by fitting, at each set size separately, a Bayesian-observer model with equal, fixed precision and a guessing rate to both the subject data and the model-generated synthetic data. The EP model predicts an apparent guessing rate of zero. We found that subjects’ apparent guessing rate was significantly higher than zero at all set sizes [*t*(6) > 4.82, *P* < 0.002 and *t*(10) > 4.64, *P* < 0.001 for experiments 3 and 4, respectively] and increased with set size [*F*(3, 18) = 85.8, *P* < 0.001 and *F*(3, 30) = 26.6, *P* < 0.001, respectively]. The VP model reproduces the increase of apparent guessing rate with set size more accurately than the SA model (Fig. 7 *B* and *D*). Like for delayed estimation, the apparent guessing rate predicted by the VP model is nonzero because items are sometimes encoded with very low precision, and this happens more frequently when set size is large.

## Discussion

### Do Slots Exist?

Our results suggest that VSTM limitations should be conceptualized in terms of quality of encoding rather than number of items. Earlier work proposing continuous-resource models in the study of VSTM (6⇓–8) did not model variability in resource across items and trials. Here, we have shown that when such variability is not modeled, as in the EP model, human responses in delayed estimation and change localization cannot be accounted for. By contrast, the VP model accounts for all presented data, including the existence of apparent guessing and its increase with set size, which have so far been attributed to an item limit. Thus, the VP model poses a serious challenge to the notion of slots in VSTM and might reconcile an apparent capacity of about four items with the subjective sense that we possess some memory of an entire scene: Items are never discarded completely, but their encoding quality could by chance be very low.

Most neuroimaging and EEG studies of VSTM limitations consider only the slots framework (5, 21⇓⇓–24) (but see refs. 25 and 26). Without testing alternative models of VSTM, these studies cannot provide evidence for the existence of slots. The VP model offers a viable alternative, and we expect that quantities in the VP model will also correlate with neural variables.

We do not expect the VP model to end the debate about the nature of VSTM limitations. Variants of both the VP model and previous models can be conceived and should be tested. Possible hybrids between the SA and VP models include SA with trial-to-trial variability in capacity *K* (27, 28) and VP augmented with an item limit (continuous resource in discrete slots). We expect, however, that any alternative model will have to explicitly model variability in resource across items and trials to account for the data.

### Is Resource Discrete?

The SA model asserts not only that VSTM consists of slots, but also that resource comes in discrete chunks. The latter notion is difficult to reconcile with the fact that sensory noise is a graded rather than a discrete quantity. For example, stimulus contrast affects sensory noise and therefore encoding precision in a graded manner. Such continuous modulation is inconsistent with the allocation of “fixed-size, prepackaged boxes” (9) of resource, because those boxes allow for only a small, discrete number of noise levels. The VP model does not have this problem, because precision is a continuous quantity and is modulated by contrast in a continuous manner.

### Neural Basis of VSTM Resource.

Previous models have not specified a neural correlate of VSTM resource. Here, we propose to identify VSTM memory resource with the gain (mean amplitude) of the neural population pattern encoding a stimulus. Several arguments support such an identification. First, for Poisson-like populations, gain is proportional to encoding precision (29). Moreover, the energy cost associated with high gain (30) could explain why working memory is limited: As set size grows larger, the energy cost gradually outweighs the benefit of encoding items with high precision. Finally, gain in visual cortical areas is modulated by attention (31⇓–33), and attentional limitations are closely related to working memory ones (8, 34).

### Neural Basis of Variability in Precision.

Although our results point to variability in encoding precision as key in describing VSTM limitations, the VP model does not specify the origin of this variability. Variations in attention and alertness are likely contributors, but stimulus-related precision differences [such as cardinal orientations being encoded with higher precision (35)] might also play a role. There is evidence that microsaccades are predictive of variability in precision during change detection (36). Variability in precision provides a behavioral counterpart to recent physiological findings of trial-to-trial and item-to-item fluctuations in attentional gain (16, 17). A consequence of gain variability is that the neural representation **r** of a stimulus follows a doubly stochastic process The spike count distribution is determined by gain *g*, which itself is stochastic. Supporting this notion, doubly stochastic processes can well describe spike counts in lateral intraparietal cortex (LIP) (13), visual cortex (15), and other areas (14). Thus, the VP model is broadly consistent with emerging physiological findings.

### Decrease of Mean Precision with Set Size.

The VP model predicts that mean precision decreases gradually with increasing set size and, if encoding precision can be identified with neural gain, that gain does as well. Extant physiological evidence is consistent with this prediction. Neuronal responses in LIP, an area associated with spatial attention, are lower to the onset of four than to that of two choice targets (37). In the superior colliculus, an area associated with covert attention, firing rates also decrease with the number of choice targets (38). Similar measurements in areas encoding short-term memories of visual stimuli remain to be made.

In both change localization experiments, we found that the mean precision decreases with set size approximately as 1/*N*, which would be predicted by models in which the total amount of resource is, on average, independent of set size. However, in both delayed-estimation experiments, we found a steeper decline. This result shows that the decrease of mean precision with set size is task-dependent and that the trial-averaged total amount of resource might depend on set size. Perhaps the precise relation between mean precision and set size is set by a trade-off between energy expenditure and performance. In support of this speculation, a decrease of mean precision with set size is also observed in an attentionally demanding task without a memory component (39).

### Neural Decoding.

Nonhuman primate studies have begun to investigate set size effects in VSTM (36, 40⇓–42). Advances in simultaneous recordings from large populations of single neurons, as well as in the decoding of voxel patterns in functional MRI, might soon allow for model comparison more powerful than psychophysics allows. For instance, in delayed estimation, one could conceivably obtain estimates **x** = (*x*_{1}, … , *x _{N}*) of the stimuli

**s**= (

*s*

_{1}, … ,

*s*) at all

_{N}*N*locations simultaneously. The predictions for

*p*(

**x**|

**s**) made by the SA and VP models can then be compared directly. Altogether, the VP model could help to consolidate the perspectives of cognitive psychology and systems neuroscience on VSTM limitations.

## Methods

Detailed experimental methods can be found in *SI Text*. In experiment 1 (Fig. 2*A*), observers memorized the colors of *N* discs (*N* = 1, … , 8) and reported the color of a randomly chosen target disk. Data of one subject were excluded, because her estimated value of *w* at set size 1 was extremely low (*w* = 0.72, compared with *w* > 0.97 for every other subject). A trial sequence consisted of the presentation of a fixation cross, the stimulus array, a delay period, and a response screen. Subjects responded by scrolling through all possible colors. Colors were drawn independently from a uniform distribution on a color wheel. Fourteen subjects each completed 864 trials in the scrolling condition. Experiment 2 (Fig. 2*B*) was identical except that stimuli were oriented Gabors. Set size was 2, 4, 6, or 8. Six subjects each completed 2,560 trials. In experiment 3 (Fig. 2*C*), observers were presented briefly with two displays containing *N* colored discs each (*N* = 2, 4, 6, or 8). The trial sequence consisted of the presentation of a fixation cross, the first stimulus array, a delay period, the second stimulus array, in which exactly one stimulus had changed color, and a response screen. Subjects clicked on the location of the stimulus that had changed. Colors in the first array and the magnitude of the change were drawn independently from a uniform distribution on a color wheel. Seven subjects each completed 1,920 trials. Experiment 4 (Fig. 2*D*) was identical except that stimuli were oriented ellipses. Eleven subjects each completed 1,920 trials.

## Acknowledgments

W.J.M. is supported by Award R01EY020958 from the National Eye Institute. R.v.d.B. was supported by the Netherlands Organisation for Scientific Research.

## Footnotes

↵

^{1}R.v.d.B. and H.S. contributed equally to this work.↵

^{2}Present address: Max Planck Institute for Dynamics and Self-Organization, Georg August University Göttingen, 37077 Göttingen, Germany.- ↵
^{3}To whom correspondence should be addressed. E-mail: wjma{at}bcm.edu.

Author contributions: R.v.d.B., H.S., and W.J.M. designed research; R.v.d.B., H.S., W.-C.C., R.G., and W.J.M. performed research; R.v.d.B., H.S., W.-C.C., and R.G. analyzed data; and R.v.d.B. and W.J.M. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1117465109/-/DCSupplemental.

## References

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Bays PM,
- Husain M

- ↵
- ↵
- ↵
- Seung HS,
- Sompolinsky H

- ↵
- ↵
- ↵
- ↵
- Goris RLT,
- Simoncelli EP,
- Movshon JA

*Cosyne Abstracts*(Salt Lake City). - ↵
- Cohen MR,
- Maunsell JHR

- ↵
- ↵
- Cover TM,
- Thomas JA

- ↵
- MacKay DJ

- ↵
- ↵
- Anderson DE,
- Vogel EK,
- Awh E

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Sims CR,
- Jacobs RA,
- Knill DC

- ↵
- ↵
- ↵
- ↵
- ↵
- Salinas E,
- Sejnowski TJ

- ↵
- ↵
- ↵
- Lara AH,
- Wallis JD

- ↵
- ↵
- Basso MA,
- Wurtz RH

- ↵
- Mazyar H,
- Van den Berg R,
- Ma WJ

*J Vis*, in press. - ↵
- Heyselaar E,
- Johnston K,
- Pare M

*J Vis*11(3):11, 1–10. - ↵
- ↵
- Buschman TJ,
- Siegel M,
- Roy JE,
- Miller EK

## Citation Manager Formats

### More Articles of This Classification

### Biological Sciences

### Psychological and Cognitive Sciences

### Related Content

- No related articles found.

### Cited by...

- Temporal Expectation Modulates the Cortical Dynamics of Short-Term Memory
- Efficient Coding in Visual Working Memory Accounts for Stimulus-Specific Variations in Recall
- Superior Intraparietal Sulcus Controls the Variability of Visual Working Memory Precision
- Pretrial functional connectivity differentiates behavioral outcomes during trace eyeblink conditioning in the rabbit
- Substitution and pooling in visual crowding induced by similar and dissimilar distractors
- The cost of misremembering: Inferring the loss function in visual working memory
- Oscillatory Brain State Predicts Variability in Working Memory
- Noise in Neural Populations Accounts for Errors in Working Memory
- Working memory retrieval as a decision process
- Variability in visual working memory ability limits the efficiency of perceptual decision making
- Stimulus-specific variability in color working memory with delayed estimation
- Why do people appear not to extrapolate trajectories during multiple object tracking? A computational investigation
- Distributed Patterns of Activity in Sensory Cortex Reflect the Precision of Multiple Items Maintained in Visual Short-Term Memory
- Obligatory encoding of task-irrelevant features depletes working memory resources
- Independence is elusive: Set size effects on encoding precision in visual search
- Object-based encoding in visual working memory: A life span study
- Robust object-based encoding in visual working memory
- Modeling visual working memory with the MemToolbox
- Introspective judgments predict the precision and likelihood of successful maintenance of visual working memory
- Recognition criteria vary with fluctuating uncertainty