A feedforward architecture accounts for rapid categorization
See allHide authors and affiliations
-
Communicated by Richard M. Held, Massachusetts Institute of Technology, Cambridge, MA, January 26, 2007 (received for review November 11, 2006)

Abstract
Primates are remarkably good at recognizing objects. The level of performance of their visual system and its robustness to image degradations still surpasses the best computer vision systems despite decades of engineering effort. In particular, the high accuracy of primates in ultra rapid object categorization and rapid serial visual presentation tasks is remarkable. Given the number of processing stages involved and typical neural latencies, such rapid visual processing is likely to be mostly feedforward. Here we show that a specific implementation of a class of feedforward theories of object recognition (that extend the Hubel and Wiesel simple-to-complex cell hierarchy and account for many anatomical and physiological constraints) can predict the level and the pattern of performance achieved by humans on a rapid masked animal vs. non-animal categorization task.
Footnotes
- §To whom correspondence should be addressed. E-mail: serre{at}mit.edu
-
Author contributions: T.S., A.O., and T.P. designed research; T.S. and A.O. performed research; T.S. analyzed data; and T.S., A.O., and T.P. wrote the paper.
-
The authors declare no conflict of interest.
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0700622104/DC1.
-
↵ ¶Thorpe, S., Biologically Motivated Computer Vision, Second International Workshop, Nov. 22–24, 2002, Tübingen, Germany, pp. 1–15.
-
↵ ‖Serre, T., Riesenhuber, M., Louie, J., Poggio, T., Biologically Motivated Computer Vision, Second International Workshop, Nov. 22–24, 2002, Tübingen, Germany, pp. 387–397.
-
↵ ** Guyonneau, R., Kirchner, H., Thorpe, S. J., European Conference on Visual Perception, Aug. 22–26, 2005, Corun̂a, Spain.
-
↵ †† The full training set is used to adjust the synaptic weights of the classification unit.
-
↵ ‡‡ Other classifiers could be used (a linear SVM gave very similar results). A recent study (9) demonstrated that a linear classifier can indeed read out with high accuracy and over extremely short time intervals (a single bin as short as 12.5 ms) object identity, object category, and other information (such as the position and size of the object) from the activity of ≈100 neurons in IT.
-
↵ §§ A single classifier was trained on all four animal and non-animal categories together.
- Abbreviations:
- V1,
- primary visual cortex;
- V2,
- extrastriate visual area II;
- V4,
- extrastriate visual area IV;
- IT,
- inferotemporal cortex;
- PFC,
- prefrontal cortex;
- SOA,
- stimulus onset asynchrony;
- ISI,
- interstimulus interval.
-
Freely available online through the PNAS open access option.
- © 2007 by The National Academy of Sciences of the USA