New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Functional computational model for optimal color coding

Contributed by A. Kimball Romney, April 28, 2009 (received for review January 26, 2009)
Abstract
This paper presents a computational model for color coding that provides a functional explanation of how humans perceive colors in a homogeneous color space. Beginning with known properties of human cone photoreceptors, the model estimates the locations of the reflectance spectra of Munsell color chips in perceptual color space as represented in the CIE L*a*b* color system. The fit between the two structures is within the limits of expected measurement error. Estimates of the structure of perceptual color space for color anomalous dichromats missing one of the normal cone photoreceptors correspond closely to results from the Farnsworth–Munsell color test. An unanticipated outcome of the model provides a functional explanation of why additive lights are always red, green, and blue and provide maximum gamut for color monitors and color television even though they do not correspond to human cone absorption spectra.
The aim of this article is to formulate a model for color coding that estimates how humans perceive color as a function of their cone sensitivity curves. The model is functional in the sense that it answers some of the how and why questions about color processing raised below. The model is computational in the sense that it provides formulas to predict, among other things, how normal human color perception with three cone photoreceptors differs from human color perception where one cone is missing. We will use the term optimal in the sense that the code carries the maximum amount of information using minimal channel capacity or bandwidth.
All visual information used by the brain in processing visual images, including the information about the color of objects, originates in the photoreceptors in the retina. The function of human color vision is to assign the colors of objects in the visual environment to locations in a coherent perceptual color space. The information about the color of an object resides in its reflectance spectra. Color is inferred from light reflected from the surfaces of objects. That light is the product of the reflectance spectrum of the object and the wavelength composition of the illumination. Basically, the retina has to consolidate the information received by many millions of highly redundant wavelengthsensitive photoreceptor cells into an informationefficient and compressed code for transmission through the approximately one million fibers of the optic nerve. The code carrying the visual information down the optic nerve must contain information about the color of objects. We emphasize the fact that the physiological implementation of color coding is beyond the scope of this article. Even though the abstract mathematical calculations of our model implemented by a computer clearly have no physiological analogs, we do think that computations that result in outcomes similar to those of our model are carried out by the human visual system in some fashion. The bandwidth limitations of the optic nerve suggest that much of the color coding is done in the retina.
The model will be tested on how well it predicts the appearance of the surface reflectance spectra of Munsell (1) color atlas chips for normal observers with three cone receptors and for anomalous dichromat observers missing one of the three cone receptors. Normal observers will be compared with the international standard of human perceptual color appearance, namely, CIE L*a*b* (2) whereas the dichromats will be compared with the performance on the Farnsworth–Munsell color test (3). We recognize that restricting stimuli to color chips constrains the applicability to a context of the appearance of the color of a single color chip observed on a neutral background in normal daylight. This constrained context is free of complex effects such as local color contrast, color induction, and other complicating phenomena. Because the data used in this article are derived from color chips in which the relationship among the chips is entirely a function of their reflectance spectra, we assume a flat illuminate equal to a constant that does not appear in the model. To test our model, we assume an observer with three cone photoreceptor sensitivity curves as specified by Stockman and Sharpe (4). The model also allows predictions of the color appearance of the Munsell chips for any observer who has known sensitivity curves that differ in location or variance from the Stockman and Sharpe curves.
The model is formulated to provide a functional explanation of various outstanding color vision puzzles. Examples of the kinds of how and why questions we explain include the following: (i) How is the redundancy caused by the close spacing of the long and medium wavelengthsensitive cones adjusted for in color coding? (ii) How is color constancy maintained over a wide range of intensities and wavelength composition differences of illumination? (iii) Why are afterimages always seen as complementary to the stimulus color? (iv) Why do red, green, and blue monochromatic lights provide the maximum gamut for any combination of wavelengths? (v) Why are the additive colors red, green, and blue, whereas the subtractive colors are cyan, magenta, and yellow and not visa versa? (vi) How does one compute metamers, i.e., separate what Wyszecki (2) calls the “fundamental color stimulus” from the “metameric black” component of any reflectance spectra? (vii) And finally, how does one account for recent research that shows that individual differences in the ratio of long wavelength and medium wavelengthsensitive cones vary among normal human observers from 10:1 to 0.4:1 (a 25fold range) without affecting the perception of the wavelength of unique yellow as measured with an anomaloscope (5)?
Major Components of the Model
Before discussing the model and its implementation in the next section, we first examine each of the major conceptual components of the model. Oddly, all of these components are well known and have been in the literature for decades but have never been combined into a single comprehensive model. The first component we label redundancy reduction. The idea is to remove receptor redundancy by producing orthogonal channels of transmission. Attneave (6) and Barlow (7) noted the need for redundancy reduction and information efficient coding in the retina several decades ago. In the early 1950s MacAdam (8, 9), in his search for a visually homogeneous color space, was the first actually to calculate an orthogonal transformation of colormatching functions (CMF), noting that, “There is an infinite variety of sets of orthogonal functions” (8). The wellknown work of Buchsbaum and Gottschalk (10) later made extensive use of an optimal orthogonal transformation of cone sensitivity curves. They used information theory to demonstrate that orthogonal channels produce optimal coding for the transmission of information. Reflectance spectra were not considered in any of these studies.
The second component we label latent components. These latent components are never observed directly, but the concept is of critical importance in our model. The latent components constitute an infinite set of three orthogonal vectors (as noted by MacAdam in ref. 8); each contains common information related to the Stockman and Sharpe (4) cone sensitivity curves by a linear transformation. The Stiles and Burch (11) CMF are one example of a linear transformation of the Stockman and Sharpe cone sensitivity curves. Another example where two sets of curves are related by a linear transformation is Wald's (12) set of empirical cone measurements where the three curves have very different heights compared with the Stockman and Sharpe curves, which are normalized to all have the same height. It is well known that one can use different primary lights in CMF experiments. When plotting the results of experiments derived by using various sets of primary lights, different curves are obtained for each set. However, all sets are related to each other by a linear transformation (13). Mathematically, these linear transformations are represented in a 3 by 3 matrix. Perhaps the most important class of latent components consists of the chromatic and luminance adaptations suggested by von Kries (2) >100 years ago. These are linear transformations of the cone receptors required to adjust to overall differences in intensity and spectral composition of the illumination. Any nonzero weight between 0 and 1 may be applied to any of the cones to adjust to changing illumination. Analogous adjustments could be made for individual differences in proportions of cone types (5). In effect, each of the infinite sets of latent components represents a virtual state of adaptation represented in orthogonal form.
The third component we label as the projection matrix. The projection matrix is a mathematical procedure that reduces the infinity of sets of latent orthogonal components to a common form. This result was first derived >25 years ago by Cohen and Kappauf (14), designated as matrix R in their work. Their result is equivalent to the proof that a projection matrix obtained by multiplying any set of latent orthogonal components by its transpose is invariant for the infinity of sets of latent components. Koenderink and van Doorn (13) refer to the work of Cohen (14, 15) as the “only noteworthy development” in color theory since Schrödinger's (16) work in 1920. They go on to say, “It [matrix R] is an invariant, complete description, the ‘holy grail’ of colorimetry!” (13, p. 73). Our Fig. 1C is identical to matrix R. Our model is unique in applying the projection matrix to the cube root of the reflectance spectra; except for this critical step Burns et al. (17) would have priority. In any event, Cohen and Kappauf demonstrate that when the reflectance spectrum of a color sample is multiplied by the projection matrix, the spectrum is partitioned into a fundamental part (representing information “seen” by the organism) and Wyszecki's (18) metameric black part (representing the part that is invisible to the organism).
The fourth component we label as the cube root transformation. The usefulness of the cube root transformation arises from the observation that the appropriate geometry for the physical representation of color is conical as described by Schrödinger (16) and Koenderink (19); whereas the appropriate geometry for the perceptual representation of color, as in the Munsell system, is cylindrical (2). Thus, for example, Burns et al. (17) found a conical structure for noncuberooted reflectance spectra of Munsell chips; in contrast, Romney (2) found a cylinder for the cuberooted reflectance spectra of Munsell chips. Viewed physically (noncuberooted), hue circles of equal chroma form a cone that expands with increasing lightness. Viewed perceptually (cuberooted), the same hue circles of equal chroma form a cylinder of equal size circles with increasing of lightness. When representing a cone in a 3D coordinate system there is a cube root transformation required to obtain a cylinder in the same coordinate system. We do not know how the neural system arrives at such a transformation; however, we use it to transform the physical cone representation into a perceptual cylinder representation.
The fifth and final component in the model is to produce a multidimensional scaling representation of the 1269 Munsell chips and the wavelength envelope of the projected reflectance spectra in perceptual space. The singular value decomposition (SVD) is a mathematical procedure for decomposing a matrix into three parts: an orthonormal matrix representing row vectors, a diagonal matrix of singular values, and an orthonormal matrix representing column vectors. In the case of reflectance spectra, SVD provides a convenient way of representing both the location of the reflectance spectra (row vectors) and the wavelength locations (column vectors) in a common space of three dimensions. Examples of representing the structure of the Munsell color chips in such a space may be found in refs. 20 and 21.
Model for Calculating Estimates of Color Appearance
The model departs from tradition in that it requires calculations to condition the state of the cone receptors before responding to visual stimuli. It may appear odd to have a model in which the responses of the receptors are initially transformed to orthogonal latent components by SVD and subsequently converted into an invariant projection matrix before interacting with the stimuli arriving at the retina from the visual scene. However, this is an essential element of the model, for two very important reasons.
The first reason for beginning with receptors is that this reflects the order of events in the real world. Adaptation of the retina to the intensity and spectral composition of the illumination precedes normal visual functioning including color perception. Rinner and Gegenfurtner (22) have shown that three components of adaptation can be identified by their temporal characteristics. They attribute the two slow components, with halflives of 20 s and 40–70 ms, common to both appearance and discrimination, to photoreceptor adaptation; the third component, devoted exclusively to color appearance with a halflife faster than 10 ms and based on multiplicative spatial interactions, is attributed partly to cortical computations.
The second reason for beginning with receptors is that most aspects of vision, such as edge detection and color vision, rely on some kind of comparison among the responses of many receptors. This means that color coding depends on how responses of many cells are combined and not on the response of individual cells. In a recent study of neural coding, Jacobs et al. (23) point out that individual receptors by themselves do not carry much information, but together as a population, they do. Individual receptors are incapable of detecting the wavelengths of photons, in fact when stimulated by flashes of various monochromatic lights, individual cones respond with random color names (24). By 1970, it was firmly established that neurons in the lateral geniculate nucleus (LGN), originating from the retinal ganglion cells, were both chromatically and spatially opponent (25–27). This means that ganglion cells contain color information that derives from many contiguous receptor cells constituting a centersurround receptive field. A reanalysis (28) of the response curves derived from responses of 147 LGN neurons to 12 spectral lights spanning the visual spectrum collected by DeValois et al. (26) demonstrates that they clearly contain colorcoded information. Each LGN neuron contains information about the wavelength composition of light stimulating the receptive field of a specific ganglion cell. Information about wavelength composition over the whole of the receptive field is derived and computed from comparisons among numerous photoreceptor cells. In effect, the model looks at the output code produced by many neurons working as a system rather than beginning with quantum catch counts of isolated cone types.
We turn now to the implementation of the model for calculating estimates of color appearance. The model is general and applies to any human observer with known receptor curves and any set of measured reflectance spectra. An overview of the model is illustrated as a diagram in Fig. 1.
To obtain empirical predictions, we need to focus on a specific human observer and a specific input dataset of surface reflectance spectra. The receptor data on our selected observer are denoted as a matrix, R_{301×3}, and consist of the Stockman and Sharpe (4) cone sensitivity curves measured from 400 nm to 700 nm shown as solid lines in Fig. 1A. The input data consist of reflectance spectra measured in percentage reflectance (scaled 0 to 1) at each nm from 400 nm to 700 nm for 1269 Munsell color samples from the 1976 matte edition. These data and their source are described by Kohonen et al. (29) and are represented by the matrix A_{1269×301} in Eq. 3. In Fig. 1D we show a sample of 40 reflectance spectra representing highly colored chips of all hues in the color circle. The relatively smooth curves with only a single peak or valley are characteristic of most surface reflectance spectra.
We present the model in standard matrix notation (30) as four equations. To facilitate replication, we follow each equation with the actual code we used in Mathematica (31) to obtain our results. The data for the observer (R in Eq. 1) and Munsell reflectance data (A in Eq. 3) are publicly available from sources cited in the previous paragraph. The first component of the model is redundancy reduction and is indicated in the diagram as the transition between Fig. 1 A and B; it is calculated by using SVD as indicated in Eq. 1. We show two sets of curves, where the solid lines represent Stockman and Sharpe (4) normalized curves and the dotted lines represent Wald's (12) nonnormalized curves, to illustrate two sets of latent components related by a linear transformation. The second component of the model is shown in Fig. 1B and consists of the normalized matrix, U, from Eq. 1. Any of the infinite number of sets of latent components that are linear transformations of the Stockman and Sharpe receptor curves could be represented as an element of this component. The third component is the projection matrix shown in Fig. 1C and obtained by multiplication of the U matrix by its transpose as calculated with Eq. 2. The fourth component of the model is the cube root transformation that has been applied to a sample of each hue from the Munsell chips in Fig. 1D. Eq. 3 contains two operations; the first is to cube root all spectra computed elementwise for the Munsell chips; a second operation projects the cuberooted spectra through the projection matrix. This is a critical step that separates each reflectance spectrum into a fundamental part that can be located in perceptual color space and a small residual part that is the metameric black part not detected by the receptors. It is important to note that the reflectance spectra reside in the Stockman and Sharpe cone receptor space. The fifth component of the model is the multidimensional scaling of the projected matrix shown as output in Fig. 1E and calculated by the SVD in Eq. 4. Note that every spectrum in matrix S is a linear combination of the vectors in matrix W^{T}, which in turn are a linear transformation of matrix U.
Results
Matrix M of Eq. 4 contains the coordinates representing estimates of the locations of the 1,269 color chips in a 3D perceptual color space. The first dimension represents lightness (value in Munsell, L* in CIE L*a*b*). The second and third dimensions represent chromaticity and are shown in Fig. 2B. For comparison, we have rotated the LAB L*a*b* coordinates to matrix M and plotted a* and b* in Fig. 2A. Because the first dimension of M correlates 0.9992 with L*, we do not show a plot. The second and third coordinates, which represent spectral wavelength locations, of matrix W^{T} of Eq. 4, are also plotted in Fig. 2B. The relative scale of the chip locations and the wavelength location curves is arbitrary; however, comparisons of angles are valid.
A critical test of how well the model actually predicts the location of the Munsell chips in human perceptual color space is to compare the coordinates in the M matrix of Eq. 4 with the CIE L*a*b* (2) color space coordinates calculated by Kohonen et al. (29). The Stewart–Love redundancy index (32) between the model output, including the first component, and the CIE L*a*b* coordinates is 0.994, probably within experimental error. The Stewart–Love index is a multivariate measure analogous to a squared correlation coefficient and measures the amount of variance accounted for among the variables. The CIE L*a*b* color space is internationally recognized as the closest approximation to human perceptual color appearance available. Kohonen et al. (29) calculated the CIE L*a*b* measures by using an observer equivalent (a linear transform) to the Stockman and Sharpe receptor curves but also included a standard illuminate, whereas our model assumes a flat illuminate. It may be noted that the model chip locations are very similar (Stewart–Love index = 0.987) to a physical model computed without an observer (21), indicating that humans perceive the color of objects based primarily on their reflectance spectra. The importance of the cube root transformation is illustrated by the fact that the Burns et al. (17) model lacking the cube root transformation (otherwise similar to our model) has a Stewart–Love index of only 0.925 compared with the CIE L*a*b* color space.
The question arises as to the interpretation of the wavelength location curve in Fig. 2B. Recall that the cone receptor curves derived from CMF experiments are carried out with monochromatic lights so that the wavelength locations represent monochromatic spectral lights. The sensitivity of the human cones to monochromatic spectral lights is proportional to how far the wavelength location curve is from the origin (Fig. 2B Center). Thus, in Fig. 2B human cones are most sensitive to spectral lights in the red, green, and blue segments of the spectrum indicated by the three prominent lobes in the figure. Human sensitivities are low for spectral lights in the cyan and yellow regions and zero in the magenta region because there are no spectral lights in this region to respond to. A surface reflectance spectrum reflects some light from every wavelength and is thus composed of mixtures of all monochromatic spectral lights; this accounts for why humans see an uninterrupted circle of colors, unlike the colors of natural spectral lights that do not close because there are no purple spectral lights.
The model is also capable of computing the estimated latent components of dichromats as was done by Buchsbaum and Gottschalk (10) (see their Figs. 6 and 7). The shapes of the latent components for the three types of dichromats shown in Fig. 3 E–G mirror almost exactly those of Buchsbaum and Gottschalk. The virtually identical shapes of the three dichromat curves in our results and their results is striking because they used Vos–Walraven (33) logged cone sensitivity curves and we use Stockman and Sharpe (4) linear cone sensitivity curves. The components match down to the detail of the more sharply curved protanope component compared with the deuteranope component despite being computed on different datasets measured on different scales. This illustrates the efficiency and generality of the orthogonal transformation.
Although Buchsbaum and Gottschalk computed the latent components, they did not apply any calculations to reflectance spectra. Our model goes well beyond Buchsbaum and Gottschalk by applying the model to reflectance spectra to calculate all three dichromat types and to plot the wavelength locations as lines in model perceptual space in Fig. 3 A–C. These locations correspond exactly with the observed outcome axes obtained on real subjects by Farnsworth (3) when he developed the Farnsworth–Munsell 100hue and dichotomous tests for color vision. The result of modeling the tritan case is shown in Fig. 3C in which all colors are estimated to collapse to a line going from Munsell 10 BlueGreen on the Left to Munsell 10 Red on the Right. Alpern et al. (34) presented research on a unilateral tritanope who had normal color vision in one eye while the other lacked a short wavelengthsensitive cone. They found two isochromes, at 485 nm and 660 nm, in which spectral lights matched in the two eyes. These are shown in Fig. 3C as black filled circles and are located on both the normal locations and the tritan wavelength locations line. Alpern et al. reported that the matching lights, in the trichromatic eye, were predominantly red, white, or blue. We substitute gray for white in representing these colors in Fig. 3C. They go on to report that mixtures of the isochromes appear purple to the normal eye but, in proper proportions, white to the dichromatic eye. They note that Grassmann's additivity law is grossly wrong for dichoptic matches and conclude that there are exactly three “functionally independent, essentially nonlinear central codes for color perception, and that these codes are different from these suggested in existing theories of color perception”(34, p. 683). We might conjecture that these are our model output dimensions of matrix W^{T}. As an aside we might suggest the potential usefulness of considering reflectance spectra of objects as the basis of a science of human color vision rather than artificial monochromatic lights. Reflectance spectra have an advantage of invariance for colormatching experiments that is lacking for monochromatic spectral lights in that two surfaces with identical reflectance spectra will appear identical to any observer in any light whereas mixtures of spectral lights differ among observers such as, to take an extreme example, normal versus tritanope observers as noted by Alpern et al. above.
Discussion
Further examination of the shape of the wavelength locations of normal trichromats, as shown in Fig. 2B, reveals that the three prominent lobes of the locations correspond almost exactly to the prime colors used widely in industry. As mentioned above, we interpret these lobes as regions of maximum wavelength sensitivity for the human color vision system. This unanticipated outcome is consistent with Thornton's (35) prime colors. Prime colors have been shown (36) to produce maximum perceived light for humans with minimum electrical power and optimal color rendering qualities when used in fluorescent lights. The interpretation of the lobes as maximum sensitivity areas provides a functional explanation for why red, green, and blue are the unique additive colors. This result resolves the disjunction between technological knowledge of color reproduction and academic color theory that arises from the incompatibility between the locations of human cone sensitivity curves and the locations of the red, green, and blue phosphors used in products such as color monitors, color television, color film, and digital cameras. Our model resolves Hunt's (37) wondering in his second chapter why it is impossible to use human cone sensitivity curves directly to produce any satisfactory modern color reproduction system. It also follows that the subtractive colors as used, for example, in dyes and jetprinter inks that act as filters would be the complements of the additive colors, namely, cyan, magenta, and yellow as shown in Fig. 2B. These facts link the model to the voluminous literature on color reproduction, prime colors, spectral sharpening (38), and other research topics bridging academic and technological color research. They also account for the ease of using prime colors as receptor locations to transform Munsell reflectance spectra into perceptual color space (39).
It may be possible to interpret the model dimensions as shown in Fig. 3D as consisting of a luminosity dimension and two chromaticity dimensions. The interpretation of the second and third dimensions as opponent processes might be questioned because they change shape drastically with rotation, which is arbitrary, and are never observed to act separately. Research that is consistent with the absence of cardinal axes would include the fact that simple negative afterimages are always complements of the stimulus color (40) with no evidence for special axes. The model could account for the exact complementarity of the afterimages on the basis of adaptation that reverses the polarity of the second and third dimensions regardless of rotational orientation. The fact that there is no interocular transfer between the eyes of negative afterimages might be taken as evidence that the effect takes place in the retina (41).
Is there any plausible model of how a calculation analogous to SVD could be carried out to obtain orthogonal dimensions? A possible model for such an operation is provided by Usui et al. (42), who reconstructed the Munsell color space from the reflectance spectra of the chips by using a fivelayer neural network. They did not take into account the cone sensitivity curves, nor did they apply the cube root transformation. The calculations of the neural network, which are equivalent to SVD, resulted in a 3D coordinate structure virtually identical to that obtained on the same spectra by using SVD (21).
We have presented a model of color coding that provides a functional explanation for how humans perceive the color of objects in a perceptual space that closely matches CIE L*a*b* specifications. We demonstrated how humans perceive color when they completely lack one of the cone receptors as is the case for dichromats. The model also provides useful insights into how to bridge the current gap between color reproduction technology and current academic theory. Last, the model provides a functional explanation of why additive and subtractive colors are constrained to unique locations.
Acknowledgments
We thank Donald G. Saari for pointing out that the transformation of the volume of a cone into that of a cylinder is accomplished with a cube root. C.C.C. was supported in part by National Science Council of Taiwan Grant 972918I007004.
Footnotes
 ^{1}To whom correspondence should be addressed. Email: akromney{at}uci.edu

Author contributions: A.K.R. and C.C.C. designed research; A.K.R. and C.C.C. performed research; A.K.R. and C.C.C. analyzed data; and A.K.R. wrote the paper.

The authors declare no conflict of interest.
References
 ↵
 Munsell Color Company
 ↵
 Wyszecki G,
 Stiles WS
 ↵
 ↵
 ↵
 ↵
 ↵
 Barlow HB
 ↵
 ↵
 ↵
 Buchsbaum G,
 Gottschalk A
 ↵
 Stiles WS,
 Burch JM
 ↵
 Wald G
 ↵
 Sommer G,
 Zeevi YY
 Koenderink JJ,
 van Doorn AJ
 ↵
 ↵
 Cohen JB
 ↵
 MacAdam DL
 Schrödinger E
 ↵
 ↵
 Wyszecki G
 ↵
 ↵
 ↵
 ↵
 ↵
 Jacobs AL,
 et al.
 ↵
 ↵
 ↵
 ↵
 Wiesel TN,
 Hubel DH
 ↵
 Romney AK,
 Indow T,
 D'Andrade RG
 ↵
 ↵
 Strang G
 ↵
 Wofram S
 ↵
 ↵
 ↵
 Alpern M,
 Kitahara K,
 Krantz DH
 ↵
 ↵
 Brill MH,
 Finlayson GG,
 Hubel PM,
 Thornton WA
 ↵
 Hunt RWG
 ↵
 ↵
 Romney AK,
 Fulton JT
 ↵
 ↵
 McCollough C
 ↵
Citation Manager Formats
Sign up for Article Alerts
Article Classifications
 Biological Sciences
 Psychology