## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Monkeys choose as if maximizing utility compatible with basic principles of revealed preference theory

Edited by Charles R. Gallistel, Rutgers University, Piscataway, NJ, and approved January 23, 2017 (received for review July 21, 2016)

## Significance

Revealed preference theory scrutinizes utility maximization based on tradeoffs between goods. This notion concerns the transition from biological rewards (necessary for survival) to tradable economic goods (beneficial for welfare and evolutionary fitness). However, these assumptions have never been tested empirically in species closely related to humans, as would be necessary to infer a general biological mechanism. In this experiment, rhesus monkeys repeatedly chose between bundles of two goods. Their choice frequencies conformed to curves of equal choice frequency (indifference curves) and satisfied crucial consistency and axiomatic tests involving out-of-sample prediction from modeled indifference curves, transitivity, and axiomatic change of option set size. In satisfying stringent theoretical criteria, the data suggest the existence of well-structured preferences consistent with utility maximization.

## Abstract

Revealed preference theory provides axiomatic tools for assessing whether individuals make observable choices “as if” they are maximizing an underlying utility function. The theory evokes a tradeoff between goods whereby individuals improve themselves by trading one good for another good to obtain the best combination. Preferences revealed in these choices are modeled as curves of equal choice (indifference curves) and reflect an underlying process of optimization. These notions have far-reaching applications in consumer choice theory and impact the welfare of human and animal populations. However, they lack the empirical implementation in animals that would be required to establish a common biological basis. In a design using basic features of revealed preference theory, we measured in rhesus monkeys the frequency of repeated choices between bundles of two liquids. For various liquids, the animals’ choices were compatible with the notion of giving up a quantity of one good to gain one unit of another good while maintaining choice indifference, thereby implementing the concept of marginal rate of substitution. The indifference maps consisted of nonoverlapping, linear, convex, and occasionally concave curves with typically negative, but also sometimes positive, slopes depending on bundle composition. Out-of-sample predictions using homothetic polynomials validated the indifference curves. The animals’ preferences were internally consistent in satisfying transitivity. Change of option set size demonstrated choice optimality and satisfied the Weak Axiom of Revealed Preference (WARP). These data are consistent with a version of revealed preference theory in which preferences are stochastic; the monkeys behaved “as if” they had well-structured preferences and maximized utility.

To function properly, the body acquires particular substances contained in objects that are conceptualized as rewards in biology and goods in economics. Even the simplest drinks and foods contain multiple constituents such as amino acids, fats, and carbohydrates and attributes such as taste, color, and temperature. Water has taste and temperature. Beer has famously hundreds of components produced by fermentation. Sandwiches are composed of such constituents as bread, meat, and cheese. Components that can be varied individually may become tradable goods. For a balanced diet, the ancient farmer goes to the market and trades 5 lb of potatoes, of which he has plenty, against 1 lb of meat, of which he has little. Thus, considering biological rewards as multicomponent objects marks the transition to tradable economic goods. Revealed preference theory achieves exactly that: Each reward constitutes a bundle of tradable goods and is formally a vector.

In trading, one gives up some quantity of one good to obtain one unit of the other good. As the farmer gives up the minimal amount of potatoes for that 1 lb of meat, he expresses his preference for the two goods. In trying to obtain the most preferable combination of potatoes and meat, the farmer can be viewed as aiming to maximize the utility of the bundle. Utility is a numerical representation of preferences and a central tool for representing goodness in economics; utility maximization is a crucial mechanism in the quest for individual welfare and evolutionary fitness. However, neither utility nor preference can be measured physically; both need to be inferred from observable behavioral choices. The inference is valid only when assuming that preferences exist and are represented by an internal utility function. Then we may test empirically whether decision makers choose and reveal their preferences “as if” they had such internal preferences and utility function that would allow them to obtain the best possible good.

The notion of maximizing utility is conceptualized in revealed preference theory (1, 2), which invokes the multicomponent nature of objects to axiomatize preferences in the tradeoff between goods. The concept was initially targeted to markets and demand functions and constitutes one of the most elegant behavioral tools for assessing the implied process of maximization (3), thus laying the foundation for consumer choice theory, pricing, consumption, subsidies, and rationing (4, 5). Revealed preference theory formalizes the use of choices to connect the theory to observations: Decision makers should always choose the best bundle out of the set of available bundles. Crucially, the choice should be consistent with an underlying transitive preference, and the preference shown in the choice of pairs of bundles should extend itself to smaller and larger sets of choice options; this extension is the essence of Arrow’s Weak Axiom of Revealed Preference (WARP) (6). In always choosing the best option, irrespective of what else is available, decision makers act “as if” they had well-ordered, ranked preferences that give structure to impulses and errors as opposed to unstructured, ad hoc inclinations toward momentarily available options.

Using basic notions of revealed preference theory, we aimed to establish behavioral tools for testing utility maximization on reward neurons of rhesus monkeys. Previous behavioral studies based on revealed preference theory have been conducted on rodents (7, 8). Monkeys have superior cognitive abilities, show sophisticated behaviors, and are suitable for neuronal recordings with stringent sensory and movement controls. Their choice of skewed gambles follows third-order stochastic dominance and prediction from empirical utility functions (9), and their amygdala neurons code internally represented values across multistep reward-saving behavior (10). Important for trading, monkeys exchange tokens for food (11, 12) and smartly trade stolen items (13). The proper statistical analysis of neuronal data requires the use of repeated trials; in these repeated trials, animals typically choose the best option most frequently rather than choosing it always. This situation seems at odds with standard presentations of revealed preference theory that assume that a maximal-utility choice will be made with certainty on each occasion. Because of this requirement, the behavior of our monkeys can only be understood as involving a degree of randomness in the choice made on an individual trial. Indeed, whether subjects always choose from among the best bundles, always choose the best bundle, or merely tend to choose the best bundle has been an issue in economics for a long time. In this sense, we tested the implications not of revealed preference theory in the classic sense but of the class of stochastic choice models in which there exists a transitive ordering of all possible bundles of goods (or utility function) with the property that one bundle will be chosen more often than some other bundle when the choice set consists only of these two bundles, if and only if the first bundle is ranked more highly by the ordering. We can define a relation ≻ between bundles such that a ≻ b means that bundle a is more likely to be chosen than bundle b when the choice set is {a, b}, and this relation should satisfy the usual axioms of revealed preference. We then can test the empirical validity of these axioms using our observations of choice frequencies on the part of our monkeys when a given choice is presented repeatedly. The noisy process is well captured by psychophysical methods that model choice probability as a fitted function to empirically measured choice frequency (14). Note that theories of stochastic choice belonging to this class include such familiar theories as those of Luce and McFadden (15⇓–17), but we did not test the more detailed predictions of such specific theories, preferring to focus on the common predictions of a broader class of stochastic choice theories. Also, the a priori assumption of existing utility in these theories seemed counterintuitive to our goal of identifying preferences and utility maximization regardless of the form their representation might take.

The current experiment investigated choices between bundles composed of two liquids with specific quantities. We tested the tradeoff between the two liquids by setting one liquid to a specific quantity and psychophysically varying the quantity of the other liquid until the animal chose this bundle with the same frequency as an unaltered reference bundle. The equal frequency indicated choice indifference and suggested an equal preference for each bundle. Repeating this test with systematically changed quantities of the two liquids of a given bundle resulted in a curve of indifference points (IPs), and several such curves set at different liquid quantities resulted in a 2D indifference map. Out-of-sample predictions used curviparallel homothetic polynomials that were fitted to all IPs of whole indifference maps to test the validity of individual indifference curves (ICs). We performed two crucial tests of revealed preference, namely transitivity for demonstrating choice consistency and extension to option sets of different sizes for satisfying the Weak Axiom of Revealed Preference (WARP) (6). The data suggest that rhesus monkeys made noisy choices “as if” they aimed overall for the best option irrespective of what else was on offer, as is compatible with the basic principles of revealed preference theory.

## Results

### Design.

In the standard two-option test, rhesus monkeys chose between two bundles with a single arm movement by touching a specific stimulus pair on a touch-sensitive computer monitor (Fig. 1*A* and *SI Methods*). The action required to obtain any of the chosen bundles consisted of a constant, single arm-reaching movement. Each bundle contained the same two liquid goods with independently set quantities (liquid A along the *y* axis, and liquid B along the *x* axis); the quantity of each was indicated by the respective vertical position of a bar within a rectangle (Fig. 1*B*). Liquid B was delivered 500 ms after liquid A to reduce taste interactions; this delay conceivably discounted, and was an integral and constant part of, the subjective value of any liquid B. One of the two bundles was the reference bundle, which was composed of preset quantities of liquids A and B. The other, the variable bundle, had a specifically set quantity of liquid B and a variable quantity of liquid A (or vice versa). We assumed from previous experience that the animals had positive, monotonic, nonsaturating internal value functions (“more is better”); they chose more liquid over less liquid. Only satiety and disfavored liquids would lead to violations of positive slope and value monotonicity. Thus, the reference bundle and the variable bundle defined the option set; the option set differed in every trial when the variable bundle changed during psychophysical testing.

A true tradeoff between bundle goods is possible only if preference for the bundles remains unchanged; we approached an unchanged preference by fixing the two goods of the reference bundle to a constant value. To test the tradeoff, we increased one good of the variable bundle by 0.1 mL and psychophysically varied the quantity of the other good of the variable bundle while the animal chose between the new variable bundle and the unchanged reference bundle (Fig. 1*C*). When too much liquid A had to be given up to obtain one additional unit of liquid B within the variable bundle, the animal forewent the tradeoff and chose the unchanged reference bundle; when the tradeoff required no or little loss of liquid A, the animal chose the variable bundle. An intermediate reduction of liquid A was met with equal choice probability (*P* = 0.5 each bundle) (Fig. 1*C*, red). Systematic variation of bundle settings resulted in a series of IPs that conformed to an IC; each bundle on that IC had the same utility (Fig. 1 *D* and *E*). The tradeoff apparent in the decreasing ICs demonstrates that the animals considered both bundle goods and did not simply follow, and maximize, one good, a process that would have resulted in strictly vertical or horizontal ICs (“lexicographic preferences”). By setting the reference bundle to specific liquid quantities, we obtained indifference maps with three to five ICs (Fig. 1*E*, red, orange, blue, and brown curves).

We tested two rhesus monkeys during most days of the week for several months using a constant, full-reward range. We established a total of 921 IPs in psychophysical tests for a large variety of bundles (660 IPs), out-of-sample predictions using homothetic fits (228 IPs), and axiomatic tests involving three bundles (33 IPs); consistency tests with transitivity used 702 existing IPs set onto fitted ICs.

### ICs and Maps.

As first step we established valid IPs, ICs, and indifference maps of bundles that captured the multicomponent nature of rewards as schematized in Fig. 1 *C*–*E*. The bundles contained specific quantities of two goods (Fig. 2): a good common to all bundles (blackcurrant) and one of several other goods, namely grape juice, strawberry juice, water, blackcurrant juice itself, apple squash, lemon juice, liquid yogurt, saline (NaCl), and combinations with monosodium glutamate (MSG) and inosine monophosphate (IMP). The IPs were rather precise in both animals, as judged from small 95% confidence intervals (CIs) of Weibull fits (0.019–0.05 mL) and small SEMs of repeatedly Weibull-estimated IPs (0.003–0.007 mL) (Fig. S1). For these bundles, we estimated 660 IPs and established 38 and 15 ICs for 11 and four indifference maps in monkeys A and B, respectively (3–15 IPs per IC, 40–50 IPs per indifference map).

Estimated IPs for four basic bundle types are shown as colored dots in Fig. 2 *A*–*D*. These bundles combined blackcurrant juice with grape juice, strawberry juice, water, or blackcurrant juice itself. The animals were indifferent between bundles indicated by dots with same color. Second-degree (quadratic) polynomials and hyperbolas provided significantly better fits to individual ICs than first-degree (linear) polynomials (*P* < 0.05, Tukey–Kramer after *P* < 0.0001; one-way ANOVA on *R*^{2}s). Quadratic polynomial fits showed *R*^{2}s of 0.80–0.97 (Dataset S1*A*), which differed insignificantly from hyperbolas (*P* > 0.15); all further analyses used quadratic polynomials for simplicity. The polynomial fits showed the typical decrement on the *y* axis and increment on the *x* axis when giving up a quantity of good A to gain one unit of good B while maintaining choice indifference between the reference and variable bundles (Fig. 2 *A*–*D*). None of the two to five ICs of each indifference map crossed each other, and even their CIs rarely touched each other (Fig. 2 *E*–*H* and Fig. S2 *A*–*D*).

The most relevant quadratic polynomial parameters are curvature and slope (currency). The curvatures of the ICs of the four basic bundle types were largely linear or slightly convex, suggesting similar exchange rates between the bundle goods along the whole curve (Fig. 2 *A*–*H* and Fig. S2 *A*–*D*). Apparently, at any point along the curve, the value gain in one bundle good compensated in a similar way for the value loss in the other bundle good, suggesting that the two bundle goods were substitutes. The IC slope demonstrated how much the animal gave up to obtain one unit of the other good and thus indicated the relative value (currency) of the two bundle goods. The steep, approximately −60° slope for the (blackcurrant, grape) bundle suggested that the animal gave up about twice as much blackcurrant juice for one unit of grape juice at choice indifference; thus, blackcurrant juice seemed less valuable to the animal than grape juice (Fig. 2 *A* and *E* and Fig. S2*B*). By contrast, the more symmetric, approximately −45° slope for the (blackcurrant, strawberry) bundle indicated similar valuation of the two juices (Fig. 2 *B* and *F* and Fig. S2*B*), and the approximately −30° slope for the (blackcurrant, water) bundle demonstrated that blackcurrant juice had a higher value than water (Fig. 2 *C* and *G* and Fig. S2*C*). Bundles containing only blackcurrant juice showed approximately symmetric ICs (Fig. 2 *D* and *H* and Fig. S2*D*), confirming the reliability of the animals’ choices. Thus, the ICs of the four basic bundle types showed well-characterized parameters captured by quadratic polynomials.

Variations in the bundle components affected the curvatures of the ICs. MSG and IMP are known taste enhancers. Indeed, combining these taste enhancers with bundles of blackcurrant and grape juice consistently showed that choice indifference required lower quantities of enhanced juices than unenhanced juices, resulting in convex ICs (Fig. 2 *I* and *J*). Thus, the combination of goods (plotted at IC center) had higher value than the simple addition of singular goods (plotted at IC axes). These choices suggest synergistic gain from goods complementing each other. By contrast, bundles of blackcurrant juice and apple squash showed concave ICs, which indicated that choice indifference required higher quantities of combined than singular goods and demonstrated anti-synergistic effects of juice combinations (Fig. 2*K*). Thus, nonlinearities in estimated ICs revealed the interaction of bundle goods.

Reward-specific satiety may reduce reward value. To induce partial satiety, we administered substantial quantities (100–175 mL) of grape juice before testing (*SI Methods* and Fig. S2 *E* and *F*). As shown with 48 IPs, both animals gave up disproportionately less blackcurrant juice for increasing quantities of grape juice on which they were sated, thus flattening the ICs (Fig. 2*L*) compared with unsated monkey’s choices involving grape juice (Fig. 2 *A* and *E* and Fig. S2*A*). Apparently, at choice indifference, the animal sated on grape juice was willing to forego disproportionately less of the other juice to obtain higher quantities of grape juice. Thus, the ICs documented the reduced value of the specific reward on which the animal was sated, resembling the anti-synergistic effects of unfavorable bundle combinations (Fig. 2*K*).

Although bundles composed of goods with positive value presented negative slopes in ICs, bundles containing unfavorable goods were associated with positive slopes. Although the animals chose lemon juice, yogurt, and saline in bundle combinations with blackcurrant juice, choice indifference required additional, not lower, quantities of blackcurrant juice (Fig. 2 *M*–*O*). For example, an additional 0.1 mL of blackcurrant juice was required to maintain choice indifference when lemon juice was increased from 0.2 to 0.3 mL (Fig. 2*M*). These positive IC slopes indicated that choices of less favorable goods required compensation by a favorable good. The positive IC slope suggested the unfavorable nature of a single good (Fig. 2 *M*–*O*), whereas the concave curvature but negative slope of the IC demonstrated an unfavorable combination of the goods (Fig. 2 *K* and *L*).

Our bundles of two goods contrasted with the standard options containing only one good that are routinely used in neurophysiological experiments (18⇓⇓⇓⇓⇓–24). To control for undue stimulus and choice biases, we tested choices between single-good bundles positioned at the axes of the 2D map. Using the same two-component stimuli shown in Fig. 1*B*, we kept liquid B at 0 mL and measured choices between different quantities of liquid A (variation along *y* axis) or kept liquid A at 0 mL while varying liquid B (variation along *x* axis). The measured behavioral choices integrated well into the ICs obtained with the full bundles of two nonzero outcomes, as shown by IPs between bundles set close to the same IC and different choice frequencies for bundles set across different ICs (colored dots along axes in Fig. 2 *A*–*D*). Thus, the visual presentation and choices of two-good choice options (bundles) did not seem to generate undue biases.

Taken together, the distinct linear, convex, and concave ICs with negative and positive slopes were characteristic for bundles with specific components. The orderly, nonoverlapping ICs reflecting systematic tradeoff between goods suggested that the rhesus monkeys behaved “as if” they used the multicomponent nature of rewards to obtain the best available option. These data laid the necessary ground for testing basic principles of revealed preference theory.

### Marginal Rate of Substitution.

Our procedure implemented the notion of marginal rate of substitution (MRS) by determining how much of one liquid the animal gave up to obtain one additional unit (usually 0.1 mL) of the other liquid (Fig. 1*D*). The MRS marks the transition from the multicomponent nature of rewards to the tradeoff between goods. All described IC characteristics were compatible with this notion. In the scheme of Fig. 1*D*, the animal initially gave up 0.3 mL (from 0.6 mL to 0.3 mL) of liquid A to gain 0.1 mL of liquid B, indicating an MRS of 3.0. The next step showed an MRS of 2.0. The MRS is defined as the inverse slope of an IC at a given IP. For the best-fitting quadratic polynomial (*y* = ax^{2} + bx + c), the MRS is the negative first derivative, namely *y* = −2ax −b, where *a* denotes the degree of deviation from linearity, called “curvature” or “elasticity,” and *b* denotes the value relationship between the two goods, called “currency.” Dataset S1 *A* and *B* shows the individual MRSs from all three fitting models and the average MRSs from the best-fitting polynomials together with their coefficients.

For linear ICs, elasticity is nil, and MRS equates negative currency. The highest ICs for bundles of (blackcurrant juice, grape juice) in monkey A and (blackcurrant juice, water) in monkey B were linear (Fig. 2 *A* and *E* and Fig. S2*C*). The animals gave up about 0.2 mL of blackcurrant juice to gain 0.1 mL of grape juice and 0.4 mL of water, indicating respective MRSs close to 2 and 0.5, suggesting that the monkeys valued grape juice almost twice as much as blackcurrant juice and valued water half as much. The second highest ICs of bundles of blackcurrant juice combined with strawberry juice or water were also linear, with MRSs close to 1 (Fig. 2 *B*, *C*, *F*, and *G*) and 0.5 (Fig. S2*C*), indicating specific exchange values between blackcurrant juice and these two liquids. These choices conforming to well-aligned, linear ICs straightforwardly reflected the animals’ exchange value of bundle goods, thus demonstrating the sensitivity of the animals’ preferences.

An adequate description of the many nonlinear ICs shown in Fig. 2 and Fig. S2 *A*–*D* required all MRS function coefficients (Dataset S1 *A* and *B*). The elasticity-curvature coefficient was positive for the typically convex ICs and negative for the occasional concave curves (blackcurrant combined with apple squash, grape juice for sated monkeys, or saline) (Fig. 2 *K*, *L*, and *O*). The currency coefficient capturing value relationships between two bundle goods was negative for all decreasing ICs (positive MRS) and was positive for the occasional increasing curves (negative MRS; blackcurrant combined with lemon, yogurt, or saline) (Fig. 2 *M*–*O*). Thus, distinct MRS coefficients characterized the animals’ choices appropriately, attesting to the systematic nature of their choices.

### Out-of-Sample Prediction with the Homothetic Model.

The validity of the ICs for representing preferences can be tested by out-of-sample predictions. To this end, we established a homothetic model for each bundle type and used it as a sufficiently general prediction for IPs not used for constructing the model.

The homothetic model consisted of a map of curviparallel curves that were derived from a single, continuous, quadratic, polynomial function fitted to all IPs on all or selected ICs of the studied bundle type. The fitted homothetic functions, MRSs, and coefficients matched well those of individually fit polynomials (Fig. 3 *A*–*C*, Fig. S3, and Dataset S1*C*). Average homothetic fits were high, despite local deviations of ICs (*R*^{2} of 0.67–0.99, mean *R*^{2} of 0.85) (Dataset S1*C*). The slopes of ICs were negative for bundle types containing only appetitive goods and were positive for bundle types containing one unfavorable good in both homothetic (Fig. S3) and individual polynomial fits (Fig. 2 and Fig. S2 *A–D*). Curvature was most convex (highest elasticity) in both homothetic and individual fits for bundles of (blackcurrant + MSG, grape + IMP), (blackcurrant + MSG, blackcurrant + IMP) and (blackcurrant, lemon) (Fig. 2 *I*, *J*, and *M* and Fig. S3 *E*, *F*, and *I*). Curvature was concave (negative elasticity) for bundles of (blackcurrant, apple squash) and (blackcurrant, grape juice for sated animals) (Fig. 2 *K* and *L* and Fig. S3 *G* and *H*). Thus, the good match between homothetic and individually fit polynomials confirmed the validity of the ICs that seemed to represent well the revealed preferences of the animals; the homothetic models seemed to provide valid predictors for out-of-sample tests.

For testing out-of-sample prediction, we estimated 228 new IPs. The animals chose between a reference bundle anchored to the *y* intercept of the tested homothetic IC and a variable bundle whose liquid B was set to a specific quantity (*x* axis) and whose liquid A was varied psychophysically in steps of 0.1 mL (*y* axis) (*SI Methods*). We tested prediction in two ways (Fig. 3*D*). For out-of-points prediction, we positioned 68 of the 228 new IPs at points not used for establishing the homothetic model. We tested five bundle types involving blackcurrant juice combined with grape juice, strawberry juice, water, MSG, IMP, and apple squash. Both animals preferred bundles above the ICs with *P* = 0.75 ± 0.1 (mean ± SEM), dispreferred bundles below the homothetic ICs with *P* = 0.3 ± 0.1 (Fig. 3*E*, *Right*), and showed indifference at the new IPs that deviated from the homothetic polynomials by 0.02 ± 0.01 mL (Fig. 3*F*). For out-of-curves prediction, we positioned 160 of the 228 new IPs on curves not used for establishing the homothetic model. We tested the four basic bundle types involving blackcurrant juice combined with grape juice, strawberry juice, water, or blackcurrant juice itself. We deleted one IC from the construction of the homothetic model in a bootstrap-like procedure and then placed test points onto homothetic ICs that were inferred from the remaining maps and most closely matched the deleted ICs (SI Methods). The new IPs deviated from the homothetic ICs by 0.04 ± 0.02 mL (Fig. 3*G*).

Taken together, the homothetic polynomials predicted well the animal’s choices and the newly established IPs. The precision of 0.02 mL and 0.04 mL was well within the 95% CIs of Weibull fits to IPs (0.02–0.05 mL) and polynomial fits to individual ICs (0.03–0.06 mL) (Fig. 2). These data supported the validity of the ICs and suggested that the animals behaved “as if” they had systematic and consistent preferences for the tested bundles.

### Transitivity.

To test the consistency of the animals’ choices, we investigated transitivity along a hierarchical chain of binary preference relationships of bundles a–d: d ≻ c ≻ b ≻ a. We consider direct, nontransitive preference as directly revealed and considered transitive preference as indirectly revealed, given the satisfaction of WARP shown below (6). To satisfy transitivity, if bundle c is directly revealed as preferred to bundle b (c ≽ b), and bundle b is directly revealed as preferred to bundle a (b ≽ a), then bundle c should be directly revealed as preferred to bundle a (c ≽ a, transitive closure). Or, more generally for any chain of bundles, if bundle c is indirectly revealed as preferred to bundle a, then bundle a cannot be directly revealed as preferred to bundle c (a ⊁ c). Or, more formally, to satisfy transitivity, I cannot prefer an option that is two ranks or more below that of the alternative option (if a_{1} ≽ a_{2} ... and ... a_{n-1} ≽ a_{n}, then a_{1} ≽ a_{n} and a_{n} ⊁ a_{1}, for *n* > 2).

We conducted three transitivity tests with bundles of blackcurrant juice combined with grape juice, strawberry juice, and water. We set bundles onto 702 points of quadratic polynomial fits to ICs (278 points for the first two tests, 424 points for the third test). For each bundle, we inferred direct preference relationships as d ≻ c ≻ b ≻ a from their position on four ICs. Each transitivity test involved about 40 trials repeated up to five times and therefore assessed violations of directly revealed preference and of transitivity in terms of choice frequency rather than as singular, one-shot choice.

For the first transitivity test, we aligned bundles according to increasing ICs but monotonically decreasing physical quantity of one bundle good (increasing ICs require increasing physical quantity of the other bundle good); we confirmed empirically all preferences inferred from IC ranks. Thus, we tested for d ≻ b as “short” (high-end) transitive closure for d ≻ c and c ≻ b, and we tested for c ≻ a as short (low-end) transitive closure for c ≻b and b ≻ a. In six such tests, transitivity violations occurred on average in 0–13% of trials, which in five of six closures were equal or inferior to violations of directly revealed preference of 0–21% of trials (Fig. 4 *A*–*C* and Dataset S2*A*). These choices demonstrated consistent preference relationships rather than simple physical-monotonic quantity following.

For the second transitivity test, we aligned bundles across ICs according to the monotonically increasing physical quantity of both goods (assuming positive, monotonic, nonsaturating value functions, i.e., “more is better”) and confirmed all inferred preferences empirically. Thus, we tested for d ≻a as “long” transitive closure for d ≻ c, c ≻ b, and b ≻ a; we tested for d ≻ b as short (high-end) transitive closure for d ≻ c and c ≻ b; we tested for c ≻ a as short (low-end) transitive closure for c ≻ b and b ≻ a. In 41 such transitivity tests, violations occurred on average in 0–6.25% of trials, which in 37 of 41 closures were equal or inferior to violations of directly revealed preference of 0–20.8% of trials (Fig. 4 *D*–*I* and Dataset S2 *B*–*G*). In showing that the animals’ choices followed physically-monotonically ordered quantities, these data suggested that the animals had a positive monotonic internal value function as a necessary condition for the third transitivity test.

For the third transitivity test, we aligned bundles across ICs according to the monotonically higher or lower physical quantity of one good. In the chain of assumed preferences d ≻ c ≻ b ≻ a, we tested empirically the central preference relationship c ≻ b but only inferred physically-monotonically the relationships d ≻ c and b ≻ a from the bundles’ alignment by physical quantity (Fig. 5*A*). Then we tested for d ≻ a as long transitive closure for physically-monotonically aligned d ≻ c, empirically tested c ≻ b, and physically-monotonically aligned b ≻ a. Formally, to satisfy this transitivity test, bundle a cannot be directly revealed as preferred to bundle d (a ⊁ d). Furthermore, using shorter chains, we tested for d ≻ b as short (high-end) transitive closure for physically aligned d ≻ c and empirically tested c ≻ b; we tested for c ≻ a as short (low-end) transitive closure for empirically tested c ≻ b and physically-monotonically aligned b ≻ a (these two short transitivity tests can be formalized in analogy to the long transitivity test). In 106 such tests, violations of long and short transitivity occurred on average in 0.4–8.5% of trials (Fig. 5, Figs. S4 and S5, and Dataset S3), generally below the percentage of violations of directly revealed preference (nontransitive) (2.4–17.1% of trials; see upward and downward inset histograms in Fig. 5 *E–M* and Fig. S4 *A–F*). We further quantified transitivity compliance in 86 of the 106 long and short transitivity tests with Afriat’s Critical Cost Efficiency Index developed for budget lines (25⇓–27) and found high satisfaction suggested by indices of 0.83*–*1.0 (Table 1 and Dataset S3). The results from this more demanding third transitivity test confirmed the choice consistency suggested by the other two transitivity tests.

Taken together, the consistency of choices in these three transitivity tests satisfied a central condition for assuming well-structured preferences in rhesus monkeys and complies with basic notions of revealed preference theory.

### Testing WARP with Three-Bundle Option Sets.

Arrow’s extension of the WARP states that if some elements are chosen out of a set Y and if the alternatives are narrowed to subset X but still contain some previously chosen elements, then no previously unchosen element of X becomes chosen, and no previously chosen element becomes unchosen (6). Equivalently, if any of the elements chosen from X are also chosen when the set of options is expanded to Y (that contains X as a subset), then all the elements chosen from X are among the elements chosen from Y. Specifically, if bundle x is preferred over bundle y and over bundle z in a set Y composed of bundles x, y, and z, then x should remain preferred if the option set is narrowed to subset X that includes only x and y. Equivalently, if x is preferred over y in an option subset X composed of x and y, and x is also preferred over z in an option subset composed of x and z, then x should remain preferred if the option set is expanded to Y to include x, y, and z. We tested these predictions in our animals after the animals had experienced the full reward range for several months.

First, we tested whether choices within three-bundle sets would be consistent with ICs established previously with two-bundle sets. To be consistent, an animal should choose three bundles located on the same IC with equal frequency (*P* < 0.33 each bundle); however, a bundle located above the IC should be chosen more frequently than the other two bundles (*P* > 0.33), and a bundle below the IC should be chosen less frequently (*P* < 0.33) (Fig. 6 *A*–*C*; blue dotted lines define the three-bundle option sets). We fixed two degenerated anchoring bundles to the respective *x* and *y* intercepts of polynomial ICs; we varied both liquids of the third, center bundle roughly orthogonal to the IC tangent around a previously untested intermediate point; then we measured the frequency of choosing the center bundle at each test point and estimated the IP (*P* = 0.33 each option) by Weibull fitting (Fig. 6 *D* and *E* and Fig. S6 *A* and *B*). In both animals, 33 newly estimated IPs for four different bundle types with convex, linear, and slightly concave ICs deviated from the polynomial-fitted ICs in the *y* axis by 0.04 ± 0.02 mL (mean ± SEM) and were situated inside their 95% CIs (Fig. 6 *F* and *G* and Fig. S6 *C* and *D*). The animals preferred options that were 0.1 mL above these IPs in both bundle liquids with *P* = 0.52 ± 0.025 (mean ± SEM; *P* < 6.4e^{−13}; *n* = 1,495 tests; paired *t* test) and dispreferred options 0.1 mL below the IPs with *P* = 0.26 ± 0.011 (*P* < 0.07; *n* = 566 tests). Thus, the animals’ choices in the tested three-bundle sets corresponded well to the ICs established in two-bundle choices.

Given this consistency, we assessed compliance with Arrow’s WARP in 1,438 choices in the three-bundle set Y containing bundles x, y, and z and in the two-bundle subset X containing bundles x and y, using two different bundle settings of (blackcurrant, water) (Fig. 6*H* and Fig. S6*E*) (*SI Methods* for the formal description). We placed bundles y and z on a polynomial-fitted IC that was established previously with two-bundle sets, and we placed bundle x well above that IC. In newly assessed choices, bundle x was directly revealed as preferred to bundle y in the two-bundle subset X {x,y}; choices were indifferent between bundles y and z, suggesting that bundle x was indirectly revealed as preferred to bundle z (Fig. 6*I* and Fig. S6*F*). Importantly, bundle x was directly revealed as preferred to bundles y and z in the three-bundle set Y {x,y,z}; the frequency of choosing bundle x was higher than the frequency of choosing bundles y or z within that set. These choices demonstrated maintained preference for bundle x in the three-bundle set Y {x,y,z} and in two-bundle subset X {x,y} and thus satisfied Arrow’s WARP as a necessary condition for utility maximization models.

## SI Methods

### General.

The Home Office of the United Kingdom approved all experimental procedures. Two male, experiment-naive rhesus monkeys (*Macaca mulatta*) weighing 9.0 kg and 10.0 kg, respectively, were used in the experiment. The animals were habituated during several months to sitting relaxed in a primate chair (Crist Instruments) in the laboratory for a few hours each working day. They were trained in specific, computer-controlled behavioral tasks in which they contacted visual stimuli on a touchscreen (Elo). During the experiments, a single animal sat in the primate chair 30 cm away from a horizontally mounted touch-sensitive computer monitor. Custom-made software (MathWorks; Matlab) running on a Microsoft Windows XP computer controlled the behavior and collected, analyzed, and presented data online. A solenoid valve (SCB262C068; ASCO) controlled by a Windows computer delivered specific quantities of liquids. Matlab and a Microsoft SQL Server 2008 Database served for offline data analysis.

### Rewards and Stimuli.

Before the experiment, we estimated the animals’ preferences in binary choices between various liquids (blackcurrant juice, grape juice, strawberry juice, water, apple squash, lemon juice, liquid yogurt, saline, and blackcurrant juice or grape juice combined with MSG or IMP. Specific quantities of two of these liquids were paired and considered a bundle (Fig. 1 *A* and *B*). The quantity of liquid A was plotted along the *y* axis in the 2D plots of IPs, and the quantity of liquid B was plotted along the *x* axis (Fig. 1 *C*–*E*).

A pair of visual stimuli represented a bundle with specific quantities of two liquids (Fig. 1*B*). The stimuli consisted of a vertical rectangle on a background. The color of the background indicated liquid A (blue, top) and B (green, bottom); the vertical position of a bar in each rectangle indicated the physical liquid quantity. Liquid A was mostly blackcurrant juice; liquid B could be any of the liquids used. Two stimulus pairs representing two different bundles served as choice options that appeared at pseudorandomly alternating left and right positions relative to the center of the computer monitor; each pair contained the same two liquids with independent quantities (Fig. 1*B*). Selected control tests comprised three bundles (Fig. 6 and Fig. S6).

### Task.

Each trial began when the animal contacted a centrally located touch-sensitive key for 1.0 s after a pseudorandom intertrial interval of 1.6 ± 0.25 s. Then the two stimulus pairs representing two bundles appeared on the computer monitor (Fig. 1*A*). After 2.0 s, two blue spots appeared as a GO stimulus underneath the two stimulus pairs; when the stimulus appeared, the animal released the touch key and touched one of the stimulus pairs within 2.0 s. We kept the required action constant at one arm movement. After a target hold time of 1.0 s, the blue spot underneath the chosen bundle turned green, and a white frame appeared around that bundle to provide feedback for successful selection; the blue spot underneath the unchosen bundle disappeared. Then the computer-controlled liquid solenoid valve delivered first liquid A and then liquid B of the chosen bundle at 1.0 and 1.5 s after the choice, respectively. Task training was initially restricted to one bundle type and was extended to other bundle types only when satisfactory behavioral performance was obtained.

Although the longer delay for liquid B compared with liquid A likely affected choices asymmetrically through different temporal discounting, all delays were constant and thus were incorporated into the IP and the IC. We choose this delay, rather than simultaneous delivery or pseudorandomly alternating single liquid delivery, to prevent more serious interactions between simultaneously delivered liquids and to avoid introducing risk. Reaching for a target before the appearance of the blue dots and key release during key touch or target-hold epochs were considered as errors and led directly to the intertrial interval without reward.

### Psychophysics for IPs.

A psychophysical procedure served to estimate indifference between choice options (Fig. 1*D*). At least five different quantities of one liquid were tested for every unit change (0.1 mL) of the other liquid. Each bundle pair was tested at least eight times on two pseudorandomly alternating left–right stimulus positions. Thus, establishing a single IP took at least 80 trials. Choices were considered as indifferent between the two bundles when choice probabilities ranged between 45 and 55%. The IP at choice probability of *P* = 0.5 for each option was estimated from fit with a Weibull function.

### ICs.

To establish a new IC, we first determined the bulk currency between the two bundle liquids in initial “anchoring” choices. We positioned the two bundles respective to the two axes; in one bundle, we offered only liquid A by setting it to a specific nonzero value (along the *y* axis) and by setting liquid B to 0 (*x* = 0); in the other bundle, we offered only liquid B by setting liquid A to 0 (*y* = 0) and psychophysically varying liquid B along the *x* axis to determine the IP. When an IP could not be determined at the two axes, we kept the bundle at *x* = 0 unchanged and increased the quantity of the liquid in the other bundle above 0 in the *y* axis until we could perform full psychophysics on both sides of the IP. This process was necessary for bundles whose ICs would not touch *y* = 0 at any value of *x* (Fig. 2 *B*, *C*, *F*, *G*, *L*, and *M*–*O*).

Subsequently, we established a whole IC in systematic steps. We pseudorandomly designated one of the anchor bundles as the reference bundle. Then we designated a variable bundle by copying the reference bundle but modifying it as follows: We set its nonaxis liquid one unit (0.1 mL) higher and psychophysically varied the other liquid toward choice indifference between the two bundles (Fig. 1 *C* and *D*). In this way, we explicitly implemented the notion of MRS, namely how much liquid the animal was ready to give up in order to gain one unit (0.1 mL) of the other liquid. The reference and variable bundles were visually indistinguishable, apart from their quantity variations. We changed sampling direction in a balanced manner. To this end, we designated the other anchor bundle as the reference bundle and repeated the initial step to establish the variable bundle. Subsequently, we moved away from the *x* and *y* axes toward the opposite anchors in 0.1-mL steps. Thus, when going from left to right on the *x*–*y* plane of indifference maps, we increased liquid B by one unit (0.1 mL) and assessed psychophysically how much liquid A the animal gave up at choice indifference; when moving from right to left, we increased liquid A by one unit and assessed how much liquid B the animal would give up at choice indifference. After every step, we made the variable bundle the new reference bundle.

For repeatedly testing an already established IC on the same day or on different days, we used the same procedure but started at a pseudorandom position on a given IC and advanced in a single 0.1-mL step; then we choose the next direction in a balanced manner and continued doing so until we hit both axes. For each bundle, we measured an average of 11 IPs for each IC, resulting in about 44 IPs for a whole indifference map of usually three to five curves. Higher ICs required more IPs than lower curves (each IP required about 80 trials, as explained above).

For validation against choices between options containing only one good, we used the same two-component stimuli and set the reference bundle to one of the axes while varying the variable bundle along that axis. Thus, we varied either liquid A while keeping liquid B constant at 0 mL (variation along the *y* axis), or we varied liquid B while keeping liquid A constant at 0 mL (variation along the *x* axis).

Three-bundle option sets served to investigate the validity of the ICs irrespective of sampling direction, test the consistency of preferences, and formally assess Arrow’s WARP (Fig. 6 and Fig. S6) (6). Following earlier suggestions (37), we presented two fixed anchoring bundles at the *x* and *y* axis and varied the third, center bundle; a choice probability between *P* = 0.28 and *P* = 0.38 for all three bundles was required to indicate indifference between these options.

The sequential transfer of the variable bundle to the new reference bundle might have led to accumulating errors. We aimed to compensate for this possibility by pseudorandomly alternating the starting positions, by the mix of anchored and nonanchored starting positions, and by alternating the measurement directions toward the opposite axis. The small errors from the repeated measures (Fig. S1) and the confirmatory results from the three-option choices (Fig. 6 *F* and *G* and Fig. S6 *C* and *D*) attested to the fidelity of the method.

### Satiety and Osmometry.

Performance of a behavioral task may be reduced when it is reinforced with a reward for which the animal is sated. A simple measure of task performance is the number of trials performed during a daily session. Satiety also may result in variations in choice behavior, as shown with grape juice (Fig. 2*L*). To obtain a potential marker for satiety, we collected saliva and measured its osmolality on both animals immediately before each test session, using an established procedure with a 5500 vapor pressure osmometer (Wescor) (38⇓–40). Fig. S2 *E* and *F* shows that fluid consumption (blue) and task performance (number of trials performed, green) correlated well with osmolality over several weeks of training in both animals. The average Pearson correlation coefficient between task performance (total of 3,800 trials) and saliva osmolality during an average week was at least *ρ* = 0.74 for monkey A and 0.67 for monkey B. This correlation differed significantly from Gaussian randomness (*P* < 0.1; *t* test on Gaussian distribution of random values). Similar values were obtained for correlations between water consumed and osmolality (*ρ* = 0.72, *P* < 0.1). Thus, performance was better on days with high saliva osmolality (e.g., Tuesday–Friday after one weekend day with ad libitum water). Days with abnormally low osmolality (<80 mOsm/kg, e.g., on Mondays) generally resulted in inconsistent performance, and ICs obtained during these periods were discarded.

The measurement of saliva osmolality also allowed us to control for incidental satiety during standard IC tests. We avoided abnormally low osmolality in each animal and aimed for symmetric osmolality between liquids by randomized sampling in both directions that increased consumption of both liquids nearly simultaneously. This procedure may explain unchanged choice frequency and reproducible preferences within the controlled consumption range.

### Curve Fitting.

We used three functions to test fits to the measured individual ICs by the least mean squared error method (*P* < 0.05). The quadratic (second-degree) polynomial provided the best combination of good fit and simplicity for individual ICs (Dataset S1A).

• Linear (first-degree) polynomial:

*y*=*ax*+*b*, with*a*as currency and*b*as offset; MRS = −*dy*/*dx*= −*a*• Quadratic (second-degree) polynomial:

*y*=*ax*^{2}+*bx*+*c*, with*a*as curvature or elasticity or complementarity,*b*as currency, and*c*as offset; MRS = −*dy*/*dx*= −(2*ax*+*b*)• Hyperbolic function:

*d*=*ax*+*by*+*cxy*, with*c*as curvature or elasticity or complementarity and (*ab + cd*)/*b*^{2}as currency; MRS = (*ab*+*cd*)/(*b*+*cx*)^{2}

### Homothetic Maps for Out-of-Sample Prediction.

We established a homothetic function as a single model that provided the best fit to all IPs of a given indifference map. The coefficients of the best-fitting quadratic polynomials (Dataset S1A) served as starting parameters for coefficient search. The coefficient *a* in the quadratic polynomial parametrizes the curvature of the function, which is also referred to as elasticity (degree of deviation from linearity) and reflects the complementarity of the two bundle goods (together with the curvature of an assumed utility function of the two bundle goods). Coefficient *b* defines the currency (value relationship between the goods). The range of this coefficient defined the homothetic search range in the free-fit model. We used the Matlab Global Optimization Toolbox to implement this coefficient search (Dataset S1C), minimizing the mean *R*^{2} for each individual curve by iterating 100–1,000 times through all ICs of a given indifference map.

For out-of-points predictions, we used the homothetic model to predict new IPs that had not been used for constructing the homothetic model. These IPs were located on ICs whose other IPs had been used for constructing the model. We performed the following 68 tests in animals A or B:

• (Blackcurrant, grape): 15 IPs, five ICs, 240 trials per IC, three IPs on highest IC

• (Blackcurrant, strawberry): 12 IPs, four ICs, 240 trials per IC, three IPs on highest IC

• (Blackcurrant, water): 17 IPs, four ICs, 336 trials per IC, four IPs on highest IC

• (Blackcurrant + MSG, grape + IMP): 15 IPs, five ICs, 240 trials per IC, three IPs on highest IC

• (Blackcurrant, apple squash): nine IPs, three ICs, 240 trials per IC, three IPs on highest IC.

For out-of-curves predictions, we used the homothetic model to predict new IPs on new ICs that had not been used for constructing the homothetic model. We removed all IPs belonging to one IC from the homothetic fitting procedure and constructed a reduced homothetic model. Then we inferred a homothetic curve from the coefficients of the reduced homothetic function that corresponded most closely to the omitted IC. We selected a test point on the inferred homothetic curve by taking a specific *x* coordinate (liquid B) and reading the corresponding *y* coordinate (liquid A) on the homothetic function. Then we estimated a new IP in choices between a reference bundle anchored to the *y* intercept of the tested homothetic IC and a variable bundle; the variable bundle had the *x* coordinate of the test point, but its *y* coordinate was obtained at choice indifference from Weibull fits to psychophysical variations of liquid A (Fig. 3*E*). The crucial step in the out-of-curve prediction consisted in comparing the new IP and the inferred homothetic curve, specifically in *y* coordinates while holding the *x* coordinate constant (Fig. 3*E*). In this way, we performed 160 tests, namely 10 points per homothetic curve, on all four curves of a given bundle type, on all four basic bundle types (blackcurrant juice vs. grape juice, strawberry juice, water, and blackcurrant juice itself).

### Formal Description of Arrow’s WARP.

The standard economic model of preference and choice emerged as an explanation of relationships between observed market behavior and market demand functions and is derived from the hypothesis that the preferences of the choosing agents are rational. Although the market context of the model focuses on choices from sets of alternatives constrained by prices and incomes, Arrow (6) and Richter (41, 42) generalized the model to include choices over arbitrary sets. Our experiments rest on the generalization.

Formally, preferences are represented by a binary relation. Let Z be a finite set of alternatives. A preference relation is a binary relation, R, on Z. The relation xRy can be interpreted as x is preferred or indifferent to y. If a concept of utility is invoked, it could be interpreted as x produces at least as much utility as does y, but the use of the utility concept involves assumptions in addition to those developed here. Although a wide range of concepts of preference are found in the literature, especially as related to cases where the data are noisy, the standard model assumes that preference relations satisfy axioms A1 and A2. We formulate in analogy to Arrow (6):

• (A1) for all x and y, xRy or yRx.

• (A2) for all x, y, and z, xRy and yRz imply xRz.

Observed choices are represented by a choice function. For any X ⊂ Z, a choice function C(X) maps a nonnull set X into a nonnull subset of x. For a binary relation R, we define C(X) = {x: x ε X, xRy for all y ε X}. If the choice function exists it is said to be derived from R. In different language, if such a relationship exists, the choice function is said to be rationalized by R (41).

The standard model connects preferences and choices with the following axiom:

• (A3) WARP: If X ⊂ Y and C(Y) ∩ X is nonnull, then C(X) = C(Y) ∩ X.

Axiom A3 can be given the following intuitive interpretation: If some elements are chosen out of a set Y, and if the range of alternatives is narrowed to X but still contains some previously chosen elements, then no previously unchosen element becomes chosen, and no previously chosen element becomes unchosen. Equivalently, if any of the elements chosen from X are also chosen when the set of options is expanded to a set Y that contains all of X, then all of the elements chosen from X are among the elements chosen from Y. Axiom A3 is the so-called WARP and characterizes the classical model as summarized by theorem 2 and theorem 3 of Arrow (6, pp. 124–125):

Theorem: Let Z be a finite set and let C(X) be a choice function defined for all X ⊂ Z. If R is any binary relation defined on Z satisfying axioms A1 and A2, and if C(X) is derived from R, then C(X) satisfies WARP. Furthermore if C(X) satisfies WARP, then there exists a binary relation on Z that satisfies axioms A1 and A2 that rationalizes C(X).

The theorem summarizes the classical economic model as having two parts. First, preference is modeled as a (complete) relation between pairs of options that satisfies transitivity. Secondly, the relationship between preferences over pairs and choice over larger sets is summarized (abstracting from the noisy features of observations) by an optimization relationship in which the choice over the large set is the set of options each of which is at least as good as any other option.

Noisy choices are accommodated as follows: Define xRy to mean p(x,{x,y}) ≥ p(y{x,y}). To satisfy WARP, xRy and xRz should imply that p(x{x,y,z}) > p(y{x,y,z}) and p(x{x,y,z}) > p(z{x,y,z}). Furthermore R is transitive. By contrast, the Luce model is more specific and assumes that p(x{x,y})/p(y,{x,y}) = p(x{x,y,z})/p(y,{x,y,z}) (15).

## Discussion

This study investigated basic economic choice processes modeled on principles of revealed preference theory. Monkeys made repeated choices between bundles of two goods; in each choice they performed a constant single-arm movement. The maximal-utility choices were not made with certainty on each individual trial, as in standard presentations of revealed preference theory, but involved a degree of randomness by which the best option was chosen most frequently but not invariably. The quantities of the chosen rewards at choice indifference conformed to nonoverlapping ICs and indicated that the animals gave up some quantity of one bundle reward to gain one unit of the other bundle reward; these choices did not seem to reflect the maximization of only one good (lexicographic preferences). The curvature of the measured ICs quantitatively reflected the relative exchange values of the two bundle goods and was convex with complementary, synergistic goods, linear with substitutable goods, and concave with noncomplementary, anti-synergistic goods. The IC slopes were negative (positive MRS) with attractive goods and were positive (negative MRS) with unfavorable goods. Higher ICs arose from larger liquid quantities in the bundles. The validity of the ICs was confirmed by out-of-sample predictions from homothetic maps fitted to all ICs of a given indifference map. The characteristics of these ICs suggested that the animals made choices “as if” they understood the multicomponent nature of rewards and meaningfully managed the tradeoff between the different goods of each bundle; the tradeoff indicated continuous integration of utilities from different goods, thus providing the highest benefit from all available goods. The results from our transitivity tests attested to the consistency of these preferences. The changes between option sets of different sizes satisfied WARP as defined by Arrow (6). According to this fundamental principle of revealed preference theory, the monkeys behaved overall “as if” they maximized utility based on internal representations of preferences characterized by principles of substitution. In this sense, the ICs mark the transition from biological rewards, which are necessary for survival, to tradable economic goods, which are beneficial for welfare and evolutionary fitness.

### Empirical Testing Conditions.

Our task design implemented the MRS directly: The animal gave up some amount of one good to obtain one unit of the other good. This tradeoff requires maintained utility, as evidenced by choice indifference, to avoid losses or gains. Maintaining utility in choices can be achieved by using a constant reference option against which the changed bundle is compared. Such a simple design would be beneficial for interpreting later neuronal data. Even more simple would be to use only a single good as the reference option, but this design would have compromised the symmetry against the variable bundle. We implemented the tradeoff by increasing one good by one unit in the variable bundle and psychophysically determining the amount of the other good being given up at choice indifference. By contrast, previous rat experiments on revealed preference modeled the tradeoff by allowing the animal to distribute freely a limited number of lever presses to obtain two single-good options (8, 28, 29). Although this design involved simpler choice options, it was more complex because of the variable number of movements, which would require additional controls in neurophysiological investigations. Importantly, the two animal species showed similar tradeoff between goods across different options (8, 28, 29) and within a single choice option (current study), demonstrating independence from the particular mechanism eliciting these preferences. Future studies may explore other eliciting mechanisms to assess the generality of reward tradeoffs.

Our bundles with two goods contrasted with choice options containing only one good in neurophysiological studies (18⇓⇓⇓⇓⇓–24). Our anchoring choices between bundles positioned at the axes of the 2D map conformed to these proven methods. Specifically, using the same two-component stimuli shown in Fig. 1*B*, we varied liquid A while keeping liquid B at 0 mL (variation along *y* axis) or varied liquid B while keeping liquid A at 0 mL (variation along the *x* axis). The measured behavioral choices between these single-good, degenerated bundles integrated well into the ICs obtained with the full bundles of two nonzero outcomes (Fig. 2 *A–D*, colored dots along the axes). Thus, the behavioral data obtained with two-good choice options (bundles) compared well with data from single-good options tested in the same animals. Further similarities between of our bundle options and previously used single-good options are seen with higher choice frequencies for larger, more frequent, or subjectively higher valued rewards (18⇓⇓⇓⇓⇓–24) and with satisfaction of transitivity in noisy choices (9, 24). Taken together, our use of choice options with two goods (bundles) did not seem to generate undue choice biases.

Previous monkey experiments investigated choices with single-good options and constant action requirement (21, 24). Such studies do not test tradeoff under the assumption of constant utility, which is inherent in the notion of ICs underlying revealed preference principles. A tradeoff intrinsically requires maintained utility; otherwise the exchange becomes a gain or loss and prevents the establishment of an IC of equally valued options. An easy way to maintain utility would be to use a constant reference bundle. That bundle defines the utility and serves as an alternate option to a bundle whose components are being varied relative to each other while titrating for indifference against the constant reference. By contrast, a simple reward increase or decrease in a single-good option against a constant reference option amounts to a gain or loss rather than maintained utility. Thus, the current design with two-goods options is close to being minimal, apart from a single-good reference option at the price of option asymmetry.

The use of ICs to study utility maximization rests on the hypothesis of an internal value function that links the physical value of the liquids, as measured in milliliters, to the subjective value conceptualized as utility. Such value functions are assumed to be positive, monotonic, and nonasymptotic for money in humans but may show saturation and even nonmonotonic curvature with alimentary rewards. The continuous linear or convex ICs for our basic bundles are consistent with positive, monotonic value functions, whereas the concave ICs for goods to which the animal has been sated may reflect nonmonotonic value functions, and the positively sloped ICs suggest negative value functions for unfavorable goods.

### Profiles of ICs.

The monkeys’ noisy choices conformed to standard ICs with the four basic bundles that combined blackcurrant juice with grape juice, strawberry juice, water, and blackcurrant juice itself (Fig. 2 *A–H*). The lack of overlap of the ICs at liquid steps of 0.15–0.2 mL attested to the validity of the estimated ICs. The IC slope showed the value relationship between the two bundle goods (currency). The near-linear ICs of the four basic bundles suggested that their goods were almost substitutes. The relatively flat IC of the bundle (blackcurrant, water) suggested that water has a lower subjective per-unit value than blackcurrant juice. Apart from the bundles containing only blackcurrant juice, the slope of the ICs of the basic bundles was asymmetric and differed from the −45° diagonal line, suggesting that the animals valued the bundle juices differently.

The convexity of ICs (positive elasticity) increased when MSG and IMP were added to the juices. These substances are known taste enhancers in humans (30, 31). Both bundles of (blackcurrant, grape) and (blackcurrant, blackcurrant) to which MSG and IMP were added showed this complementary effect (Fig. 2 *I* and *J*), perhaps suggesting similar taste-enhancing effects in monkeys. Convex ICs also characterized the choices rats made between single-reward options of root beer and quinine solution that were linked as a bundle by common budget constraint (7, 28, 29). The comparable ICs demonstrate similarly well-structured preferences in the two species.

Concave ICs (negative elasticity) were rare. The animal initially gave up less blackcurrant juice to obtain one unit of apple squash and traded in more blackcurrant juice only to receive larger quantities of apple squash (Fig. 2*K*). Thus, we were able to elicit concave ICs that are compatible with the notion of antagonism (inverse synergy) between the goods combined in a bundle.

A second instance of concave ICs involved satiety. Although osmolality is a good predictor of task performance (Fig. S2*E*) and valuation of liquids (32), it does not distinguish between general and sensory specific satiety. Such distinction would be desired for neurophysiological studies in monkeys and rats (33, 34) but has been reported only occasionally (35). By contrast, ICs may reveal sensory-specific satiety by becoming more flat and even concave. Substantial quantities of grape juice (100–175 mL) had such an effect when contrasted with ICs for grape juice in unsated animals (Fig. 2 *E* and *L*). The antagonism reflected in IC concavity may suggest that the animal was unwilling to give up precious unsated juice (blackcurrant) to obtain sated juice (grape) (Fig. 2*L*), thus indicating the low value of the sated juice.

For some of the studied bundles, ICs had positive slope and thus negative MRS. Such bundles contained lemon juice, yogurt, or saline (Fig. 2 *M*–*O*). The animal required increasing quantities of blackcurrant juice to accept more of these goods but nevertheless showed well-organized preferences, as evidenced by the curviparallel, nonoverlapping character of the ICs. The most likely interpretation was that such goods were unfavorable for the animal but were not entirely inconsumable; the animal simply required more of one good to compensate for accepting the other, unfavorable good.

Taken together, the ICs for the various bundles showed distinct and meaningful patterns of value relationships between two goods (currency) and specific curvatures (elasticity). These systematic and consistent variations of ICs suggested that the monkeys had specific preferences that were elicited by the choices.

### Out-of-Sample Validation.

Predictions from the homothetic models of the empirical indifference maps provided stringent tests for the validity of the ICs. The model provided numerical data for the two main parameters of ICs, currency (exchange rate between bundle goods) and complementarity (how well the goods fit together), both of which corroborated the characteristics of the ICs and maps apparent from the observed choices. We then used the homothetic model to predict IPs that were not used for establishing the model. To be valid, the model’s accuracy in predicting IPs should lie within the accuracy of Weibull IP fits and polynomial IC fits. Indeed, they did so, suggesting that the empirically measured ICs reflected systematic and reproducible choices by the monkeys as a necessary condition for investigating revealed preferences.

### Transitivity.

Satisfaction of transitivity is a crucial condition for inferring consistent preferences. We placed bundles at specific IPs of established ICs and tested transitivity in repeated choices as frequency of correct closures. The animals’ behavior satisfied transitivity by showing low frequencies of dominated choices in three tests. First, transitivity satisfaction was observed when bundles with partly physically decreasing bundle components were aligned according to increasing ICs. This test ruled out explanations by simple physical quantity ordering. Second, transitivity satisfaction with bundles ranked according to physical monotonicity confirmed the assumption of a positive monotonic value function, a necessary condition for the third transitivity test. Third, transitivity satisfaction was observed with bundles that were arranged partly according to physically inferred preferences. This test allowed us to confirm transitivity satisfaction with the sensitive Afriat-like index that accounted for physical reward differences (25⇓–27). Together, these transitivity satisfactions suggested that the ICs reflected consistent rank-ordering of the animals’ preferences.

### Independence of Option Set Size: WARP.

These tests derive from a theory of choice with two identifiable concepts: preference and optimization. Indifference maps reveal a theoretical relationship among bundles to which the animal is indifferent. Thus, when presented with a set of several feasible options to which the animal revealed noisy preference or indifference, the choices from a smaller subset or larger set containing some previously chosen and unchosen elements should be similarly frequent. For several months before undergoing these tests, our monkeys had experienced a stable reward distribution, which is known to slow behavioral adaptations (36) and to render economic choices resistant to short-term adaptation (9). Such stable conditions would favor investigating the influence of option set size on preferences with little intervening adaptation to instantaneous change of option distributions.

The preferences remained stable irrespective of set size in two tests that extended the constructs of indifference maps to actual choices. First, when presented with a set of feasible options on ICs established with different bundle set sizes, the animal continued to choose the option on the highest IC irrespective of the bundle set. Choices in three-bundle sets showed higher-than-mean frequencies for bundles on superior ICs (that had been established with two-bundle sets), indifference for bundles on same ICs, and lower-than-mean frequencies for bundles on inferior ICs (Fig. 6 *A*–*G* and Fig. S6 *A*–*D*). This result demonstrated optimization; the animal exhibited the propensity to choose the optimum according to its preferences as evidenced by its indifference map, irrespective of bundle set size. Second, the preferences, as elicited by direct choices, were maintained when changing between two-bundle and three-bundle sets (Fig. 6 *H* and *I* and Fig. S6 *E* and *F*). This test involving explicit choices, beyond comparisons involving ICs established with different bundle sets, provided the most direct evidence for satisfaction of WARP according to Arrow’s definition (6). Taken together, in following the basic principles suggested by revealed preference theory, the animals behaved “as if” they were choosing the best option irrespective of what else was on offer.

## Methods

### Animals.

The Home Office of the United Kingdom approved all experimental procedures. Two male monkeys (*Macaca mulatta*) weighing 9.0 kg and 10.0 kg, respectively, were used in the experiment. Neither animal had been used in any prior study.

### Behavior.

To obtain individual ICs, we set one liquid (A or B) of the variable bundle either to one of the axes’ anchor points or to a pseudorandom quantity away from the axes, conforming to a unit grid of 0.1 mL. Then we psychophysically varied the quantity of the other liquid (B or A) of the variable bundle across the full testing range to estimate empirically the choice IP (*P* = 0.5 each bundle) against the reference bundle within a 95% CI from fitting a Weibull function (Fig. 1*C*); repeatedly tested, Weibull-fitted IPs varied very little (Fig. S1). After each IP assessment, we made the variable bundle the new reference bundle and defined the new variable bundle by incrementing liquid A or B. We alternated the direction of change in the variable bundle between left-to-right and right-to-left. An initial test in an unexperienced monkey had shown diverging, nonoverlapping ICs when the variable bundle advanced from opposite anchor points over longer distances toward the center of the *x*–*y* map; however, later probe tests failed to confirm such divergences and demonstrated consistent ICs in monkeys with several months of experience during all working days with a stable, unchanging reward distribution. This conclusion is supported by the choice consistency seen between two- and three-bundle option sets (Fig. 6 and Fig. S6).

The assessment of each IP required 80 trials of five equally spaced and equally frequently tested psychophysical test points, irrespective of the animal's behavior (eight trials for each pseudorandomly alternating left and right stimulus position). Thus, a typical IC with five IPs required 400 trials (in ≥1 d). Three-option tests (Fig. 6) used two reference bundles and one variable bundle and assessed choice indifference (*P* = 0.33 each option) psychophysically with Weibull fits in analogy to the two-option choices.

### Curve Fitting.

We fit individual ICs composed of multiple Weibull-fit IPs with a linear (first-degree) polynomial (*y* = *ax* + *b*), a quadratic (second-degree) polynomial (*y* = *ax*^{2} + *bx* + *c*; where *a* represents curvature, and *b* represents slope or currency) and a hyperbolic function (*d* = *ax* + *by* + *cxy*), using weighted least mean squares (*P* < 0.05). The quadratic polynomial provided the best combination of good fit and simplicity (Dataset S1*A*). We fit a single homothetic function to a whole indifference map using the common, single, best-fitting quadratic polynomial for all its ICs (“homothety” refers to the curviparallel character of lines within a given map) (*SI Methods* and Dataset S1*C*). To find the best fit, we let both polynomial coefficients vary within a constrained range, starting with the coefficients shown in Dataset S1*A*. We used the Matlab Global Optimization Toolbox to implement this coefficient search.

### Severity of Transitivity Violation.

For the transitivity test based partly on physically inferred preference relationships (the third transitivity test), we assessed the severity of transitivity violation with Afriat’s Critical Cost Efficiency Index that usually is applied to budget lines (25⇓–27). Our test used a line connecting the test bundles (b and c in Fig. 5*A*) instead of the budget line in the standard Afriat Index. Obtaining the Index required repeated parallel displacement of the test bundles b and c with its connecting line toward the origin of the indifference map until complete transitivity satisfaction was reached. Each Afriat-like test involved on average three displacements of 0.05 mL of liquid. Each step required one direct preference test between the displaced bundles (b and c) and one transitive closure test (a versus d, a versus c, or b versus d). The Index was calculated as *e* = *y*1/*y*2, *y*1 and *y*2 being the *y* axis intercepts of the displaced and the initial bundle-connecting line, respectively [see C/D ratio in Varian (27)]; the range from 0.0 to 1.0 inversely reflects the severity of transitivity violation, 1.0 indicating no required line displacement and thus no violation.

## Acknowledgments

We thank Aled David for invaluable help with animal training and Matt Shum, Kim Border, and three anonymous reviewers for helpful comments on the manuscript. This work was supported by The Wellcome Trust, the European Research Council, and National Institutes of Health Caltech Conte Center.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. Email: ws234{at}cam.ac.uk.

Author contributions: A.P.-B., C.R.P., and W.S. designed research; A.P.-B. performed research; A.P.-B. analyzed data; and A.P.-B., C.R.P., and W.S. wrote the paper.

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1612010114/-/DCSupplemental.

Freely available online through the PNAS open access option.

## References

- ↵.
- Samuelson P

- ↵
- ↵.
- Andreoni J,
- Gillen BJ,
- Harbaugh WT

- ↵.
- Mas-Colell A,
- Whinston M,
- Green J

- ↵.
- Perloff JM

- ↵
- ↵
- ↵.
- Kagel JH,
- Battalio RC,
- Green L

- ↵.
- Genest W,
- Stauffer WR,
- Schultz W

- ↵.
- Grabenhorst F,
- Hernádi I,
- Schultz W

- ↵
- ↵
- ↵.
- YouTube

- ↵.
- Green DM,
- Swets JA

- ↵.
- Luce RD

- ↵.
- Chipman JS,
- McFadden DL,
- Richter MK

- McFadden DL,
- Richter MK

- ↵.
- McFadden DL

- ↵
- ↵.
- Sugrue LP,
- Corrado GS,
- Newsome WT

- ↵.
- Samejima K,
- Ueda Y,
- Doya K,
- Kimura M

- ↵
- ↵.
- Kobayashi S,
- Schultz W

- ↵
- ↵.
- Lak A,
- Stauffer WR,
- Schultz W

- ↵
- ↵
- ↵.
- Varian H

- ↵
- ↵.
- Battalio RC,
- Kagel JH,
- Kogut CA

- ↵
- ↵
- ↵.
- Yamada H,
- Tymula A,
- Louie K,
- Glimcher PW

- ↵.
- Bouret S,
- Richmond BJ

- ↵
- ↵.
- Critchley HD,
- Rolls ET

- ↵.
- Kobayashi S,
- Pinto de Carvalho O,
- Schultz W

- ↵.
- Cantillo V,
- Amaya J,
- de Dios Ortuzar J

- ↵
- ↵
- ↵
- ↵
- ↵.
- Richter M

*Preferences, Utility and Demand*, eds Chipmann J, Hurwicz L, Richter M, Sonnenschein H (Harcourt Brace Jovanovich, New York).

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Social Sciences
- Economic Sciences

- Biological Sciences
- Neuroscience