## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Network model of top-down influences on local gain and contextual interactions in visual cortex

Contributed by Charles D. Gilbert, September 11, 2013 (sent for review January 22, 2013)

## Significance

Perceptual grouping links line segments that define object contours and distinguishes them from background contours. This process is reflected in the responses to contours of neurons in primary visual cortex (V1), and depends on long-range horizontal cortical connections. We present a network model, based on an interaction between recurrent inputs to V1 and intrinsic connections within V1, which accounts for task-dependent changes in the properties of V1 neurons. The model simulates top-down modulation of effective connectivity of intrinsic cortical connections among biophysically realistic neurons. It quantitatively reproduces the magnitude and time course of the facilitation of V1 neuronal responses to contours.

## Abstract

The visual system uses continuity as a cue for grouping oriented line segments that define object boundaries in complex visual scenes. Many studies support the idea that long-range intrinsic horizontal connections in early visual cortex contribute to this grouping. Top-down influences in primary visual cortex (V1) play an important role in the processes of contour integration and perceptual saliency, with contour-related responses being task dependent. This suggests an interaction between recurrent inputs to V1 and intrinsic connections within V1 that enables V1 neurons to respond differently under different conditions. We created a network model that simulates parametrically the control of local gain by hypothetical top-down modification of local recurrence. These local gain changes, as a consequence of network dynamics in our model, enable modulation of contextual interactions in a task-dependent manner. Our model displays contour-related facilitation of neuronal responses and differential foreground vs. background responses over the neuronal ensemble, accounting for the perceptual pop-out of salient contours. It quantitatively reproduces the results of single-unit recording experiments in V1, highlighting salient contours and replicating the time course of contextual influences. We show by means of phase-plane analysis that the model operates stably even in the presence of large inputs. Our model shows how a simple form of top-down modulation of the effective connectivity of intrinsic cortical connections among biophysically realistic neurons can account for some of the response changes seen in perceptual learning and task switching.

For a visual system, one of the most important steps in object recognition is to distinguish what belongs to the object and what is part of the background. The borders of most objects consist of smooth edges that for the most part change direction slowly, with only occasional abrupt changes. Local contour elements can thus be integrated along the edge following the rule of “good continuation” discovered by the Gestalt school of psychologists (1).

Studies in awake behaving monkeys show that the responses of single neurons in primary visual cortex (V1) to contours embedded in a complex background correlate closely with the animal’s performance in detecting the contours (2). Moreover, the contour-related responses change with experience: before training on contour detection the V1 responses contain little information about the camouflaged contour, but after perceptual learning the responses increase in parallel with the animals’ improvement in contour detection (3). The contour-related responses are subject to task-specific top-down influences, being strong when the animal performs the contour detection task, much weaker when it performs an irrelevant task, and entirely absent under anesthesia (3).

Previous models have attempted to account for some of the mechanisms underlying contour integration in V1 (4⇓⇓⇓⇓⇓–10), but they do not account for the top-down control of the strength of contextual interactions. A particular challenge is to construct a network with massive horizontal connectivity that is stable to large variations in input intensity, particularly while exhibiting temporal dynamics similar to those observed experimentally. In this study, we constructed and compared several large-scale network models for visual contour integration with the aim of categorizing minimal requirements for all of the following properties to be displayed in the same network: The network should allow top-down influences to change (or “gate”) the effective contextual interactions mediated by long-range horizontal connections, should perform nonlinear contour integration similar to that seen in behaving animals, should reproduce the time course of V1 responses to complex stimuli, should be stable to variations in input intensity, and should be robust to noise. A large-scale network of paired excitatory and inhibitory nodes fulfilled all these requirements. Stability could be achieved in these networks with conductance-based model neurons with subtractive inhibition or alternatively, with current-based model neurons with divisive inhibition. We present simulations using both types of networks. These simulations demonstrate contour integration and sharpening in a realistic image, the time course of contour pop-out in a simpler image for which experimental data are available, and robustness to noise. However, stabile performance of a third type of model—current-based neurons with subtractive inhibition—could not be achieved under conditions that would permit adequate agreement with time course data from experiments. Phase-plane analyses of network dynamics were performed to analyze the conditions under which stable responses were possible with the three network node types; these analyses are summarized here with computational details given in *SI Text*.

## Materials and Methods

Our aim is to explore the most simple network configurations that allow changes in the effective strength of long-range connectivity in V1 (for examples, see refs. 2 and 11) without altering the dynamical parameters of the network nodes themselves in the absence of long-range inputs, and without directly changing the synaptic strengths of any of the connections. We find that this can be accomplished using only two network nodes for the local circuit. With this basic module, we explore how changing the strengths of putative top-down interactions could give rise to the desired result.

We first describe the local circuit that forms the basic module of the full network. We then present the equations governing network node responses for the three types of models considered. We describe how multiple orientation-selective copies of the basic module are connected to form the full model. We show how a recasting of connection-strength parameters permits a clearer description of network dynamics when top-down modulation is included, and therefore, how changing the local connection parameters allows the effective strength of long-range connections to change without affecting the responses to bottom-up inputs alone. Finally, we list the parameters that we used to account for the experimentally observed responses in V1 superficial layers.

### Structure of the Basic Network Module.

Taking into account the characteristics of real neurons, the basic network module comprises reciprocally interconnected excitatory (designated “E”) and inhibitory (designated “I”) nodes; the excitatory node is also connected to itself (Fig. 1*A*). Each such pair of network nodes models the population averages of the activities of the excitatory and the inhibitory neurons in the superficial layers of a single orientation column of monkey V1 and their local interactions.

### Network Node Dynamics.

A key feature of conductance-based models is that the current through each conductance is proportional to the difference between the membrane potential and the equilibrium potential for that conductance, thus reducing the influence of the conductance when the potential is near equilibrium. (We ignore changes in the equilibrium potential caused by local changes in ion concentration due to passage of ions through channels or for other reasons.) This feature enables this network, with only subtractive inhibition, to remain stable to large variations in input current; as we show in *SI Text*, a network with simpler current-based node dynamics requires divisive inhibition to attain a similar degree of stability. The dynamics of the model can be represented graphically by phase-plane analysis (Fig. 2). We model the dynamics of the excitatory and the (here subtractive) inhibitory network node with the following two differential equations:

In these equations, and . *x* is an internal state variable that represents the activity of the excitatory node; it may be thought of as an abstraction of the membrane potential. *f*_{x}(*x*) is the response function that transforms the activity, *x*, to the cell's response rate; it represents the membrane voltage-to-firing rate transformation that is often characterized via the current-firing relationship of a neuron. The constant *τ*_{x} is the cell time constant; *g*_{xx}, the conductance of the connection of the excitatory network node onto itself; *g*_{xy}, the conductance of the connection of the inhibitory node onto the excitatory node; and *g*_{e}, the conductance of the combined external excitatory input to the excitatory node. Similarly, *y* denotes the activity of the inhibitory node; *f*_{y}(*y*), its response rate; *τ*_{y}, its cell time constant; *g*_{yx}, the conductance of the connection from the excitatory network node onto the inhibitory one; and *g*_{i}, the conductance of any external excitatory inputs to the inhibitory node. (These equations include the case that both types of nodes receive bottom-up input, but we show later that the network behavior is quite similar if only the excitatory nodes receive such input.)

The two response functions are threshold-linear functions:

Their slopes are *m*_{x} and *m*_{y}, respectively, and the thresholds, *T*_{x} and *T*_{y}.

The dynamics for the current-based model with subtractive inhibition is given by the following:

These equations are similar to those above for the conductance-based model, except that the conductances *g*_{xx}, etc., are replaced by the connection strengths *J*_{xx} and so on, respectively, and there is no dependence of the currents on the membrane potential. The response functions *f*_{x}(*x*) and *f*_{y}(*y*) are the same as given in Eqs. **3** and **4**.

Finally, when subtractive inhibition is replaced by divisive inhibition, we have the following dynamics:

In this instantiation, instead of subtracting from the other inputs to the excitatory network node, the activity of the inhibitory node, *f*_{y}(*y*), after multiplication by the connection strength, *J*_{xy}, divisively modulates the self- and external inputs.

### Long-Range Connectivity.

In the full network, each orientation column is modeled by a copy of the local circuit, and multiple columns are connected by long-range horizontal connections (Fig. 1*B*). Specifically, at each spatial location, *K* = 12 orientation columns form a hypercolumn (hexagons in Fig. 1*C*). Each local circuit is tuned to a different orientation *θ* and circuits representing different optimal orientations are separated by Δ*θ* = π/*K* = 15° (wedges in Fig. 1*C*).

We define an angular tuning function that is based on the cosine and has as arguments an orientation, an optimal orientation, and a full width at half-maximum (FWHM) tuning width. This function is 1 when the input angle is identical to the optimal angle, 0.5 when it is 1/2 tuning width away from the optimal angle to either side, and 0 if it is one width or more away:

With these definitions, the formula for the strengths of the long-range excitatory-to-excitatory connections (also referred to as “E-E” connections) is as follows:

Here *l*_{xx} is a factor that determines the overall synaptic strength, Δ*x* and Δ*y* are the *x* and *y* distances between the two nodes relative to a falloff distance parameter *d*_{σ}, and *θ*_{1} and *θ*_{2} are, respectively, the angles between the preferred orientation of the local neuron (*θ*_{1}) and the “remote” neuron (*θ*_{2}) and the line connecting the two neurons. In short, there is Gaussian decay with distance, the optimal connection strength is achieved for cocircular geometry, and there is a penalty for curvature (with the optimal configuration being collinearity). The chosen form for the curvature penalty (and the rest of the function) additionally ensures that the connectivity is symmetric between any two given neurons. Note also that the connections from the local inhibitory node to the local excitatory node do not just target the same orientation—they also connect to the nodes at the same spatial location with different optimal orientations and with the same tuning width as the bottom-up input tuning (not shown in Fig. 1*C*).

The formula for the strengths of the long-range excitatory-to-inhibitory connections (“E-I” connections) and the reverse (“I-E” connections) is almost identical to Eq. **10**:

The only difference is that the optimal configuration is now a side-by-side parallel line rather than a collinear one. The long-range connection strengths are shown in Fig. 1*D* for an orientation column that responds optimally to a horizontal short line.

These connection strengths are also referred to as kernel strengths because the same connection pattern is repeated at each node. The overall decay of the kernel strength with distance can be well fit with a Gaussian function with a decay constant of three hypercolumn distances [i.e., the density at ∼2.25–2.4 mm is 1/*e* = 36.7%, in agreement with the known anatomy (12)]. Within this envelope, the strongest connection strength between two elements is for collinear or cocircular configuration, as expected given the frequency of such configurations in natural images (13, 14). Last, there is a penalty for increasing curvature, and the maximal connection strength of E-E connections is for collinear segments, whereas the maximal connection strength for E-I connections is for parallel flanking elements. This ensures an overall preference for connecting columns of similar optimal orientation (12, 15⇓–17) and gives very good agreement with the psychophysical and physiological measures of the contextual interactions (18). The exact shape of the long-range kernels is not essential for the results in this paper as long as they facilitate smooth contours and suppress parallel flankers—similar results can be obtained by using other kernels with an equivalent spatial arrangement, e.g., those from refs. 7, 19, and 20. The kernels used here were chosen to be as simple as possible while being in agreement with the physiological, psychophysical, and anatomical evidence.

When analyzing the dynamics of the full network, it is necessary to distinguish between bottom-up and lateral input into the basic network circuit. We do so by splitting the input current *i*_{e} or the corresponding conductance *g*_{e} to the excitatory network node into three components: (*i*) the bottom-up input *I* (or *G*); (*ii*) the long-range input from the horizontal connections Δ*i*_{e} (or Δ*g*_{e)}; and (*iii*) a constant background input *I*_{0e} (or *G*_{0e}):

Correspondingly, the input current *i*_{i} or conductance *g*_{i} to the inhibitory network node can be split into (*i*) the component from bottom-up input *αI* or *αG* (with *α* being the input strength relative to that of the excitatory nodes; *α* = 0 was chosen for simplicity in the figures, i.e., no bottom-up input to inhibitory network nodes); (*ii*) the input received via long-range horizontal connections Δ*i*_{i} (or Δ*g*_{i}); and (*iii*) the background input *I*_{0i} (or *G*_{0i}):

For the naturalistic stimuli in Figs. 3 and 4 *D1*, *E1*, and *F1* (the Lena image), only the signal strength at the optimal orientation at each spatial location was used as input to the network, convolved with a tuning function, viz. the cos_{fwhm} function in Eq. **9**, to allow for the network nodes to have a realistic input tuning. This simplification significantly reduces computation time by eliminating calculation of the responses for the nonoptimum orientations, which, in reality, would have been inhibited by the one orientation that is computed. The necessary signal characteristics were determined as in Sigman et al. (13) by using steerable filters (21). This approach provides an efficient way to calculate the maximum-strength orientation and corresponding energy at any number of orientations at once, while using filters similar to complex cell receptive fields (RFs), namely, a quadrature pair of simple-cell RF type filters. Such filters have been used in related studies (e.g., refs. 10, 13, and 22). We chose to use G2 and H2 filter pairs resembling simple cells with 2 and 3 subfields, respectively, with values of 1 pixel for the radius of the Gaussian decay of the filters to ensure that the size of the classical RF is as small as possible. However, the exact function or size of the kernels does not have much influence on the results: increasing and decreasing the size of the kernel, or using different input functions (pairs of Gabor functions for each orientation) gave very similar results (results not shown).

In much of the following discussion, we are motivated to define two additional combinations of the connection strength parameters *g*_{xx}, *g*_{xy} (or *J*_{xx}, *J*_{xy} for the current-based models). It will often be desirable, that in the steady-state limit, the excitatory and inhibitory node response rates should be a multiple *γ* of one another, i.e. (in the absence of any long-range horizontal inputs):

To accomplish this, it will be useful to define first, what we have termed the feedback set point:

and second, the feedback gain:

These additional parameters allow us to change the effective strength of the long-range horizontal connections without altering the steady-state response to the bottom-up component of the input. That is, at least for certain values of the connection parameters and of the bottom-up input, changing the FB_{gain} does not change the output of either excitatory or inhibitory network nodes in the absence of long-range inputs. Note that this reformulation does not increase the number of free parameters in the network, but only provides convenient names for two combinations of parameters that we want to manipulate independently. Also note that, in the figures, *γ* was always chosen to be 1, both for simplicity and because the firing rates of excitatory and inhibitory neurons in our experiments were always quite similar (compare Fig. 4 *C1* and *C2*).

### Parameter Values.

The parameters in the model were chosen to be as generic as possible while still taking into account the quantitative relationships found in experimental data.

*K* = 12 different orientations are explicitly represented so that there are 12 excitatory and 12 inhibitory nodes at each spatial location with a tuning separation of π/12 = 15°. The tuning widths for these long-range connections (*ϕ*_{fwhm} in Eq. **9**) were chosen to be π/4 = 45° in all cases (Eqs. **10** and **11**).

A normalized voltage scale was used in the model. Based on commonly accepted cellular parameters (23) (leakage reversal potential *V*_{L} = −70 mV, excitatory reversal potential *V*_{E} = 0 mV, inhibitory reversal potential *V*_{I} = −80 mV, and firing threshold *V*_{T} = −55 mV), and with the reversal potentials chosen the same way as in Shelley (24) by defining the leakage potential to be *v*_{L} = 0 and the threshold *v*_{t} to be 1, the normalized excitatory reversal potential is *v*_{e} = 14/3 and the inhibitory reversal potential is *v*_{i} = −2/3. The excitatory and inhibitory node thresholds *T*_{x} and *T*_{y} were chosen to be 1 and the response function slopes *m*_{x} and *m*_{y} were 1 and 2, respectively. The reason for choosing *m*_{y} as 2 is that, although the current-firing relationship of inhibitory neurons is six times as steep as the one for excitatory neurons (e.g., ref. 25), any inhibitory nodes in the real V1 would probably be a mixed pool of excitatory and inhibitory neurons for divisive inhibition and thus would likely contain more excitatory neurons; so a compromise value of 2 was chosen. In addition, choosing a higher value would not affect the results qualitatively as long as the connection strengths/conductances were scaled to be correspondingly weaker.

The cell time constant *τ*_{x} was defined as 1. Because both excitatory and inhibitory neurons have similar cell time constants in vivo in V1 (e.g., ref. 25: regular spiking cells 10.4 ± 3.5 ms and fast spiking/inhibitory cells 7.6 ± 4.2 ms) and the inhibitory pool for divisive inhibition would likely contain both types of neurons, *τ*_{y} was set to 1 as well. In our model, choosing a faster inhibitory time constant would reduce the transient responses and would improve network stability further.

In the current-based models, the inhibitory-to-excitatory connection strength was *J*_{yx} = 1/*m*_{y} = 0.5 so that the response rate of excitatory and inhibitory network nodes was identical in the absence of horizontal inputs (and quite similar even with horizontal inputs; Fig. 4 *C1* and *C2*). The excitatory input *i*_{i} to the inhibitory network nodes was set to the inhibitory threshold, *i*_{i} = *T*_{y} = 1, and for simplicity, these nodes do not receive any further input. The excitatory input *i*_{e} to the excitatory nodes was *i*_{e} = 0.75 plus the input stimulus, which was the strength of the maximum orientation with a tuning FWHM of π/4 = 45°, and scaled from 0 to 2. In the conductance-based model, the variables *g*_{xx}, *g*_{xy}, *l*_{x}, *l*_{y}, *g*_{e}, and *g*_{i} play the same roles as the variables *J*_{xx}, *J*_{xy}, *l*_{x}, *l*_{y}, *i*_{e}, and *i*_{i}, respectively, in the current-based models, but they are multiplied by the difference between the normalized excitatory potential *v*_{e} and the activity *x*. Thus, we have scaled the conductances by dividing the corresponding connection strengths by the ratio between *v*_{e} and the threshold *T*_{x}, 11/3. The value for *g*_{yx} was chosen to be 0.225, because this gives a good fit to the desired firing rate relationship between excitatory and inhibitory nodes (i.e., they should be equal).

The decay parameter of the Gaussian was chosen to be , which provides a good match to the measured axonal density distribution of the long-range horizontal connections seen in V1 (12). The long-range connectivity scaling constants were chosen for all simulations as *l*_{xx} = *l*_{yx} = 0.1. Choosing the two constants to be identical ensured agreement with experimental data that find no clear orientation preference in any cortical direction (12). Because the long-range connections in our model are mainly connecting to excitatory neurons along the collinear, and to inhibitory neurons along the flanking direction, choosing *l*_{xx} and *l*_{yx} equal ensures this. However, it is important to note that dynamically the effective strength of E-I connections is greater than that of E-E connections because inhibitory nodes have a larger response-rate slope than excitatory ones (*m*_{y} = 2 vs. *m*_{x} = 1; the equivalent of a steeper *f–I* relationship). This leads to an overall suppression for stimuli activating E-E and E-I connections equally.

Last, the noise inputs for Fig. 4 *E* and *F* were chosen to be additive noise with a Gaussian amplitude distribution of *σ*_{n} = 0.2 (Fig. 4*E*) and *σ*_{n} = 0.4 (Fig. 4*F*), both with an exponential distribution of 0.1 cell time constant in the time domain. The second *σ*_{n} is twice as large as the first one; given that the threshold is 0.25 away from the background input, this is about as large as possible without driving a lot of spontaneous background activity. The noise in Fig. 4 *E* and *F* was added to the input *i*_{e}, but similar results can be obtained by adding it to both the excitatory and inhibitory network node activity.

### Implementation.

The phase-plane analysis was implemented in Mathematica (Wolfram Research). The full network was implemented in MATLAB (The Mathworks) with convolution via 3D-Fourier transforms to accelerate the calculations. Additionally, the Jacket toolbox (Accelereyes) was used to offload the simulation onto a graphics processing unit, accelerating the simulation by an order of magnitude. The calculations were performed on an NVIDIA GTX285 graphics card (NVIDIA Corporation). Simulation of the full network (220 × 220 × 24 network nodes), with each node connected to more than 500 others, could be performed in under 1 min for simulation durations of 20 cell time constants.

## Results

Our results are presented in two sections. The first section uses phase-plane analysis to demonstrate how changing the strength of the local interactions between the excitatory and the inhibitory network nodes can gate long-range horizontal connections without affecting the responses to local inputs. This same phase-plane analysis also demonstrates the stability of the network configurations. The second section presents a number of simulations that show how the network can gate long-range horizontal connections in a stable fashion. We show how the network responds to a large-scale naturalistic stimulus and also compare its responses with neurophysiological data from superficial layer neurons in V1 of monkeys performing a contour detection task. *SI Text* contains more mathematical details of the models including their analytical derivation and stability analysis, plus a simplified analytical model of the large-scale simulation of neural responses in the contour detection task.

### Phase-Plane Stability Analysis.

We used phase-plane analysis to explore the stability of dynamical models with specific reference to the conductance-based model with subtractive inhibition. For simplicity, we present only the analysis for the basic two-node network module. The extension to include long-range horizontal connections and the corresponding analyses of the current-based models are presented in *SI Text*.

In phase-plane analysis, the network dynamics of the local circuitry are depicted on a Cartesian plane (Fig. 2). This permits one to visualize the dynamics of two coupled differential equations at once on a grid with one axis representing the activity of the excitatory network node (*x*) and the other the activity of the inhibitory node (*y*). At each point on the grid, a small arrow is plotted, whose *x* and *y* components are the rates of change (*dx*/*dt* and *dy*/*dt*) at that grid point.

We first calculate the nullclines, the paths where the time derivatives of the network activities are zero. At potentially stable solutions to the network dynamics, both derivatives must be 0, and the two nullclines intersect. From any initial network state, one can find the evolution of the network dynamics by integrating the equations with small time steps, which is equivalent graphically to moving along the directions of the small arrows. For example, when starting from the initial state depicted by the black circles in Fig. 2 *Aa* and *Ba*, the network state follows the vector field in the phase plane (solid blue lines) until it reaches the stable equilibrium point where the two nullclines intersect.

The nullclines can be calculated by setting the derivatives to zero. For the excitatory nullcline, the result is as follows:To calculate the shape of this nullcline when both the excitatory and inhibitory node activities are above threshold (*x* > *T*_{x}, *y* > *T*_{y}), we can substitute the threshold functions *f*_{x}(*x*) and *f*_{y}(*y*) with their definitions (Eqs. **3** and **4**):For the inhibitory nullcline, the result is as follows:To find the shape above threshold, we replace *f*_{x}(*x*) with its definition from Eq. **3**:Thus, the inhibitory nullcline is of the general form *b***x*/(1 + *x*), and thus has a horizontal asymptote *y = v*_{e} as *x* grows to infinity.

In the absence of long-range horizontal connections, it is possible analytically to derive the steady-state solution of the network equations by letting the time constant *τ*_{y} of the inhibitory node approach 0, i.e., by having the inhibitory current act instantaneously. This has no influence on the position of the nullcline intersection but only affects the dynamics of approach to the steady state. However, this approach leads to complex quadratic equations for the values of *x* and *y* at equilibrium. More insight can be obtained by following the stepwise procedure outlined here. We begin by assuming *f*_{y}(*y*) is a simple multiple *γ* of *f*_{x}(*x*) as in Eq. **14**:The biological rationale for this assumption is that the firing rates of excitatory and inhibitory neurons in our experiments were very similar in a post hoc analysis (compare the dashed lines in Fig. 4 *C1* and *C2*; the firing rates are very similar, i.e., *γ* ∼ 1). Substituting the response rate functions from Eqs. **3** and **4**, it follows that, above threshold, the following linear relationship must be fulfilled by the steady-state solution to the network. We call this the response rate relationship:The excitatory and inhibitory nullclines must intersect on this line in the absence of long-range horizontal input.

To specify where along this line the intersection should lie, consider that ideally the intersection should not change for any value of the local recurrency, so we can let the local recurrency go to zero, i.e., *g*_{xx} → 0 (and thus *g*_{xy} → 0; Eq. **25**). This determines what we call the feedback set point: As one of our main goals is to be able to change the strength of the interaction between the excitatory and the inhibitory network node (i.e., the feedback gain) without changing their responses to bottom-up inputs only, we seek a stable equilibrium point where the nullcline intersection does not change its position in the phase plane as the bottom-up inputs change. This is the case when these inputs are matched to the feedback set point, and the results of this matching can be seen in the simulations of Fig. 3 *D* and *F*: The feedback gain only changes the strength of the long-range horizontal interactions, without having any effect on the bottom-up inputs.

By using Eq. **18** for the excitatory nullcline above firing threshold, we get the following relationship for the intersection, specifying the activity of the excitatory network node *x* for a given input strength *g*_{e} that should be fulfilled for a feedback set point-matched input:We can then calculate the intersection between the nullclines and show that it will not change when FB_{set point} is changed. In addition, the inhibitory nullcline of this model can be approximated by the linear response rate relationship (Eq. **23**; compare straight dashed red line with the inhibitory nullcline, solid red line, in Fig. 2 *Aa* and *Ba*). This simplifies the expression for the relationship between *g*_{xy} and *g*_{xx} (the feedback set point) to the following equation, which was also used for the simulation in Fig. 3*F*:The corresponding calculations and stability analyses for the current-based models and the extension to networks with horizontal connections are given in *SI Text*. In summary, we show that, with a few simple assumptions, the presence of horizontal connections does not change the intrinsic stability of the basic two-node module.

### Simulations.

Having established the stability of the model, we performed simulations that demonstrated the behavior of the network in tests of nonlinear contour integration, the time course of V1 responses to complex stimuli, and robustness to noise. We first examined the effectiveness of the network in responding to a naturalistic stimulus, namely, a commonly used test image in computer science, that of a young woman known as Lena wearing a hat (Fig. 3*A1*). The simulations were performed both using the current-based model with divisive inhibition (Fig. 3 *B–D*) and using the conductance-based model with subtractive inhibition (Fig. 3 *E* and *F*). The point of Fig. 3 is not to propose that one type of processing result is preferred over another, but rather to show how changing the FB_{gain} changes the amount of contour integration and noise suppression, to show that the network leads to stable results with a complex naturalistic image, and to give an understanding of the role of the FB_{set point}, whether it is fixed or matched to the input strength. We expect that this demonstration complements the more formal mathematical formulations.

The network itself never directly received the raw image (Fig. 3*A1*) as input; instead, it received the results of calculating the local orientation strength with the help of a quadrature pair of steerable filters (21), mimicking the processing of the visual inputs by retina, lateral geniculate nucleus (LGN), and complex cells in V1. The input strength at the orientation that was strongest at each location in the image is shown in Fig. 3*A2*. When this input was presented to the network with all local and long-range horizontal connections inactive, the network output was that shown in Fig. 3*A3*, which is just a thresholded version of the input.

Each of the next three rows (Fig. 3 *B–D*) shows a different network configuration, whereas each column depicts the response of each network configuration with a weak and a strong FB_{gain}.

In the configuration without long-range horizontal connections (i.e., *L*_{xx} = *L*_{yx} = 0; Fig. 3*B*), as the FB_{gain} was increased, weak inputs were amplified more and some of the strongest ones were attenuated, but without taking any information from the other orientation columns into account. This is equivalent to a nonlinear scaling of the inputs. Not too surprisingly, the effect of this transformation is not very useful; it is just a nonlinear compression of the local orientation contrast.

In the network configuration with a constant FB_{set point} = 2 and long-range connectivity restored (Fig. 3*C*), the network circuit is the one shown in Fig. 1*B* without the gray connections. As a result, the gain of the local circuit is higher for weak inputs, leading to both an amplification of weak bottom-up inputs and a more pronounced effect of long-range horizontal connections upon the response to weak contours. At the highest FB_{gain} level (Fig. 3*C2*), one can see both how strong the weakest contours have become because of the nonlinear contour integration, and how much they have been sharpened at the same time, almost to the point of a sketch. In addition, the stronger ability of the network to “squash” inputs at higher FB_{gain} prevented runaway excitation of the output to smooth contours.

In the last network configuration (Fig. 3*D*), the FB_{set point} was matched to the input strength at every location in a bottom-up manner by the gray connections in Fig. 1*B*. That is, the FB_{set point} = *J*_{xx}/*J*_{xy} was made linearly proportional to the input *i*_{e}, and thus for a given FB_{gain} = *J*_{xx}, the effective I-E connection strength *J*_{xy} was assumed to be adjusted inversely to the input strength *i*_{e} by an unspecified mechanism. A biologically plausible implementation could be inhibition of the inhibitory network nodes in a feedforward manner with added divisive inhibition. The result of this set point matching was that increasing the FB_{gain} only affected contextual interactions mediated by long-range horizontal connections without influencing bottom-up inputs, because the intersection of the nullclines is independent of the FB_{gain} as explained above. At the highest FB_{gain} level (Fig. 3*D2*), one can see that, although the contours were “sharpened” and “cleaned up” by the nonlinear contour integration, the full dynamic range of the input was not lost as in Fig. 3*C2*. This represents a kind of exclusive gating of the long-range horizontal connections and contextual interactions. However, if contextual stimuli of multiple orientations and positions (a “noisy” background) send input to an orientation column via long-range horizontal connections, the response of the orientation column is suppressed because with E-E and E-I inputs of similar strength, the higher response-rate slope for inhibitory neurons (steeper *f–I* curve) makes E-I inputs more effective (twice as effective with our choice of parameters, compare stimuli with and without the “noisy” background in Fig. 4 *B1* and *B2*). The higher the FB_{gain}, the stronger is this suppression. Conversely, a contour running through a noisy background is comparatively strengthened because of a number of factors: an orientation column representing a contour element receives less E-I than E-E input due to the geometry of the connectivity; there is reduced E-I input from the surround (suppression is stronger at higher FB_{gain}); and there is recurrent facilitation due to E-E connections between collinear contour elements. Additionally, increasing the FB_{gain} increases the competition between orientation columns inhibiting one another. All these factors together can account for the observed contour enhancement and noise suppression.

Fig. 3 *E* and *F* shows the corresponding results for Fig. 3 *C* and *D*, but using the conductance-based model with subtractive inhibition. Although the relationship between the local E-E and E-I effective connection strengths is more complicated than with the current-based model as described above, the same definition for the feedback gain could be used and the processing of the inputs as a function of the feedback gain was remarkably similar. The main difference is that the stimulus processing with the chosen parameters was a bit weaker than that of the current-based model with divisive inhibition (compare Fig. 3 *E* and *F* with Fig. 3 *C* and *D*). This is because we chose the connection strengths to have the same effective magnitude at the node's activity thresholds (*Materials and Methods*), but above threshold, the driving force, i.e., the difference between the reversal potential and the activity, becomes smaller and the effective connection strength, weaker. It is possible to increase the value of the conductances a bit to take this effect into account, but we chose not to do so to make the results more directly comparable.

The next simulation (Fig. 4 *A–C*) reproduces the results of contour integration and perceptual learning of contour detection (2, 3). In brief, monkeys were trained to detect collinear contours consisting of one through nine bars embedded in a patch of randomly oriented bars, while electrophysiological recordings were made from neurons in the superficial layers of V1. Before learning, the contours did not elicit strong neural responses, but after learning there was a strong dependency of the response on the length of the contour. The size of the response increase depended on the actual task being performed (2, 3).

These effects were simulated by proposing an envelope of task-dependent modulation mediated by feedback connections to V1. Here, this envelope had the shape of a Gaussian distribution with the same SD (same spatial extent) as the long-range horizontal connections convolved with the stimulus, but without orientation specificity (ref. 12, but see ref. 26). This envelope is hypothesized to change the FB_{gain} in our model from the default value of 1 to a peak value of ∼3 at the center of the location where the contour would appear. This is shown in Fig. 4*A1* together with the bottom-up input derived from an embedded nine-bar stimulus. The same attentional envelope was used for all simulations because the animal in the actual experiment did not know how many collinear elements the stimulus on the next trial would contain.

Fig. 4*A2* shows the network output to an embedded nine-bar stimulus averaged over 50 cell time constants. In this display, the thickness of the oriented bars is proportional to the average output. The pop-out of the global contour was very obvious due to the high FB_{gain} at its location. Outside of the attentional spotlight, there were only relatively small but noticeable differences between elements forming smooth contours and their surround bars.

Next, Fig. 4*B* shows the time course in response to a single bar without background (dashed line), and to one, five, and nine bars embedded in a random background (light, darker, and darkest line), in green in Fig. 4*B1* for the center excitatory network node, and in red in Fig. 4*B2* for the center divisive-inhibitory network node.

This simulation was in close agreement with the time course of 24 putative excitatory neurons shown in Fig. 4*C1* and 6 putative inhibitory neurons shown in Fig. 4*C2* that were identified in a post hoc analysis of the spike waveforms recorded in the monkey experiments. It is striking how similar were the time courses of both putative excitatory and inhibitory neurons. We can fully account for this in our model by choosing the E-I connection strengths as the inverse of the slope of the response function of the divisive-inhibitory neuron, i.e., *J*_{yx} = 1/*m*_{y}. Of particular note is the delay in the contour-related responses, which is seen in both the electrophysiological recordings and in the model. This delay of the contour-related response is mirrored by a delay of the suppression of the bars surrounding the contour, which results from the competition between contour elements and background: Once the contour starts to pop out a bit more, it tends to suppress the background more, which in turn inhibits the contour less and makes it pop out more (a random background has a suppressive effect), etc., until equilibrium is reached. This recurrent process is much slower than the initial suppression of the stimulus by the random background because it involves small differences that grow over time rather than the bulk inhibition by the contextual background that leads to the initial suppression.

Fig. 4 *D–F* demonstrate the robustness of the current-based model with divisive inhibition in the presence of noise. Despite the fact that the model was recurrent and did not exhibit strong oscillations, one can see that even at the highest noise level simulated [effectively eight times that simulated in a comparable model (7)], the model started to fail gracefully and did not generate artificial contours, but rather lost a bit more detail (e.g., compare the weak contours in the top quarter hat in Fig. 4 *D1*, *E1*, and *F1*). Even more, notice that the onset response was much less susceptible to noise in Fig. 4 *E2* and *F2*. This agreed with experimental data (compare the onset response of the neuronal data in Fig. 4 *C1* and *C2* with the later response) and would be useful for rapid decision making essential for survival.

## Discussion

Recent physiological experiments have suggested that contextual interactions are mediated by long-range horizontal connections, and that they are under top-down control. It has been shown that contextual interactions can depend on the location of spatial attention (27) and can change when different tasks are performed with the same visual display (28). The contextual effect is subject to perceptual learning (3, 29). The modulatory effect can be so strong that the number of spikes that a single V1 neuron fires during a single trial can predict the performance of the animal on the task (2). These physiological findings suggest that an interaction between top-down feedback projections and the horizontal connections in V1 plays an important role in the selective integration of visual information. However, it is unknown how top-down control can actually gate the lateral interactions, and current physiological approaches have difficulties probing this gating mechanism.

The network presented in this paper proposes a simple mechanism by which top-down influences can provide gain control and allow long-range horizontal connections to be modulated via a feedback gain parameter that controls the amount of local “self-recurrency” in the network. Phase-plane analysis demonstrates the superior stability properties of the circuit and shows how the gain of the circuit can be changed without affecting overall stability. Large-scale simulations illustrate how gain control in a system with long-range horizontal connections can lead to nonlinear contour integration and sharpening of edges. Our model reproduces quantitatively the time course of neuronal responses in V1 to complex stimuli and accounts for the immediate suppression by surround stimuli, the delayed facilitatory component associated with contour saliency, and its dependence on attention. Last, we show that the network behaves very stably in the presence of noise.

### Intrinsic Local Connectivity and Long-Range Horizontal Connections.

Recurrently connected excitatory and inhibitory neurons make up an essential “basic circuit” of a kind proposed in many brain circuit models (30). In experiments studying contour integration, nearly all cells in V1 superficial layers show facilitation for more salient embedded contours after learning (2, 3). To preserve this relationship and have excitatory and inhibitory nodes behave similarly even when lateral influences strongly excite the inhibitory nodes, it is important that both excitatory and inhibitory nullclines have a positive slope. This can be achieved by connecting the excitatory network node recurrently to itself (31, 32). This type of connectivity is an essential component of our model with the following extension: The connection strengths *g*_{xx} and *g*_{yx} (or *J*_{xx} and *J*_{yx} in the current-based model) are directly under control of top-down influences, allowing such influences to control the gain of the circuit so that contour integration and pop-out are under control of perceptual task.

### Top-Down Influences.

Previous work has shown that neurons in the superficial layers of V1 can adjust their properties to encode more information pertaining to a task when an animal is performing that specific task compared with an irrelevant one (2, 28). In our model, this is simulated by increasing the FB_{gain} at the spatial location where the stimulus is expected (Fig. 4*A*). This view is compatible with either spatial- or feature-based attention (33, 34), but also with object-based attention (35), insofar as the embedded contour is viewed as an object that gives rise to an attentional envelope via feedback connections. Indeed, differentiating between these modes of attention may not be necessary as long as they all lead to a similar attentional envelope.

Biophysically realistic implementation of the FB_{gain} could be achieved in at least two different ways. One possibility is that active dendritic currents could serve as a source of synaptic gain control (36). Voltage-activated sodium channels have been shown to strongly amplify the postsynaptic potentials of long-range horizontal input in V1 at elevated membrane potentials (37). It seems reasonable that descending axons conveying the feedback gain parameter could change the local dendritic membrane potential and thus the effective strength of self-recurrency. A second proposal, relevant to models with divisive inhibition, is related to a potential biophysical implementation of that type of inhibition (38, 39). Changing the gain of the excitatory “foreground” neurons by varying the firing rate of background inputs would alter the FB_{gain} because both *J*_{xx} and *J*_{xy} would be equally affected. Alternatively, one could change the number of neurons contributing to the recurrent loop, by varying the proportion of neurons in “off” and “on” states.

One extension of our model that still has to be explored is to allow feedback to gate different sets of long-range horizontal connections for different tasks, and to allow perceptual learning for the set that is engaged in the task. This level of specificity of the top-down influences is suggested by experiments in which different perceptual tasks, exercised on the same visual stimulus, can selectivity gate different contextual influences (28) and top-down anticipatory influences can even dynamically modify neuronal selectivity for contour shapes (40).

### Timing of Context-Related Responses.

Placing stimuli outside of the classical RF often leads to a suppression of neural responses in V1 (41⇓⇓⇓⇓–46). Moreover, this contextual inhibition is very quick, often lowering the peak of the initial onset response (18, 41, 43⇓–45, 47). However, a facilitatory component delayed by 100–150 ms has been seen in recent studies of the neuronal correlates of the pop-out of embedded contours in a complex background in V1 (2, 3). Previous studies investigating the influence of higher-order, contextually dependent properties have shown similar delays (35, 44, 48⇓⇓⇓⇓⇓⇓–55).

These results have given rise to a debate on the origin of the delayed facilitatory component: It has been argued that the delay must be due to feedback (50⇓–52, 55), although others have shown that at least some feedback may actually be very fast, affecting already the first 10 ms of the response (56), and is associated with an early inhibitory component (41). A different view is that perhaps the relatively slow conduction velocity of the unmyelinated axons of the long-range horizontal connections may be responsible for the delay (57). However, delays associated with slow conduction times seem inadequate to account for the considerable delay associated with contextual influences in V1. Rather, we have proposed that delays may result from the time required for the network to move from one stable state to another (58), and this idea is supported by our model.

### Relationship to Previous Models.

Our model focuses on the question how top-down feedback could gate long-range horizontal interactions in a circuit that can be readily mapped onto the cortical anatomy and physiology of the superficial layers of V1. The aim was to create as simple a model as possible that is able to reproduce the time course of electrophysiological responses and the psychophysical performance of awake behaving primates, and to use phase-plane analysis to get an analytical understanding of the network’s responses. Last, we wanted to demonstrate that the model scales up from a “toy problem” to natural stimuli.

Although earlier models simulate the phenomenon of contour saliency and incorporate horizontal connections, they do not address the idea that contour integration, which may be beneficial to some tasks but detrimental to others, involves gating these connections by feedback. This needs to be achieved in a manner that is compatible with the known orientation dependence of the contour integration process. Thus, directly changing the strength of long-range connections with non–orientation-selective feedback connections (12) seems an unlikely solution (and difficult to implement in a biophysically realistic way without axo-axonal synapses).

Many components of our model are also central to many other well-established models of V1: a network of coupled excitatory and inhibitory node pairs with similar-shaped long-range horizontal connection kernels (7, 19, 20, 59) provides the most complete explanation of pop-out and texture segmentation in V1; an inhibition-stabilized self-excitatory network has been proposed to explain data from intracellular recordings in surround-inhibition experiments (32); and divisive inhibition has been used to explain many of the properties of the classical RF (60, 61). A recent model (5) used multiplicative kernels derived from image statistics to account for horizontal cortical interactions in the detection of closed contours, but this model was not cell-based and did not allow for top-down response modulation. To our knowledge, no other model can gate long-range horizontal connections exclusively with a simple network mechanism based on top-down feedback by changing the local gain while allowing for stable responses despite strong recurrent connections over a large range of feedback gain.

## Acknowledgments

We thank J. McManus for categorizing excitatory and inhibitory neurons based on the action potential waveform. This work was supported by the Neurosciences Research Foundation (V.P.), National Eye Institute Grant EY007968 (to C.D.G.), and The James S. McDonnell Foundation.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: gilbert{at}rockefeller.edu.

Author contributions: V.P., W.L., G.N.R., and C.D.G. designed research; V.P. and G.N.R. performed research; V.P., W.L., G.N.R., and C.D.G. analyzed data; and V.P., W.L., G.N.R., and C.D.G. wrote the paper.

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1317019110/-/DCSupplemental.

## References

- ↵
- ↵
- ↵
- ↵
- Ernst UA,
- Mandon S,
- Pawelzik KR,
- Kreiter AK

- ↵
- ↵
- ↵
- ↵
- Ullman S

- ↵
- VanRullen R,
- Delorme A,
- Thorpe SJ

- ↵
- ↵
- ↵
- ↵
- Sigman M,
- Cecchi GA,
- Gilbert CD,
- Magnasco MO

- ↵
- ↵
- ↵
- ↵
- ↵
- Kapadia MK,
- Westheimer G,
- Gilbert CD

- ↵
- ↵
- ↵
- ↵
- ↵
- Koch C

- ↵
- ↵
- Nowak LG,
- Azouz R,
- Sanchez-Vives MV,
- Gray CM,
- McCormick DA

- ↵
- Angelucci A,
- et al.

- ↵
- ↵
- ↵
- ↵
- Shepherd GM

- ↵
- Tsodyks MV,
- Skaggs WE,
- Sejnowski TJ,
- McNaughton BL

- ↵
- ↵
- McAdams CJ,
- Maunsell JH

- ↵
- ↵
- ↵Segev I, London M (2003) Dendritic processing.
*The Handbook of Brain Theory and Neural Networks*, ed Arbib MA (The MIT Press, Cambridge, MA), 2nd Ed, pp 324–332. - ↵
- ↵
- Ayaz A,
- Chance FS

- ↵
- ↵
- McManus JN,
- Li W,
- Gilbert CD

- ↵
- Bair W,
- Cavanaugh JR,
- Movshon JA

- ↵
- Bishop PO,
- Coombs JS,
- Henry GH

- ↵
- Knierim JJ,
- van Essen DC

- ↵
- Li W,
- Thier P,
- Wehrhahn C

- ↵
- ↵
- ↵
- Kapadia MK,
- Westheimer G,
- Gilbert CD

- ↵
- ↵
- Kinoshita M,
- Komatsu H

- ↵
- ↵
- ↵
- ↵
- Rossi AF,
- Desimone R,
- Ungerleider LG

- ↵
- ↵
- Zipser K,
- Lamme VAF,
- Schiller PH

- ↵
- Hupé JM,
- et al.

- ↵
- Bringuier V,
- Chavane F,
- Glaeser L,
- Frégnac Y

- ↵
- ↵
- ↵
- Carandini M,
- Heeger DJ

- ↵
- ↵

## Citation Manager Formats

## Sign up for Article Alerts

## Jump to section

## You May Also be Interested in

### More Articles of This Classification

### Related Content

### Cited by...

- Axonal plasticity associated with perceptual learning in adult macaque primary visual cortex
- Extracting neuronal functional network dynamics via adaptive Granger causality analysis
- Profile of Charles D. Gilbert
- Top-down modulation of sensory cortex gates perceptual learning
- Interactions between feedback and lateral connections in the primary visual cortex
- Adult Cortical Plasticity Studied with Chronically Implanted Electrode Arrays