Kashtan and Alon. 10.1073/pnas.0503610102.
Fig. 6. Electronic circuit genome description. The binary genome is composed of n + 1 genes: n genes that code for gates (here n = 12) and a single gene that codes for the output. In the experiment the gate-type field was not used (we used only one gate type, a NAND gate). Input address bit field was 4 bits (Table 1).
Fig. 7. Neural network genome description. The genome is composed of n = 15 genes, each codes for a neuron. We used r = 4 layers of neurons arranged in layers as follows: L1 = 8, L2 = 4, L3 = 2, L4 = 1. In our setting the activation function field was 0 bits length, as we evolved networks with a single activation function (integrate and fire). Neurons are arranged in the genome according to their layer order. A gene length varies according to the layer. The gene length corresponds to the number of inputs and to the input addresses space required for the specific layer (Table 2).
Fig. 8. A typical nonnormalized significance profile of circuits evolved under varying nonmodular random goals. (a) Significance of all three-node subgraphs compared with randomized networks. Mean Z-score ± SE is shown for 40 networks. Red circles indicate network evolved under randomly varying goals (condition d), Black squares indicate corresponding random networks. (b) Significance of four-node subgraphs. The 10 subgraphs with highest absolute Z-scores were selected.
Fig. 9. Comparison of the Z-score profiles of different subgraphs between modularly varying goal (MVG) networks and control modular networks. Red circles indicate networks evolved under MVGs. Black squares indicate control modular networks evolved to have the same modularity measure as the MVG networks but without an information processing goal. (a) Electronic circuits. Significance of three- and four-node subgraphs. The control networks were evolved with a goal of minimizing (Q - Qreal)2, where Qreal = 0.49 (the mean Qreal measure of MVG circuits). (b) Neural networks. Significance of four-node subgraphs. The control networks were evolved with a goal of minimizing (Q - Qreal)2, where Qreal = 0.45 (the mean Qreal measure of MVG neural networks).
Table 1. Electronic circuits genotype–phenotype mapping
|
Address code |
Physical address |
Address code |
Physical address |
|
0000 |
Input 1 |
1000 |
Gate 5 |
|
0001 |
Input 2 |
1001 |
Gate 6 |
|
0010 |
Input 3 |
1010 |
Gate 7 |
|
0011 |
Input 4 |
1011 |
Gate 8 |
|
0100 |
Gate 1 |
1100 |
Gate 9 |
|
0101 |
Gate 2 |
1101 |
Gate 10 |
|
0110 |
Gate 3 |
1110 |
Gate 11 |
|
0111 |
Gate 4 |
1111 |
Gate 12 |
The address code describes an input address field of a gene. Physical address describes the physical wiring.
Table 2. Neural networks genotype–phenotype mapping
|
Address code first layer |
Physical address |
Address code second layer |
Physical address |
|
000 |
Retina – pixel 1 |
000 |
Neuron 1 layer 1 |
|
001 |
Retina – pixel 2 |
001 |
Neuron 2 layer 1 |
|
010 |
Retina – pixel 3 |
010 |
Neuron 3 layer 1 |
|
011 |
Retina – pixel 4 |
011 |
Neuron 4 layer 1 |
|
100 |
Retina – pixel 5 |
100 |
Neuron 5 layer 1 |
|
101 |
Retina – pixel 6 |
101 |
Neuron 6 layer 1 |
|
110 |
Retina – pixel 7 |
110 |
Neuron 7 layer 1 |
|
111 |
Retina – pixel 8 |
111 |
Neuron 8 layer 1 |
|
Address code third layer |
Physical address |
Address code fourth layer |
Physical address |
|
00 |
Neuron 1 layer 2 |
0 |
Neuron 1 layer 3 |
|
01 |
Neuron 2 layer 2 |
1 |
Neuron 2 layer 3 |
|
10 |
Neuron 3 layer 2 |
||
|
11 |
Neuron 4 layer 2 |
||
|
Threshold code first and second layers |
Neuron threshold |
Threshold code third layer |
Neuron threshold |
|
000 |
4 |
00 |
2 |
|
001 |
3 |
01 |
1 |
|
010 |
1 |
10 |
-1 |
|
011 |
2 |
11 |
0 |
|
100 |
-1 |
Threshold code fourth layer |
Neuron threshold |
|
101 |
-2 |
||
|
110 |
0 |
00 |
1 |
|
111 |
-3 |
01 |
0 |
|
Weights code |
Connection weight |
||
|
0 |
-1 |
||
|
1 |
1 |
An address code represents a gene input address field. Physical address describes the physical wiring of the neurons inputs. A threshold code represents a gene threshold field. Note that the threshold code mapping was set such that a single-bit mutation increase or decrease the threshold by one (a Gray code).
Table 3. Modularity measurements of networks evolved under fixed goal and modularly varying goals (MVG)
|
Model system |
Conditions |
Qreal ± SE |
Qrand ± SE |
Qmax ± SE |
Qm ± SE |
|
Electronic circuits |
Fixed goal G1,G2 |
0.38 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.12 ± 0.02 |
|
Electronic circuits |
MVG G1 and G2 |
0.49 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.54 ± 0.02 |
|
Neural networks |
Fixed goal G1,G2 |
0.40 ± 0.01 |
0.36 ± 0.00 |
0.62 ± 0.01 |
0.15 ± 0.02 |
|
Neural networks |
MVG G1 and G2 |
0.45 ± 0.00 |
0.36 ± 0.00 |
0.62 ± 0.01 |
0.35 ± 0.02 |
Table 4. Modularity measurements of circuits evolved under varying nonmodular random goals
|
Random varying goal |
Qreal ± SE |
Qrand ± SE |
Qmax ± SE |
Qm ± SE |
|
a |
0.33 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
-0.08 ± 0.02 |
|
b |
0.35 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.00 ± 0.02 |
|
c |
0.38 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.12 ± 0.02 |
|
d |
0.36 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.04 ± 0.02 |
|
e |
0.36 ± 0.01 |
0.35 ± 0.01 |
0.61 ± 0.01 |
0.04 ± 0.02 |
Each row presents Q measures of 40 networks (20 experiments) of each of the control experiments a-e.
Table 5. Modularity measure of several biological networks
|
Network |
Qreal |
Qrand ± SE |
Qmax |
Qm |
|
|
1 |
E. coli transcription network |
0.77 |
0.70 ± 0.001 |
0.85 |
0.54 |
|
2 |
Neuronal network of C. elegans (threshold = 5) |
0.66 |
0.56 ± 0.001 |
0.74 |
0.54 |
|
3 |
Signal transduction in human cells |
0.67 |
0.53 ± 0.002 |
0.77 |
0.58 |
Supporting Text
Spontaneous Evolution of Modularity and Network Motifs
Electronic Circuit Evolution Simulations
We used a simple and standard genetic algorithm (1-3) to evolve combinatorial logic circuits. The target function was a predefined Boolean function of the inputs. The fitness of each circuit was the relative number of correct outputs for all possible input combinations. The circuits were composed of a single gate type: a two-input NAND gate. We used a binary genome of a fixed size of 13 genes: 12 genes coded 12 NAND gates and one gene coded the output wiring. We used mutations only, with elite strategy (4). The population size was S = 1,000 and the elite number was L = 300. For each generation all of the circuits were evaluated, and the best L circuits passed unchanged to the next generation. The L worst circuits were removed and replaced by a new copy of all of the elite circuits (4). Then, each individual was randomly mutated (mutation probability was Pm = 0.7 per genome). Lower mutational probabilities yielded similar results but on slower time scales. Alternatively, we had to increase population size to keep the same time scales. We did not impose any restrictions on the evolved circuits. Gates could have self inputs or two identical inputs. Feedbacks were not restricted. The circuit’s inputs could be used or ignored, and the number of connection from each input was free to be decided by evolution. Before each evaluation every circuit was reset such that the output of all gates was set to zero. To impose a pressure on the size of the evolving circuits, a fitness penalty of 0.2 was given for every additional gate above a predefined number of effective gates (here we used 11 gates). We define "effective gates" as gates with a directed path between them and the output. Genome and genotype-phenotype mapping are described in Fig. 6 and Table 1, respectively.
We performed 50 experiments of fixed-goal evolution for G1 and G2 and 50 experiments for modularly varying goals (in short we term this approach MVG). Statistical analysis of the networks for motifs and modularity was based on 77 perfect networks (with a fitness equals one) evolved under fixed goal (of the 100 experiments for G1 and G2), and 128 perfect networks evolved under MVG. The MVG networks were chosen from periods in evolution where perfect adaptation occurred (Fig. 1b).
We note that although there are many parameters to set in the evolutionary simulations, the results are not sensitive to a specific choice of parameters. We got qualitatively similar results under a range of the parameters, such as larger populations and smaller mutation rates. In addition, the same simulations were run with both a crossover operator and mutations. We received qualitatively similar results.
Larger Electronic Circuit Evolution Simulations
We used the same settings for the genetic algorithm as described above with a few changes. The population size was S = 2000 and the elite number was L = 500. The genome and genotype-phenotype definitions remained qualitatively the same as for the smaller circuit. Here n = 21, Input address bit field was 5 bits instead of 4 bits. No selection for circuits’ size was introduced.
We defined four different six-inputs, three-outputs goals. Let A = (X XOR Y), B = (Z XOR W), and C = (Q XOR R). Then the goals were defined as:
G1 = (A OR B; A AND C; B AND C),
G2 = (A AND B; A OR C; B AND C),
G3 = (A AND B; A AND C; B OR C),
G4 = (A AND B; A AND C; B AND C).
We used MVG where the goals switched every 20 generations in the following order: G1, G4, G2, G4, G3, G4. This set of six epochs was repeated in a cyclic manner over the evolution.
We performed 20 experiments, each of them lasting 300,000 generations. In 16 of the experiments a perfect circuit for at least two or three of the goals was found (after 120,000 ± 80,000 generations).
Neural Networks Evolution Simulations
We used a simple standard genetic algorithm to evolve feed-forward neural networks (5). The goal was to recognize objects in the left or right side of a 4-×-2-pixels retina (Fig. 5a). The fitness of each network was the relative number of correct recognitions in the environment. The genome was of a fixed size of 15 genes: each gene coded for a neuron. The neurons were set in four layers as follows: eight neurons in the first layer, four neurons in the second layer, two in the third, and one in the fourth layer. Connections were restricted to only between neighboring layers and in a feed-forward manner (lower layer to higher layer). Connections from the retina were to the first layer only (consider each pixel in the retina as a sensory neuron connecting to the first-layer neurons). The output was defined to be the output of the single neuron at the fourth layer. Each neuron was given a maximal number of incoming connections as follows: three inputs for a neuron in the first, second, and third layers and two inputs for a neuron in the fourth layer. Each connection could be one of two weight values: -1 or 1. We used a simple threshold function: a neuron fires if the sum of its inputs multiplied by their weights exceed the neuron firing threshold. Each neuron was assigned a threshold of integer value in the range (- (2ˆ(i - 1)) , 2ˆ(i - 1) - 1), where i is the maximal number of inputs of the neuron. For example if the neuron could receive three inputs , the threshold range was [-4, -3,..,2 ,3]. Genome and genotype-phenotype mapping are described in Fig. 7 and Table 2, respectively.
We performed 24 experiments of fixed-goal evolution toward G1 = L AND R and G2 = L OR R and 16 experiments for MVG switching between G1 and G2. In each experiment four island populations evolved (independent populations that evolve in a parallel manner). We note that unlike engineering applications of neural networks generalizations or cross-validation were not implemented. To study the pure effect of the environment on the network structure, each network evolved and was evaluated on the same set of inputs (environment). Statistical analysis of the networks for motifs and modularity was based on 92 nearly perfect networks (fitness ³0.95) evolved under fixed goal (of the 24 × 4 experiments for G1 and G2) and 190 nearly-perfect networks evolved under MVG. The MVG networks were chosen from periods in evolution where nearly perfect adaptation occurred.
We used mutations and crossovers as evolutionary operators, with elite strategy. To accelerate the simulations we used a population composed of four islands (6). Use of islands is not essential to obtain the present results. Each island population size was S = 600; elite number was L = 150. In each generation all of the networks were evaluated, and the top L networks passed unchanged to the next generation. The L worst networks died and were replaced by a new copy of all of the elite networks. Crossover occurred by probability Pc = 0.5 on randomly chosen couples of networks. Then, each network was randomly mutated (mutation probability was Pm = 0.5 per genome). Every 10 generations copies of the 50 best networks from each island were added to each of the other islands, replacing eliminated networks.
Note that we did not impose a strict number of incoming connections to a neuron. If two of the inputs are identical then the effective weight of the connection is the sum of the weights of the two connections. To impose a selective pressure on the size of the evolving circuits, a penalty of 0.01 was given to a network that effectively used more than a predefined number of neurons (here we used 13 neurons).
Computing a Nonnormalized Subgraph Significance Profile
We found the network motifs in each evolved network by using previously described algorithms (7, 8). To detect network motifs, we counted the number of appearances of different subgraphs in the evolved (real) network and compared it with the number of times they appeared in randomized networks. The randomized networks preserved the single-node characteristics of the real network, such as the incoming and outgoing degree sequence and the layered structure (in the case of neuronal networks), as described in the next two sections.
To display the network motifs for each network we computed a nonnormalized subgraph significance profile as described (9). Briefly, to compare the distribution of real networks with the distribution of an ensemble of random networks we did the following: For each real network we used the MFINDER1.2 software to find the significance of all three- and four-node subgraphs (network motifs detection tool). The significance profile is composed of the Z-score for each subgraph [Z = (Nreal - Nrand) / sigma , where Nreal and Nrand are the counts in the real and average count in the randomized networks, and sigma is the SD in the randomized networks].
To evaluate the Z-score distribution in randomized networks, for every network we randomly chose a randomized network as the real network and compared it with an ensemble of 100 other randomized networks.
In Fig. 1 c and d, we present the mean Z-score ± SE of the two evolutionary scenarios: fixed-goal evolution and MVG evolution.
Representing Combinatorial Circuits as Networks for Z-Score Calculations and Motifs Detection
We considered only the effective part of a circuit (i.e., nodes with a directed path from them to the output). We generated a network in which each NAND gate is represented by a node. Additional nodes represent the four inputs and an output. Edges were set according to the gates’ input addresses. In case that the two input addresses of a gate were identical we ignored one of them (to avoid multiple edges). Self edges were ignored.
The combinatorial circuit networks were compared with random networks that conserved the number of incoming edges, outgoing edges, and mutual edges for each node.
Representing Neural Networks for Z-Score Calculations and Motifs Detection
We considered only the effective part of a network. We generated a network in which each neuron is represented by a node. Additional nodes represented the eight inputs and the output. Edges were set according to the neural network wiring. In cases where a neuron receives two or more connections from the same neuron we set only a single edge. Weights were not considered for motif detection.
The neural networks were compared with randomized networks that conserved the number of incoming edges, outgoing edges, and mutual edges for each node. In addition, to conserve the layer structure, in the randomization process, we allowed switches between edges only if they were between neurons of the same layer (the switching method of random network generation is reviewed in ref. 10).
Quantifying Modularity
Modularity measurements results are summarized in Tables 3-5.
Simulations of Evolution Under Varying Nonmodular Random Goals
We asked whether varying between random goals (as opposed to varying between modular goals) can yield modularity and network motifs. For this purpose, we performed simulations of combinatorial circuits with the same settings as described for the fixed goal and modularly varying goals, with epoch length of E = 20 generations.
We used the following goals in five different control experiments:
(a) Switching between two given, randomly chosen Boolean functions: Two goals, G1 and G2, were set each to a four-input, one-output random Boolean function. G1 and G2 were kept throughout the experiment.
(b) Switching between random Boolean goals: In every epoch the goal was set to a new four-input, one-output random Boolean function.
(c) Switching between G1 = (X XOR Y) AND (W XOR Z) and a randomly chosen four-input, one-output random Boolean function G2. G2 was kept the same throughout the experiment.
(d) Switching between G1 = (X XOR Y) AND (W XOR Z) and a four-input, one-output random Boolean function. A new G2 was chosen randomly each time the goal was switched.
(e) The goal is exposed to a small random change every epoch. First epoch was set to G1 = (X XOR Y) AND (W XOR Z). Each epoch a single bit in the function truth table was randomly flipped. This gives a slow diffusion away from the original goal, but with no apparent modularity between the goals.
We find that evolution under any of these five types of varying nonmodular random goals did not yield modular network structure (Table 4). In addition, these networks did not show significant network motifs (Fig. 8). In contrast, evolution under modularly varying goals showed spontaneous evolution of modularity and motifs (Table 3).
Comparing the Subgraph Significance Profiles of MVG Networks to Control Modular Networks
We found that networks evolved under MVG have significantly more network motifs than networks evolved under fixed-goal evolution. One possible explanation of the origin of these motifs is that the modular networks are denser than nonmodular randomized networks, because the number of subgraphs in a random network of a given size is known to increase with its density (number of edges per node) (12). The more edges a subgraph has, the stronger the increase in its average number of appearances as a function of network density.
To test this explanation, we compared the MVG networks (the real networks) to networks of the same size evolved to have the same modularity measure, but without any information processing goal (control modular networks). To construct the control modular networks, we used the same evolutionary algorithm with a goal of minimizing (Q - Qreal)2, where Qreal is the mean Q value of the MVG networks. The networks were constrained to have the same number of nodes as the MVG networks. Therefore, they represent the ensemble of networks that are as modular as the MVG networks, but which have not been selected to perform an information-processing goal.
We find that the control modular networks show no significant network motifs (Fig. 9). The absolute Z-scores of all subgraphs are smaller than |Z| = 2. In contrast, MVG networks showed significant motifs and antimotifs. This finding suggests that modularity alone cannot explain evolution of network motifs in the MVG networks.
The control networks also highlight the specific choice of motifs and antimotifs in the real networks. For example, three-node feedback loops (electronic circuits, subgraph 8) are found somewhat more often than at random in the control modular networks, as expected for dense networks. In contrast, the three-node feedback loop is an antimotif in the MVG networks, as it is in many biological networks. Selection against this pattern overcomes the tendency of dense networks to form such loops.
Modularity Measure, Qm, of Biological Networks
We applied the present modularity measure to three biological networks: transcription network of E. coli (7) that describes direct transcription interactions between 424 nodes (that represent operons) based on RegulonDB (13) and additional literature search described by Shen-Orr et al. (7), the network of synaptic connections between 252 C. elegans neurons (edges represent connections with five or more synapses) (14), and a network of signal transduction interactions in human cells between 94 proteins, made up primarily of phosphorylations in kinase cascades, compiled mainly from selected data from the STKE database studied in ref. 15. All networks showed high modularity scores as shown in Table 5.
Qrand and Qmax were estimated from networks constrained to conserve the degree sequence of the real networks (note that any other random network model could be used for this purpose). To estimate Qmax we applied a simulated annealing approach (16) that switches between edges while trying to minimize the energy E = (1 - Q)2.
1. Goldberg, D. (1989) Genetic Algorithms in Search, Optimization, and Machine Learning (Addison-Wesley, Reading, MA).
2. Mitchell, M. (1996) An Introduction to Genetic Algorithms (MIT Press, Cambridge MA).
3. Holland, J. (1975) Adaptation in Natural and Artificial Systems (University of Michigan Press, Ann Arbor).
4. Vasconcelos, J. A., Ramirez, J. A., Takahashi, R. H. C. & Saldanha , R. R. (2001) Proc. IEEE Trans. Magnetics 37, 3414-3418.
5. Yao, X. (1992) A Review of Evolutionary Artificial Neural Networks (Commonwealth Scientific and Industrial Research Organization, Victoria, Australia).
6. Cantu-Paz, E. (1995) A Summary of Research on Parallel Genetic Algorithms (Illinois Genetic Algorithms Laboratory, University of Illinois at Urbana-Champaign, Urbana).
7. Shen-Orr, S., Milo, R., Mangan, S. & Alon, U. (2002) Nat. Genet. 31, 64-68.
8. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. & Alon, U. (2002) Science 298, 824-827.
9. Milo, R., Itzkovitz, S., Kashtan, N., Levitt, R., Shen-Orr, S., Ayzenshtat, I., Sheffer, M. & Alon, U. (2004) Science 303, 1538-1542.
10. Milo, R., Kashtan, N., Itzkovitz, S., Newman, M. E. J. & Alon, U. (2003) e-Print Archive, http://xxx.lanl.gov/abs/cond-mat?0312028.
12. Itzkovitz, S., Milo, R., Kashtan, N., Ziv, G. & Alon, U. (2003) Phys Rev E 68, 026127.
13. Salgado, H., Santos-Zavaleta, A., Gama-Castro, S., Millan-Zarate, D., Diaz- Peredo, E., Sanchez-Solano, F., Perez-Rueda, E., Bonavides-Martinez, C. & Collado-Vides, J. (2001) Nucleic Acids Res. 29, 72-74.
14. White, J. G., Southgate, E., Thomson, J. N. & Brenner, S. (1986) Philos. Trans. R. Soc. London B 314, 1-340.
15. Itzkovitz, S., Levitt, R., Kashtan, N., Milo, R., Itzkovitz, M. & Alon, U. (2005) Phys. Rev. E 71, 016127.
16. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. (1987) Optimization by Simulated Annealing (Morgan Kaufmann, San Francisco).