New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology
Searching with iterated maps
-
Edited by Margaret H. Wright, New York University, New York, NY, and approved November 9, 2006 (received for review July 26, 2006)

Abstract
In many problems that require extensive searching, the solution can be described as satisfying two competing constraints, where satisfying each independently does not pose a challenge. As an alternative to tree-based and stochastic searching, for these problems we propose using an iterated map built from the projections to the two constraint sets. Algorithms of this kind have been the method of choice in a large variety of signal-processing applications; we show here that the scope of these algorithms is surprisingly broad, with applications as diverse as protein folding and Sudoku.
Phase-retrieval algorithms provide an indispensable service to structural biology by deriving from x-ray diffraction data the shapes of proteins, viruses, etc. The task of these algorithms is to discover unique combinations of values for large sets of variables, called phase angles, that satisfy equally large sets of constraints. Because the constraints in phase retrieval are special to this discipline, algorithm development has progressed independently of mainstream optimization theory. An interesting outcome of this parallel development is the widespread use in phase retrieval of a style of searching based on iterating a mathematical mapping (1–3). Although the generality of iterative methods have been appreciated for some time in the area of signal processing (4, 5), we show here that the technique performs surprisingly well in search problems of a purely combinatorial character.
When reconstructing an object, phase-retrieval algorithms try to reproduce the object's measured Fourier magnitudes, while also satisfying certain other general characteristics (bounded size, positivity, atomicity, etc.). We can make a rough analogy with the process used by a human to unscramble an anagram, such as “ilnoostu.” The solution here lies in the intersection of two sets: set A, consisting of the 20,160 distinct permutations of the given letters, and set B, the collection of eight-letter English words (≈30,000). Most humans do not solve the anagram by exhaustively searching one of these large sets (and checking for membership in the other). The actual process is a much smaller search, perhaps a restriction to elements of A that are in some sense “close” to B.
In phase-retrieval algorithms, the notion of “closeness” is made precise by embedding the sets A and B in a Euclidean space. To reconstruct a real-valued object in a two-dimensional field of view with N pixels requires a space with N dimensions. The Fourier transform will have N/2 independent magnitudes and the same number of phases. Knowledge of just the magnitudes corresponds to a set A of dimension N/2 (geometrically, a Cartesian product of circles). In some applications, one knows that the object occupies only M < N pixels of the field of view. The linear subspace spanned by these pixels then is the second constraint set B. By limiting the size of M relative to N, a unique reconstruction is possible in principle and A ∩ B is a single image (or symmetry-related family).
A large number of problems can be formulated in the language “find an element of A ∩ B” with the understanding that finding elements of A and B independently is easy. The general purpose algorithm we describe below requires one more attribute of the sets A and B: efficient methods for computing the projections to them. Given an arbitrary element x of the Euclidean space, the projection P A(x) is the typically ‡ unique element of A that minimizes the distance to x. Together, the projections P A and P B allow one to realize the heuristic of a restricted search that maintains closeness to both sets.
The Difference Map
Naive iterative schemes, such as the alternating map x ↦ P B(P A(x)), can stagnate at near-solutions, where the distance between A and B has a local, nonzero minimum (6). A slightly more elaborate alternative was proposed by Fienup (7) in the context of image reconstruction. Known as the “hybrid input–output” algorithm, recent interest in this algorithm has grown through the development of diffraction microscopy (8–10). When generalized and expressed in geometric language (11, 12), Fienup's iterated map has the form x ↦ D(x) = x + P A(2P B(x) − x) − P B(x).
It is easy to see why this scheme, unlike the alternating map, is immune to the problem of false fixed points. Suppose the algorithm has discovered a fixed point x* = D(x*); then, because P B(x*) = P A(2P B(x*) − x*), we have found a point that lies in both sets (a solution).
Fixed points of D and solutions often are confused; these have, in general, a many-to-one relationship. As an example, take the sets A and B to be the X and Y axes of a Cartesian coordinate system in three dimensions. P A in this case maps to the nearest point on X, (x, y, z) ↦ (x, 0, 0), and similarly for P B; the resulting D is the map (x, y, z) ↦ (0, 0, z). We see that every point on the Z axis is a fixed point of D and that all of these are associated with the unique solution X ∩ Y = (0, 0, 0). In general, soluble problems always have a fixed point because every solution (element of A ∩ B) is itself a fixed point.
A second map that codes the same solutions, but in general has different fixed points, is obtained by interchanging the two projections in the formula for D. Alternatively, we can define a further generalization, called the “difference map” (12), that interpolates between these through a parameter β:
where
The expression above for D reduces to the previous form for β = 1; changing the sign of β corresponds to interchanging A and B. Fienup's original map (7) also contained a free parameter β, and it coincides with the difference map when β = 1 (12). When the task is generalized to finding a point common to more than two sets, Bauschke et al. (13) observed that a product space construction reduces this to the case above, that of a constraint set pair.
The fixed point sets of D are not useful for solving the set intersection problem unless they are attracting. This is where the distance-minimizing property of the projections, as well as the detailed forms of the maps f A and f B, are critical. It easily can be checked that locally, when the sets A and B are linear and orthogonal, the form above reduces to a projection onto the linear space of fixed points (convergence in one iteration) (12, 14). This local model also illustrates how the algorithm avoids getting stuck at a near-solution. Consider a modification of the earlier example, where X is shifted along the Z axis by an amount c (the line defined by y = 0, z = c), thereby eliminating the solution. The sequence of difference map iterates is now (x, y, z) → (0, 0, z + βc) → (0, 0, z + 2βc), continuing at a uniform rate away from the near-solution. What previously was the manifold of fixed points (Z) has become, for c ≠ 0, the attractor on which the search is carried out. Taking this local model one step further, we arrive at a global picture of the space that is searched by the algorithm. Roughly it corresponds to the cobbled-together local attractors of numerous near-solutions. It is this more limited search space, of candidates that are all nearly elements of A ∩ B, that the difference map is able to access. Quantitative support of this interpretation is given below.
Convergence (to a fixed point set) has been proven only when A and B are convex (11), a characterization that excludes most difficult problems. Our reporting of the algorithm therefore is based almost entirely on a large body of empirical evidence. This evidence supports the central hypothesis that the space being searched is a strange attractor when there is no solution (A ∩ B = Ø) and that the presence of solutions only modifies this picture by introducing isolated traps (fixed point sets that reduce the strange attractor to a simple attractor). In the strongly mixing regime, the dynamics on a strange attractor quickly becomes uncorrelated in time. Because this behavior usually applies, the sampling of the attractor, and in particular the discovery of solutions, can be modeled as a Poisson process. The number of iterations to reach a solution, sampled over random initial conditions, therefore should have a decaying exponential distribution (15). The mean of this distribution is a useful measure of complexity for hard problems because the computation of each iterate should require only easy computations (projections). Finally, our experience with the algorithm suggests that typically there is only one strange attractor, the benefit being that the outcome of the search is independent of initial conditions. We attribute the absence of multiple attractors, including short cycles, to the strong mixing generated by the sharp decisions made when projecting to highly nonconvex sets (nearest integer, etc.).
One of the simplest applications of the algorithm is finding solutions to linear systems of diophantine equations. As an example, consider the three equations in four unknowns: x 1 + x 2 = 1, x 1 + x 3 = 1, and x 2 + x 4 = 2. The solution is unique if we restrict the values of the variables to be 0 or 1 (binary). Given positive integers m and n, an interesting ensemble of random instances is defined by the matrix equation Cx = b, where C is an m × n matrix with 0/1 entries (uniformly sampled) and b is a vector of m integers. A soluble instance is generated by first selecting C and a random binary solution vector x and then evaluating the corresponding b. Given C and b, how hard is it to recover x (with the guarantee that it exists)? Clearly, there are two limits where this task is easy. When m is small, a greedy approach succeeds because few variables appear in more than one equation. At the other extreme, m = n, the matrix C is square and the solution is obtained by inverting C. When m and n are large, a sharp transition occurs for the random ensemble in the expected number of solutions, from exponentially large (m < m 0) to exponentially small (m > m 0). In the latter case, there is a probabilistic guarantee of solution uniqueness in instances known, by construction, to be soluble. For fixed n, we expect the complexity of finding a solution to be a maximum when m is near m 0. The transition point m 0 = 2n/log2(πn/4) is obtained by evaluating the density of points Cx (fixed C, random x) with the central limit theorem.
Adapting the difference map algorithm to the problem of linear diophantine equations is very straightforward. The n-dimensional space of solution vectors x defines the Euclidean space where the projections are performed. One of the constraint sets (A) is the affine space Cx = b, obtained by relaxing the binary constraint on x. The projection to this constraint is given by the matrix formula P aff(x) = (1 − C −1 C)x + C −1 b, where C −1 is the pseudoinverse matrix of C. The second constraint set (B) is the hypercube of binary vectors, and the corresponding projection, P bin, simply rounds each component of x to 0 or 1.
Once the algorithmic details of the projections have been worked out, the protocol for finding solutions is the same for any application of the difference map. First, a value of β is selected. With no prior experience the values β = ±1 are preferred because this uses only half as many projection evaluations per iteration. Second, an initial element x 0 is chosen, usually with a random number generator. Although this is the only explicit use of randomness, round-off errors in the finite precision arithmetic may lead to unpredictable machine-specific outcomes. This point reflects the strongly chaotic nature of the underlying dynamical system. The arrival at a fixed point is signaled by a small value of the displacement Δ = ‖D(x) − x‖. When this value falls below a chosen threshold, the validity of the solution is checked and, if successful, the algorithm is terminated. In the diophantine equations problem, for example, the binary vector x sol = P B ○ f A(x) would be confirmed as a solution of Cx = b.
To demonstrate the solution of diophantine equations, we generated random solvable instances with n = 80 and various values of m. All instances attempted were solved with the difference map parameter β = −0.6, an empirically determined optimum. § Fig. 1 a shows the evolution of Δ for overconstrained (m = 36) and underconstrained (m = 13) instances. The mean fluctuating value of Δ has the expected variation with m, because this diagnostic measures directly the currently achieved distance between projections on the two constraint sets. Phase retrieval normally is carried out in the overconstrained regime because image/molecule reconstructions are uninteresting unless there is some guarantee of uniqueness. We see from Fig. 1 a that, in this case especially, the arrival at the fixed point is an unambiguous event. The distribution of run times (iterations), plotted in Fig. 1 b for a particular instance, has the exponential form expected of a Poisson process. One of the more interesting outcomes of this study is that the variation in the mean iteration count with the parameter m, shown in Fig. 1 c, has the same behavior reported for other hard problems (16): a clear maximum at the (over/under)-constrained boundary (m ≈ m 0). Table 1 compares difference map performance with a linear programming-based solver (CPLEX; ILOG, Inc., Mountain View, CA), the leading methodology for this kind of problem.
The difference map applied to linear diophantine equations. (a) Time series of the difference map displacements Δ for overconstrained (blue) and underconstrained (red) instances. A solution was found for both instances in <1,000 iterations. (b) Exponential distribution of iteration counts for 2,000 difference map solutions of a diophantine equation instance with n = 80 and m = 27. The horizontal axis is scaled to the mean iteration count (1.3 × 106), and the curve is the exponential distribution with unit mean. (c) Complexity peak in the mean iteration counts for difference map solutions of diophantine equations with n = 80 variables. The maximum occurs near the solvability threshold of random instances: m = m 0 ≈ 27. Vertical bars correspond to standard deviations in experiments with five instances.
The difference map compared with the CPLEX solver on hard instances of the diophantine equations problem
The most direct way of assessing the size of the space being searched by the difference map is to enumerate the places visited by the algorithm in the course of trying to solve an insoluble instance. In this scenario, the iterates x ceaselessly sample a strange attractor while generating a stream of candidate solutions (binary strings of length n) given by x bin = P bin ○ f aff(x). A probability distribution p(x bin) on the binary strings is defined by the frequency that the dynamical system visits x bin. The corresponding entropy S = −Σxbin p(x bin)log2 p(x bin) measures the effective size of the search space. Finding the solution of a soluble instance with similar parameters n and m should require ≈2S iterations. We confirmed this estimate with experiments on instances of size n = 32. By choosing m = 15 > m 0 and a particular random C, the uncontrived choice b = (n/4, n/4, …) almost always gives an insoluble instance. Separate runs of the algorithm up to 105 iterations gave practically indistinguishable probability distributions on the set of <2,000 binary strings that were visited; the entropy was just below S = 9. We then generated soluble instances for the same matrix C; these were solved typically in a few hundred iterations, as expected.
Example Applications
There is a large class of problems that are formulated in terms of independently constrained “primary variables” that are linearly related to a set of “secondary variables” whose values are independently constrained as well. In the difference map scheme, one of the constraints (A) is the linear relationship between the two sets of variables; the other constraint (B) restricts the allowed values for the combined set. Below, we summarize our experiences using the difference map on a number of much studied problems that fit this description. In addition, we also discuss three problems not in this class, where the difference map can take advantage of special kinds of projections. In comparisons with other solvers, the difference map performance (mean iterations, timing) is averaged over at least 10 runs.
Graph Coloring.
This application is interesting because it shows the difference map is chaotic even in a highly symmetric setting. The task is to color the edges of a complete graph with three colors while avoiding monochromatic triangles. The minimum number of vertices n for which this is impossible defines the Ramsey number R 3(3). Analysis has shown that R 3(3) = 17, and that for n = 16 there are just two nonisomorphic colorings (17). In Fig. 2, we show the time evolution of one difference map solution for this most difficult case (n = 16). Time (iteration number) increases vertically, and color assignments to the 120 edge variables are plotted horizontally. We see that although the color updates in each iteration are global in nature, there is persistence in a large fraction of the assignments over many iterations. This problem has many symmetry equivalent solutions; the particular one found by the algorithm is set by initial conditions.
Deterministic time evolution (vertically) of edge colorings in the difference map solution of a graph coloring problem. The solution (top) was found after 2,148 iterations.
Our embedding in Euclidean space respects all of the symmetries in the problem. Associated with each of the n e graph edges is a primary variable e i, a 3-vector that symmetrically encodes a coloring when restricted to the set E = {(2, −1, −1), (−1, 2, −1), (−1, −1, 2)}. Secondary variables t j are associated with each of the n t triangles in the graph. The two sets of variables are related by linear compatibility equations, e i + e j + e k + s t l = 0, one for each triangle, and containing (by symmetry) a single free scale parameter s. Combining edge and triangle variables into a single vector z, these equations, when written C z = 0, comprise the first constraint (A) in the difference map construction. When coloring a general graph, all of the graph-specific information is contained in the n t × (n e + n t) sparse matrix C and its dense pseudoinverse C −1. The projection to the compatibility constraint, P com(z) = (1 − C −1 C)z, through its dependence on s, can be tuned to favor changes to edge or triangle variables and thereby improve performance.
The second difference map projection, P
val, imposes value constraints on the variables: each edge variable is mapped to the nearest element of E, and each triangle variable is projected to a set that excludes monochromatic triangles. From the compatibility equation, we see that valid values (colorings) for triangle variables are (0, 0, 0), and the six permutations of (3, 0, −3)/s, with (3, 3, −6)/s and its permutations excluded. We chose to project each triangle variable to the disk of radius
Through experimentation we found good performance on all solvable instances of this coloring problem for the scale factor value s = 3 and β = 1. When n is <10, the algorithm finds a coloring in few iterations, with the displacement Δ making systematic progress toward zero. For the hardest instance, n = 16, there is considerable searching and an exponentially decaying histogram of iteration counts. Even though this coloring problem is straightforward to formulate as an integer program, we find the CPLEX solver (ILOG, Inc.) is not competitive with the difference map on the harder instances (see Table 2).
The difference map compared with the CPLEX solver on finding three colorings of the edges of complete graphs that avoid monochromatic triangles
Logical Satisfiability.
The task in 3-SAT, one of the most extensively studied problems in the NP-complete complexity class, is to decide the simultaneous satisfiability of m Boolean clauses, each comprising three literals from a list of n variables.
A natural embedding in Euclidean space calls for n real numbers x associated with the Boolean variables and another m variables y for the clauses. Compatibility between the two sets of variables then is expressed by m linear equations. For example, the logical statement p 1 ⋁ p 4 ⋁ p̄ 7 = q 1 is transcribed as the linear equation x 1 + x 4 − x 7 = s y 1, where s > 0 is an arbitrary scale parameter that, for simplicity, we fix at the same value for all m clauses. The projection P com to the constraint Cx = y, where C is sparse, can be computed to sufficient accuracy in time linear in n and m by using conjugate gradient minimization. For the second constraint, we impose discrete values on the x variables: s for TRUE and −s for FALSE. These assignments, when the compatibility constraint is satisfied, imply y ∈ {−3, −1, 1, 3}. To complete the second constraint, we impose y ≥ −1, because a clause is unsatisfied only when y = −3. The corresponding projection, P val, is given by the maps x ↦ s sgn(x) and y ↦ max(−1, y).
We tested the difference map algorithm on random 3-SAT instances in the critical region, m/n = 4.2, up to sizes n = 10,000. The performance is summarized in Table 3 and compared with the local search algorithm Walksat (18). Every instance found by Walksat to be satisfiable was solved by the difference map for the parameter choices s = 3 and β = 0.85. Interestingly, considering the very different algorithmic frameworks, Walksat is faster by roughly a constant factor 100 over the whole range. Walksat flips one Boolean variable at a time, based on a simple decision tree with greedy and stochastic elements; the difference map, by contrast, updates n + m variables synchronously and deterministically at every iteration. All our difference map solutions were found without random restarts, a strategy often used by local search algorithms when the configuration space is believed to have disconnected components (19).
Difference map performance compared with Walksat (18) on the same set of random 3-SAT instances
Spin Glass Ground States.
In statistical physics, a spin glass is defined by a Hamiltonian H = x t J x for n spin variables x 1, …, x n. Given a symmetric coupling matrix J (with zero diagonal elements), a ground state is an assignment of ±1 values to the spins that minimizes the energy H. In optimization terms, the spin glass ground state problem is an integer quadratic program with nonpositive objective function. This application shows how an objective function can be incorporated into the difference map scheme. We studied the Sherrington–Kirkpatrick (infinite-range) model (20), where the couplings J ij (i < j) are independently sampled from the same Gaussian distribution.
This problem has a natural embedding in a space of 2n dimensions, where the first n comprise the primary spin variables. A set of secondary variables, y
1, …, y
n, interpreted as the amplitudes of a spin configuration in the basis of eigenvectors of the coupling matrix J, are more useful for projecting to particular energy states of H. The two sets of variables satisfy the linear equations y = Vx, where the rows of V are the eigenvectors of J normalized by their corresponding eigenvalue magnitudes: V = diag(
Combining the x and y variables into a vector z of length 2n, the first difference map projection (A) restores compatibility Cz = 0 as in the previous applications. Because we did not explore performance enhancement by introducing relative scale parameters, this is just the equations y = Vx. Value constraints on the two parts of z are enforced by the second projection (B). For the x variables, this is just the map x ↦ sgn(x). The constraint on the values of the y variables comes from a target energy h and is defined by the inequality y t η y < h. If y already satisfies the inequality, there is no change; otherwise, y is projected to the nearest point on the normalized quadric y t η y = h. This computation is just a rescaling of y when η is proportional to the identity matrix and nearly as simple for the general (hyperbolic) case.
To find ground states without knowledge of their energy H, we modified the basic difference map protocol as follows. Rather than terminate by a criterion based on the displacement Δ, we evaluated the current energy H = sgn(x)t J sgn(x) after every iteration and compared this with the best achieved so far, H 0. If we found H < H 0, then H 0 was replaced by H, and the energy target h was decremented by the corresponding amount. Iterations were continued (without updating the target) if H > H 0, and terminated if H = H 0. The latter situation, in light of our continuously distributed couplings J, occurs only if the same spin configuration (up to global sign) is encountered twice, an event that should be extremely unlikely unless there is a unique configuration with energy bounded above by H 0. To improve confidence in the putative ground state, this protocol was repeated multiple times with the usual result that the same spin configuration was found each time. Finally, for systems below n = 20, we were able to verify ground states obtained by this procedure by exhaustive searching.
Candidate ground states for n = 100 typically were found in only a few thousand iterations of the difference map with β = 1. Although these have yet to be verified, the size of the search space is surprisingly small. Comparisons with other algorithms are planned for the future.
Bit Retrieval.
Bit retrieval may be viewed as the phase problem for a one-dimensional crystal that, for simplicity, has a binary-valued contrast function defined at n equally spaced points in the unit cell: σ = (σ1, …, σn). It is natural to use the shifted contrast values σi = ±½ because then the autocorrelation α = (α1, …, αn), given by αi = Σj=1 n σjσj+i, has values symmetrically distributed about zero. Here indices are treated modulo n, and the autocorrelation component αn is special in that it always equals n/4. In bit retrieval, we are given α and asked to find a valid σ. For example, if α = (−1, −1, −1, −1, −1, −1, 7)/4, then σ = (1, 1, −1, 1, −1, −1, 1)/2 is a solution (as are its reversal and cyclic shifts).
The magnitudes of the contrast function's Fourier transform, |F(σ)| = |σ̃|, are considered known because they are given explicitly by the Fourier transform of the autocorrelation:
As shown in Table 4, the difference map is clearly superior to integer-programming methods on this problem. We tried two encodings of the autocorrelation constraint, on “bond” variables σiσj or directly as “box” inequalities on the complex numbers σ̃k [see details in supporting information (SI)], and were unable to solve instances beyond n = 60 with CPLEX (ILOG, Inc.). Simulated annealing algorithms do only slightly better, the largest reported instance having size n = 64 (21).
Bit retrieval performance of the difference map compared with two integer-programming encodings on CPLEX
Sudoku.
Sudoku is the exercise of constraint satisfaction best known to the general public (22). The related problem of completing Latin squares (quasi-group tables) has attracted the interest of computer scientists (23). Our interest in these problems is the property that, unlike the previous problems, their constraint sets A and B are similar in structure. The natural embedding for Sudoku and Latin squares is an n × n × n cube of numbers x ijk, where x ijk = 1 denotes occupation of row–column position (i, j) of the n × n puzzle grid with symbol k (n = 9 in standard Sudoku). For Latin squares, one of the constraints is the requirement that each of the n rows is a permutation of the n symbols. The projection to this constraint, beginning from an arbitrary cube of numbers, involves the solution of n independent assignment problems, one for each row.
Consider row 1. Given initial values x
1jk, the projection seeks the assignment j ↦ p(j) that minimizes the Euclidean distance
Achieving the minimum distance therefore is equivalent to finding the permutation p(j) = k that minimizes “cost” in the bipartite matching problem with weights −x
1jk. The Hungarian algorithm (24) finds these matchings in time n
3 for each row or in time n
4 for the projection overall.
For the second constraint in the Latin squares problem, we apply the foregoing to the columns and symbols or to the rows and columns (for each symbol independently). Whichever we choose, the two constraints together specify the constraints in the Latin squares problem completely. The situation in Sudoku is similar. We may let one assignment constraint (A) apply to the rows and columns and the other to the block elements and symbols (B). The output of P A then is a valid arrangement of each of the symbols (one per row and column) without regard to collisions between different symbols. P B, on the other hand, correctly assigns all n symbols to each of the blocks but is blind to multiple occurrences of symbols on rows or columns. Again, the time to compute each projection grows as n 4. The symbols given on the puzzle grid as clues are easily accommodated by treating the corresponding x ijk weights as infinite.
We have implemented a difference map Sudoku solver based on the two projections just described. The iterations in this application execute a walk on an integer lattice, because each step has the form P row–col(···) − P digit–block(···), and each of the projections gives a binary vector. The combinatorial flavor of the search is underscored by the fact that convergence (Δ = 0) occurs in a finite number of iterations. We find that very little actual searching occurs in most puzzles: the displacement Δ decreases quasi-monotonically to zero, an indication of systematic progress toward the solution. Only in the very hardest puzzles, as judged by the size of the tree explored by backtracking algorithms, does the dynamical system enter a quasi-steady state where it explores a strange attractor. Puzzles designed for humans are usually solved in 10 to 100 iterations. Minimal Sudokus, or puzzles that lose solution uniqueness when any of the given clues is removed, often do not yield to simple logic. Of these, the hardest we found (reproduced in SI) is solved on average in 3,300 iterations for β = ±0.5; a standard backtracking algorithm typically explores 600 tree-nodes when solving this puzzle.
Protein Folding.
We conclude with the observation that one of the most stubborn problems in global optimization, protein native-fold prediction, may yield to the difference map approach. The basic idea has been tested on simple off-lattice models (25) and amounts to deconstructing the energy landscape into two constraint sets. Fig. 3 Right is a cartoon representation, where the 3N-dimensional Euclidean space describing the positions of N atoms has been reduced to the two dimensions of the page. The circle represents one constraint set (A), where the atoms have the correct covalent bonding geometry. The contours represent the energy landscape extended to nonbonded configurations, where the molecule is a liquid of independent atoms subject to volume exclusion, etc. When the energy landscape is sampled on the circle, the resulting restricted landscape (Fig. 3 Left) has the multiple local minima that physical proteins must negotiate when folding to the global minimum. In the difference map scheme, there is another constraint set B that also is of interest, in addition to the circle representing rotameric configurations. This is the set of configurations where the energy of the landscape (associated with nonbonded interactions) is below a certain level E 0. In Fig. 3 Right, set B is the region bounded by the lowest contour shown; its intersection with the circle determines the lowest energy rotamer, the native fold.
Schematic energy landscape in protein folding (Left) extended into a configuration space where the covalent bonding of the constituent atoms is relaxed (Right). The space of rotamers, where the bond geometry is correct, is represented by the circle and serves as one of the constraint sets (A). The other constraint (B) is the region enclosed by the energy contour of the target fold. In the difference map scheme, the global minimum of the physical landscape at Left is identified by the intersection A ∩ B on the Right.
What makes this approach attractive is the ease of computing the two projections (rendered more realistically in Fig. 4). In both cases, there is a strictly downhill path ¶ from most starting configurations to the constraint set. When used with the difference map, the two projections execute large steps in configuration space while maintaining proximity to both constraint sets. Our experience with simple protein models (25) shows that low-energy configurations can be identified by the difference map method in a small fraction of the time required by the landscape sampling methods in widespread use.
Projections corresponding to constraint sets A and B in Fig. 3. Shown are the minimal atomic displacements that restore the covalent bonding geometry of the molecule (a), and pack the atoms with favorable contact energies in a solvent environment (b). Both projections are strictly downhill paths on independent energy landscapes.
Conclusions
Our survey of applications shows the difference map algorithm often achieves results comparable to much more sophisticated, special purpose algorithms. Efficient implementations of constraint projections usually are easier than designing the linear program solver at the heart of many optimization algorithms. Being able to efficiently project onto certain nonlinear constraints also is a great convenience and explains the success of the method in the spin glass problem (one quadric surface) and bit retrieval (many circles). The use of even exotic constraint sets within the same algorithmic framework, such as the permutation matrices of Sudoku or the metric constraints in a polypeptide, opens up possibilities beyond the scope of most other general schemes.
Acknowledgments
We thank John Guckenheimer and David Mermin for their comments and Carla Gomes and Mike Todd for access to the CPLEX solver. This work was supported by National Science Foundation Grant DMR-0426568 and U.S. Department of Education Graduate Assistance in Areas of National Need (GAANN) Award P200A030111.
Footnotes
- †To whom correspondence should be addressed. E-mail: ve10{at}cornell.edu
-
Author contributions: V.E. designed research; V.E. performed research; V.E., I.R., and P.T. analyzed data; and V.E. wrote the paper.
-
The authors declare no conflict of interest.
-
This article is a PNAS direct submission.
-
↵ ‡ We are not concerned with measure-zero instances, as when projecting to the nearest point on a circle (x at the origin) or rounding to the nearest integer (x = 1/2).
-
This article contains supporting information online at www.pnas.org/cgi/content/full/0606359104/DC1.
-
↵ § Tests on the hardest instances showed that performance (mean iteration count) is degraded at most by a factor of 5 when β = ±1.
-
↵ ¶ For projecting to the nearest bonded geometry (set A), we use a sum-of-squares penalty function on the bond lengths and angles.
- © 2007 by The National Academy of Sciences of the USA
References
- ↵
- ↵
-
↵
- Miller R ,
- DeTitta GT ,
- Jones R ,
- Langs DA ,
- Weeks CM ,
- Hauptman HA
-
↵
- Youla DC ,
- Webb H
-
↵
- Fienup JR
-
↵
- Levi A ,
- Stark H
- ↵
- ↵
- ↵
-
↵
- Shapiro D ,
- Thibault P ,
- Beetz T ,
- Elser V ,
- Howells MR ,
- Jacobsen C ,
- Kirz J ,
- Lima E ,
- Miao H ,
- Nieman AM ,
- Sayre D
-
↵
- Bauschke HH ,
- Combettes PL ,
- Luke DR
- ↵
- ↵
- ↵
- ↵
-
↵
- Hogg T ,
- Huberman BA ,
- Williams C
-
- Kalbfleisch JG ,
- Stanton RG
-
↵
- Selman B ,
- Kautz H ,
- Cohen B
- Johnson DS ,
- Trick MA
-
↵
- Mézard M ,
- Parisi G ,
- Zecchina R
- ↵
-
↵
- Zwick M ,
- Lovell B ,
- Marsh J
-
↵
- Gold L
-
↵
- Gomes CP ,
- Selman B
-
↵
- Jungnickel D
-
↵
- Elser V ,
- Rankenburg I