New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
Numerical operator calculus in higher dimensions

Communicated by Ronald R. Coifman, Yale University, New Haven, CT (received for review July 31, 2001)
Abstract
When an algorithm in dimension one is extended to dimension d, in nearly every case its computational cost is taken to the power d. This fundamental difficulty is the single greatest impediment to solving many important problems and has been dubbed the curse of dimensionality. For numerical analysis in dimension d, we propose to use a representation for vectors and matrices that generalizes separation of variables while allowing controlled accuracy. Basic linear algebra operations can be performed in this representation using onedimensional operations, thus bypassing the exponential scaling with respect to the dimension. Although not all operators and algorithms may be compatible with this representation, we believe that many of the most important ones are. We prove that the multiparticle Schrödinger operator, as well as the inverse Laplacian, can be represented very efficiently in this form. We give numerical evidence to support the conjecture that eigenfunctions inherit this property by computing the groundstate eigenfunction for a simplified Schrödinger operator with 30 particles. We conjecture and provide numerical evidence that functions of operators inherit this property, in which case numerical operator calculus in higher dimensions becomes feasible.
In almost all problems that arise from physics there is an underlying physical dimension, and in almost every case the algorithm to solve the problem will have computational complexity that grows exponentially in the physical dimension. In other words, when an algorithm in dimension one is extended to dimension d, its computational cost is taken to the power d. In this paper we present an approach that, in several important cases, allows onedimensional algorithms to be extended to d dimensions without their computational complexity growing exponentially in d. In moderate dimensions (d = 2, 3, 4) our approach greatly accelerates a number of algorithms. In higher dimensions, such as those arising from the multiparticle Schrödinger equation, where the wave function for p particles has d = 3p variables, our approach makes algorithms feasible that would be unthinkable in a traditional approach.
As an example of the exponential growth in d, consider ordinary matrix–matrix multiplication. In dimension d a matrix has (N^{2})^{d} entries, and matrix–matrix multiplication takes (N^{3})^{d} operations. Using a “fast” onedimensional algorithm does not help: a banded matrix has (bN)^{d} entries, and matrix–matrix multiplication takes (b^{2}N)^{d} operations. This fundamental difficulty is the single greatest impediment to solving many realworld problems and has been dubbed the curse of dimensionality (1).
In problems in physics where the underlying assumptions permit, separation of variables has been the most successful approach for avoiding the high cost of working in d dimensions. Instead of trying to find a ddimensional function that solves the given equation (e.g., the multiparticle Schrödinger equation), one only considers functions that can be represented as a product: By substituting Eq. 1 into the original equation, one often can produce a system of d weakly coupled onedimensional equations for the functions {φ_{i}(x_{i})} [e.g., via Hartree or Kohn–Sham formulations (2)]. By iteratively solving the equation in x_{i} for the function φ_{i}, one obtains an approximate solution to the original equation using only onedimensional operations and thus avoiding the exponential dependence on d. However, if the best approximate solution of the form (Eq. 1) is not good enough, there is no way to improve the accuracy.
The natural extension of Eq. 1 is the form The key quantity in Eq. 2 is r, which we call the separation rank. By increasing r, the approximate solution can be made as accurate as desired. One way to use Eq. 2 is to fix the functions {φ(x_{i})} from some (basis) set and try to solve for the coefficients s_{l}. This option includes the use of a tensor product basis as well as configuration interaction methods. Although a wise choice of functions {φ(x_{i})} may reduce the number of degrees of freedom necessary to discretize the problem in each direction, N, it does not affect the exponential scaling with the dimension. A second option is to substitute Eq. 2 into the original equation. Unfortunately, the resulting equations for the functions {φ(x_{i})} are generally intractable, since the equations are strongly coupled. There are many variations on these two approaches within computational quantum mechanics (e.g., see ref. 2). Our approach is distinct from both of these traditional approaches.
Analytic consideration of Eq. 2 raises two questions. The first question is, what class of functions can be represented efficiently in this form? One can, for instance, characterize a class based on the mixed derivatives of order k, for which the approximation error is 𝒪(r^{−kd/(d−1)}) (3). Along the same lines, the “sparsegrids” (e.g., refs. 4 and 5) methods identify classes of functions compatible with certain representations. For these functions, reduction in the computational complexity from 𝒪(N^{d}) to 𝒪(N^{d−1}lnN) can be achieved, but the complexity is still exponential. Clearly, classes of functions in multiple dimensions are extremely rich, and such approaches face great difficulties. In contrast, our approach relies on properties of physically significant operators.
The second question is, how does one find the representation in Eq. 2 for a given function? Optimized separated representations such as in Eq. 2 have been studied for more than 30 years for statistical applications, and we refer to refs. 6–8 and the references therein. Since the problems considered for statistics have as input a dense ddimensional data cube, their applications have been limited to small dimensions (d ≪ 10, mostly d = 3).
Summary of the Paper
The contribution of this paper is twofold. First, we present a computational paradigm. With hindsight it is very natural, but this perspective was the most difficult part to achieve, and it has farreaching consequences. In our approach, we use the natural extension of separation of variables in Eq. 2 but neither fix the set {φ(x_{i})} nor try to find and solve equations for {φ(x_{i})}. We use Eq. 2 simply as an approximation technique for functions and operators and try to minimize the number of terms at each step of the algorithm. Second, we start the development of a theory that demonstrates that separation ranks are low for many problems of interest. We show here that certain physically significant operators have small separation rank.
Operators can be represented in a form similar to Eq. 2, We observe that many linear algebra operations can be performed while keeping all objects in the form of Eqs. 3 or 2. We can perform operations in d dimensions using combinations of onedimensional operations and so achieve computational complexity that scales linearly in d. We then solve the original equation directly by some appropriate algorithm while maintaining the intermediate functions and operators in the forms of Eqs. 2 and 3, with adaptively chosen r. The same approach applies to computing functions of operators, in particular polynomials, exponentials, inverses, and sign functions, which form the foundation for a numerical operator calculus.
Although each linear algebra operation leads to an object in the form of Eqs. 3 or 2, the result will have larger separation rank. If we allow r to grow uncontrollably, then the representation will quickly become untenable. Therefore, at each step of the algorithm, we seek to minimize the separation rank r by adaptively changing the coefficients {s_{l}} and the functions {φ} while maintaining the required accuracy. We present a numerical algorithm to perform this reduction. We note that our algorithm is similar to an algorithm used in statistics (e.g., ref. 8). In our approach, however, we never handle a ddimensional cube.
We have observed that the reduced separation rank produced by our algorithm is typically optimal or nearly optimal. It is an interesting question if such algorithms can, in general, guarantee optimality. For our purposes optimal representations are desired but not required.
In this paper, we prove that both the multiparticle Schrödinger operator and the inverse Laplacian can be represented with separation rank r = 𝒪(log d). We feel strongly that the class of operators with low separation rank is much wider than we can demonstrate at present.
We conjecture that functions associated with such operators inherit low separation rank. We know, for example, that if the Green's function and the initial/boundary condition have low separation rank, then it follows immediately that the same holds for the solution of the equation. We conjecture that eigenfunctions inherit a low separation rank from the operator. We present numerical results for the computation of the groundstate eigenfunction of a simplified model of a 30electron Schrödinger operator using the power method, with accuracies ranging from 10^{−2} to 10^{−7}. Using a direct method this computation would be impossible, since it would require on the order of 10^{80} operations, but with our approach it took a few hours on a desktop computer. We conjecture that functions of operators inherit a low separation rank and present a numerical example of the computation of a sign function.
Finally, we note that some problems considered in statistical analysis and other applications can be recast as scattered data interpolation in d dimensions, which is feasible with our approach. Although the implications of this observation are very interesting, we do not address them here.
The Separated Representation
In order to emphasize the underlying physical dimension, we define operators and functions in d dimensions. To avoid confusion between, e.g., a “vector in two dimensions” and a “matrix,” we clarify our notation and nomenclature. A function f in dimension d is a map f : R^{d}→ R from ddimensional Euclidean space to the real numbers. We write f as f(x_{1}, … , x_{d}), where x_{i} ∈R. A vector F in dimension d is a discrete representation of a function in dimension d on a rectangular domain. We write it as F = F(j_{1}, … , j_{d}), where j_{i} = 1, … , N_{i}. A linear operator 𝒜 in dimension d is a linear map 𝒜 : S → S where S is a space of functions in dimension d. A matrix 𝔸 in dimension d is a discrete representation of a linear operator in dimension d. We write 𝔸 = A(j_{1}, j′_{1}; … ; j_{d}, j′_{d}), where j_{i} = 1, … , N_{i} and j′_{i} = 1, … , N′_{i}. For simplicity we assume N′_{i} = N_{i} = N for all i.
Definition 1 (separated matrix representation):
For a given ɛ, we represent a matrix 𝔸 = A(j_{1}, j′_{1}; j_{2}, j′_{2}; … ; j_{d}, j′_{d}) in dimension d as where s_{l} is a scalar, s_{1} ≥ ⋯ ≥ s_{r} > 0, and 𝕍 are matrices in dimension one with norm 1. We require the error to be less than ɛ: We call the scalars s_{l}separation valuesand the integer r the separation rank. The smallest r that yields such a representation for a given ɛ is the optimal separation rank.
When possible, it is preferable to use the operator norm; otherwise the Frobenius norm is used. The definition for a vector is similar, with the matrices 𝕍 replaced by the vectors V.
In dimension d = 2, the separated representation in Eq. 4 reduces to a form similar to the singular value decomposition (SVD), and in fact we can construct an optimal representation using an ordinary SVD algorithm but with an unusual pairing of indices. Instead of separating vectors in dimension 2 in the input coordinate (j′_{1}, j′_{2}) from vectors in dimension 2 in the output coordinate (j_{1}, j_{2}), we separate matrices in dimension 1 in the j_{1} direction from matrices in dimension 1 in the j_{2} direction. Thus, common matrix operators that have full rank as operators may still have low separation rank. For example, the identity is trivially represented as ℐ_{1} ⊗ ℐ_{2} ⊗ ⋯ ⊗ ℐ_{d}, with separation rank one.
When d > 2 this representation is not unique even when r is the optimal separation rank. The optimal representations for different values of ɛ are not nested, so we cannot simply add or delete terms from Eq. 4 when ɛ changes and retain a representation with optimal separation rank. The numbers s_{l}, matrices 𝕍, and separation rank r all will change as ɛ changes. Issues related to such generalizations of the SVD have been studied extensively (e.g., see refs. 6 and 7 and the references therein).
To illustrate many of these issues for d > 2, we consider a sine wave in the diagonal direction, sin(x_{1} + ⋯ + x_{d}), and attempt to represent it in the separated form using only real functions. We can use the usual trigonometric formulas for sums of angles to obtain a separated representation, but then we will have r = 2^{d−1} terms. The numerical algorithm described below, however, uncovered a trigonometric identity in d dimensions, using exactly d terms.
Lemma 1.
for all choices of {α_{j}} such that sin(α_{k} − α_{j}) ≠ 0 for all j ≠ k.
This identity illustrates several key points. First, the “obvious” (analytic) separated representation may be woefully inefficient. Second, even when ɛ = 0 and r is optimal, there may be entire families of separated representations. Third, among these separated representations, some may have large separation values, leading potentially to cancellations and, hence, poor conditioning.
The Multiparticle Schrödinger Operator
We consider the Schrödinger operator ℋ for a delectron system and show that the separation rank of an appropriate approximation of ℋ grows only logarithmically in d. Without changing the basic in formalism in Eq. 4, we choose to use threedimensional operators as our elementary building blocks. The operator ℋ is equal to −Δ + 𝒩 + ℰ, where the Laplacian Δ is defined by the nuclear potential portion 𝒩 is defined by and the electron–electron interaction portion ℰ is defined by where 𝒱_{i} is the operator that multiplies by the nuclear potential function v(x) in the (threedimensional) variable x_{i}, and 𝒲_{im} is multiplication by the electron–electron interaction (Coulomb) potential w(x_{i} − x_{m}). Since any numerical treatment must be on a finite domain and use a finite step size, we state our results for an approximation to ℋ that has been suitably limited in space and in wave number. We choose a fixed but arbitrary precision ɛ to which to approximate ℋ and assume that we are working in finite precision arithmetic with roundoff error μ.
The following theorem provides bounds on the separation rank for Δ and 𝒩, and its proof provides a construction for their separated representations.
Theorem 2.
Let 𝒜 = ∑ ℬ_{i}, where ℬ_{i} is the bounded operator ℬ acting in direction i. We can represent 𝒜 to within ɛ in the operator norm with separation rank
Proof:
Consider the auxiliary operatorvalued function of the real variable t and note that 𝒢′(0) = 𝒜. Using an appropriate finite difference formula of order r, we approximate thus providing a separation rank r approximation for 𝒜. As with all finite difference approximations, the error bound for an order r formula has the form μ/h + h^{r}∥𝒢^{(r+1)}∥, where h is the step size. The bound on r follows by optimizing over h.
The estimate in Eq. 10 implies that, as long as ∥ℬ∥d/ɛ ≪ 1/μ, the separation rank is 𝒪[log(∥ℬ∥d/ɛ)]. Thus, for fixed ɛ, the separation rank grows only as log(d). Another implication of these estimates is the restriction ɛ > ∥ℬ∥dμ on the accuracy attainable using the form of Eq. 12. Note that if we choose to approximate with relative precision and assume ∥𝒜∥ ≈ d∥ℬ∥, then the separation rank is independent of the dimension.
For ℰ, we use a similar theorem and construction.
Theorem 3.
Let 𝒜 = ∑ ∑ ℬ_{i}ℬ_{m}, where ℬ_{i} is the bounded operator ℬ acting in direction i. We can represent 𝒜 to within ɛ in the operator norm with separation rank
Proof:
With 𝒢 defined in Eq. 11, note that 𝒢"(0) = 2𝒜/∥ℬ∥. We can thus use the same approach as in Eq. 12, simply substituting a second derivative finite difference.
To use this result for ℰ, we first symmetrically separate a single term in Eq. 9 to obtain the representation for some separation rank K_{ɛ} and collection of (multiplication) operators 𝒲. This can be accomplished using, for example, the SVD. We then substitute Eq. 14 into 9 and exchange the order of summation to obtain which gives a separated representation with separation rank K_{ɛ}⋅d(d − 1)/2. For each value of k, however, we apply Theorem 3. Since K_{ɛ} is independent of d, we conclude that the separation rank of ℰ grows only logarithmically in d rather than quadratically.
The Inverse Laplacian
It can be shown (9) that for 0 < δ < y < D and any ɛ > 0, there exist M = 𝒪{log[D/(δɛ)]} numbers α_{l}, τ_{l} > 0 such that We can use this approximation to construct a separated representation for the inverse of −Δ on part of its spectrum. Similar arguments can be made for other classical Green's functions.
Theorem 4.
For the interval [δ, D] of the spectrum of−Δ, we havewith separation rank r = 𝒪{log[D/(δɛ)]}.
For functions with Fourier transform supported in the annulus (∑ ξ)^{1/2} ∈ [, ], we thus have a separated representation for the Green's function: For periodic problems it may be more natural to restrict the wave number to the “cubic annulus” max_{1≤i≤d} ξ_{i} ∈ [, ], in which case the approximation (Eq. 17) needs to be valid on the interval [δ/d, dD], and thus the separation rank grows as r = 𝒪{log[d^{2}D/(δɛ)]} with the dimension d.
Basic Linear Algebra
The main point of the separated representation is that the elementary objects on which we operate are onedimensional, so that linear algebra in dimension d is performed using only onedimensional operations. The computational complexities are linear in d rather than exponential.
We assume that all objects have been discretized using N points in each direction, so a vector in dimension d has N^{d} entries. In d dimensions, a dense matrix has (N^{2})^{d} entries, whereas a banded matrix has (bN)^{d} entries. In the separated representation (Eq. 4), we will need to store d⋅r⋅N^{2} entries if the matrices 𝕍 are dense or d⋅r⋅bN entries if they are banded. The banded case demonstrates the effect of combining the separated representation with a fast onedimensional algorithm.
Addition of two matrices in d dimensions takes (N^{2})^{d} operations if they are dense and (bN)^{d} if they are banded. In the separated representation the addition 𝔸̃ + 𝔸̂ is merely the merging of two sums and then resorting. Thus addition is essentially free but yields a matrix with separation rank r = r̂ + r̃.
Multiplication of matrices is the most important operation for the applications we have in mind. In d dimensions, multiplication takes (N^{3})^{d} operations for dense matrices and (b^{2}N)^{d} operations for banded matrices. In the separated representation, multiplication can be done using only onedimensional operations: We have r = r̂r̃ pairings, each of which costs d onedimensional matrix–matrix multiplications. Thus, we need d⋅r̂⋅r̃⋅N^{3} operations if the onedimensional matrices are dense and d⋅r̂⋅r̃⋅b^{2}⋅N operations if they are banded. The resulting matrix has separation rank r = r̂r̃.
Several other operations are also efficient in the separated representation, in particular the computation of inner products, Frobenius norm, trace, matrix–vector multiplication, etc. We will not discuss these, since they follow the same pattern as matrix–matrix multiplication. It is also possible to solve a linear system in the separated representation by posing the problem as a leastsquares problem and then doing a variant of the separation rank reduction algorithm described below.
The following are examples of algorithms that use only matrix and vector operations that can be done in the separated representation with computational cost linear in d.
Power method (F_{k+1} = 𝔸F_{k}) to determine the largest eigenvalue and its eigenvector for a matrix 𝔸.
Schulz iteration (𝔹_{k+1} = 2𝔹_{k} − 𝔹_{k}𝔸𝔹_{k}) to construct 𝔸^{−1}.
Sign iteration [𝔸_{k+1} = (3𝔸_{k} − 𝔸)/2] to construct sign(𝔸) (10).
Scaling and squaring ([exp(𝔸/2^{n})]^{2n}) to construct the matrix exponential exp(𝔸).
Inverse power method (𝔸F_{k+1} =F_{k}) for computing the smallest eigenvalue and its eigenvector for a matrix 𝔸.
Since the basic linear algebra operations increase the separation rank, after each step in these algorithms we reduce it using the algorithm in the following section. In all our experiments we observe that the final and intermediate matrices have low separation rank, and we conjecture that this is true for a wide class of problems.
Finding “Optimal” Representations
The key to the success of our approach is finding a separated representation with low separation rank for matrices of interest. We assume that 𝔸 is given to us in the separated representation but with larger separation rank r than necessary, as occurs, for example, when we multiply two matrices to form 𝔸. We have found the algorithm described here to be effective in practice, although it is not guaranteed to find the optimal representation. This algorithm uncovered the trigonometric identity (Eq. 6) and the derivative formulation (Eq. 12) for the multiparticle Schrödinger operator and produced the numerical results described below. Although it is a problem of ongoing interest to find the optimal separation rank (e.g., see refs. 6–8, 11, and 12), it is not required for our approach. In fact, a suboptimal representation is preferred when it has better conditioning or can be obtained faster. For example, when d = 2 we find it much more efficient to use a faster algorithm that produces a suboptimal solution, even though the SVD is available in this dimension.
Our overall strategy is to find the best approximation with separation rank r̃, beginning with r̃ = 1, and then increase r̃ and try again until the residual is less than ɛ. This strategy allows us to take our solution at one value of r̃, add another term (obtained e.g. by a power method) and use it as our initial approximation for separation rank r̃ + 1. Although the best representations of different separation ranks are not actually nested, this strategy provides us with a good initial approximation at each r̃. Other initialization strategies exist (see also ref. 11).
Before starting, we reduce our search space by rotating each direction into an efficient basis. For each fixed direction i, the matrices {𝕍} each have N^{2} (or bN) entries and span a vector space of dimension M_{i} ≤min(r, N^{2}). By computing a basis for the span, we can express the matrix 𝕍 in this basis as a vector V of length M_{i}. The matrix 𝔸 is thereby expressed as the vector We will reduce the separation rank of A and then undo the change of basis to recover the matrix 𝔸. For simplicity we will assume M_{i} = M for all i.
Alternating Least Squares
For a given separation rank r̃, the best separated representation is that which minimizes the error (Eq. 5), i.e. solves the nonlinear leastsquares problem. To make this problem tractable, we exploit the multilinearity of the problem and use an alternating leastsquares approach. This approach is also used in statistics (e.g., see refs. 8 and 11). Besides a few technical details, the key conceptual difference in our approach is that our input is a vector in the separated representation rather than a dense data vector in dimension d, and thus we can consider much larger values of N and d.
In the alternating leastsquares approach one starts with an initial approximation to A in Eq. 20, and then iteratively refines it. We loop through the directions k = 1, … , d. For each direction k, fix the vectors in the other directions {Ṽ}_{i≠k} and then solve a linear leastsquares problem for new Ṽ (and s̃_{l̃}). Repeat the loop in k iteratively until convergence is detected or ∥A − Ã∥ < ɛ. Although it is easy to prove that the norm of the residual (∥A − Ã∥) decreases at each step, in general this process is not guaranteed to converge to the best representation with separation rank r̃.
This linear leastsquares problem naturally divides into separate problems for each coordinate. For fixed direction k, form the matrix 𝔹 with entries Then, for a fixed coordinate j_{k}, form the vector b_{jk} with entries The normal equations for the direction k and coordinate j_{k} become which we solve for c_{jk}(l̃) as a vector in l̃. After computing c_{jk}(l̃) for all coordinates j_{k}, we let s̃_{l̃} = ∥c_{jk}(l̃)∥ and Ṽ(j_{k}) = c_{jk}(l̃)/s̃_{l̃}, where the norm is taken with respect to the coordinate j_{k}.
For fixed direction k and coordinate j_{k}, it requires r̃^{2}⋅d⋅M operations to compute 𝔹, r̃r⋅d⋅M to compute b_{jk}, and r̃^{3} to solve the system. Since 𝔹 and the inner products in b_{jk} are independent of the coordinate j_{k}, computing for another value of j_{k} has incremental cost rr̃ + r̃^{2}. Similarly, many of the computations involved in 𝔹 and b_{jk} are the same for different k. Thus, one full alternating leastsquares iteration costs 𝒪[d⋅r̃(r̃^{2} + r⋅M)]. Because this algorithm uses inner products that can only be computed to within roundoff error μ, the best accuracy obtainable is ɛ = .
We have found it prudent to monitor the conditioning of the matrix 𝔹 in Eq. 22 by detecting small pivots during the linear solve of Eq. 24. If a small pivot is detected, it means the set of vectors {Ṽ ⊗ ⋯ ⊗Ṽ} that make up 𝔸̃ have become nearly linearly dependent. If the separation values are large, it signals a legitimate but poorly conditioned representation, in which case we should raise r̃ to alleviate the conditioning. If the separation values are not large, we can instead discard the vector corresponding to the small pivot and reduce r̃ by one.
Numerical Examples
In this section we provide numerical evidence to support our conjectures and to demonstrate that our approach can be used successfully.
Alternating LeastSquares Tests.
In the first series of tests we generate a random separated vector of norm one (Eq. 20) and then look for a representation (Eq. 21) with the same separation rank using our algorithm. On a typical run with d = 30, M = 100, and r = 100, the residual decreased steadily but slowly as r̃ was increased and was still 2⋅10^{−2} when r̃ = 99. At r̃ = 100 the error dropped to 2⋅10^{−8}, the best obtainable in double precision. The entire process took 2,900 seconds on a 360MHz Sun Ultra10.
The second test is to find a representation for sin(∑ x_{j}), starting with the separated representation with r = 2^{d−1} terms, obtained via ordinary trigonometric formulas. When started with r̃ = d, the alternating leastsquares algorithm quickly finds separated representations with r̃ = d terms, thereby generating an instance of the trigonometric identity (Eq. 6). When started with r̃ < d, however, it pursued poorly conditioned representations.
The third test is the protontype potential (Eq. 8), which nominally has r = d. In dimensions d ≤ 100, our algorithm found representations of the correct form (Eq. 12), which was unknown to us at the time of the experiment. It did not, however, find truly optimal representations and occasionally encountered badly conditioned matrices (Eq. 22).
Computation of a Spectral Projector.
For a diagonalizable matrix 𝔸 with real eigenvalues, we can compute sign(𝔸) using the following recursion: Using the sign function, we can compute spectral projectors (10), which are useful for computing electron densities, in some wavepropagation problems and in model reduction.
We now give an example of the computation of a sign function. Here we test the principle of using a separated representation to compute functions of matrices. We compare the run times using the ordinary representation with dense matrices and sparse matrices in wavelet coordinates and the separated representation with dense and sparse onedimensional matrices 𝕍.
We compute sign(ℋ − 50ℐ) for the Hamiltonian on a periodic domain in dimension d = 2. This Hamiltonian is a simplified singleatom independentelectron model for a crystal. The shift 50 separates out the five eigenfunctions with lowest eigenvalues. Even in this dimension the computation is very expensive. The sign function does not have an explicit solution and cannot be simplified by separation of variables. We perform 30 iterations (Eq. 25) and truncate the sparse matrices at 10^{−5} relative precision. Table 1 gives run times on a Sun Ultra2 with a 300MHz processor, thus demonstrating that the separated representation removes the dimension d = 2 from the exponent.
Computation of the Smallest Eigenvalue via the Power Method in Dimension 30
For a diagonalizable matrix 𝔸, we can compute the spectral radius ρ(𝔸) with the power method. Beginning with a random vector F_{0}, we compute The norm ∥G_{k}∥ will converge to ρ(𝔸), and if the corresponding eigenvalue is distinct, then F_{k} is the eigenvector.
For the operator, we choose the Hamiltonian which is a simplified singleatom interacting delectron model. We discretize ℋ to form the matrix ℍ. Since the smallest eigenvalue of ℍ corresponds to the lowest energy state, we apply the power method to 𝔸 = C_{d}𝕀 − ℍ for a suitable shift C_{d} ≈ ∥ℍ∥/2 to determine that eigenvalue. Usually the power method is not used to compute the smallest eigenvalue, in particular due to the large number of iterations required. However, it provides a simple demonstration of the ability to compute in higher dimensions.
The initial vector F_{0} is chosen with separation rank one. After each iteration, we reduce the separation rank of F_{k} using F_{k−1} as the initial approximation in the alternating leastsquares algorithm. For small values of k,F_{k} does not have the properties of the target eigenvector, and so it may have large separation rank. To prevent the power method from slowing down, we use an adaptively changing accuracy ɛ_{k}.
For this example we choose dimension d = 30 and a onedirectional discretization with N = 20. These choices make ∥ℍ∥ ≈ 8⋅10^{4}, so we choose C_{d} = 5⋅10^{4}. The matrix ℍ has apparent separation rank 2d + d(d − 1) = 930 using trigonometric identities, but we represent it with separation rank r = 22 using the derivative formulation in Eq. 12, accurate to relative precision ɛ = 10^{−7} in the operator norm. We iterate until the norm ∥G_{k}∥ has converged within 10^{−7} relative accuracy and use this as the correct value for ρ(𝔸). We then examine earlier iterates that are accurate to within ɛ, for various values of ɛ. In Table 2 we vary ɛ and present the number of iterations needed to obtain that precision, the separation rank of F_{k}, and the run time in seconds on a Sun Ultra10 with a 360MHz processor.
Although we did not enforce antisymmetry in this example, it is possible to guide the iteration to a fermionic rather than bosonic eigenspace. We plan to perform computations with the full Schrödinger operator, including the antisymmetry condition, and will present results as they become available.
Acknowledgments
This research was supported partially by National Science Foundation Grant DMS9902365 (to M.J.M.), Defense Advanced Research Projects Agency Grants F496209810491 and F306029810154, and University of Virginia Subcontract MDA9720010016 (to G.B.).
Footnotes

↵* To whom reprint requests should be addressed. Email: beylkin{at}colorado.edu.

This paper was submitted directly (Track II) to the PNAS office.
Abbreviations

SVD, singular value decomposition
 Received July 31, 2001.
 Accepted May 31, 2002.
 Copyright © 2002, The National Academy of Sciences
References
 ↵
 Bellman R.
 ↵
 Yarkony D. R.
 ↵
 Temlyakov V. N.
 ↵
 ↵
 Strömberg J.O.
 ↵
 ↵
 ↵
 ↵
 ↵
 ↵
 Leurgans S. E.
 ↵
Citation Manager Formats
More Articles of This Classification
Physical Sciences
Related Content
 No related articles found.
Cited by...
Similar Articles
You May Also be Interested in
Sign up for Article Alerts
Jump to section
 Article
 Abstract
 Summary of the Paper
 The Separated Representation
 The Multiparticle Schrödinger Operator
 The Inverse Laplacian
 Basic Linear Algebra
 Finding “Optimal” Representations
 Alternating Least Squares
 Numerical Examples
 Computation of the Smallest Eigenvalue via the Power Method in Dimension 30
 Acknowledgments
 Footnotes
 Abbreviations
 References
 Authors & Info