## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# A quantitative version of the commutator theorem for zero trace matrices

Edited by* Robion C. Kirby, University of California, Berkeley, Berkeley, CA, and approved July 13, 2012 (received for review February 15, 2012)

## Abstract

Let *A* be an *m* × *m* complex matrix with zero trace and let *ε* > 0. Then there are *m* × *m* matrices *B* and *C* such that *A* = [*B*,*C*] and ‖*B*‖‖*C*‖ ≤ *K*_{ε}*m*^{ε}‖*A*‖ where *K*_{ε} depends only on *ε*. Moreover, the matrix *B* can be taken to be normal.

It is well-known that a complex *m* × *m* matrix *A* is a commutator (i.e., there are matrices *B* and *C* of the same dimensions as *A* such that *A* = [*B*,*C*] = *BC* - *CB*) if and only if *A* has zero trace. In such a situation clearly ‖*A*‖ ≤ 2‖*B*‖‖*C*‖ where ‖*D*‖ denotes the norm of *D* as an operator from to itself.

Is it true that the converse holds? That is, if *A* has zero trace are there *m* × *m* matrices *B* and *C* such that *A* = [*B*,*C*] and ‖*B*‖‖*C*‖ ≤ *K*‖*A*‖ for some absolute constant *K*?

Here we provide a weaker estimate.

For every *ε* > *o* there is a constant *K*_{ε} such that if an *m* × *m* matrix *A* has zero trace then there are *m* × *m* matrices *B* and *C* such that *A* = [*B*,*C*] and ‖*B*‖‖*C*‖ ≤ *K*_{ε}*m*^{ε}‖*A*‖. Moreover, the matrix *B* can be taken to be normal.

The proof will be presented in the next section. It is self-contained except for two facts. The first is a relatively easy result of Rosenblum (1) which gives a solution for *X* of the matrix equation *A* = *SX* - *XT* where all matrices are square and *S* and *T* have separated spectra in the sense that there is a domain *D*, whose boundary is a simple curve, which contains the spectrum of *S* and is disjoint from the spectrum of *T*. The solution then is, as follows: The second fact is a heavy theorem of Bourgain and Tzafriri (2) related to restricted invertibility of matrices and to the Kadison–Singer conjecture. It is stated as Theorem 2 in the sequel.

We also remark in Claim 3 that if the statement of our main theorem, Theorem 3, which is a somewhat stronger version of the theorem above, holds with an absolute constant replacing *K*_{ε}*m*^{ε}, then the Kadison–Singer conjecture holds.

After two of us were led to this problem while considering classification problems for commutators in spaces of operators on Banach spaces, one of us raised the problem discussed here on MathOverFlow (http://mathoverflow.net/questions/27345) Although the MathOverFlow discussion did not produce a solution to the problem, it did put the authors in contact with one another and the discussion itself contains some useful tidbits.

## The Main Result

Given 0 < *ε* < 1, define a sequence of sets Λ_{n} inductively: Λ_{1} is the set of 4 points { ± 1 ± *ı*1} and Note that Λ_{n} is a subset of the square [-1,1] × [-*ı*,*ı*] of cardinality 4^{n} and that it consists of a disjoint union of four sets each of which is a translate of and for each two of them their projection on either the real or imaginary axis is 2*ε* separated.

As we shall discuss below, every square matrix with zero trace is unitarily equivalent to a matrix with zero diagonal. It is thus enough to consider such matrices *A*. In our main result the matrix *B* can then be chosen to be a diagonal matrix. This fact is the reason for the definitions of μ and λ below. We do not know if one can get better results with more general *B*.

Given a 4^{n} × 4^{n} matrix *A* with zero diagonal denote by μ(*A*) the smallest number μ such that there is a diagonal matrix *B* with diagonal elements exactly the points of Λ_{n} and a 4^{n} × 4^{n} matrix *C* such that *A* = [*B*,*C*] = *BC* - *CB* and ‖*C*‖ ≤ μ. Note that because *A* has zero diagonal, for each diagonal matrix *B* with distinct diagonal entries {*b*_{i}} such a matrix *C* exists and its nondiagonal entries are uniquely defined by *c*_{ij} = *a*_{ij}/(*b*_{i} - *b*_{j}). Put also μ(4^{n}) = max μ(*A*) where the max ranges over all zero diagonal 4^{n} × 4^{n} matrices of norm one.

Similarly, for *m* not necessarily of the form 4^{n}, we denote by λ(*A*) the smallest number λ such that there is a diagonal matrix *B* with diagonal elements in [-1,1] × [-*ı*,*ı*] and an *m* × *m* matrix *C* such that *A* = [*B*,*C*] = *BC* - *CB* and ‖*C*‖ ≤ λ. Put λ(*m*) = max λ(*A*) where the max ranges over all *m* × *m* matrices of zero diagonal and norm one.

Given an *m* × *m*, *m* = 4^{n}, matrix *A* write it as a 4 × 4 block matrix with blocks of size 4^{n-1} × 4^{n-1}

### Claim 1.

In particular Also, and [1]

Let *B*_{ii} be diagonal matrices with diagonal entries in Λ_{n-1} and *C*_{ii} 4^{n-1} × 4^{n-1} matrices with *A*_{ii} = [*B*_{ii},*C*_{ii}] and ‖*C*_{ii}‖ = μ(*A*_{ii}). Let (the order doesn’t matter), and, for *i* ≠ *j*, let be defined (uniquely) by Then by the result mentioned in the Introduction (see refs. 1 or 3), where *D*_{ij} is the boundary curve of any domain containing the spectrum of and disjoint from the spectrum of . Because we can easily find such a curve of distance at least *ε* from the spectra of and and of length 4 + 4*ε* < 8 we get that .

Let and set and Then Here the first summand on the right-hand side comes from the estimate of the diagonal part of the decomposition of *C* and the second from the rest (the constant 6 is not important here but is easy to achieve by decomposing the off diagonal part of *C* into three matrices each of which is a permutation of a diagonal block matrix). This estimate gives the claim for μ and the proof for λ is almost identical.▪

Note that to estimate λ(4^{n}) in terms of λ(4^{n-1}) we only used the fact that ‖*A*_{ii}‖ ≤ ‖*A*‖. If we could get a better estimate on ‖*A*_{ii}‖ we would get a better estimate on λ(4^{n}). This observation will be used in the proof of the main theorem below. In the proof of the main theorem we shall only use the parameter λ. The reason we also included μ here is that the matrices *B* in the proof for the property of μ depend only on *ε* and not on the matrices *A*. Optimizing over *ε* we get

(i) For each *m* there is an *m* × *m* diagonal matrix *B* with spectrum in the square [-1,1] × [-*ı*,*ı*] such that for each *m* × *m* matrix *A* with diagonal zero there is an *m* × *m* matrix *C* with norm at most such that *A* = [*B*,*C*].

(ii) For each *m* = 4^{n} there is a subset Λ_{m} of [-1,1] × [-*ı*,*ı*] such that for any trace zero *m* × *m* matrix *A* there is a normal matrix *B* with spectrum Λ_{m} and a matrix *C* with norm at most such that *A* = [*B*,*C*].

For each 0 < *ε* < 1, *m* of the form 4^{n}, and an *m* × *m* matrix *A* with norm 1 and zero diagonal, Claim 1 gives, as long as , that Let *k* be the largest natural number smaller than log _{4}*m* such that . (If no such *k* exists take *k* = log _{4}*m* and change the argument below a bit, getting a better estimate.) Then For we get Because *k* is at most log _{4}*m* we get (*i*). To get (*ii*) use the fact (see e.g. refs. 4 or 5) that any trace zero matrix is unitarily equivalent to a matrix with zero diagonal.▪

The power 1/2 of *m* in the first part of Corollary 1 cannot be lowered. Indeed, if *B* is any *m* × *m* diagonal matrix with spectrum in [-1,1] × [-*ı*,*ı*] then there are *i* ≠ *j* in {1,2,⋯,*m*} with . If *A* is the *m* × *m* matrix with 1 in the *i*,*j* place and zero elsewhere and *A* = [*B*,*C*], then it is easy to see that the absolute value of the *i*,*j* entry of *C* is at least .

Note that the constant in Eq. **1** is what leads to the power 1/2 of *m* in the Corollary above. If we could replace it with we could eliminate the power of *m* altogether and be left with only a log factor. The way we approach this strengthening was hinted at above—we make sure that the norms of the ‖*A*_{ii}|| are substantially smaller than that of ‖*A*‖. Unfortunately, we can assure that only for half of the *i*-s (and we shall say more about it before Theorem 2). The next Claim hints at how to deal with the full matrix once we take good care of half of it. We place it here rather than just before it is used because its proof is similar to that of Claim 1. Claim 2 shows that if a zero diagonal 2*m* × 2*m* matrix *A* has its two *m* × *m* central submatrices having substantially different λ values and the smaller one is substantially larger than the norm of the matrix, then λ(*A*) is, up to a multiplicative constant close to 1, basically the same as the larger of these two values. This reasoning will be used in the proof of the main theorem.

### Claim 2.

Let be a 2*m* × 2*m* matrix with zero diagonal where the *A*_{ij} are all *m* × *m* matrices. Assume also that λ(*A*_{ii}) ≤ *c*_{i} where *c*_{1}/*c*_{2} < 1/4. Then for some absolute constant *K* > 0.

Write *A*_{ii} = *B*_{ii}*C*_{ii} - *C*_{ii}*B*_{ii}, *i* = 1,2 where the *B*_{ii} are diagonal matrices with spectrum in [-1,1] × [-*ı*,*ı*] and ‖*C*_{ii}‖ = λ(*A*_{ii}) ≤ *c*_{i}. For any 1/2 > δ≥*c*_{1}/*c*_{2} put and Then and the -s are diagonal matrices with spectrum in [-1,1] × [-*ı*,*ı*]. Moreover, the spectrum of lies to the left of the vertical line *ℜz* = -1 + 2δ and that of to the right of the vertical line *ℜz* = -1 + 4δ. Also Define , *i* ≠ *j*∈1,2, by then, by the same argument as in the proof of Claim 1, using Rosenblum’s result, ‖*C*_{ij}‖ ≤ *K*‖*A*‖/δ^{2} for some universal *K*. Define then *A* = *B*^{′}*C*^{′} - *C*^{′}*B*^{′}, *B* is a diagonal matrix with spectrum in [-1,1] × [-*ı*,*ı*] and Taking δ = (*c*_{1}/*c*_{2})^{1/2} we get that for some absolute constant *K* (which, as a careful examination of the proof shows, can be taken to be 4/π).▪

Claim 2 together with the remark following the proof of Claim 1 hints that to evaluate λ(2*m*) in terms of λ(*m*) it is enough, given a 2*m* × 2*m* matrix *A* with zero diagonal, to find an *m* × *m* central submatrix and a 4 × 4 block decomposition of it so that the 4 central submatrices (corresponding to *A*_{ii} in Claim 1) have relatively small norms. To achieve this decomposition and similar ones we use a theorem of Bourgain and Tzafriri (2):

For some absolute constant *K* > 0, if *A* is an *m* × *m* matrix with zero diagonal then for all *ε* > 0 there is a central (i.e., whose diagonal is a subset of the diagonal of *A*) submatrix *A*^{′} of dimension ⌊*ε*^{2}*m* × *ε*^{2}*m*⌋ whose norm is at most *Kε*‖*A*‖. Consequently, if *A* is a norm one 2·4^{n} × 2·4^{n} matrix with zero diagonal then for all *l* ≤ *n* there are 4^{l} disjoint subsets σ_{i} of 1,2,…,2·4^{n} each of size 4^{n-l} such that all the submatrices corresponding to the entries in σ_{i} × σ_{i} have norm at most *K*2^{-l}.

The last paragraph in the theorem above is a simple and known consequence of the first one—choose σ_{1}, remove it from the set of rows and columns obtaining a somewhat smaller matrix [of size (2·4^{n} - 4^{n-l}) × (2·4^{n} - 4^{n-l})]. Choose σ_{2} from the set of rows (and columns) of this new matrix, remove it and continue in that manner 4^{l} times. This choice is possible because all the submatrices from which we are choosing 4^{n-l} × 4^{n-l} central submatrices are of size at least 4^{n} × 4^{n}.

We are now ready for the statement and proof of our main theorem, which easily implies Theorem 1.

(i) For each *ε* > 0 there is a constant *K*_{ε} such that for all *m*(ii) For each *ε* > 0 there is a constant *K*_{ε} such that for all *m* and every *m* × *m* zero trace matrix *A* there is a normal matrix *B* with spectrum in [-1,1] × [-*ı*,*ı*] and a matrix *C* with norm at most *K*_{ε}*m*^{ε}‖*A*‖ such that *A* = [*B*,*C*].

Let *A* be a 2·4^{n} × 2·4^{n} matrix with zero diagonal and norm one. Let 1 ≤ *l* ≤ *n* and let *A*^{′} be the 4^{n} × 4^{n} submatrix corresponding to the entries in where σ_{i} are given by Theorem 2. Let denote the submatrix corresponding to the entries in σ_{i} × σ_{i}, *i* = 1,2,…,4^{l}. Divide 1,2,…,4^{l} into 4^{l-1} disjoint sets each a union of 4 σ_{i}-s and let , *i* = 1,2,…,4^{l-1}, denote the 4^{n-l+1} × 4^{n-l+1} submatrices corresponding to the entries corresponding to these sets. Continue in this manner to define , *i* = 1,2,…,4^{s} for each *s* = 0,1,2,…,*l* where for *s*≥1 is a 4^{n-s} × 4^{n-s} submatrix of one of the . Note that .

Now, By Claim 1 for each *ε* > 0, where the last step is the place we use Theorem 2. Now use Corollary 1 to get that for some absolute constants *K* (not necessarily the same in each row) [2]For *ε* = 1/*l* we get and taking *l* = *n*/2 gives [3]We managed to reduce the power of *m* in the bound on λ(*A*) from *m*^{1/2} to *m*^{1/4} but only for a large submatrix. Next we are going to utilize Claim 2 to get a similar bound for the whole matrix. Let and let *A*^{′′} be the submatrix of *A* with entries in σ^{c} × σ^{c}. Put *c*_{1} = *K*(log *m*)^{3}*m*^{1/4} and *c*_{2} = max{*K*(log *m*)^{7}*m*^{1/4},λ(*A*^{′′})}. Then *A*, *A*_{11} = *A*^{′} and *A*_{22} = *A*^{′′} satisfy the assumptions of Claim 2 with *c*_{1},*c*_{2}. Consequently, where we continue to use *K* to denote a universal constant, possibly different in different occurrences, and for *m* = 4^{n}, *n*≥1, Repeating the argument again reducing from matrices of size 4^{n+1} × 4^{n+1} to ones of size 2·4^{n} × 2·4^{n} and combining with the above we get, for *m* = 4^{n}, Let *k* ≤ *m* be the largest power of 4 such that λ(*k*) ≤ *K*(log _{4}*k*)^{7}*k*^{1/4}. Then For some other absolute constant *K* this last quantity is at most *K*(log *m*)^{7}*m*^{1/4}. We thus improved the previous bound on λ(*m*) (for *m* = 4^{n}) to for some absolute *K*.

Repeating the argument one can improve the bound further: Go back to Eq. **2** and plug this new bound to get For *ε* = 1/*l* we get and taking *l* = *n*/3 gives replacing Eq. **3** with this new estimate and following the rest of the argument above leads to Iterating, this argument leads to a bounds of the form: [4]for every *m* = 4^{n} and every positive integer *k*, where *K*_{k} depends only on *k*. This inequality gives the statement of the theorem for *m* being a power of 4. For a general *m* × *m* zero diagonal matrix *A*, complete it to a 4^{n} × 4^{n} matrix *A*^{′} where 4^{n-1} < *m* ≤ 4^{n} by adding zero entries and keeping *A* supported on {1,2,⋯,*m*} × {1,2,⋯,*m*}. Apply the theorem to *A*^{′} and note that the fact that *B* is diagonal implies that we can assume that *C* has nonzero entries only in {1,2,⋯,*m*} × {1,2,⋯,*m*}. We thus proved the first part of the theorem. The second follows from the fact that any trace zero matrix is unitarily equivalent to a zero diagonal matrix.▪

## Concluding Remarks

1. Recall that the paving conjecture states that for every *ε* > 0 there is a positive integer *n*(*ε*) such that any norm one zero diagonal matrix has a paving of length at most *n*(*ε*) and norm at most *ε*. By a paving of *A* we mean a block diagonal submatrix of *A* whose diagonal is the same as that of *A*. The length of a paving is the number of blocks. Anderson (6) showed that this conjecture is equivalent to the Kadison–Singer conjecture (7) on the extension of pure states. For a recent expository paper on these conjectures see ref. 8.

It is clear from the proof above that if the paving conjecture holds with the right parameters then the proof can be simplified and the main result strengthened to get a polylog estimate on λ(*m*). We next show that the reverse holds in a very strong sense. In particular if λ(*m*) is bounded independently of *m* then the paving conjecture holds.

### Claim 3.

Assume *A* = [*B*,*C*] with *B* an *m* × *m* diagonal matrix with spectrum in [-1,1] × [-*ı*,*ı*] and *C* an *m* × *m* matrix. Then for every 0 < *ε* < 1 *A* has a paving of length and norm .

Partition [-1,1] into disjoint intervals *I*_{i} of length at most *ε* each. Let *B*(*i*,*j*) be the central (diagonal) submatrix of *B* whose diagonal entries are in *I*_{i} × *ıI*_{j}, let *A*(*i*,*j*) and *C*(*i*,*j*) be the central submatrices of *A* and *C*, respectively, with the same support as *B*(*i*,*j*). *A*(*i*,*j*), , is a paving of *A* and it is enough to prove that .

Clearly *A*(*i*,*j*) = [*B*(*i*,*j*),*C*(*i*,*j*)]. Pick *i*,*j*, let *b* be the center of the square *I*_{i} × *ıI*_{j} and note that *bI* - *B*(*i*,*j*) [with *I* the identity matrix of the same dimensions as *B*(*i*,*j*)] is a diagonal matrix with entries of absolute value at most . Therefore ▪

2. A more careful examination of the proof of Theorem 3 shows that the constant we get in Eq. **4** is for some absolute constant *K*. Optimizing over *k* gives for some absolute *K*.

3. It is quite easy to see that 1/2 is also the best constant for *K* in the second paragraph of the introduction (assume *A* has zero diagonal and take *B* to be diagonal with diagonal elements 1/2 and -1/2). It also follows that . We did not try to compute the best constants for other small values of the dimension.

4. Although the problem we discuss seems basic enough not to need further motivation, we would like to indicate one. If any trace zero matrix *A* could be written as *A* = [*B*,*C*] with ‖*B*‖‖*C*‖ ≤ *K*‖*A*‖ for a universal *K*, then we would get a simple characterization of the commutators in an important class of *II*_{1} factors, the Wright factors; an element there would be a commutator if and only if it has zero trace. See ref. 9 for this and related matters.

5. One can ask similar questions to the one addressed here for norms other than the operator norm. In particular, what is the (order of) the best constant in where *A* ranges over all trace zero *m* × *m* matrices, *A* = *BC* - *CB* and ‖·‖_{HS} denotes the Hilbert–Schmidt norm?

We checked that a proof with a similar idea but much simpler gives *K* = *O*((log *m*)^{3/2}). In this case we can also prove a lower bound for *K* of order (log *m*)^{1/2}. This proof uses a quantitative version of an argument from ref. 10. Details will appear elsewhere.

## Acknowledgments

W.B. Johnson was supported in part by National Science Foundation Grant DMS-1001321 and US-Israel Binational Science Foundation. N. Ozawa was supported in part by Japan Society for the Promotion of Science. G. Schechtman was supported in part by US-Israel Binational Science Foundation. Ozawa and Schechtman were participants of the National Science Foundation Workshop in Analysis and Probability, Texas A&M University.

## Footnotes

- ↵
^{1}To whom correspondence should be addressed. E-mail: gideon{at}weizmann.ac.il.

Author contributions: W.B.J., N.O., and G.S. performed research.

The authors declare no conflict of interest.

↵*This Direct Submission article had a prearranged editor.

## References

## Citation Manager Formats

## Sign up for Article Alerts

## Article Classifications

- Physical Sciences
- Mathematics