New Research In
Physical Sciences
Social Sciences
Featured Portals
Articles by Topic
Biological Sciences
Featured Portals
Articles by Topic
 Agricultural Sciences
 Anthropology
 Applied Biological Sciences
 Biochemistry
 Biophysics and Computational Biology
 Cell Biology
 Developmental Biology
 Ecology
 Environmental Sciences
 Evolution
 Genetics
 Immunology and Inflammation
 Medical Sciences
 Microbiology
 Neuroscience
 Pharmacology
 Physiology
 Plant Biology
 Population Biology
 Psychological and Cognitive Sciences
 Sustainability Science
 Systems Biology
The Pythagorean Theorem: I. The finite case

Contributed by Richard V. Kadison
Abstract
The Pythagorean Theorem and variants of it are studied. The variations evolve to a formulation in terms of noncommutative, conditional expectations on von Neumann algebras that displays the theorem as the basic result of noncommutative, metric, Euclidean Geometry. The emphasis in the present article is finite dimensionality, both “discrete” and “continuous.”
1. Introduction and Theme
Most of us carry away from our earliest contact with elementary mathematics memories of two basic formulae from Euclidean Geometry: πr^{2}, the “area” of a circle with radius r, and a^{2} + b^{2} = c^{2}, the formula relating the lengths, a and b, of the two sides of a right triangle to the length, c, of the hypotenuse of that triangle. That last formula, the Pythagorean Theorem, is the most basic result of “metric” Euclidean Geometry.
In this article, we study that theorem and variants of it. Our study falls into two large parts: the case of “discrete dimensionality” and the case of “continuous dimensionality.” Each of these parts, in turn, falls into two parts: finite dimensionality and infinite dimensionality. The primary focus of this article is discrete dimensionality in the finite case, although we discuss the continuous case in the last section (where the meaning of the discretecontinuous division will become clearer). At the same time, in that section, we formulate the Pythagorean Theorem in terms of (noncommutative) conditional expectations and note its “semicommutative” nature. In this context (noncommutative, finitecontinuousdimensional, metric Euclidean Geometry), we prove a fully noncommutative version of the theorem. The next article in this series deals with discrete dimensionality in the infinite case. Arguments become more involved in that case.
Elementary, mostly finitedimensional, variants of the Pythagorean Theorem are examined in the next section, some of them new. A converse, which we refer to as the Carpenter's Theorem, is introduced. The proof of this converse is carried out by operator–matrix methods in the third section. In this same section, we view the Pythagorean Theorem in terms of traces, in terms of indices, and in terms of stochastic matrices.
The fourth section contains a discussion of the finitecontinuous case. The Carpenter's Theorem is left open in that case as a subject for later elucidation.
2. Elementary Variations
To begin with, the Pythagorean Theorem refers to “plane geometry.” Are there threedimensional, ndimensional, or even infinitedimensional analogues of that theorem? Of course there are, and they are familiar—but first we must recast the theorem mildly. If we replace the two sides of the triangle by “orthogonal” axes and the hypotenuse by a vector x of length c, the “orthogonal projections” of that vector on the axes have lengths a and b satisfying a^{2} + b^{2} = c^{2}, by virtue of the Pythagorean Theorem. This is our first variation.
By choosing vectors e_{1} and e_{2} of length 1 (unit vectors) along the positive (orthogonal) axes, the projections of x on these axes allow us to “expand” x in terms of the orthonormal basis {e_{1}, e_{2}} (for the plane). That is, we express x as the linear combination c_{1}e_{1} + c_{2}e_{2} of e_{1} and e_{2}. In this case, c_{1} = a, c_{2} = b, and the length ∥x∥ of x is c, where a^{2} + b^{2} = c^{2}. This is our second variation of the Pythagorean Theorem.
In this form, we can take the leap (our third variation) to Hilbert space ℋ of any dimension. With {e_{a}}_{a∈𝔸} an orthonormal basis for ℋ, and x in ℋ, there is an expansion, x = ∑_{a∈𝔸} c_{a}e_{a}, where the equality refers to convergence of finite subsums to x in the “metric” of the Hilbert space. The inner product of vectors x and y in ℋ is denoted by 〈x, y〉, and the length (or norm) ∥x∥ of x is 〈x, x〉^{1/2}. Convergence of ∑_{a∈𝔸} c_{a}e_{a} is over the “net” of finite subsets of 𝔸 (directed by inclusion). The Parseval equality tells us that ∥x∥^{2} = ∑_{a∈𝔸} c_{a}^{2}, which is a direct extension of the Pythagorean Theorem, to (Hilbert) space of any dimension.
In the context of “infinitedimensional” Hilbert space, there is more to be said. Given a potential set of coefficients {c_{a}}_{a∈𝔸}, there is a (unique) vector x in ℋ with expansion ∑_{a∈𝔸} c_{a}e_{a} if and only if ∑_{a∈𝔸} c_{a}^{2} converges (in which case ∑_{a∈𝔸} c_{a}^{2} converges to ∥x∥^{2}). Some aspect of this added information is present in the Pythagorean Theorem when that theorem is suitably formulated (our fourth variation): the positive numbers a and b are the lengths of the sides of a right triangle with hypotenuse of length c if and only if a^{2} + b^{2} = c^{2}. Carpenters use this aspect to check that their work is “true.” We shall refer to this “converse” to the usual statement of the Pythagorean Theorem as the Carpenter's Theorem.
The “expansion” formulation of the Pythagorean Theorem involves projecting a vector onto orthogonal axes. We can reverse that and formulate the theorem (our fifth variation) in terms of the projections of vectors of equal length along the axes onto the line determined by a vector.
In this case, the lengths of the projections of the axis vectors of length c onto the line have lengths a and b such that a^{2} + b^{2} = c^{2}, again as a result of the Pythagorean Theorem. It is not an essential restriction in this formulation to insist that c be 1. We are, then, projecting orthonormal basis vectors onto the line. Can something of this nature be said for orthonormal bases in higherdimensional spaces? Our sixth variation follows.
Proposition 1.
If {e_{a}}_{a∈𝔸}is an orthonormal basis for the Hilbert space ℋ, then the sum of the squares of the lengths of the orthogonal projections of each e_{a}on every onedimensional subspace of ℋ is 1. If a real nonnegative t_{a}is specified for each aand ∑_{a∈𝔸}t = 1, then ∑_{a∈𝔸}t_{a}e_{a}is a unit vector xin ℋ that generates a onedimensional subspace of ℋ on which each e_{a}has projection of length t_{a}.
Proof:
If x is a unit vector and 𝒱 is the onedimensional subspace of ℋ spanned by x, then the orthogonal projection of e_{a} on 𝒱 is 〈e_{a}, x〉x and ∥〈e_{a}, x〉x∥^{2} = 〈e_{a}, x〉^{2}∥x∥^{2} = 〈e_{a}, x〉^{2}. From Parseval's equality, Because 〈e_{a′}, ∑_{a∈𝔸} t_{a}e_{a}〉 = t_{a′}, each e_{a} has a projection on the onedimensional space generated by ∑_{a∈𝔸} t_{a}e_{a} of length t_{a}. ▪
Equivalently, from the Pythagorean Theorem, we can specify the distances s_{a} from e_{a} to the onedimensional space subject to the condition that ∑_{a∈𝔸} 1 − s = 1. Of course, the question of orthogonal projections of basis elements may be asked when the projections are made onto a subspace of ℋ of dimension other than 1. What is the situation if, for example, 𝒱 is an mdimensional subspace of ℋ? In this case, choosing an orthonormal basis {f_{1}, … , f_{m}} for 𝒱, we have that the projection of e_{a} on 𝒱 is ∑ 〈e_{a}, f_{j}〉f_{j} of length whose square is ∑ 〈e_{a}, f_{j}〉^{2}. Now, ∑_{a∈𝔸} ∑ 〈e_{a}, f_{j}〉^{2} converges, because all terms are real and nonnegative, and from Parseval's equality. We have proved our seventh variation.
Proposition 2.
The sums of the squares of the lengths of the projections of the elements of an orthonormal basis for a Hilbert space ℋ onto an mdimensional subspace of ℋ is m.
Our eighth variation is an interesting, although small, alteration of Proposition 2. We emphasize it as our definitive (geometric) formulation of the finitedimensional Pythagorean Theorem because it puts in evidence a property that will play an important role in our extension of the Carpenter's Theorem to infinite dimensions.
Proposition 3.
If ais the sum of the squares of the lengths of the projections of relements of an orthonormal basis {e_{1}, … , e_{n}} for an ndimensional Hilbert space ℋ onto an mdimensional subspace ℋ_{0}, and bis the sum of the squares of the projections of the remaining n − rbasis elements on the orthogonal complement ℋ′_{0}, then
Proof:
If a_{j} is the square of the length of the projection of e_{j} on ℋ_{0}, then 1 − a_{j} is the square of the length of its projection on ℋ′_{0}. Thus a = a_{1} + ⋯ + a_{r}, b = 1 − a_{r+1} + ⋯ + 1 − a_{n}, and m = a_{1} + ⋯ + a_{n} from Proposition 2. It follows that Another proof, one that does not make use of Proposition 2, which is not available, of course, in the infinitedimensional case, follows. Let {e_{1}, … , e_{n}} be an orthonormal basis for ℋ. Let {f_{1}, ⋯ , f_{m}} and {f_{m+1}, ⋯ , f_{n}} be orthonormal bases for ℋ_{0} and ℋ′_{0}, respectively. The projection y of e_{j} on ℋ_{0} is ∑ 〈e_{j}, f_{k}〉f_{k}, and ∥y∥^{2} = ∑ 〈e_{j}, f_{k}〉^{2}. Thus a = ∑ ∑ 〈e_{j}, f_{k}〉^{2}. The projection of e_{j} on ℋ′_{0} is ∑ 〈e_{j}, f_{k}〉f_{k}, and the square of its length is ∑ 〈e_{j}, f_{k}〉^{2}, which is 1 − ∑ 〈e_{j}, f_{k}〉^{2} because 1 = ∥e_{j}∥^{2} = ∑ 〈e_{j}, f_{k}〉^{2}, from Parseval's equality. Thus and We note, especially, that the difference a − b is an integer however we split the basis for projection onto ℋ_{0} and ℋ′_{0}. If we move a basis element from those projected onto ℋ′_{0} to those projected onto ℋ_{0}, we increase the difference by 1; if we move a basis element in the opposite sense, we decrease the difference by 1, clearly not affecting the integrality of the difference. In the next section, we introduce matrix methods and give another proof.
Once again, we can ask whether the lengths of the projections of the basis elements can be specified subject to the condition that the sum of their squares is m (the Carpenter's Theorem for this case). That is, given such a specification, is there an mdimensional subspace of ℋ on which the projections of the basis elements have those lengths? Equivalently, from the Pythagorean Theorem, can we find an mdimensional subspace of ℋ from which the basis elements have specified distances not greater than 1, subject to the condition that subtracting their squares from 1 produces numbers that sum to m? The affirmative answer to these questions provides our ninth and tenth variations. Their proof requires more involved arguments.
3. Operator–Matrix Methods
We assume, first, that ℋ has finite dimension n, and that {e_{1}, … , e_{n}} is an orthonormal basis for ℋ. Let ℋ_{0} be an mdimensional subspace of ℋ and E the orthogonal projection of ℋ onto ℋ_{0}. If (a_{jk}) is the matrix of E relative to {e_{j}}, then a_{jk} = 〈Ee_{k}, e_{j}〉 for all k and j. Since E = E* = E^{2}, and ∑ a_{jj} = ∑ ∥Ee_{j}∥^{2}. It follows that the sum of the squares of the lengths of the projections of the basis elements e_{1}, … , e_{n} onto ℋ_{0} is the trace of E. Of course, there is a unitary transformation U of ℋ onto itself that maps ℋ_{0} onto the mdimensional space generated by {e_{1}, … , e_{m}}. The projection E_{m} that has that space as its range has matrix (b_{jk}) relative to {e_{j}} such that b_{11}, … , b_{mm} are 1 and b_{m+1 m+1}, … , b_{nn} are 0. Since UEU^{−1} = E_{m}, E and E_{m} have the same trace m. This proves again our sixth variation, tr(E) = ∑ ∥Ee_{j}∥^{2} = m, where “tr” is the functional that assigns to a matrix its usual (nonnormalized) trace, the sum of its diagonal entries. It tells us, too, that another (our eleventh) variation of the Pythagorean Theorem is the assertion: From these same considerations, we see that prescribing the squares of the lengths of the projections of the basis elements on an mdimensional subspace of ℋ amounts to prescribing the diagonal of the matrix, relative to that basis, of the projection with that subspace as range. Our Carpenter's Theorem question, in this case, becomes:
Is an ordered ntuple, 〈a_{1}, … , a_{n}〉 of numbers in [0, 1] with sum mthe diagonal of some idempotent selfadjoint n × nmatrix?
This has an affirmative answer. (Together with the ninth variation, it provides an extension, our twelfth variation, of the fourth variation.) For its proof, we make use of a variant of a combinatorial–geometric lemma used in ref. 1.
Definition 4:
With (a_{1}, … , a_{n}) (= ã) a point in ℝ^{n} and Π the group of permutations of {1, … , n}, we let 𝒦_{ã} be the (closed) convex hull of {(a_{π(1)}, … , a_{π(n)}) (= π(ã)): π ∈ Π} (= Π(ã)). We refer to 𝒦_{ã} as the permutation polytope generated by ã.
Lemma 5.
If a_{1} ≥ a_{2} ≥ ⋯ ≥ a_{n}, b_{1} ≥ b_{2} ≥ ⋯ ≥ b_{n}, and a_{1} + ⋯ + a_{n} = b_{1} + ⋯ + b_{n}, then the following are equivalent:
 (i),
 (b_{1}, … , b_{n}) (= b̃) ∈ 𝒦_{ã};;
 (ii),
 b_{1} ≤ a_{1}, b_{1} + b_{2} ≤ a_{1} + a_{2}, … , b_{1} + ⋯ + b_{n−1} ≤ a_{1} + ⋯ + a_{n−1};;
 (iii),
 There are points (a, … , a) (= ã_{1}), … , (a, … , a) (= ã_{n}) in 𝒦_{ã}such that ã_{1} = ã, ã_{n} = b̃, and ã_{k+1} = tã_{k} + (1 − t)τ(ã_{k}) for each kin {1, … , n − 1}, some transposition τ in Π, depending on k, and some tin [0, 1], depending on k.
Proof:
(i)→(ii). From the assumption that a_{1} ≥ ⋯ ≥ a_{n}, we conclude that a_{1} + ⋯ + a_{j} ≥ a_{π(1)} + ⋯ + a_{π(j)}, for each j in {1 … , n} and π in Π. Thus for each convex combination b̃ of points in Π(ã) and j in {1, … , n}, b_{1} + ⋯ + b_{j} ≤ a_{1} + ⋯ + a_{j}.
(iii)→(i). As π(d̃) ∈ Π(ã) when d̃ ∈ Π(ã), π(c̃) ∈ 𝒦_{ã} when c̃ ∈ 𝒦_{ã}. Thus ã_{1} = ã ∈ 𝒦_{ã}, ã_{2} = tã_{1} + (1 − t)τ(ã_{1}) ∈ 𝒦_{ã}, … , b̃ = ã_{n} = t′ã_{n−1} + (1 − t′)τ′(ã_{n−1}) ∈ 𝒦_{ã}.
(ii)→(iii). If b_{1} < a_{j} for all j in {2, … , n}, then b_{j} ≤ b_{1} < a_{j} for all such j, and b_{1} + ⋯ + b_{n} < a_{1} + ⋯ + a_{n}, contrary to assumption. Let m be the smallest number in {2, … , n} such that a_{m} ≤ b_{1}. Since a_{m} ≤ b_{1} ≤ a_{1}, there is a t in [0, 1] such that b_{1} = ta_{1} + (1 − t)a_{m}. Let τ be the transposition that interchanges 1 and m. Let ã_{1} be ã and ã_{2} be tã_{1} + (1 − t)τ(ã_{1}). Then As b_{m−1} ≤ b_{m−2} ≤ ⋯ ≤ b_{1} < a_{m−1} ≤ ⋯ ≤ a_{2}, by choice of m, If m ≤ j ≤ n − 1, then a + ⋯ + a = a_{1} + ⋯ + a_{j} ≥ b_{1} + ⋯ + b_{j}.
Suppose now that we have constructed ã_{1}, … , ã_{j} such that ã_{k+1} = tã_{k} + (1 − t)τ(ã_{k}) for each k in {1, … , j − 1} (t ∈ [0, 1] and τ is a transposition in Π depending on k), such that b_{1} = a, … , b_{k−1} = a for each k in {2, … , j} and for each k in {1, … , j}. Then Hence b_{j} ≤ a. In addition, for k in {1, ⋯ , j − 1}, Thus a ≤ b_{n} ≤ b_{j}, because b_{1} + ⋯ + b_{n−1} ≤ a + ⋯ + a. Let m be the smallest number in {j + 1, … , n} such that a ≤ b_{j}. Then (∗)
Because a ≤ b_{j} ≤ a, there is a t in [0, 1] such that b_{j} = ta + (1 − t)a. Let τ be the transposition that interchanges j and m, and let ã_{j+1} be tã_{j} + (1 − t)τ(ã_{j}). Then If j + 1 = n, we are through. If not, we must show that b_{1} + ⋯ + b_{k} ≤ a + ⋯ + a, for each k in {1, … , n − 1}, to carry the construction forward. If 1 ≤ k ≤ j, then If j + 1 ≤ k ≤ m − 1, then from (∗), Finally, if m ≤ k ≤ n − 1, then
Theorem 6.
Let ϕ be the mapping that assigns to each selfadjoint n × nmatrix (a_{jk}) the point (a_{11}, … , a_{nn}) (= ã) in ℝ^{n}, 𝒦_{m}be the range of ϕ restricted to the set 𝒫_{m}of projections of rank m, where m ∈ {0, … , n}, and 𝒦 be the range of ϕ restricted to the set 𝒫 of projections. Then ã ∈ 𝒦_{m}if and only if 0 ≤ a_{jj} ≤ 1, for each jand ∑a_{jj} = m, and ã ∈ 𝒦 if and only if 0 ≤ a_{jj} ≤ 1, for each j, and ∑a_{jj} ∈ {0, … , n}.
Proof:
Let (a_{jk}) (= A) be a selfadjoint matrix and U be the unitary matrix with ξ sin θ, sin θ at the j, j and k, k entries, respectively, −cos θ, ξ cos θ at the j, k and k, j entries, respectively, 1 at all diagonal entries other than j, j and k, k, and 0 at all other entries, where ξ is a complex number of modulus 1 such that = −ξa_{jk}. Then UAU^{−1} has a_{jj} sin^{2} θ + a_{kk} cos^{2} θ at the j, j entry, a_{jj} cos^{2} θ + a_{kk} sin^{2} θ at the k, k entry, and a_{hh} at the hh entry when h ≠ j, k. Letting t be sin^{2} θ, τ be the transposition of {1, … , n} that interchanges j and k, and ã_{τ} be (a_{τ(1),τ(1)}, … , a_{τ(n), τ(n)}), we see that Because VEV^{−1} ∈ 𝒫_{m} for each unitary V, when E ∈ 𝒫_{m}, we see that, when ã ∈ 𝒦_{m}, so is tã + (1 − t)ã_{τ}, for each t in [0, 1] and each transposition τ of {1, … , n}.
As noted, tã + (1 − t)ã_{τ} ∈ 𝒦_{m} when ã ∈ 𝒦_{m}, for each t in [0, 1] and each transposition τ of {1, … , n}. From Lemma 5, 𝒦_{m} contains the permutation polytope 𝒦_{ã} of each ã in 𝒦_{m}. Now the point ã whose first m coordinates are 1 and whose last n − m coordinates are 0 is in 𝒦_{m}. If b̃ = (b_{1}, … , b_{n}), 0 ≤ b_{j} ≤ 1 for each j in {1, … , n} and ∑ b_{j} = m, then it follows that b_{1} ≤ 1, b_{1} + b_{2} ≤ 1 + 1, … , b_{1} + ⋯ + b_{m} ≤ m, b_{1} + ⋯ + b_{m+1} ≤ m + 0, … , b_{1} + ⋯ + b_{n−1} ≤ m. Again, from our lemma, b̃ ∈ 𝒦_{ã} ⊆ 𝒦_{m}. Thus 𝒦_{m} is as described in the statement. In particular, 𝒦_{m} is convex.
Since 𝒦 = ∪ 𝒦_{m}, 𝒦 is as described in the statement. ▪
We present another proof of our twelfth variation (Theorem 6) and extend the information contained there slightly, to yield our thirteenth variation. Specifically, we prove the following result.
Theorem 7.
If 〈a_{1}, … , a_{n}〉 is an ordered ntuple of numbers in [0, 1] with sum a positive integer, then there is an idempotent selfadjoint n × nmatrix with diagonal entries a_{1}, … , a_{n}and all entries real.
Proof:
Our proof proceeds by induction on m, the sum of a_{1}, … , a_{n}. In the case where m is 1, we let E_{1} be the projection matrix (acting on ℂ^{n} in the standard manner) that has range spanned by the vector (x =) (a, … , a). Let {e_{j}} be the orthonormal basis for ℂ^{n} where e_{j} is the ntuple with 1 at the jth coordinate and 0 at all others. The matrix for E_{1} relative to this basis has 〈E_{1}e_{k}, e_{j}〉 as its j, kth entry. Since each entry of the matrix is a nonnegative real number (positive, when no a_{j} is 0, and perforce none is 1 in this case, unless n = 1). The jth diagonal entry is 〈E_{1}e_{j}, e_{j}〉 (= a_{j} ≥ 0), as desired.
We take the inductive step. Suppose our assertion has been established when a_{1}, … , a_{n} has sum m − 1 (where m is an integer 2 or greater). Assume that a_{1} + ⋯ + a_{n} = m. Let k be the smallest integer j for which a_{1} + ⋯ + a_{j} ≥ m − 1 and a be m − 1 − ∑ a_{r}. By inductive hypothesis, there is a selfadjoint idempotent E_{2} with matrix (a_{jr}) relative to the basis {e_{j}}, such that each a_{jr} is real, with diagonal a_{1}, … , a_{k−1}, a, 0, … , 0. Let F_{2} be E_{2} with the k + 1, k + 1 entry replaced by 1. Each a_{jr} with j or r greater than k is 0 (since E_{2} ≥ 0). Hence F_{2} is a projection. Let W_{k}(θ) be the unitary operator whose matrix relative to the basis {e_{j}} has sin θ at the k, k and k + 1, k + 1 entries, −cos θ and cos θ at the k, k + 1 and k + 1, k entries, respectively, 1 at all other diagonal entries, and 0 at all other offdiagonal entries. Let p(k, θ, F_{2}) be W_{k}(θ)F_{2}W_{k}(θ)*. Relative to the basis {e_{j}}, the matrix of p(k, θ, F_{2}) has diagonal entries a_{1}, … , a_{k−1}, a sin^{2} θ + cos^{2} θ, a cos^{2} θ + sin^{2} θ, 0, … , 0. The j, r entry is a_{jr} when both j and r do not exceed k − 1 and 0 when either j or r is greater than k + 1. The entries in the kth row of the matrix for p(k, θ, F_{2}) are a_{k1} sin θ, … , a_{kk−1} sin θ, a sin^{2} θ + cos^{2} θ, (a − 1) sin θ cos θ, 0, … , 0. The entries in the k + 1st row are a_{k1} cos θ, … , a_{kk−1} cos θ, (a − 1) sin θ cos θ, a cos^{2} θ + sin^{2} θ, 0, … , 0. The entries in the kth column are a_{1k} sin θ, … , a_{k−1k} sin θ, a sin^{2} θ + cos^{2} θ, (a − 1) sin θ cos θ, 0, … , 0 and in the k + 1st column are a_{1k} cos θ, … , a_{k−1k} cos θ, (a − 1) sin θ cos θ, a cos^{2} θ + sin^{2} θ, 0, … , 0.
By choice of k, m − 1 ≤ ∑ a_{r} + a_{k}, whence a = m − 1 − ∑ a_{r} ≤ a_{k} ≤ 1. For an appropriate choice θ_{2} of θ, a sin^{2} θ_{2} + cos^{2} θ_{2} = a_{k}, and Let p(k, θ_{2}, F_{2}) be F_{3}. Each entry in the matrix for F_{3} is real.
The projection p(k + 1, θ, F_{3}) has as its diagonal entries a_{1}, … , a_{k}, (∑ a_{r}) sin^{2} θ, (∑ a_{r}) cos^{2} θ, 0, … , 0. Again, for an appropriate choice θ_{3} of θ, (∑ a_{r}) sin^{2} θ_{3} = a_{k+1}. Thus the projection p(k + 1, θ_{3}, F_{3}) (= F_{4}) has as its diagonal entries a_{1}, … , a_{k+1}, ∑ a_{r}, 0, … , 0. We continue with this construction, forming p(k + 2, θ, F_{4}) next and so forth, until we consider p(n − 1, θ, F_{n−k+1}). Choosing θ_{n−k+1} appropriately, we let F_{n−k+2} be the selfadjoint idempotent matrix p(n − 1, θ_{n−k+1}, F_{n−k+1}). The diagonal entries of the matrix for F_{n−k+2} are a_{1}, … , a_{n−1}, a_{n}, and all entries are real. ▪
Remark 8.
When we constructed p(k, θ, F_{2}) in the preceding argument, if F_{2} is replaced by A with the same matrix except that the k + 1, k + 1 entry is b rather than 1, then the matrix of p(k, θ, A) has diagonal entries a_{1}, … , a_{k−1}, a sin^{2} θ + b cos^{2} θ, a cos^{2} θ + b sin^{2} θ, 0, … , 0. The entries of the kth and k + 1st rows and columns remain the same except that “(a − 1)” becomes “(a − b),” and “a sin^{2} θ + cos^{2} θ” and “a cos^{2} θ + sin^{2} θ” become “a sin^{2} θ + b cos^{2} θ” and “a cos^{2} θ + b sin^{2} θ.” All other entries remain the same. This general process of transforming a matrix by our unitary matrix so that two segments of the diagonal are altered by replacing their terminal and initial elements by convex combinations of the two in such a way that the sum of the original elements is the same as the sum of the replacements will be referred to as splicing.
We have applied this general construction once, when b is 1 and for the rest with 0 for b. With 1 for b, (a − 1) sin θ_{2} cos θ_{2} appears at the k + 1, k and k, k + 1 entries of F_{3}. Because a < 1 and θ_{2} ∈ (0, π/2), in general these entries are negative, even though E_{1} has a matrix of nonnegative real entries, and F_{2} may have all its entries real and nonnegative. At a lecture on this topic, Frank Hansen raised the possibility of constructing our projection with specified diagonal so that all its entries are real and nonnegative.‡ This is accomplished in the case of a onedimensional projection by the construction given in Proposition 1. It would be interesting to know whether this is possible in general, and whether the construction can be altered to produce such a projection.
Remark 9.
As noted at the end of Section 2, the question of whether there is an mdimensional subspace of our ndimensional Hilbert space from which the elements of a given orthonormal basis {e_{1}, … , e_{n}} have distances r_{1}, … , r_{n}, respectively, is equivalent, by the Pythagorean Theorem, to the existence of such a subspace on which e_{1}, … , e_{n} have orthogonal projections of lengths t_{1}, … , t_{n}, respectively, where r + t = 1. Because this latter question is answered affirmatively by Theorem 6 if and only if t + ⋯ + t = m, and ∑ r + t = n, the former question is answered affirmatively if and only if 0 ≤ r_{j} ≤ 1 and r + ⋯ + r = n − m. This last variation, our fourteenth, is equivalent to the assertion that there is an “mplane” through the origin tangent to each of the spheres S_{1}, … , S_{n} with centers at e_{1}, … , e_{n} and radii r_{1}, … , r_{n}, respectively, if and only if 0 ≤ r_{j} ≤ 1 and r + ⋯ + r = n − m.
Remark 10.
Another proof of the formula of Proposition 3 was promised earlier. With the notation established in that proposition and its proof, let F be the projection of ℋ onto ℋ_{0} and E the projection with range spanned by {e_{1}, … , e_{r}}. Then a = ∑ 〈Fe_{j}, e_{j}〉 = tr(EFE) and b = ∑ 〈(I − F)e_{j}, e_{j}〉 = tr((I − E)(I − F)(I − E)). Thus Formulated in matrix terms, this equality takes on the following form: If a is the sum of any r elements of the diagonal of an n × n matrix of a projection of rank m, and b is the sum of the result of subtracting each of the remaining n − r diagonal elements from 1, then a − b = m − n + r. In these same matrix terms, a (= tr((FE)*FE)) is the trace of the principal upper r × r block of the matrix for F (relative to {e_{j}}) and also the sum of the squares of the absolute values of the entries in the matrix for FE, that is, the sum of those squares for the entries in the first r columns of the matrix for F. This sum of squares is the square of the Hilbert–Schmidt norm of FE (and of EF). We write “∥FE∥” for that sum. (More will be said about this in the infinitedimensional case.) In this notation, our formula is ∥FE∥ − ∥(I − F)(I − E)∥ = m − n + r. Surprisingly (at first sight), ∥EFE∥ − ∥(I − E)(I − F)(I − E)∥ is also m − n + r. To prove this, note that ∥EFE∥ is the sum of the squares of the absolute values of the matrix entries of the principal upper r × r block of the matrix for F, and ∥(I − E)(I − F)(I − E)∥ is the same sum for the principal lower (n − r) × (n − r) block of I − F; their difference is ∥FE∥ − ∥(I − E)(I − F)∥ (at the same time ∥(I − E)(I − F)∥ is ∥(I − F)(I − E)∥). In addition, by a straightforward trace computation of the type we used in proving the formula tr(EFE) − tr((I − E)(I − F)(I − E)) = m − n + r.
With the notation established in this remark, if we assume that the ranges of E and F and of their complements have intersections (0), then we may view a − b (= m − n + r) as the index of E(I − F). To see this, note that the null space of E(I − F) is F(ℋ) ∨ ((I − F)(ℋ) ∧ (I − E)(ℋ)), which is F(ℋ), by assumption. (See ref. 3, proposition 2.5.14.) The null space of (I − F)E (= [E(I − F)]*) is (I − E)(ℋ) ∨ (E(ℋ) ∧ F(ℋ)), which is (I − E)(ℋ), by assumption. Thus the index of the operator E(I − F) is m − (n − r) (= m − n + r).
Remark 11.
Another approach to proving our formula “a − b = m − n + r” results from stochastic–matrix methods. We describe the stochastic matrices, introducing some terminology and establishing some basic facts that will be useful to us. Although our present interest is the finite discrete case, these stochastic–matrix considerations will reappear in the infinite case.
For later use, we develop the basics in the infinite case as well as the finite. We deal with matrices having complex entries and the property that each row and each column sums to r. If r is 1 and all entries are nonnegative real numbers, the matrices are the wellstudied doubly stochastic matrices (the entries representing stationary transition probabilities from one state of a discrete Markov process to another). If an “rsum” matrix has n rows and m columns (with n and m finite and r nonzero), then n = m for summing each row, and then all the sums yields nr as the sum of all entries while summing each column, and then all those sums yields mr as the sum of all matrix entries.
We say that the submatrix A_{0} of a matrix A whose rows are indexed by a set 𝔸 and whose columns are indexed by a set 𝔹 consisting of those entries in the rows corresponding to a given subset 𝔸_{0} of 𝔸 and, at the same time, in the columns corresponding to a subset 𝔹_{0} of 𝔹 is a block (in A, the 𝔸_{0}, 𝔹_{0} block). The complementary block A′_{0} to A_{0} is the 𝔸′_{0}, 𝔹′_{0} block in A, where 𝔸′_{0} = 𝔸∖𝔸_{0} and 𝔹′_{0} = 𝔹∖𝔹_{0}. The weight w(A_{0}) of the block A_{0} is the sum of its entries. In the case where A_{0} and hence A have an infinite number of entries, this sum is taken over the net of finite subsums, directed by inclusion, provided that net converges. If the entries of A are nonnegative real numbers, r is positive, and A is infinite, then w(A) is ∞, for each row sums to r and there are an infinite number of rows. Of course, w(A_{0}) is finite when A_{0} is a finite block. In this case, the sum of the entries in the (finite number of) rows and columns corresponding to A_{0} is finite, whence w(A′_{0}), the sum of the remaining entries in A, is ∞ (still under the assumption that A is an infinite matrix). Despite these observations, there are infinite blocks A_{0}, with infinite complements A′_{0}, such that w(A_{0}) and w(A′_{0}) are both finite. The article on the infinite discrete case to follow this article will contain a description of a method for generating such blocks.
The differences of the weights of complementary blocks of doubly stochastic matrices are intimately related to the Pythagorean Theorem. To describe that relation, we note first that each pair of orthonormal bases {e_{j}}_{j∈ℤ0} and {f_{j}}_{j∈ℤ0} of a Hilbert space ℋ, where ℤ_{0} = ℤ_{+} ∪ ℤ_{−}, ℤ_{+} are the positive integers, and ℤ_{−} are their negatives, gives rise to a doubly stochastic matrix. If a_{jk} = 〈e_{j}, f_{k}〉^{2}, then ∑_{k∈ℤ0} a_{jk} = ∥e_{j}∥^{2} = 1 for each j in ℤ_{0}, from Parseval's equality, because e_{j} = ∑_{k∈ℤ0} 〈e_{j}, f_{k}〉f_{k}. Symmetrically, ∑_{j∈ℤ0} a_{jk} = ∥f_{k}∥^{2} = 1 for each k in ℤ_{0}. Thus (a_{jk}) is a doubly stochastic infinite matrix. If U is the unitary operator on ℋ such that Uf_{j} = e_{j} for each j in ℤ_{0}, then 〈Uf_{j}, f_{k}〉 = 〈e_{j}, f_{k}〉 = u_{kj}, the k, j entry of the matrix for U corresponding to the basis {f_{j}}. Thus u_{kj}^{2} = a_{jk}.
In the case of finite doubly stochastic matrices, we derive a formula relating the weights of complementary blocks (a “Pythagorean Theorem” for doubly stochastic matrices) that provides us with another proof of our formula, a − b = m − n + r.
Proposition 12.
If Ais an n × ndoubly stochastic matrix and A_{0}is a block in Awith prows and qcolumns, then
Proof:
The sum of the p rows of A corresponding to the p rows of A_{0} is p, and the sum of the n − q columns of A corresponding to the columns of A′_{0} is n − q. The difference of these sums, p − n + q, is w(A_{0}) − w(A′_{0}). ▪
Given an orthonormal basis {e_{1}, … , e_{n}} for the ndimensional Hilbert space ℋ and an mdimensional subspace ℋ_{0} with orthogonal complement ℋ′_{0}, choose orthonormal bases {f_{1}, … , f_{m}} and {f_{m+1}, … , f_{n}} for ℋ_{0} and ℋ′_{0}, respectively, and let a_{jk} be 〈e_{j}, f_{k}〉^{2}. As noted in the infinitedimensional case, (a_{jk}) is a doubly stochastic matrix A, an n × n matrix, in this case. If A_{0} is the r × m block whose entries are a_{jk} with j in {1, … , r} and k in {1, … , m}, and F is the projection of ℋ onto ℋ_{0}, then Fe_{j} is ∑ 〈e_{j}, f_{k}〉f_{k} and (I − F)e_{j} is ∑ 〈e_{j}, f_{k}〉f_{k}. Thus ∥Fe_{j}∥^{2} is the sum of the jth row of A_{0}, when 1 ≤ j ≤ r, and ∥(I − F)e_{r+j}∥^{2} is the sum of the jth row of A′_{0}, when 1 ≤ j ≤ n − r. Thus these sums are a_{j} and 1 − a_{r+j}, respectively, where a_{p} is the pth diagonal entry of the matrix for F relative to the basis {e_{1}, … , e_{n}}. It follows that ∑ a_{j} − ∑ 1 − a_{j} = w(A_{0}) − w(A′_{0}) = m − n + r from Proposition 12. Again, with a the sum of the squares of the lengths of the projections of the r elements e_{1}, … , e_{r} of {e_{1}, … , e_{n}} onto ℋ_{0} and b the sum of the squares of the lengths of the projections of the remaining n − r basis elements onto ℋ′_{0}, a − b = m − n + r.
4. Finite Continuous Dimensionality
The Pythagorean and Carpenter's Theorems, in the form of Proposition 2 and its operator–matrix variant (referring to the trace of a rank m projection), deal with projections on an ndimensional Hilbert space ℋ and the diagonals of their matrices with respect to a given orthonormal basis. Denoting by “ℬ(ℋ)” the algebra of all (bounded) operators on ℋ (also when ℋ is infinite dimensional) and by “𝒜” the algebra of all operators in ℬ(ℋ) with diagonal matrices relative to the given basis, we have that 𝒜 is a maximal abelian selfadjoint subalgebra of ℬ(ℋ) (a “masa,” that is, if TA = AT for each A in 𝒜, then T ∈ 𝒜, and A* ∈ 𝒜 when A ∈ 𝒜). The masas are precisely the subalgebras of ℬ(ℋ) whose matrices are diagonal relative to some fixed orthonormal basis for ℋ. For our purposes, orthonormal bases and masas are interchangeable. The mapping Φ that assigns to T in ℬ(ℋ) the element Φ(T) in the masa 𝒜 corresponding to the diagonal of the matrix for T (relative to the orthonormal basis associated with 𝒜) has special properties. It is linear [from ℬ(ℋ) onto 𝒜], maps positive operators to positive operators, and maps the identity operator I in ℬ(ℋ) to I. With A and B in 𝒜, we have that Φ(ATB) = AΦ(T)B. A mapping such as Φ is said to be a conditional expectation [of ℬ(ℋ) onto 𝒜]. For this Φ, tr(TA) = tr(Φ(T)A) for each A in 𝒜. In particular, tr(T) = tr(Φ(T)), for each T in ℬ(ℋ). Conversely, if tr(T) = tr(Φ′(T)) when T ∈ ℬ(ℋ), for a conditional expectation Φ′ of ℬ(ℋ) onto 𝒜, then, again, tr(TA) = tr(Φ′(TA)) = tr(Φ′(T)A), for each A in 𝒜. Thus tr([Φ(T) − Φ′(T)]A) = 0, for each A in the algebra 𝒜, and tr([Φ(T) − Φ′(T)][Φ(T) − Φ′(T)]*) = 0. It follows that Φ(T) − Φ′(T) = 0 for each T in ℬ(ℋ) and Φ = Φ′. When the conditional expectation Φ has the property that tr(T) = tr(Φ(T)), for each T in ℬ(ℋ), we say that Φ lifts the trace [from 𝒜 to ℬ(ℋ)]. We have just proved that there is a unique conditional expectation of ℬ(ℋ) onto a masa that lifts the trace.
In these terms, with E a projection in ℬ(ℋ), tr(Φ(E)) is the sum of the squares of the lengths of the projections onto the range of E of the basis vectors corresponding to 𝒜, where tr is the unique linear functional on ℬ(ℋ) such that tr(I) = n and tr(AB) = tr(BA), for all A and B in ℬ(ℋ), and tr(E) is the rank m of E. Thus the equality (∗)
is the Pythagorean Theorem as expressed in Proposition 2. In these same terms, the Carpenter's Theorem states that if A ∈ 𝒜, 0 ≤ A ≤ I, and tr(A) = m, then there is a projection E in ℬ(ℋ) (necessarily of rank m), such that Φ(E) = A.
Trace considerations will play a role when we discuss the Pythagorean Theorem for the case of an infinitedimensional projection in the next article, although there is no trace functional defined on all of ℬ(ℋ) when ℋ is infinite dimensional. There are, however, subalgebras of ℬ(ℋ), the factors of type II_{1}, that serve as an infinitedimensional generalization of ℬ(ℋ) when ℋ is finite dimensional that are, in many ways, a more appropriate replacement than the infinitedimensional ℬ(ℋ). For one thing, these factors have a (unique) trace functional defined on them. For another, they are simple algebras, whereas the infinitedimensional B(ℋ) is not. They can be characterized as the simple algebras consisting of all operators commuting with a selfadjoint operator algebra and admitting a trace.
Examples of factors ℳ of type II_{1} are provided by (countably) infinite (discrete) groups G, each of whose conjugacy classes, other than that of the unit e of G, is infinite (i.c.c groups). Let ℋ be l_{2}(G), the Hilbert space of complexvalued functions ϕ on G such that ∑_{g∈G} ϕ(g)^{2} < ∞, provided with the inner product: 〈ϕ, ψ〉 = ∑_{g∈G} ϕ(g). If R_{h}ϕ(g) = ϕ(gh), for each ϕ in ℋ and g in G, then R_{h} is a unitary operator on ℋ (right translation by h). The family {T:TR_{h} = R_{h}T, h ∈ G} [those operators in ℬ(ℋ) commuting with all R_{h}], denoted by “ℒ_{G}” is a factor of type II_{1} (the “left von Neumann group algebra” of G).
Let ℳ be a factor of type II_{1} and τ the unique “tracial state” on ℳ [characterized as a linear functional on ℳ such that τ(I) = 1 and τ(AB) = τ(BA), for each A and B in ℳ, but possessing many other properties]. Each spectral projection E for a selfadjoint A in ℳ is a limit on vectors (“strongoperator” limit) of a sequence p_{n}(A), where p_{n} is a polynomial function on the reals. Thus TE = ET when TA = AT, and E ∈ ℳ. It follows that ℳ is generated by (the “norm closure” of the linear span of) the projections in ℳ. Restricted to these projections, τ is a “dimension function”—τ(E) being the dimension of the range of E “relative to ℳ.” In the case of ℬ(ℋ), where ℋ has finite dimension n, we used “tr” in place of τ, and tr(I) is n as is appropriate, because there are minimal projections in ℬ(ℋ). There are no minimal projections in a factor of type II_{1} and no “natural” projection to which to assign trace (“dimension”) 1 other than I. The structural properties of factors ℳ of type II_{1} allow us to conclude that for each real number a in [0, 1], there are projections E in ℳ such that τ(E) = a; that is, the range of the dimension function on ℳ is the entire closed unit interval [0, 1]. Thus the factors of type II_{1} provide us with a natural extension of the finitedimensional ℬ(ℋ) to a central simple algebra in which each of the projections has finite “rank” and the dimensions of the projections form a “continuous” range of values.
An orthonormal basis relative to ℳ is precisely what we arrived at in the case of ℬ(ℋ), with ℋ finite dimensional, that is, a masa 𝒜 in ℳ. In this case, there is a conditional expectation Φ of ℳ onto 𝒜 that lifts the trace, although it is more complicated to construct than passing to the diagonal of a matrix (see p. 403 of ref. 2). The paraphrased version of (∗), (∗∗)
is the Pythagorean Theorem for the case of finite continuous dimensionality (that is, in a factor of type II_{1}). The Carpenter's Theorem for this case asserts that each A in 𝒜 such that 0 ≤ A ≤ I is Φ(E) for some projection E in ℳ. This will be proved in a later article.
A factor ℳ of type II_{1} may be thought of and studied as a noncommutative algebra of (bounded) measurable functions on a (noncommutative) measure space, the projections in ℳ serving as the “characteristic” (or “indicator”) functions on the measure space. In the case of a classical measure space, the algebra of bounded measurable functions on the space is (isomorphic to) a masa in some ℬ(ℋ). Our Pythagorean Theorem describes a property (“lifting the trace”) of a mapping (conditional expectation) from the projections in ℳ to a masa 𝒜 in ℳ. So that theorem describes a certain (trace, that is, integral) property of a mapping from a noncommutative, finite, continuous measure algebra (ℳ) to a commutative measure algebra (𝒜). In that sense, it is a semicommutative result in the metric Euclidean geometry of spaces with finite continuous dimensionality.
Let 𝒩 be a von Neumann subalgebra of a factor ℳ of type II_{1} (that is, 𝒩 is a selfadjoint subalgebra of ℳ consisting of all operators that commute with some other selfadjoint algebra). By techniques akin to those used to prove classical Radon–Nikodým results (suitably modified to apply to the case of noncommutative measure spaces), it was shown (in 1950) that there is a (unique) conditional expectation Φ of ℳ onto 𝒩 that lifts the trace. Thus τ(E) = τ(Φ(E)) for each projection E in ℳ. If 𝒩 is noncommutative, for example, if it is a subfactor of ℳ, then the domain and range of Φ are noncommutative. In that case, the equality, τ(E) = τ(Φ(E)), is a (fully) noncommutative version of the Pythagorean Theorem. Again, the Carpenter's Theorem would describe the range of Φ restricted to the projections in ℳ. There is even a version of Proposition 3 that is valid in a factor ℳ of type II_{1}. If 𝒜 is a masa in ℳ, E is a projection in 𝒜, and F is a projection in ℳ, then The computation of Remark 10 applies to prove this.
The Pythagorean investigation can be extended to include C* algebras with faithful tracial states and their C* subalgebras. Under what conditions are there tracelifting conditional expectations, and what are their ranges when restricted to the projections in the algebra?
 Accepted December 17, 2001.
 Copyright © 2002, The National Academy of Sciences
Citation Manager Formats
More Articles of This Classification
Physical Sciences
Related Content
 No related articles found.