The hypergraph regularity method and its applications
- *Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322; †Department of Mathematics and Statistics, University of Nevada, Reno, NV 89557; ‡Instituto de Matemática e Estatística, Universidade de São Paulo, Rua do Matão 1010, CEP 05508-090, São Paulo, Brazil; and §Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany
-
Communicated by Ronald L. Graham, University of California at San Diego, La Jolla, CA, April 6, 2005 (received for review March 16, 2005)
Abstract
Szemerédi's regularity lemma asserts that every graph can be decomposed into relatively few random-like subgraphs. This random-like behavior enables one to find and enumerate subgraphs of a given isomorphism type, yielding the so-called counting lemma for graphs. The combined application of these two lemmas is known as the regularity method for graphs and has proved useful in graph theory, combinatorial geometry, combinatorial number theory, and theoretical computer science. Here, we report on recent advances in the regularity method for k-uniform hypergraphs, for arbitrary k ≥ 2. This method, purely combinatorial in nature, gives alternative proofs of density theorems originally due to E. Szemerédi, H. Furstenberg, and Y. Katznelson. Further results in extremal combinatorics also have been obtained with this approach. The two main components of the regularity method for k-uniform hypergraphs, the regularity lemma and the counting lemma, have been obtained recently: Rödl and Skokan (based on earlier work of Frankl and Rödl) generalized Szemerédi's regularity lemma to k-uniform hypergraphs, and Nagle, Rödl, and Schacht succeeded in proving a counting lemma accompanying the Rödl–Skokan hypergraph regularity lemma. The counting lemma is proved by reducing the counting problem to a simpler one previously investigated by Kohayakawa, Rödl, and Skokan. Similar results were obtained independently by W. T. Gowers, following a different approach.
In 1975, Szemerédi (1) confirmed a long-standing conjecture of Erdős and Turán (2) concerning the upper density of sets containing no arithmetic progression with a fixed number of elements.
Theorem 1 [Szemerédi's theorem ( 1 )]. For every positive integer t and every δ > 0, there exists an integer N 0 = N 0(t, δ) such that, for N ≥ N 0, any subset Z ⊂ [N] = {1,..., N} with |Z| ≥ δN contains an arithmetic progression with t elements.
In 1977, shortly after Szemerédi's combinatorial proof appeared, Furstenberg (3) gave an alternative proof of Theorem 1 using methods of ergodic theory. Refining the techniques of that proof, Furstenberg and Katznelson later were able to derive several other density versions of combinatorial partition theorems. The following theorem, which may be viewed as a density version of the Gallai–Witt theorem, is one of them (in what follows, we denote by [-t; t] the set of all integers i satisfying -t ≤ i ≤ t).
Theorem 2 [Furstenberg and Katznelson (
4
)].
Let T be a finite subset of
and let δ > 0 be given. Then there exists a finite subset C of
such that any subset Z ⊂ C with |Z| > δ|C| contains a homothetic copy of T, i.e., a set of the form z + λT for some
and some λ ≠ 0. Moreover, if T ⊂ [-t; t]d for some positive integer t, then C = [-N; N]d has the above property for every sufficiently large N ≥ N
0(t, d, δ).
Note that the special case of Theorem 2 for d = 1 implies Theorem 1. More generally, for fixed d, the above result allows us to find a homothetic copy of a full-dimensional cube [-t; t]d in any dense subset of a sufficiently large cube of the same dimension. Two other results in a similar vein address the complementary case in which the dimension is allowed to grow.
Theorem 3 [Furstenberg and Katznelson (
5
)].
Let
be the finite field with q elements. For every positive integer d and every δ > 0, there exists M
0 = M
0(q, d, δ) such that, for M ≥ M
0, any subset
with
contains a d-dimensional affine subspace.
Theorem 4 [Furstenberg and Katznelson ( 5 )]. Let G be a finite abelian group and let δ > 0 be given. Then there exists M 0 = M 0(G, δ) such that if M ≥ M 0 and Z is a subset of GM with |Z| > δ|G|M, then Z contains a coset of a subgroup of GM isomorphic to G.
The techniques introduced by Furstenberg and Katznelson have been extended to prove other generalizations of Theorems 1–4, among which are a density version of the Hales–Jewett theorem (6), again due to Furstenberg and Katznelson (7), and polynomial extensions of Szemerédi's theorem, due to Bergelson and Leibman (8) and Bergelson and McCutcheon (9).
Another area of investigation concerns estimates on N
0 and M
0 in Theorems 1–4. Szemerédi's original proof of Theorem 1 uses the regularity lemma for graphs (10), upcoming Theorem 9, which forces (see ref. 11) the upper bound on N
0 = N
0(t, δ) to exceed a tower function of height polynomial in 1/δ. Using, among others, methods of Fourier analysis, Gowers (12) gave an alternative proof of Theorem 1 rendering the immensely improved estimate
The original proofs of Theorems 2–4 and their generalizations rely on ergodic theory and do not yield any upper bounds on N
0 and M
0.
The aim of this work is to report on an extremal result for hypergraphs, which can be called “removal lemma” (see Theorem 5). This lemma has a number of applications, including purely combinatorial proofs of Theorems 1–4. This approach also yields the first quantitative bounds on N 0 and M 0 in Theorems 2–4. The bounds are, however, poor: they belong to a level of the Ackermann hierarchy that depends on the input parameters.
The proof of the removal lemma is based on an extension of the “regularity method” from graphs to “uniform hypergraphs.” We present the removal lemma and some of its consequences below. The fundamentals of the regularity method for hypergraphs are discussed in the remainder of the work.
Main Result: The Removal Lemma
For a set V and an integer k ≥ 1, let
be the set of all k-element subsets of V. A subset
is a k-uniform hypergraph or k-graph for short, on the vertex set V.
The following result was conjectured by Erdős, Frankl, and Rödl (13) in the case where
is a clique.
Theorem 5 (Removal Lemma).
For all fixed integers l ≥ k ≥ 2 and every μ > 0 there exist ζ = ζ(l, k, μ) > 0 and n
0 = n
0(l, k, μ) so that the following holds. Suppose
is a k-graph on l vertices and
is a k-graph on n ≥ n
0
vertices. If
contains at most ζn
l
copies of
, then one can delete μnk edges of
to make it
-free.
Ruzsa and Szemerédi (14) proved Theorem 5 for k = 2 and
being the triangle
, the graph consisting of three edges on three vertices. Frankl and Rödl (15) proved Theorem 5 for k = 3 and
, the 3-graph consisting of four triples on four vertices. These proofs follow what we call the regularity method for graphs
and 3-graphs, respectively. Extending concepts developed in ref. 15, the regularity method for k-graphs was established in refs. 16 and 17. The proof of Theorem 5 for general k follows lines similar to those in refs. 14 and 15 [see ref. 17 for the case
and ref. 18 for general
]. For illustrative purposes, we later sketch the derivation of Ruzsa and Szemerédi's result by using the regularity method
for graphs.
We mention that Gowers (W. T. Gowers, personal communication; see also ref. 19) has also independently developed similar regularity techniques for proving Theorem 5. Very recently, Tao (T. Tao, personal communication) announced another proof of Theorem 5.
Applications
Before discussing the components of the regularity method for k-graphs, we turn to some applications of the removal lemma. As mentioned earlier, the removal lemma implies Theorems 1–4. This was shown for Theorem 1 by Frankl and Rödl in ref. 15. Subsequently, Solymosi (20) established a similar relation for Theorem 2. Elaborating on an idea from ref. 15, Tengan, Tokushige, V.R., and M.S. (21) derived the following common generalization of Theorems 3 and 4.
Theorem 6. Let A be a finite ring with q elements. For every δ > 0, there exists M 0 = M 0(q, δ) such that, for M ≥ M 0, any subset Z ⊂ AM with |Z| > δ|AM| = δqM contains a coset of an isomorphic copy of A (as a left A-module). In other words, there exist r, u ∈ AM such that r + ϕ(A) ⊂ Z, where ϕ: A ↪ AM, ϕ(α) = αu for α ∈ A, is an injection.
Theorem 5 also implies the affirmative answer to a geometric problem of Székely (ref. 22, p. 226): For a point c = (c
1,..., ck) in the k-fold cross-product of [n] with itself, we define the jack J(c) with center c as the set of points that differ from c in at most one coordinate. For 1 ≤ i ≤ k and fixed c
1,..., ci
-1, ci
+1,..., ck ∈ [n], we also define a line as a set of n points of the form
Let LS(n, k) be the maximum cardinality of a system J of jacks for which
-
no two distinct jacks share a common line; and
-
for all distinct jacks J
1,..., Jk ∈ J.
The following result (18) gives a positive answer to Székely's problem.
Theorem 7. LS(n, k) = o(nk -1).
As one would expect from the literature for graphs (see, e.g., ref. 23), the hypergraph regularity method also may be used to obtain further results in the areas of extremal and asymptotic hypergraph problems. Here we mention one such result.
Let ℱ be a finite family of k-graphs and let
be the set of all labeled k-graphs not containing any
as a subhypergraph. Moreover, let
be those hypergraphs of
with vertex set [n] and set
Note that all subhypergraphs of any fixed
also belong to
, and in particular, this holds for
achieving
. The inequality
thus follows. The next theorem (B.N., V.R., and M.S., unpublished data) asserts this bound is essentially best possible whenever
ℱ satisfies
.
Theorem 8.
For any finite family ℱ of k-graphs,
Theorem 8 complements a collection of analogous theorems already proven for graphs (i.e., k = 2), and generalizes results from refs. 13, 24, and 25.
We now discuss the regularity method for graphs and hypergraphs.
Szemerédi's Regularity Lemma
In the course of proving Theorem 1, Szemerédi established a lemma that decomposes the edge set of any graph into relatively few “blocks,” almost all of which are random-like (10). In what follows, we give a precise account of Szemerédi's lemma.
A pair (A, B) of two disjoint vertex subsets A, B ⊂ V of a graph G = (V, E) is said to be ε-regular if for all A′ ⊂ A, B′ ⊂ B, where |A′| > ε|A|, |B′| > ε|B|, we have |d(A, B) - d(A′, B′)| < ε where d(A′, B′) = |E(A′, B′)|/(|A′∥B′|) and E(A′, B′) denotes the set of edges of G with an endpoint in each of A′ and B′. We denote by G[A, B] the bipartite subgraph of G induced on A and B, and we say that G[A, B]is ε-regular if (A, B) is ε-regular. Szemerédi's lemma may then be stated as follows.
Theorem 9 (Szemerédi's Regularity Lemma). Let ε > 0 be given and let t 0 be a positive integer. There exists a positive integer T 0 = T 0(ε, t 0) such that any graph G = (V, E) admits a partition V = V 1 υ · · · υ Vt, t 0 ≤ t ≤ T 0, satisfying
-
|V 1| ≤ · · · ≤ |Vt| ≤ |V 1| + 1, and
-
all but at most
pairs (Vi, Vj), 1 ≤ i < j ≤ t, are ε-regular.
Szemerédi's regularity lemma is a powerful tool in extremal graph theory. One of its most important consequences is that, in appropriate circumstances, it can be used to show that a given graph contains a fixed subgraph. This observation follows from the following well-known and easily proved fact, which may be called the “counting lemma” for graphs.
Fact 10 (Counting Lemma).
For all positive integers l and d > 0 and γ > 0 there exists ε = ε(l, d, γ) > 0 so that the following holds. Let G = υ1≤i<j≤l Gij be an l-partite graph with l-partition V1 υ · · · υ Vl, where Gij = G[Vi, Vj], 1 ≤ i < j ≤ l, and |V
1| = · · · = |Vl| = n. Suppose further that each graph Gij, 1 ≤ i < j ≤ l, is ε-regular with density d. Then the numberl of copies of the l-clique Kl in G is within the interval
.
We refer to the combined application of Theorem 9 and Fact 10 as the regularity method for graphs. This method yields Theorem 5 in the case of graphs (k = 2), the proof of which can be traced to Ruzsa and Szemerédi (14). We now sketch this proof in the special case in which
is the triangle.
We first address the promised constants. To that end, let μ > 0 be given. To define ζ = ζ(3, 2, μ) > 0, we first let γ = 1/2,
d
0 = μ/3, and t
0 = ⌈3/μ⌉ and take ε = ε(3, d
0, γ) > 0 to be the constant guaranteed by Fact 10. Without loss of generality, we may assume that ε < μ/3. For t
0 = ⌈3/μ⌉ and ε defined above, let T
0 = T
0(ε, t
0) be the constant guaranteed by Theorem 9. We define
. We take n
0 sufficiently large whenever needed.
Suppose
is a graph on n ≥ n
0 vertices and suppose H contains at most ζn
3 copies of the triangle
. We exhibit a triangle-free subgraph H′ of H with only μn
2 fewer edges and do so by appealing to the regularity lemma, Theorem 9.
Indeed, with input parameters ε > 0 and t 0 previously defined, apply Theorem 9 to the graph H to obtain a partition V(H) = V 1 υ · · · υ Vt, t 0 ≤ t ≤ T 0, satisfying the properties in the conclusion of that theorem. Note that the constant T 0, guaranteed by Theorem 9, is the same as we considered earlier when defining ζ. To obtain the promised triangle-free subgraph H′, delete from H any edge {vi, vj} ∈ E(H), vi ∈ Vi, vj ∈ Vj, 1 ≤ i, j ≤ t, for which either i = j, or dH(Vi, Vj) < d 0, or the pair (Vi, Vj) is not ε-regular.
It is easy to see that the process above deletes at most
edges, where the last inequality follows from our choice of constants. We claim the resulting subgraph H′ is triangle-free.
Suppose, on the contrary, that H′ contains a copy of
with vertex set {vh, vi, vj}, where, by construction of H′, we may assume vh ∈ Vh, vi ∈ Vi and vj ∈ Vj, for 1 ≤ h < i < j ≤ t. Then, it must also be the case that each of (Vh, Vi), (Vi, Vj), and (Vh, Vj) are ε-regular with respective densities at least d
0. Fact 10 then implies that the 3-partite subgraph H′[Vh, Vi, Vj] induced on Vh υ Vi υ Vj contains at least
copies of
, contradicting our hypothesis that H had at most ζn
3 such copies.
Regularity Method for Hypergraphs
Szemerédi's regularity lemma decomposes any given graph into pseudorandom blocks, the ε-regular pairs. This notion of pseudorandomness admits the companion counting statement in Fact 10. To develop a regularity method for hypergraphs, one needs a concept of pseudorandomness that allows one to prove both a regularity lemma (which decomposes any given hypergraph into pseudorandom blocks) and a counting lemma (which estimates the number of hypergraphs of a given isomorphism type in an appropriate collection of pseudorandom blocks).
Various concepts capturing the notion of pseudorandomness for hypergraphs have been studied; see Haviland and Thomason (26) and Chung and Graham (27–30). “Deviation,” the central concept in refs. 29 and 30, admits a companion counting statement, as does “discrepancy,” a different notion of pseudorandomness for hypergraphs, studied in ref. 31. No matching regularity lemma, however, is known for either deviation or discrepancy.
Before we introduce the notion of pseudorandomness to be employed, we note that, to extend Theorem 9 to hypergraphs, one also needs to establish a suitable concept of partition. Recall that in Theorem 9 the main structure that undergoes regularization is the edge set of a graph, and a certain partition of the vertex set is an auxiliary structure. Briefly, the 2-tuples (edges) are regularized versus the 1-tuples (vertices). If for k-graphs, k ≥ 3, we just regularize the k-tuples versus 1-tuples, as, e.g., in refs. 32–34, then the natural analogue to the counting lemma fails to be true (see ref. 35 for a counterexample).
A more refined approach is to consider an auxiliary partition of the j-tuples for each j < k. This idea was pursued in refs. 36 and 37 with no attempt, however, to prove a companion counting statement. Building on the ideas from ref. 37, Frankl and Rödl improved the concept of hypergraph regularity for k = 3 in ref. 15 and proved a corresponding regularity lemma for 3-graphs. They also succeeded in proving a companion counting lemma for the
special case
, the complete 3-graph on four vertices. Subsequently, the regularity lemma from ref. 15 was extended to k-graphs for arbitrary k ≥ 3 in ref. 16.
We shall now outline the regularity lemmas of refs. 15 and 16 and begin by fixing some notation. A k-uniform clique of order j, denoted by
, is a k-graph on j ≥ k vertices consisting of all
many k-tuples [i.e.,
is isomorphic to
].
Let l ≥ k ≥ 2 be fixed integers. For each integer 2 ≤ j < k, let
be an l-partite j-graph with vertex partition
. Let
be the complete l-partite j-graph, and, for 2 ≤ i < j < k, let
be the family of all j-element vertex sets that span the clique
in
.
For j ≥ 3, fix classes Vi1,..., Vij,1 ≤ i
1 < · · · < ij ≤ l. For an integer r ≥ 1, let
be a family of subhypergraphs of
, the j-partite subgraph of
induced on Vi
1 υ · · · υ Vij. We define the density or relative density
of
with respect to Q
(j - 1) as
if
and 0 otherwise.
We are now in position to introduce a notion of pseudorandomness for j-graphs. For positive reals δj and dj and integer r ≥ 1, we say that
is (δj, dj, r)-regular with respect to (w.r.t.)
if for any choice of classes Vi
1,..., Vij, 1 ≤ i
1 < · · · < ij ≤ l, and any collection
of subhypergraphs of
satisfying
we have
This concept of regularity gives control over the structure of the hypergraph
with respect to the hypergraph
. To gain more control on the structure of
, one imposes additional structural assumptions on
. In ref. 16, it is assumed that
is itself (δj
-1, dj
-1, r)-regular w.r.t. some
, which again is (δj
-2, dj
-2, r)-regular w.r.t. some
, etc.
The discussion above leads to the definition of an (l, h)-complex. For l ≥ h, an (l, h)-complex G is a system
of l-partite j-graphs
satisfying
for 2 ≤ j ≤ h, with the same vertex partition
. Given vectors of positive reals δ = (δ2,..., δh) and d = (d
2,..., dh) and an integer r ≥ 1, we say that an (l, h)-complex G is (δ, d, r)-regular if
-
for each 1 ≤ i 1 < i 2 ≤ l,
is δ2-regular with density d
2 ± δ2;
-
for each 3 ≤ j ≤ h,
is (δj, dj, r)-regular with respect to
.
Informally speaking, regular complexes are the pseudorandom blocks into which the hypergraph regularity lemma decomposes any large enough hypergraph.
We now proceed to describe the partition analogue of the hypergraph regularity lemma in ref. 16. To this end, we shall need the concept of a (μ, δ, d, r)-equitable family of partitions, which, as we shall see momentarily, is a sequence of partitions of vertices, pairs, triples,...,(k - 1)-tuples. Let μ > 0 be a real number, let δ = (δ2,..., δk
-1) and d = (d
2,..., dk
-1) be vectors of positive reals, and let r ≥ 1 be an integer. Let a = (a
1,..., ak
-1) be a vector of positive integers and V an n-element vertex set. We say that a family of partitions
on V is (μ, δ, d, r)-equitable if it satisfies the following conditions:
-
(P1)
is an equitable vertex partition of V, i.e., |V
1| ≤ · · · ≤ |Va
1| ≤ |V
1| + 1,
-
(P2)
partitions
so that if
and
, then
is partitioned into at most aj parts, all of them members of
, and, most importantly,
-
(P3) for all but at most μnk k-tuples
there is a unique (δ, d, r)-regular (k, k - 1)-complex
such that
has as members
different partition classes from
and
.
The complex
mentioned in (P3) takes the place of the pairs (Vi, Vj) in Theorem 9. We say that a k-graph
is (δk, r)-regular w.r.t. a family of partitions 𝒫 if all but at most δknk edges K of
have the property that
and if
is the unique (k, k - 1)-complex for which
, then
is
-regular w.r.t.
.
The hypergraph regularity lemma from ref. 16, stated below, asserts that for every k-graph
there exists a (μ, δ, d, r)-equitable family of partitions 𝒫 with a bounded number of complexes so that
is (δk, r)-regular w.r.t. 𝒫.
Theorem 11 (Hypergraph Regularity Lemma).
For all positive reals μ and δk and functions
there exist T
0
and n
0
so that the following holds. For every k-graph
on n ≥ n
0
vertices, there exist a family of partitions
and a vector d = (d
2,..., dk
-1) so that, for δ = (δ2,..., δk
-1), where δj = δj(dj,..., dk
-1) for all j, and r = r(a
1, d), the following holds:
-
𝒫 is a (μ, δ, d, r)-equitable family of partitions and ai ≤ T 0 for every i = 1,..., k - 1 and
-
is (δk, r)-regular w.r.t. 𝒫.
In essence, this is similar to the regularity lemma for graphs, Theorem 9, and the regularity lemma for 3-graphs (15). For instance, for a graph G, Szemerédi's regularity lemma implies that most of the edges of G belong to ε-regular, bipartite subgraphs G[Vi, Vj]. In the above concept of hypergraph regularity, the (δ, d, r)-regular (k, k)-complexes
correspond to the ε-regular pairs in Szemerédi's partition. More specifically, the auxiliary structure
corresponds to the pairs of vertex sets (Vi, Vj), while
corresponds to the edge set G[Vi, Vj] induced by Vi and Vj.)
Note that in Theorem 11, for each 2 ≤ j ≤ k - 1, the constant δj may be chosen as a function of dj,..., dk
-1. Therefore, we may ensure δj << min{dj,..., dk
-1}. However, we have no control over the relation between δj and dj
-1. In particular, Theorem 11 cannot avoid the outcome δj >> dj
-1, thus forcing us to face the following hierarchy concerning the constant δk and the constant entries of the vectors δ and d:
For a counting lemma for hypergraphs to be an appropriate counterpart of the regularity lemma in Theorem 11, it necessarily has to match the quantification stated in Eq. 1. Such a result has been obtained in ref. 17.
Theorem 12 (Hypergraph Counting Lemma).
For all integers 2 ≤ k ≤ l the following holds: ∀γ > 0 ∀dk > 0∃δk > 0 ∀dk
-1 > 0∃δk
-1 > 0 · · · ∀d
2 > 0∃δ2 > 0 and there are integers r and m
0
so that, with d = (d
2,..., dk), δ = (δ2,..., δk) and m ≥ m
0
, if
is a (δ, d, r)-regular (l, k)-complex with vertex partition
and |Vi| = m, then
Unlike the case of graphs (k = 2), Theorem 12 for k ≥ 3 was the most technical part of the approach outlined here. The proof of Theorem 12 establishes a reduction to a simpler counting problem, previously addressed in corollary 6.11 of ref. 31. The special case k = 3 of Theorem 12 has been considered by three of us (B.N., V.R., and M.S., unpublished data) following a similar but simpler approach.
Acknowledgments
V.R. was partly supported by National Science Foundation Grant DMS 0300529. J.S. was partly supported by National Science Foundation Grant INT-0305793 and National Security Agency Grant H98230-04-1-0035. M.S. was partly supported by the Deutsche Forschungsgemeinschaft through the European Graduate Program “Combinatorics, Geometry, and Computation” Grant GRK 588/2 and by Deutsche Forschungsgemeinschaft Grant SCHA 1263/1-1. Y.K. was partly supported by Fundação de Amparo à Pesquisa do Estado de São Paulo/Conselho Nacional de Desenvolvimento Científico e Tecnológico Projeto Temático-ProNEx Proc. 2003/09925-5, by Conselho Nacional de Desenvolvimento Científico e Tecnológico Processo 300334/93-1, and by the Instituto do Milênio Avanço Global e Integrado da Matemática Brasileira.





