The hypergraph regularity method and its applications

  1. V. Rödl*,
  2. B. Nagle,
  3. J. Skokan,
  4. M. Schacht§,, and
  5. Y. Kohayakawa
  1. *Department of Mathematics and Computer Science, Emory University, Atlanta, GA 30322; Department of Mathematics and Statistics, University of Nevada, Reno, NV 89557; Instituto de Matemática e Estatística, Universidade de São Paulo, Rua do Matão 1010, CEP 05508-090, São Paulo, Brazil; and §Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany
  1. Communicated by Ronald L. Graham, University of California at San Diego, La Jolla, CA, April 6, 2005 (received for review March 16, 2005)

Abstract

Szemerédi's regularity lemma asserts that every graph can be decomposed into relatively few random-like subgraphs. This random-like behavior enables one to find and enumerate subgraphs of a given isomorphism type, yielding the so-called counting lemma for graphs. The combined application of these two lemmas is known as the regularity method for graphs and has proved useful in graph theory, combinatorial geometry, combinatorial number theory, and theoretical computer science. Here, we report on recent advances in the regularity method for k-uniform hypergraphs, for arbitrary k ≥ 2. This method, purely combinatorial in nature, gives alternative proofs of density theorems originally due to E. Szemerédi, H. Furstenberg, and Y. Katznelson. Further results in extremal combinatorics also have been obtained with this approach. The two main components of the regularity method for k-uniform hypergraphs, the regularity lemma and the counting lemma, have been obtained recently: Rödl and Skokan (based on earlier work of Frankl and Rödl) generalized Szemerédi's regularity lemma to k-uniform hypergraphs, and Nagle, Rödl, and Schacht succeeded in proving a counting lemma accompanying the Rödl–Skokan hypergraph regularity lemma. The counting lemma is proved by reducing the counting problem to a simpler one previously investigated by Kohayakawa, Rödl, and Skokan. Similar results were obtained independently by W. T. Gowers, following a different approach.

In 1975, Szemerédi (1) confirmed a long-standing conjecture of Erdős and Turán (2) concerning the upper density of sets containing no arithmetic progression with a fixed number of elements.

Theorem 1 [Szemerédi's theorem ( 1 )]. For every positive integer t and every δ > 0, there exists an integer N 0 = N 0(t, δ) such that, for NN 0, any subset Z ⊂ [N] = {1,..., N} with |Z| ≥ δN contains an arithmetic progression with t elements.

In 1977, shortly after Szemerédi's combinatorial proof appeared, Furstenberg (3) gave an alternative proof of Theorem 1 using methods of ergodic theory. Refining the techniques of that proof, Furstenberg and Katznelson later were able to derive several other density versions of combinatorial partition theorems. The following theorem, which may be viewed as a density version of the Gallai–Witt theorem, is one of them (in what follows, we denote by [-t; t] the set of all integers i satisfying -tit).

Theorem 2 [Furstenberg and Katznelson ( 4 )]. Let T be a finite subset of Formula and let δ > 0 be given. Then there exists a finite subset C of Formula such that any subset ZC with |Z| > δ|C| contains a homothetic copy of T, i.e., a set of the form z + λT for some Formula and some λ ≠ 0. Moreover, if T ⊂ [-t; t]d for some positive integer t, then C = [-N; N]d has the above property for every sufficiently large NN 0(t, d, δ).

Note that the special case of Theorem 2 for d = 1 implies Theorem 1. More generally, for fixed d, the above result allows us to find a homothetic copy of a full-dimensional cube [-t; t]d in any dense subset of a sufficiently large cube of the same dimension. Two other results in a similar vein address the complementary case in which the dimension is allowed to grow.

Theorem 3 [Furstenberg and Katznelson ( 5 )]. Let Formula be the finite field with q elements. For every positive integer d and every δ > 0, there exists M 0 = M 0(q, d, δ) such that, for MM 0, any subset Formula with Formula contains a d-dimensional affine subspace.

Theorem 4 [Furstenberg and Katznelson ( 5 )]. Let G be a finite abelian group and let δ > 0 be given. Then there exists M 0 = M 0(G, δ) such that if MM 0 and Z is a subset of GM with |Z| > δ|G|M, then Z contains a coset of a subgroup of GM isomorphic to G.

The techniques introduced by Furstenberg and Katznelson have been extended to prove other generalizations of Theorems 1–4, among which are a density version of the Hales–Jewett theorem (6), again due to Furstenberg and Katznelson (7), and polynomial extensions of Szemerédi's theorem, due to Bergelson and Leibman (8) and Bergelson and McCutcheon (9).

Another area of investigation concerns estimates on N 0 and M 0 in Theorems 1–4. Szemerédi's original proof of Theorem 1 uses the regularity lemma for graphs (10), upcoming Theorem 9, which forces (see ref. 11) the upper bound on N 0 = N 0(t, δ) to exceed a tower function of height polynomial in 1/δ. Using, among others, methods of Fourier analysis, Gowers (12) gave an alternative proof of Theorem 1 rendering the immensely improved estimate Formula The original proofs of Theorems 2–4 and their generalizations rely on ergodic theory and do not yield any upper bounds on N 0 and M 0.

The aim of this work is to report on an extremal result for hypergraphs, which can be called “removal lemma” (see Theorem 5). This lemma has a number of applications, including purely combinatorial proofs of Theorems 1–4. This approach also yields the first quantitative bounds on N 0 and M 0 in Theorems 2–4. The bounds are, however, poor: they belong to a level of the Ackermann hierarchy that depends on the input parameters.

The proof of the removal lemma is based on an extension of the “regularity method” from graphs to “uniform hypergraphs.” We present the removal lemma and some of its consequences below. The fundamentals of the regularity method for hypergraphs are discussed in the remainder of the work.

Main Result: The Removal Lemma

For a set V and an integer k ≥ 1, let Formula be the set of all k-element subsets of V. A subset Formula is a k-uniform hypergraph or k-graph for short, on the vertex set V.

The following result was conjectured by Erdős, Frankl, and Rödl (13) in the case where Formula is a clique.

Theorem 5 (Removal Lemma). For all fixed integers lk ≥ 2 and every μ > 0 there exist ζ = ζ(l, k, μ) > 0 and n 0 = n 0(l, k, μ) so that the following holds. Suppose Formula is a k-graph on l vertices and Formula is a k-graph on nn 0 vertices. If Formula contains at most ζn l copies of Formula , then one can delete μnk edges of Formula to make it Formula-free.

Ruzsa and Szemerédi (14) proved Theorem 5 for k = 2 and Formula being the triangle Formula, the graph consisting of three edges on three vertices. Frankl and Rödl (15) proved Theorem 5 for k = 3 and Formula, the 3-graph consisting of four triples on four vertices. These proofs follow what we call the regularity method for graphs and 3-graphs, respectively. Extending concepts developed in ref. 15, the regularity method for k-graphs was established in refs. 16 and 17. The proof of Theorem 5 for general k follows lines similar to those in refs. 14 and 15 [see ref. 17 for the case Formula and ref. 18 for general Formula]. For illustrative purposes, we later sketch the derivation of Ruzsa and Szemerédi's result by using the regularity method for graphs.

We mention that Gowers (W. T. Gowers, personal communication; see also ref. 19) has also independently developed similar regularity techniques for proving Theorem 5. Very recently, Tao (T. Tao, personal communication) announced another proof of Theorem 5.

Applications

Before discussing the components of the regularity method for k-graphs, we turn to some applications of the removal lemma. As mentioned earlier, the removal lemma implies Theorems 1–4. This was shown for Theorem 1 by Frankl and Rödl in ref. 15. Subsequently, Solymosi (20) established a similar relation for Theorem 2. Elaborating on an idea from ref. 15, Tengan, Tokushige, V.R., and M.S. (21) derived the following common generalization of Theorems 3 and 4.

Theorem 6. Let A be a finite ring with q elements. For every δ > 0, there exists M 0 = M 0(q, δ) such that, for MM 0, any subset ZAM with |Z| > δ|AM| = δqM contains a coset of an isomorphic copy of A (as a left A-module). In other words, there exist r, uAM such that r + ϕ(A) ⊂ Z, where ϕ: AAM, ϕ(α) = αu for α ∈ A, is an injection.

Theorem 5 also implies the affirmative answer to a geometric problem of Székely (ref. 22, p. 226): For a point c = (c 1,..., ck) in the k-fold cross-product of [n] with itself, we define the jack J(c) with center c as the set of points that differ from c in at most one coordinate. For 1 ≤ ik and fixed c 1,..., ci -1, ci +1,..., ck ∈ [n], we also define a line as a set of n points of the form Formula Let LS(n, k) be the maximum cardinality of a system J of jacks for which

  1. no two distinct jacks share a common line; and

  2. Formula for all distinct jacks J 1,..., JkJ.

The following result (18) gives a positive answer to Székely's problem.

Theorem 7. LS(n, k) = o(nk -1).

As one would expect from the literature for graphs (see, e.g., ref. 23), the hypergraph regularity method also may be used to obtain further results in the areas of extremal and asymptotic hypergraph problems. Here we mention one such result.

Let ℱ be a finite family of k-graphs and let Formula be the set of all labeled k-graphs not containing any Formula as a subhypergraph. Moreover, let Formula be those hypergraphs of Formula with vertex set [n] and set Formula Note that all subhypergraphs of any fixed Formula also belong to Formula, and in particular, this holds for Formula achieving Formula. The inequality Formula thus follows. The next theorem (B.N., V.R., and M.S., unpublished data) asserts this bound is essentially best possible whenever ℱ satisfies Formula.

Theorem 8. For any finite familyof k-graphs, Formula

Theorem 8 complements a collection of analogous theorems already proven for graphs (i.e., k = 2), and generalizes results from refs. 13, 24, and 25.

We now discuss the regularity method for graphs and hypergraphs.

Szemerédi's Regularity Lemma

In the course of proving Theorem 1, Szemerédi established a lemma that decomposes the edge set of any graph into relatively few “blocks,” almost all of which are random-like (10). In what follows, we give a precise account of Szemerédi's lemma.

A pair (A, B) of two disjoint vertex subsets A, BV of a graph G = (V, E) is said to be ε-regular if for all A′ ⊂ A, B′ ⊂ B, where |A′| > ε|A|, |B′| > ε|B|, we have |d(A, B) - d(A′, B′)| < ε where d(A′, B′) = |E(A′, B′)|/(|A′∥B′|) and E(A′, B′) denotes the set of edges of G with an endpoint in each of A′ and B′. We denote by G[A, B] the bipartite subgraph of G induced on A and B, and we say that G[A, B]is ε-regular if (A, B) is ε-regular. Szemerédi's lemma may then be stated as follows.

Theorem 9 (Szemerédi's Regularity Lemma). Let ε > 0 be given and let t 0 be a positive integer. There exists a positive integer T 0 = T 0(ε, t 0) such that any graph G = (V, E) admits a partition V = V 1 υ · · · υ Vt, t 0tT 0, satisfying

  1. |V 1| ≤ · · · ≤ |Vt| ≤ |V 1| + 1, and

  2. all but at most Formula pairs (Vi, Vj), 1 ≤ i < jt, are ε-regular.

Szemerédi's regularity lemma is a powerful tool in extremal graph theory. One of its most important consequences is that, in appropriate circumstances, it can be used to show that a given graph contains a fixed subgraph. This observation follows from the following well-known and easily proved fact, which may be called the “counting lemma” for graphs.

Fact 10 (Counting Lemma). For all positive integers l and d > 0 and γ > 0 there exists ε = ε(l, d, γ) > 0 so that the following holds. Let G = υ1≤i<jl Gij be an l-partite graph with l-partition V1 υ · · · υ Vl, where Gij = G[Vi, Vj], 1 ≤ i < jl, and |V 1| = · · · = |Vl| = n. Suppose further that each graph Gij, 1 ≤ i < jl, is ε-regular with density d. Then the numberl of copies of the l-clique Kl in G is within the interval Formula.

We refer to the combined application of Theorem 9 and Fact 10 as the regularity method for graphs. This method yields Theorem 5 in the case of graphs (k = 2), the proof of which can be traced to Ruzsa and Szemerédi (14). We now sketch this proof in the special case in which Formula is the triangle.

We first address the promised constants. To that end, let μ > 0 be given. To define ζ = ζ(3, 2, μ) > 0, we first let γ = 1/2, d 0 = μ/3, and t 0 = ⌈3/μ⌉ and take ε = ε(3, d 0, γ) > 0 to be the constant guaranteed by Fact 10. Without loss of generality, we may assume that ε < μ/3. For t 0 = ⌈3/μ⌉ and ε defined above, let T 0 = T 0(ε, t 0) be the constant guaranteed by Theorem 9. We define Formula. We take n 0 sufficiently large whenever needed.

Suppose Formula is a graph on nn 0 vertices and suppose H contains at most ζn 3 copies of the triangle Formula. We exhibit a triangle-free subgraph H′ of H with only μn 2 fewer edges and do so by appealing to the regularity lemma, Theorem 9.

Indeed, with input parameters ε > 0 and t 0 previously defined, apply Theorem 9 to the graph H to obtain a partition V(H) = V 1 υ · · · υ Vt, t 0tT 0, satisfying the properties in the conclusion of that theorem. Note that the constant T 0, guaranteed by Theorem 9, is the same as we considered earlier when defining ζ. To obtain the promised triangle-free subgraph H′, delete from H any edge {vi, vj} ∈ E(H), viVi, vjVj, 1 ≤ i, jt, for which either i = j, or dH(Vi, Vj) < d 0, or the pair (Vi, Vj) is not ε-regular.

It is easy to see that the process above deletes at most Formula edges, where the last inequality follows from our choice of constants. We claim the resulting subgraph H′ is triangle-free.

Suppose, on the contrary, that H′ contains a copy of Formula with vertex set {vh, vi, vj}, where, by construction of H′, we may assume vhVh, viVi and vjVj, for 1 ≤ h < i < jt. Then, it must also be the case that each of (Vh, Vi), (Vi, Vj), and (Vh, Vj) are ε-regular with respective densities at least d 0. Fact 10 then implies that the 3-partite subgraph H′[Vh, Vi, Vj] induced on Vh υ Vi υ Vj contains at least Formula copies of Formula, contradicting our hypothesis that H had at most ζn 3 such copies.

Regularity Method for Hypergraphs

Szemerédi's regularity lemma decomposes any given graph into pseudorandom blocks, the ε-regular pairs. This notion of pseudorandomness admits the companion counting statement in Fact 10. To develop a regularity method for hypergraphs, one needs a concept of pseudorandomness that allows one to prove both a regularity lemma (which decomposes any given hypergraph into pseudorandom blocks) and a counting lemma (which estimates the number of hypergraphs of a given isomorphism type in an appropriate collection of pseudorandom blocks).

Various concepts capturing the notion of pseudorandomness for hypergraphs have been studied; see Haviland and Thomason (26) and Chung and Graham (2730). “Deviation,” the central concept in refs. 29 and 30, admits a companion counting statement, as does “discrepancy,” a different notion of pseudorandomness for hypergraphs, studied in ref. 31. No matching regularity lemma, however, is known for either deviation or discrepancy.

Before we introduce the notion of pseudorandomness to be employed, we note that, to extend Theorem 9 to hypergraphs, one also needs to establish a suitable concept of partition. Recall that in Theorem 9 the main structure that undergoes regularization is the edge set of a graph, and a certain partition of the vertex set is an auxiliary structure. Briefly, the 2-tuples (edges) are regularized versus the 1-tuples (vertices). If for k-graphs, k ≥ 3, we just regularize the k-tuples versus 1-tuples, as, e.g., in refs. 3234, then the natural analogue to the counting lemma fails to be true (see ref. 35 for a counterexample).

A more refined approach is to consider an auxiliary partition of the j-tuples for each j < k. This idea was pursued in refs. 36 and 37 with no attempt, however, to prove a companion counting statement. Building on the ideas from ref. 37, Frankl and Rödl improved the concept of hypergraph regularity for k = 3 in ref. 15 and proved a corresponding regularity lemma for 3-graphs. They also succeeded in proving a companion counting lemma for the special case Formula, the complete 3-graph on four vertices. Subsequently, the regularity lemma from ref. 15 was extended to k-graphs for arbitrary k ≥ 3 in ref. 16.

We shall now outline the regularity lemmas of refs. 15 and 16 and begin by fixing some notation. A k-uniform clique of order j, denoted by Formula, is a k-graph on jk vertices consisting of all Formula many k-tuples [i.e., Formula is isomorphic to Formula].

Let lk ≥ 2 be fixed integers. For each integer 2 ≤ j < k, let Formula be an l-partite j-graph with vertex partition Formula. Let Formula be the complete l-partite j-graph, and, for 2 ≤ i < j < k, let Formula be the family of all j-element vertex sets that span the clique Formula in Formula.

For j ≥ 3, fix classes Vi1,..., Vij,1 ≤ i 1 < · · · < ijl. For an integer r ≥ 1, let Formula be a family of subhypergraphs of Formula, the j-partite subgraph of Formula induced on Vi 1 υ · · · υ Vij. We define the density or relative density Formula of Formula with respect to Q (j - 1) as Formula if Formula and 0 otherwise.

We are now in position to introduce a notion of pseudorandomness for j-graphs. For positive reals δj and dj and integer r ≥ 1, we say that Formula is (δj, dj, r)-regular with respect to (w.r.t.) Formula if for any choice of classes Vi 1,..., Vij, 1 ≤ i 1 < · · · < ijl, and any collection Formula of subhypergraphs of Formula satisfying Formula we have Formula This concept of regularity gives control over the structure of the hypergraph Formula with respect to the hypergraph Formula. To gain more control on the structure of Formula, one imposes additional structural assumptions on Formula. In ref. 16, it is assumed that Formula is itself (δj -1, dj -1, r)-regular w.r.t. some Formula, which again is (δj -2, dj -2, r)-regular w.r.t. some Formula, etc.

The discussion above leads to the definition of an (l, h)-complex. For lh, an (l, h)-complex G is a system Formula of l-partite j-graphs Formula satisfying Formula for 2 ≤ jh, with the same vertex partition Formula. Given vectors of positive reals δ = (δ2,..., δh) and d = (d 2,..., dh) and an integer r ≥ 1, we say that an (l, h)-complex G is (δ, d, r)-regular if

  1. for each 1 ≤ i 1 < i 2l, Formula is δ2-regular with density d 2 ± δ2;

  2. for each 3 ≤ jh, Formula is (δj, dj, r)-regular with respect to Formula.

Informally speaking, regular complexes are the pseudorandom blocks into which the hypergraph regularity lemma decomposes any large enough hypergraph.

We now proceed to describe the partition analogue of the hypergraph regularity lemma in ref. 16. To this end, we shall need the concept of a (μ, δ, d, r)-equitable family of partitions, which, as we shall see momentarily, is a sequence of partitions of vertices, pairs, triples,...,(k - 1)-tuples. Let μ > 0 be a real number, let δ = (δ2,..., δk -1) and d = (d 2,..., dk -1) be vectors of positive reals, and let r ≥ 1 be an integer. Let a = (a 1,..., ak -1) be a vector of positive integers and V an n-element vertex set. We say that a family of partitions Formula on V is (μ, δ, d, r)-equitable if it satisfies the following conditions:

  1. (P1) Formula is an equitable vertex partition of V, i.e., |V 1| ≤ · · · ≤ |Va 1| ≤ |V 1| + 1,

  2. (P2) Formula partitions Formula so that if Formula and Formula, then Formula is partitioned into at most aj parts, all of them members of Formula, and, most importantly,

  3. (P3) for all but at most μnk k-tuples Formula there is a unique (δ, d, r)-regular (k, k - 1)-complex Formula such that Formula has as members Formula different partition classes from Formula and Formula.

The complex Formula mentioned in (P3) takes the place of the pairs (Vi, Vj) in Theorem 9. We say that a k-graph Formula is (δk, r)-regular w.r.t. a family of partitions 𝒫 if all but at most δknk edges K of Formula have the property that Formula and if Formula is the unique (k, k - 1)-complex for which Formula, then Formula is Formula-regular w.r.t. Formula.

The hypergraph regularity lemma from ref. 16, stated below, asserts that for every k-graph Formula there exists a (μ, δ, d, r)-equitable family of partitions 𝒫 with a bounded number of complexes so that Formula is (δk, r)-regular w.r.t. 𝒫.

Theorem 11 (Hypergraph Regularity Lemma). For all positive reals μ and δk and functions Formula there exist T 0 and n 0 so that the following holds. For every k-graph Formula on nn 0 vertices, there exist a family of partitions Formula and a vector d = (d 2,..., dk -1) so that, for δ = (δ2,..., δk -1), where δj = δj(dj,..., dk -1) for all j, and r = r(a 1, d), the following holds:

  1. 𝒫 is a (μ, δ, d, r)-equitable family of partitions and aiT 0 for every i = 1,..., k - 1 and

  2. Formula isk, r)-regular w.r.t. 𝒫.

In essence, this is similar to the regularity lemma for graphs, Theorem 9, and the regularity lemma for 3-graphs (15). For instance, for a graph G, Szemerédi's regularity lemma implies that most of the edges of G belong to ε-regular, bipartite subgraphs G[Vi, Vj]. In the above concept of hypergraph regularity, the (δ, d, r)-regular (k, k)-complexes Formula correspond to the ε-regular pairs in Szemerédi's partition. More specifically, the auxiliary structure Formula corresponds to the pairs of vertex sets (Vi, Vj), while Formula corresponds to the edge set G[Vi, Vj] induced by Vi and Vj.)

Note that in Theorem 11, for each 2 ≤ jk - 1, the constant δj may be chosen as a function of dj,..., dk -1. Therefore, we may ensure δj << min{dj,..., dk -1}. However, we have no control over the relation between δj and dj -1. In particular, Theorem 11 cannot avoid the outcome δj >> dj -1, thus forcing us to face the following hierarchy concerning the constant δk and the constant entries of the vectors δ and d: Formula

For a counting lemma for hypergraphs to be an appropriate counterpart of the regularity lemma in Theorem 11, it necessarily has to match the quantification stated in Eq. 1. Such a result has been obtained in ref. 17.

Theorem 12 (Hypergraph Counting Lemma). For all integers 2 ≤ kl the following holds: ∀γ > 0 ∀dk > 0∃δk > 0 ∀dk -1 > 0∃δk -1 > 0 · · · ∀d 2 > 0∃δ2 > 0 and there are integers r and m 0 so that, with d = (d 2,..., dk), δ = (δ2,..., δk) and mm 0 , if Formula is a (δ, d, r)-regular (l, k)-complex with vertex partition Formula and |Vi| = m, then Formula

Unlike the case of graphs (k = 2), Theorem 12 for k ≥ 3 was the most technical part of the approach outlined here. The proof of Theorem 12 establishes a reduction to a simpler counting problem, previously addressed in corollary 6.11 of ref. 31. The special case k = 3 of Theorem 12 has been considered by three of us (B.N., V.R., and M.S., unpublished data) following a similar but simpler approach.

Acknowledgments

V.R. was partly supported by National Science Foundation Grant DMS 0300529. J.S. was partly supported by National Science Foundation Grant INT-0305793 and National Security Agency Grant H98230-04-1-0035. M.S. was partly supported by the Deutsche Forschungsgemeinschaft through the European Graduate Program “Combinatorics, Geometry, and Computation” Grant GRK 588/2 and by Deutsche Forschungsgemeinschaft Grant SCHA 1263/1-1. Y.K. was partly supported by Fundação de Amparo à Pesquisa do Estado de São Paulo/Conselho Nacional de Desenvolvimento Científico e Tecnológico Projeto Temático-ProNEx Proc. 2003/09925-5, by Conselho Nacional de Desenvolvimento Científico e Tecnológico Processo 300334/93-1, and by the Instituto do Milênio Avanço Global e Integrado da Matemática Brasileira.

Footnotes

  • To whom correspondence should be addressed. E-mail: schacht{at}informatik.hu-berlin.de.

  • Abbreviations: k-graph, k-uniform hypergraph; w.r.t., with respect to.

  • See Commentary on page 8075.

References

« Previous | Next Article »Table of Contents
From the Cover