## New Research In

### Physical Sciences

### Social Sciences

#### Featured Portals

#### Articles by Topic

### Biological Sciences

#### Featured Portals

#### Articles by Topic

- Agricultural Sciences
- Anthropology
- Applied Biological Sciences
- Biochemistry
- Biophysics and Computational Biology
- Cell Biology
- Developmental Biology
- Ecology
- Environmental Sciences
- Evolution
- Genetics
- Immunology and Inflammation
- Medical Sciences
- Microbiology
- Neuroscience
- Pharmacology
- Physiology
- Plant Biology
- Population Biology
- Psychological and Cognitive Sciences
- Sustainability Science
- Systems Biology

# Bounds for cell entries in contingency tables given marginal totals and decomposable graphs

Contributed by Stephen E. Fienberg

## Abstract

Upper and lower bounds on cell counts in cross-classifications of nonnegative counts play important roles in a number of practical problems, including statistical disclosure limitation, computer tomography, mass transportation, cell suppression, and data swapping. Some features of the Fréchet bounds are well known, intuitive, and regularly used by those working on disclosure limitation methods, especially those for two-dimensional tables. We previously have described a series of results relating these bounds to theory on loglinear models for cross-classified counts. This paper provides the actual theory and proofs for the special case of decomposable loglinear models and their related independence graphs. It also includes an extension linked to the structure of reducible graphs and a discussion of the relevance of other results linked to nongraphical loglinear models.

## 1. Introduction

Upper and lower bounds
on cell counts in cross-classifications of positive counts given
certain marginal totals play important roles in a number of the
disclosure limitation procedures, e.g., see the various papers in the
1998 special issue of *The Journal of Official Statistics*
(1). In that context, if a cell count is small and the upper bound is
“close” to the lower bound, the intruder knows with certainty
that there is only a small number of individuals possessing the
characteristics corresponding to the cell and this may pose an undo
risk of disclosure of the identity of these individuals. Similarly,
such bounds also arise in a variety of other contexts including mass
transportation problems (2), computer tomography (3), ecological
inference in the social sciences (4), causal inference in imperfect
experiments (5), and are the focus of the probabilistic literature on
copulas (6). Much of the work on this problem has been focused on
bounds in the case when the marginal totals are nonoverlapping.

The class of bounds we describe is a generalization of bounds usually
attributed to Fréchet (7), whose original presentation was in
terms of cumulative distribution functions (c.d.f.) for a random vector
(*D*_{1}, *D*_{2}, … , *D*_{m})
in *R*^{m}:
1
which are essentially equivalent to contingency tables when the
underlying variables are categorical. For example, suppose we have a
two-dimensional table of counts, {*n*_{ij}}
adding up to the total *n*_{++} = *n*. If we
normalize each entry by dividing by *n* and then create a
table of partial sums, by cumulating the proportions from the first row
and first column to the present ones, we have a set of values of the
form **[1]**. Thus, Fréchet bound results for
distribution functions correspond to bounds for the cell counts where
the values {*x*_{i}} in **[1]**
represent “cut-points” between categories for the *i*th
categorical variable. Bonferroni (8) and Hoeffding (9) independently
developed related results on bounds.

We are interested in the following generalization of the
Bonferroni–Fréchet–Hoeffding bounds. Consider a
*k*-dimensional contingency table *n*_{K}
arranged as a linear list of *m* counts. The random
variable assigned to the *i*th cell will be denoted
*Y*_{i}. Let 𝒮 be a system of nonempty subsets
of {1, 2, … , *m*}, such that
∪_{S∈𝒮} *S* = {1, 2, … , *m*}.
The *Fréchet class* ℱ(𝒮) (6) is the class of
*m*-variate distributions with fixed marginals
{*F*_{S} : *S* ∈ 𝒮}, where
*F*_{S} is the joint c.d.f. of random variables
{*Y*_{i} : *i* ∈ 𝒮}. Because the indices of the
margins being fixed might be overlapping, we have to impose a
*consistency* constraint, namely
where π_{S} means integrating out the
variables that do not appear in *S*. Following
Rüschendorf (10), for a measurable function φ:
**R**^{m} → **R**, we define *M*(φ)
= sup {∫ φd*F* : *F* ∈ ℱ (𝒮)}
and *m*(φ) = inf {∫ φd*F* : *F* ∈
ℱ (𝒮)}. Our goal is to determine *M*(φ) and
*m*(φ) in the particular case when φ is the identity
function on the set **R **× … × (−
∞,*y*_{i}] × … × **R**. This is equivalent to
determining sharp upper and lower bounds for the *i*th cell in
the cross-classification *n*_{K}, given the marginals
{*n*_{S} : *S* ∈ 𝒮}.

Fienberg (11) noted that there is an intimate link between bounds
for non-negative cell entries in a cross-classification subject to
marginal constraints, and maximum likelihood estimates for the same
cell entries under the loglinear model whose minimal sufficient
statistics are the margins. This link seems especially clear in the
special case of cross-classifications of non-negative counts and
loglinear models for their expectations that are
*decomposable*, i.e., for tables where estimated expected
values can be *explicitly* written as a function of the
marginal totals (e.g., see refs. 12–14). Such models are a special
subclass of the graphical loglinear models (e.g., see refs. 14 and 15),
and these models are representable in terms of graphs that display
conditional independence relationships. We present the results here in
terms of graphs and explain how they apply to the more general
situation. In the next section, we introduce some basic notation for
the corresponding theory of *decomposable graphs*. Then, in
Section 3, we give results on Fréchet bounds when the margins
correspond to those that characterize decomposable loglinear models.
Sections 4 and 5 extend the approach to reducible graphs and provide
some explicit examples. In the final section, we present some
conjectures on how these bound results can be extended to cases
corresponding to bounds for cross-classifications that are not quite
representable in graphical form but that utilize our results for
reducible graphs.

## 2. Basic Graph Theory Results

In this section, we begin with some basic definitions and notations for graphs and then define decomposable graphs and present some results that characterize them.

### 2.1. Graph Terminology.

A *graph* is a pair 𝒢 = (*V*, *E*), where
*V* is a finite set of vertices and *E* ⊆ *V* ×
*V* is a set of edges linking the vertices. Our interest is in
*undirected* graphs, for which (*u*, *v*) ∈ *E* implies
(*v*, *u*) ∈ *E*. For any vertex set *A* ⊆ *V*, we
define the edge set associated with it as
Let 𝒢(*A*) = (*A*, *E*(*A*)) denote the subgraph of 𝒢
induced by *A*. The *section graph* 𝒢\*A* :=
𝒢(*V*\*A*) is the subgraph of 𝒢 obtained by removing a set of
vertices *A* ⊂ *V* from the graph. Two vertices *u*, *v* ∈
*V* are *adjacent (neighbors*) if (*u*, *v*) ∈ *E*.
A set of vertices of 𝒢 is *independent* if no two of its
elements are adjacent. The boundary bd(*A*) of a subset of
vertices *A* ⊂ *V* is the set of vertices in *V*\*A*
adjacent to at least one vertex in *A*:
The *closure* of *A* ⊂ *V* is cl(*A*) =
*A* ∪ bd(*A*). An induced subgraph 𝒢(*A*) is
*complete* if the vertices in *A* are pairwise
adjacent in 𝒢. We also say that *A* is *complete*
in 𝒢. A complete vertex set *A* in 𝒢 that is maximal is a
*clique*.

Let *u*, *v* ∈ *V*. A *path* (or *chain*) from
*u* to *v* is a sequence *u* =
*v*_{0}, … , *v*_{n} = *v* of distinct
vertices such that (*v*_{i−1}, *v*_{i}) ∈ *E*
for all *i* = 1, 2, … , *n*. The path is a
*cycle* if the end points are allowed to be the same,
*u* = *v*. If there is a path from *u* to
*v* we say that *u* and *v* are
*connected*. The sets *A*, *B* ⊂ *V* are
*disconnected* if *u* and *v* are not
connected for all *u* ∈ *A*, *v* ∈ *B*. The *connected
component* of a vertex *u* ∈ *V* is the set of all
vertices connected with *u*. A graph is *connected*
if all the pairs of vertices are connected.

The set *C* ⊂ *V* is an *uv-separator* if all paths
from *u* to *v* intersect *C*. The set
*C* ⊂ *V* *separates* *A* from *B*
if it is an *uv*-separator for every *u* ∈ *A*, *v* ∈
*B*. *C* is a *separator *(*cut*-*set*) of 𝒢
if two vertices in the same connected component of 𝒢 are in two
distinct connected components of 𝒢\*C* or, equivalently, if
𝒢\*C* is disconnected. In addition, *C* is a
*minimal* separator of 𝒢 if *C* is a separator and
no proper subset of *C* separates the graph. Unless otherwise
stated, the separators we work with will be complete.

Consider a connected graph 𝒢 = (*V*, *E*) having a clique
separator *C*, and let *V*_{1}, … ,
*V*_{s} be the vertex sets of the connected components of
𝒢\*C*. The subgraphs 𝒢(*V*_{1} ∪
*C*), … , 𝒢(*V*_{s} ∪ *C*) are the *leaves*
of 𝒢 produced by *C*. A graph is *bipartite* if its
set of vertices can be partitioned into two disjoint subsets
*V*_{1} and *V*_{2} such that every
edge of the graph connects between a vertex of
*V*_{1} and a vertex of *V*_{2},
i.e. *V*_{1} and *V*_{2} are
independent sets. A *tree* is a connected graph with no
cycles. It has *n* vertices and *n* − 1 edges.
In a tree, there is a unique path between any two vertices.

### 2.2. Decomposable Graphs.

*Decomposable* graphs possess the special property that
allows us to “decompose” them into components or subgraphs and
work directly with these components. They also allow us to make use of
divide-and-conquer techniques to solve any type of problem associated
with such a graphical structure. The idea is to decompose the graph 𝒢
in two possibly overlapping subgraphs 𝒢′ and 𝒢" so that no
structural information of the graph is lost when transforming 𝒢 into
𝒢′ and 𝒢". Furthermore, by “correctly” decomposing 𝒢′ and
𝒢", and so on, one ends up with a set of subgraphs of 𝒢 that allow
for no further decompositions. A set of subgraphs of 𝒢 generated in
this way is called a *derived system* of 𝒢, while its
elements are called *atoms* (16). If one does not lose any
information along the way in the decomposition, then one can solve
problems for each atom and then put together the component solutions to
solve a combined problem for the initial graph 𝒢. But first we need
to define what we mean by “correct” decomposition.

*Definition 1: *The partition (*A*_{1},
*A*_{2}, *A*_{3}) of *V* is said to form a
*decomposition *of 𝒢 if *A*_{2} is a
minimal separator of *A*_{1} and
*A*_{3}.

In this case (*A*_{1}, *A*_{2},
*A*_{3}) *decomposes* 𝒢 into the
*components* 𝒢(*A*_{1} ∪ *A*_{2})
and 𝒢(*A*_{2} ∪ *A*_{3}). The
decomposition is *proper* if *A*_{1} and
*A*_{3} are not empty. If *A*_{2}
is empty, *A*_{1} and *A*_{3} form
two nonoverlapping connected components.

Throughout the remainder of this section, we will assume that the graphs we work with are connected. No loss of generality is incurred because all the results can be applied to a disconnected graph by applying them successively to each connected component. We follow closely Blair and Barry (17) and Lauritzen (18).

*Definition 2: *The graph 𝒢 is *decomposable *if it
is complete or if there exists a proper decomposition
(*A*_{1}, *A*_{2}, *A*_{3}) into
decomposable graphs 𝒢(*A*_{1} ∪ *A*_{2})
and 𝒢(*A*_{2} ∪ *A*_{3}).

Because we require a proper decomposition of the graph at every step,
the components 𝒢(*A*_{1} ∪ *A*_{2}) and
𝒢(*A*_{2} ∪ *A*_{3}) have fewer vertices
than the original graph 𝒢, hence the procedure will stop after a
finite number of steps. The smallest nondecomposable graph is a cycle
with four vertices.

*Definition 3:* A vertex *v* ∈ *V* is
*simplicial *in 𝒢 = (*V*, *E*) if
bd(*v*) is a clique.

If *v* ∈ *V* is simplicial in 𝒢 and 𝒢 is not complete,
({*v*}, bd(*v*), *V*\cl(*v*)) is
a proper decomposition of 𝒢. Simplicial vertices have very nice and
useful properties:

Lemma 1. (*i*)* A vertex is simplicial if and
only if it belongs to precisely one clique. *(*ii*)* Any decomposable graph has at least one simplicial vertex*.

The importance of simplicial vertices in describing the structure of
decomposable graphs will soon become apparent. Assume that the graph
𝒢 has *n* vertices. An *ordering* of 𝒢 is a
bijection from the vertex set *V* to a set of labels
{1, 2, … , *n*}. Let *v*_{1},
*v*_{2}, … , *v*_{n} be an ordering of the
vertex set *V*. The *monotone adjacency set* of
*v*_{i} is given by:
2
There is a special class of orderings of 𝒢 that plays a central
role in the characterization of decomposable graphs.

*Definition 4:* The ordering *v*_{1},
*v*_{2}, … , *v*_{n} is a perfect elimination
ordering (PEO) if *v*_{i} is simplicial in the
graph 𝒢 ({*v*_{i}, *v*_{i+1}, … ,
*v*_{n}}) for every *i* = 1,
2, … , *n*.

Any decomposable graph is characterized by the possession of a PEO, as the next result shows.

Theorem 1. *A graph 𝒢 is decomposable if and only if
𝒢 has a perfect elimination ordering*.

The *maximum cardinality search* algorithm (MCS) is a
linear-time procedure for generating a perfect elimination ordering. It
starts with an arbitrary vertex *v* ∈ *V* for which it sets
*v* = *v*_{n}. The next vertex will be labeled
*n* − 1 and will be one of the unlabeled vertices with
the maximum number of labeled neighbors. The ordering
*v*_{1}, *v*_{2}, … , *v*_{n}
generated by continuing in this way will always be a PEO if the input
graph is decomposable.

Let 𝒞(𝒢) = {*C*_{1}, *C*_{2}, … ,
*C*_{p}} be the set of cliques of a decomposable graph
𝒢 and *v*_{1}, *v*_{2}, … ,
*v*_{n} be a PEO obtained by applying the MCS algorithm. We
will refer to *v*_{iq} as the
*representative vertex* of *C*_{q} whenever
*C*_{q} = {*v*_{iq}} ∪
madj(*v*_{iq}). The following result shows
how MCS can efficiently generate the cliques in 𝒞(𝒢) by identifying
their representative vertices.

Theorem 2. [*Blair* *and* *Barry* (17).] *Let*
*v*_{1}, *v*_{2}, … , *v*_{n}
*be a **PEO** obtained by applying the **MCS** algorithm to a connected
decomposable graph 𝒢. Then *𝒞(𝒢) *contains precisely
the following sets*: {*v*_{1}} ∪
madj(*v*_{1}) *and*
{*v*_{i+1}} ∪ madj(*v*_{i+1}),
1 ≤ *i* ≤ *n* − 1, *for which *|
madj(*v*_{i})| ≤ |
madj(*v*_{i+1})|.

Because MCS labels the vertices of 𝒢 in decreasing order, the cliques
also will be generated in a decreasing order with respect to the labels
of their representative vertices. More explicitly, assume that
*v*_{i1}, *v*_{i2}, … ,
*v*_{ip} are the representative vertices of the
cliques *C*_{1}, *C*_{2}, … ,
*C*_{p}, respectively, where *i*_{1} >
*i*_{2} > … > *i*_{p}. The MCS algorithm
finds the cliques in 𝒞(𝒢) in the order *C*_{1},
*C*_{2}, … , *C*_{p}. We need to introduce
one additional class of sets.

*Definition 5:* Let *V*_{1}, … ,
*V*_{k} be a sequence of subsets of the vertex set of a
graph 𝒢 = (*V*, *E*). Let *H*_{j} =
*V*_{1} ∪ … ∪ *V*_{j}, *S*_{j} =
*H*_{j−1} ∩ *V*_{j}, and
*R*_{j} = *V*_{j}\*H*_{j−1}. The
sequence is said to be *perfect *if (*i*) for all
*j* > 1, there is an *i* < *j* such that
*S*_{j} ⊆ *V*_{i}, and (*ii*) the
sets *S*_{j} are complete for all *j*.

The first condition in *Definition 5* is known as the
*running intersection property*. The sets
*S*_{j} are called the *separators* of the
sequence.

Theorem 3. [*Lauritzen* (*14*).]
*Let* *V*_{1}, … , *V*_{k} *be
a perfect sequence of sets that contains all cliques of a graph
𝒢. Then for every **j*, *S*_{j}
*separates*
*H*_{j−1}\*S*_{j} *from*
*R*_{j} *in* 𝒢(*H*_{j})
*and hence *(*H*_{j−1}\*S*_{j},
*S*_{j}, *R*_{j}) *decomposes*
𝒢(*H*_{j}).

A total ordering *C*_{1}, *C*_{2}, … ,
*C*_{p} of the cliques in 𝒞(𝒢) generated by the MCS
algorithm will always have the running intersection property (17).
Because *C*_{1}, *C*_{2}, … ,
*C*_{p} are complete in 𝒢, the vertex sets
*S*_{j} = (*C*_{1}
∪ … ∪*C*_{j−1}) ∩ *C*_{j} will also be
complete, and consequently *C*_{1},
*C*_{2}, … , *C*_{p} is a perfect sequence of
sets. By recursively applying *Theorem 3*, we obtain that
𝒞(𝒢) is a derived system of 𝒢, whereas *S*_{j}
(*j* = 2, … , *p*) is the corresponding sequence of
separators [c.f. the recursive result described by Rüschendorf
(10)]. We note that, although a clique can appear only once in
𝒞(𝒢), a separator can appear more than once in 𝒞(𝒢). Therefore,
𝒮(𝒢) is not really a set, but a “multiset” of separators
(17).

## 3. Generalized Fréchet Bounds for Decomposable Loglinear Models

Let *X* = (*X*_{1}, *X*_{2}, … ,
*X*_{k}) be a vector of discrete random variables. Denote
*K* = {1, 2, … , *k*} the index set associated
with *X*_{1}, *X*_{2}, … ,
*X*_{k}. The random variable *X*_{j} can
take the values *x*_{j} ∈ {1, 2, … ,
*I*_{j}}, for *j* = 1, 2, … , *k*. Let
*J*_{K} = *I*_{1} × *I*_{2}
× … × *I*_{k} and *x* = (*x*_{1},
*x*_{2}, … , *x*_{k}) ∈ *J*_{K}.

Consider the *k*-way contingency table *n*_{K} :=
{*n*_{K}(*x*)}_{x∈JK}. We let
*a* = {*i*_{1}, *i*_{2}, … ,
*i*_{p}} denote an arbitrary subset of *K*, and
we define *X*_{a} as the ordered tuple
*X*_{a} = (*X*_{i}; *i* ∈ *a*). Similarly,
we denote *J*_{a} = *J*_{i1} ×
*J*_{i2} × … × *J*_{ip}. The
marginal table of counts *n*_{a} :=
{*n*_{a}(*x*_{a})}_{xa}_{∈Ja}
corresponding to *X*_{a} is given by
We write *n*_{ab} instead of
*n*_{a∪b}, where *a*, *b* ⊆ *K*. The grand
total of the complete table is *n*_{∅}.

Assume we are given *m* possibly overlapping marginal tables
*n*_{C1}, *n*_{C2}, … ,
*n*_{Cp} such that *C*_{1} ∪
*C*_{2} ∪ … ∪ *C*_{p} = *K*. Moreover,
*C*_{1}, *C*_{2}, … , *C*_{p}
are the cliques of a decomposable graph 𝒢 = (*K*, *E*). Let
*S*_{2}, … , *S*_{p} be the separators
associated with (*C*_{j})_{j}. Every
*S*_{j} is included in some clique
*C*_{i}, hence the marginals
*n*_{S2}, … , *n*_{Sp} will
also be fixed.

The class of Fréchet bounds we present is linked with the theory
of decomposable loglinear models. We think of every vertex *i* ∈
*K* of 𝒢 as being associated with a variable
*X*_{i}. The structural information embedded in 𝒢
might be interpreted in the following way: If *S* separates
*A*_{1} and *A*_{2} in 𝒢, then
*X*_{A1} is conditionally independent of
*X*_{A2} given *X*_{S}. The
loglinear model with minimal sufficient statistics
*C*_{1}, *C*_{2}, … , *C*_{p}
will be decomposable because its independence graph 𝒢 is
decomposable, and consequently the maximum likelihood estimates (MLEs)
will exist and can be expressed in a closed form (14, 15). We develop
explicit formulas for the tightest upper and lower bounds for the cell
counts in the cross-classification *n*_{K} provided
that the marginals *n*_{C1},
*n*_{C2}, … , *n*_{Cp} are known by
employing a similar machinery to the one used for developing formulas
for MLEs for a decomposable loglinear model. This machinery provides us
with the tools we need for extending the usual Fréchet bounds to
more complicated graphical structures.

We begin with a slightly more general statement of the original Fréchet bound result (2, 11).

Theorem 4. (Fréchet).
(*i*) *Let* *a*_{1}, *a*_{2} ⊆ *K*
*such that* (*a*_{1}\*a*_{2},
*a*_{1} ∩ *a*_{2}, *a*_{2}\*a*_{1})
*is a proper decomposition of the graph* 𝒢
(*a*_{1} ∪ *a*_{2}). *Then the following
inequality holds:
*3
(*ii*)* The above inequality provides sharp bounds for
the cells in the contingency table*
*n*_{a1}_{∪a2}
*given the marginals* *n*_{a1}
*and* *n*_{a2}.

If two vertex sets are in two distinct connected components, they are
separated by the empty set. It is not hard to see that *Theorem
4* implies the following result.

Corollary 1. (*i*) *If*
*a*_{1} *and* *a*_{2}
*are two disjoint subsets of **K*, *we have
*
(*ii*) *The above inequality provides sharp bounds for
the cells in the contingency table*
*n*_{a1}_{∪a2}
*given the marginals* *n*_{a1}
*and* *n*_{a2}.

This immediately generalizes to a graph with any number of connected components.

Theorem 5. (*i*) *Let* {*a*_{1},
*a*_{2}, … , *a*_{m}} *denote the set
of connected components of the graph
𝒢*.
*Then the following is true*:
4
(*ii*) *The* *above* *inequality* *provides* *sharp* *bounds* *for* *the*
*cells* *in* *the* *contingency* *table*
*given the marginals* *n*_{a1},
*n*_{a2}, … , *n*_{am}.

We are now ready to explore the situation when the minimal sufficient statistics of a decomposable loglinear model define a connected graph.

Theorem 6. *Suppose* 𝒢 = (*K*, *E*)
*is connected and decomposable. Let *𝒞(𝒢)
= {*C*_{1}, *C*_{2}, … , *C*_{p}}
*the set of cliques of 𝒢 ordered in a perfect sequence and*
𝒮(𝒢) = {*S*_{2}, … ,
*S*_{p}} *the corresponding set of separators. Then
*5
*and these are sharp bounds for the cells in the contingency
table* *n*_{K} *given the marginals*
*n*_{C1}, … , *n*_{Cp}.

*Proof:* By induction. If 𝒢 decomposes in *p*
= 2 cliques, then Eq. 5 is a direct consequence of
*Theorem 4*. Suppose we know that Eq. 5 holds for
any connected decomposable graph with *p* − 1 cliques.
We want to prove Eq. 5 for a graph with *p*
cliques.

*Theorem 3* tells us that
(*H*_{p−1}\*S*_{p}, *S*_{p},
*R*_{p}) is a decomposition of the graph
𝒢(*H*_{p}) = 𝒢. By using *Theorem
4*, we obtain
6
The cliques of 𝒢(*H*_{p−1}) are
*C*_{1}, *C*_{2}, … ,
*C*_{p−1}, and this is a perfect sequence in
𝒢(*H*_{p−1}). From the induction assumption that
we made, we have
7
By combining Eqs. 6 and 7, we obtain the
desired Eq. 5. Again, because the bounds in Eq. 6
are the tightest possible for the counts in table
*n*_{K}, and the same is true for the bounds in Eq.
7 for the cell counts in table
*n*_{Hp−1}, we conclude that the bounds in Eq.
5 are also the tightest bounds for the counts in table
*n*_{K}.

Buzzigoli and Giusti (18) proposed an algorithm, which they call
*the shuttle algorithm*, that alternates iteratively between
upper and lower bounds, and that when applied to decomposable
structures appears to indirectly exploit the structure implicit in
*Theorem 6*. But it does not achieve the sharp bounds in as
computationally efficient fashion as we can by using the formula
directly.

At this point we succeeded in developing formulas for the sharpest bounds when the sets of indices defining the known marginals define a connected decomposable graph. However, the connectivity assumption is not by any means essential. We can extend the definition of decomposable graphs to include disconnected graphs with all their connected components decomposable. By employing the maximum cardinality search algorithm sequentially for every connected component, we can determine the set of cliques of such a disconnected decomposable graph as the union of the sets of cliques associated with the connected components. The corresponding set of separators can be obtained in the same way.

The next result provides an explicit formula for the generalized Fréchet bounds associated with an arbitrary decomposable graphical structure. We emphasize that the generalized Fréchet bounds are sharp bounds given the information that we assumed we have.

Theorem 7. (*i*) *Let* 𝒢 = (*K*, *E*)
*be a decomposable graph. Then the following inequality
is true*:
8
*where* 𝒞(𝒢) *is* *the* *set* *of* *cliques* *of* 𝒢, 𝒮(𝒢) *is* *the*
*set* *of* *separators* *associated* *with* 𝒞(𝒢), *and*
*m* *is the number of connected components of the graph
𝒢*. (*ii*) *The above inequality provides sharp bounds
for the cells in the contingency table* *n*_{K}
*given the marginals* {*n*_{C}|*C* ∈
𝒞(𝒢)}.

*Proof:* We apply *Lemma 6* for each connected
component of 𝒢, then *Theorem 5* to combine the resulting
inequalities. All the bounds for the marginal tables involved are
tight, hence the bounds in Eq. 8 will also be tight.

## 4. Reducible Graphs

By exploiting decomposability in an appropriate manner, we have been able to find sharp bounds for cell counts when some special sets of marginals characterizing decomposable loglinear models are given. It is natural to ask ourselves whether we could develop similar results for reducible graphs, as described in refs. 16 and 19.

*Definition 6:* A graph 𝒢 is *reducible *if 𝒢
admits a proper decomposition, otherwise 𝒢 is a *prime
*graph.

Any complete graph is prime, whereas any disconnected graph is reducible. By definition, the atoms contained in a derived system of a graph are all prime. Given that every reducible graph 𝒢 might have several derived systems (16), we would like to be able to isolate one of them that could fully characterize the input graph 𝒢.

*Definition 7:* A subgraph 𝒢(*A*) is a
*maximal prime *(mp-) subgraph of 𝒢, if
𝒢(*A*) is prime and 𝒢(*B*) is reducible for
all *B* with *A* ⊂ *B* ⊆ *V*.

The set of mp-subgraphs of 𝒢 is contained in every derived system of
𝒢. Moreover, the set of mp-subgraphs of 𝒢 is always a derived
system of 𝒢 (19), and consequently it is the *unique minimal
derived system*. If 𝒢 is decomposable, the mp-subgraphs of 𝒢
are complete, hence the unique minimal derived system of a decomposable
graph contains only its cliques (19).

Section 2 describes a procedure for finding the mp-subgraphs of a decomposable graph. The order in which the MCS algorithm identifies the mp-subgraphs along with the set of separators are needed to reconstruct the original graph from its minimal derived system. We would like to devise a similar decomposition algorithm for the more general case when the input graph is reducible, not necessarily decomposable.

It is easy to see that any decomposable graph is reducible, but the
converse is not true, as we will prove next. Gavril (20) introduced the
family of *clique separable graphs* in the following recursive
manner.

*Definition 8:* 𝒢 = (*V*, *E*) is a
*clique-separable graph *if (*i*) 𝒢 is a Type 1 or
Type 2 graph, or (*ii*) 𝒢 has a separator *C*, and
the leaves of 𝒢 produced by *C* are
clique-separable graphs.

A graph 𝒢 is a Type 1 graph if its vertex set can be partitioned in
two subsets *V*_{1}, *V*_{2}, such that
|*V*_{1}| ≥ 3, 𝒢(*V*_{1}) is
a connected bipartite graph, *V*_{2} is complete, and
every vertex of *V*_{1} is adjacent to every vertex
of *V*_{2}. In addition, 𝒢 = (*V*, *E*) is a
Type 2 graph if there exists a partition
*V*_{1}, … , *V*_{k} of *V*,
such that *V*_{1}, … , *V*_{k} are
independent sets in 𝒢, and every vertex of *V*_{i}
is adjacent to every vertex of *V*_{j}, for
*i* ≠ *j*.

By definition, any decomposable graph is also clique-separable, and any clique-separable graph is reducible. However, Type 2 graphs are clique-separable but obviously they are not necessarily decomposable, hence the class of reducible graphs is much richer than the class of decomposable graphs.

Tarjan (16) has proposed an O(*nm*)-time method for
decomposing a reducible graph with *n* vertices and
*m* edges. The downside of Tarjan's algorithm is that it
generates an arbitrary derived system of prime graphs. Leimer (19) has
adapted this algorithm so that the input graph is decomposed exactly
into its mp-subgraphs. A reducible graph 𝒢 might have several
separators that would induce a proper decomposition of 𝒢. If we could
select the “right” separator at every step of the decomposition
procedure, then we would manage to avoid including nonmaximal prime
subgraphs in the final derived system.

*Definition 9:* [Leimer (19).] Let (*A*_{1},
*A*_{2}, *A*_{3}) be a decomposition of
𝒢 into the subgraphs 𝒢′ = 𝒢(*A*_{1} ∪
*A*_{2}) and 𝒢" = 𝒢(*A*_{2} ∪
*A*_{3}). If the mp-subgraphs of 𝒢′ and
𝒢" are pairwise different and if they are all mp-subgraphs
of 𝒢, then (*A*_{1}, *A*_{2},
*A*_{3}) is called a *P-decomposition *and
*A*_{2} is called a *P-separator.*

Moreover, a decomposition (*A*_{1}, *A*_{2},
*A*_{3}) is a P-decomposition if and only if
𝒢(*A*_{2}) is not an mp-subgraph of any of the
graphs 𝒢(*A*_{1} ∪ *A*_{2}) and
𝒢(*A*_{2} ∪ *A*_{3}) (19). If a graph has
a decomposition, then it also has a P-decomposition. Therefore it is
possible to decompose a reducible graph by means of P-separators, and
in this case we are guaranteed to obtain the minimal derived system of
maximal prime subgraphs.

Assume that we somehow managed to order the vertex sets of the
mp-subgraphs 𝒢(*V*_{1}), … ,
𝒢(*V*_{k}) of a graph 𝒢 in a perfect sequence. By using
the same notations as before, we have the following result.

Theorem 8. [*Leimer* (19).]
(*H*_{k−1}\*S*_{k}, *S*_{k}, *R*_{k}) *is*
*a* *P*-*decomposition* *of* 𝒢 *into* 𝒢′ = 𝒢(*H*_{k−1}) *and* *the*
*prime* *graph* 𝒢" = 𝒢(*V*_{k}).
𝒢(*V*_{1}), … , 𝒢(*V*_{k−1}) *are* *the*
*mp*-*subgraphs* *of* 𝒢′ *and* *V*_{1}, … , *V*_{k−1}
*is* *a* *perfect* *sequence* *of* *sets* *in* 𝒢′.

*Theorem 8* can be applied recursively to generate a derived
system of 𝒢. Because the decompositions performed along the way are
P-decompositions, the minimal derived system of 𝒢 will be generated.

We are interested in the existence of a perfect sequence of the mp-subgraphs of a graph only for proving the correctness of our results. The ordering of the mp-subgraphs is not relevant when computing the generalized Fréchet bounds, and consequently, in an actual implementation of our algorithms, we would only have to obtain the set 𝒱(𝒢) of mp-subgraphs along with the corresponding sequence 𝒮(𝒢) of separators.

Leimer (19) has suggested an alternative approach that would allow us
accomplish this task by taking advantage of the MCS algorithm we
previously presented. The first step would be to transform a connected
reducible graph 𝒢 = (*V*, *E*) in a closely related
decomposable graph by adding extra edges in *E*. We would like
to keep the number of edges added to a minimum, so that a minimal
decomposable graph is derived.

*Definition 10:* [Tarjan (16).] Let π be an ordering
of the vertex set of a graph 𝒢 = (*V*,* E*). The
fill-in *F*_{π} caused by the ordering π is the
set of edges:
9
The graph 𝒢_{π} = (*V*, *E* ∪
*F*_{π}) is called the *minimal fill-in graph*
if there does not exist a numbering π′ of 𝒢 with
*F*_{π′} ⊂ *F*_{π}. It can be shown that
the fill-in graph 𝒢_{π} is decomposable for any numbering
π of 𝒢. Algorithms for generating a minimal fill-in graph can be
found in Ohtsuki and Cheung (21).

The second step consists of applying the maximum cardinality search
algorithm to the minimal fill-in graph 𝒢_{π} associated
with the input graph 𝒢. However, we will not employ the
“original” maximum cardinality search algorithm. We will make use
instead of an expanded version (17) that can find the set
𝒞(𝒢_{π}) = {*C*_{1},
*C*_{2}, … , *C*_{r}} of cliques of
𝒢_{π} along with the associated system
𝒮(𝒢_{π}) = {*S*_{2}, … ,
*S*_{r}} of separators by constructing a tree
𝒯_{π} = (𝒞(𝒢_{π}),
ℰ_{τπ}). We assume that the sequence
*C*_{1}, *C*_{2}, … , *C*_{r} is
perfect. For every clique *C*_{j}, *j* > 1, we
choose a “parent” clique *C*_{i}, *i* < *j*
such that *S*_{j} ⊂ *C*_{i}, and include the
edge (*C*_{j}, *C*_{i}) in
ℰ_{τπ}. Because the parent of a clique might
not be unique, more than one tree could be constructed on
𝒞(𝒢_{π}). Moreover, *C*_{1} cannot
have a parent and will be called the *root* of the tree. This
is certainly not a restriction because every clique can be
*C*_{1} in some perfect sequence. The tree
𝒯_{π} generated by the MCS algorithm has the additional
property that *S* ⊂ *V* is a minimal vertex separator of
𝒢_{π} if and only if *S* = *C*_{j} ∩
*C*_{i} for some edge (*C*_{j},
*C*_{i}) ∈ ℰ_{τπ}. Consequently, the
set of separators associated with 𝒞(𝒢_{π}) will be given
by 𝒮(𝒢_{π}) = {*C*_{i} ∩
*C*_{j} : (*C*_{i}, *C*_{j}) ∈
ℰ_{τπ}}. Then *S* ∈
𝒮(𝒢_{π}) will also be a minimal separator in
𝒢 if *S* is complete in 𝒢.

The last step of the algorithm is presented below in pseudo-code. With
every clique *C* ∈ 𝒞(𝒢_{π}), we associate a
vertex set Δ(*C*). Initially we set Δ(*C*) ← *C*
for all *C* ∈ 𝒞(𝒢_{π}). A clique *C*
is *terminal* in 𝒯_{π} if *C* is not the
parent of any other clique, i.e., if there is no such *C*′
with (*C*′, *C*) ∈ ℰ_{τπ}.

• 𝒱(𝒢) ← ∅; 𝒮(𝒢) ← ∅;

•

**while**ℰ_{τπ}≠ ∅ do

- 1.,
- Identify a terminal clique
*C*_{j};; - 2.,
- 𝒞(𝒢
_{π}) ← 𝒞(𝒢_{π})\{*C*_{j}};; - 3.,
- ℰ
_{τπ}← ℰ_{τπ}\{(C_{j}, C_{i})};; - 4.,
**if***C*_{j}∩*C*_{i}is complete in 𝒢 then;- ,
- 𝒱(𝒢) ← 𝒱(𝒢) ∪
{Δ(
*C*_{j})};; - ,
- 𝒮(𝒢) ← 𝒮(𝒢) ∪ {C
_{j}∩ C_{i}};; - ,
**else**;- ,
- Δ(
*C*_{i}) ← Δ(*C*_{i}) ∪ Δ(*C*_{j});; - ,
**end while**

• 𝒱(𝒢) ← 𝒱(𝒢) ∪ {Δ(

*C*_{1})}.

This algorithm provides a computational approach for identifying the maximal prime subgraphs 𝒱(𝒢) of an arbitrary connected reducible graph 𝒢, along with its associated system 𝒮(𝒢) of separators. We utilize it in the following section.

## 5. Generalized Fréchet Bounds for Reducible Loglinear Models

In Section 3, we showed that we can explicitly determine the
tightest bounds for the cells in a table of counts
*n*_{K} given a set of marginals when that set of
marginals define a decomposable graph 𝒢 = (*K*, *E*). When the
graph associated with some set of marginals is not decomposable, we
have no choice but to employ iterative methods such as the simplex
algorithm. Generally speaking, linear programming methods are
computationally expensive and might yield results that are very
difficult to interpret, so they should be used with care. The natural
question to ask is whether we could reduce the computational effort
needed to determine the tightest bounds by employing the same strategy
used for decomposable graphs, i.e., decompositions of graphs by means
of complete separators.

To be more specific, assume we want to determine the bounds for a
contingency table *n*_{K} given the marginals
*n*_{C1}, *n*_{C2}, … ,
*n*_{Cp}. In addition, *C*_{1},
*C*_{2}, … , *C*_{p} are the cliques of the
graph 𝒢 = (*K*, *E*). 𝒢 is assumed to be reducible, not
necessarily decomposable. Let *V*_{1},
*V*_{2}, … , *V*_{q} be the maximal prime
subgraphs of 𝒢 ordered in a perfect sequence, and let
*S*_{2}, *S*_{3}, … , *S*_{q} be
the sequence of separators associated with *V*_{1},
*V*_{2}, … , *V*_{q}. Suppose we could
compute tight bounds for the marginals *n*_{V1},
*n*_{V2}, … , *n*_{Vq} given
*n*_{C1}, *n*_{C2}, … ,
*n*_{Cp}, i.e., we know
*n*_{V1}^{U},
*n*_{V2}^{U}, … ,
*n*_{Vq}^{U} and
*n*_{V1}^{L},
*n*_{V2}^{L}, … ,
*n*_{Vq}^{L} such that
10
Because *S*_{j} is complete in 𝒢, there
will exist an *i* ∈ {1, 2, … , *p*} such that
*S*_{j} ⊆ *C*_{i}. Hence
*n*_{Sj} is a marginal table of
*n*_{Ci}. Therefore, once we fixed
*n*_{C1}, *n*_{C2}, … ,
*n*_{Cp}, the marginals
*n*_{S2}, … , *n*_{Sq} will
also be fixed. With the notations introduced above, we develop
explicit formulas for sharp bounds for the cells counts in table
*n*_{K}.

Theorem 9. *Suppose* 𝒢 = (*K*, *E*)
*is connected and reducible. The tightest bounds for the cell
counts in the contingency table* *n*_{K}
*given the marginals* *n*_{C1},
*n*_{C2}, … , *n*_{Cp} *are
given by*
11
*Proof:* Because *V*_{1},
*V*_{2}, … , *V*_{q} is a derived system of
𝒢, we could think about the subgraphs
𝒢(*V*_{1}), … , 𝒢(*V*_{q}) as being
the cliques of a connected decomposable graph 𝒢′. Moreover,
*S*_{2}, … , *S*_{q} will be the system
of separators associated with *V*_{1},
*V*_{2}, … , *V*_{q} in 𝒢′. By employing
*Theorem 6*, we obtain
12
Then Eq. 11 follows immediately from Eqs. 12
and 10. The bounds for the marginal tables involved are all
sharp, hence the bounds in Eq. 11 will also be tight.

Once again, we will point out the link with maximum likelihood
estimation in loglinear models. We define a *reducible loglinear
model* as one for which the corresponding minimal sufficient
statistics are margins that characterize the maximal prime subgraphs of
a reducible graph. Assuming that one has calculated maximum
likelihood estimates for the loglinear models determined by the
independence graphs
𝒢(*V*_{1}),𝒢(*V*_{2}), … ,𝒢(*V*_{q}),
then one can easily derive explicit formulae for the maximum likelihood
estimates in the reducible loglinear model with independence graph 𝒢.
By employing results of Lauritzen (14), we find that
13
(c.f. the special cases given in ref. 12).

We continue the analogy with the decomposable case we previously discussed by considering a reducible disconnected graph. We know how to find the maximal prime subgraphs (along with the corresponding sequence of separators) successively for every connected component. The set of mp-subgraphs for the complete graph is defined as the union of the sets of mp-subgraphs of every connected component. The set of separators can be determined in a similar way. We are now ready for the main result of the paper. However, we are going to postpone presenting it for the moment.

### 5.1. Example.

To clarify the concepts and the results presented so far, we use an example similar to the one proposed by Tarjan (16). The graph 𝒢 in Fig. 1 has 11 vertices and 17 edges represented by continuous lines. We want to determine the mp-subgraphs of 𝒢. The edge {3, 9} is a separator for {1, 3, 4, 5, 6, 7, 8, 9, 11} and {2, 3, 9, 10}. The latter is a four-cycle, hence cannot be further decomposed, and because it is not a complete, 𝒢 cannot be decomposable. Similarly, {4, 7} separates {1, 3, 4, 7, 8, 9, 11} and {4, 5, 6, 7}. Again, the latter is a four-cycle, hence it is a prime subgraph. Now the clique {1, 3, 4, 11} is separated from {3, 4, 7, 8, 9, 11} by the triangle {3, 4, 11}. The subgraph 𝒢({3, 4, 7, 8, 9, 11}) does not have a separator, therefore we have finished decomposing 𝒢. The set of mp-subgraphs is 𝒱(𝒢) = {{2, 3, 9, 10}, {4, 5, 6, 7}, {1, 3, 4, 11}, {3, 4, 7, 8, 9, 11}}, whereas the sequence of separators is 𝒮(𝒢) = {{3, 9}, {4, 7}, {3, 4, 11}}.

Next we illustrate how to obtain 𝒱(𝒢) and
𝒮(𝒢) by using the decomposition algorithm from Section
4. The minimal fill-in graph 𝒢_{π} is obtained by adding
six new edges to 𝒢. These edges are represented with dotted lines in
Fig. 1. The cliques of 𝒢_{π} are *C*_{1}
= {1, 3, 4, 11}, *C*_{2} = {3, 4, 7, 11},
*C*_{3} = {3, 7, 8, 11}, *C*_{4} = {4, 6,
7}, *C*_{5} = {4, 5, 6}, *C*_{6} = {3,
8, 9}, *C*_{7} = {3, 9, 10}, and
*C*_{8} = {2, 3, 10}. The tree
𝒯_{π} constructed by the MCS algorithm on
𝒞(𝒢_{π}) = {*C*_{1},
*C*_{2}, … , *C*_{8}} has edges
We proceed to the last step of the algorithm. The clique
*C*_{5} is terminal, but *C*_{5} ∩
*C*_{4} = {4, 6} is not complete in 𝒢, hence we
set Δ(*C*_{4}) = {4, 5, 6, 7}. After
eliminating *C*_{5} from the clique tree,
*C*_{4} becomes terminal. Because
*S*_{1} = *C*_{4} ∩ *C*_{2} =
{4, 7} is complete in 𝒢, we identified the first mp-subgraph
*V*_{1} = Δ(*C*_{4}) and its associated
separator *S*_{1}. We eliminate
*C*_{4} from 𝒯_{π}, and the algorithm
proceeds in a similar manner.

The set 𝒞(𝒢) of cliques is essentially the set of edges
of 𝒢 from which we take out {1, 3}, {1, 4}, {1, 11},
{4, 11}, {3, 4}, {3, 11}, and then add {1, 3, 4, 11}.
Assume we want to determine upper and lower bounds for a
cross-classification *n*_{K} with 11 dimensions.
Given the marginal tables {*n*_{C} : *C* ∈
𝒞(𝒢)}, it is possible to compute sharp bounds for the
marginal tables corresponding to the mp-subgraph of 𝒢. Because the
separators in 𝒮(𝒢) are subsets of some cliques, they
will define marginals of some tables in {*n*_{C} : *C* ∈
𝒞(𝒢)}, hence it is possible to make use of Theorem 9 to
calculate sharp bounds for the cell counts in table
*n*_{K}.

### 5.2. Bounds for Reducible Loglinear Models.

The foregoing example indicates that *Theorem 9* is applicable
in a more general setting than the one we previously suggested.
Determining bounds for cell counts in a cross-classification given the
marginals defined by the set of cliques is equivalent to the problem of
calculating the MLEs of a graphical loglinear model. The minimal
sufficient statistics of a graphical log-linear model define a graph,
and the cliques of this graph are exactly the minimal sufficient
statistics. If the minimal sufficient statistics are not cliques in the
associated graph, the model is not graphical.

For example, suppose we have a table *n*_{K}
corresponding to the graph in Fig. 1. Now assume we don't have access
to the marginal *n*_{[1,3,4,11]}, but instead we do
know *n*_{[1,3]}, *n*_{[1,4]},
*n*_{[1,11]} and also *n*_{[3,4,11]}.
These marginals no longer correspond to the cliques of a graph. Yet it
is still possible to compute sharp bounds for the marginals determined
by the mp-subgraphs of 𝒢, and then to combine these bounds using
Theorem 9 to obtain tight bounds for the complete table
*n*_{K}.

To be more explicit, suppose we are provided with a set of marginals
*n*_{D1}, *n*_{D2}, … ,
*n*_{Dr} that define a graph 𝒢 = (*K*, *E*).
We have *K* =
*D*_{i} and *E* = {(*u*, *v*) : {*u*, *v*} ⊂
*D*_{j}, for some *j* = 1, … , *r*}.
The mp-subgraphs of 𝒢 are 𝒱(𝒢) = {*V*_{1},
*V*_{2}, … , *V*_{q}}, whereas the
corresponding sequence of separators is 𝒮(𝒢) =
{*S*_{2}, … , *S*_{q}}. We emphasize that
(*D*_{j})_{j} do not have to be the set of
cliques of 𝒢 and that 𝒢 is not necessarily connected. However, we
need to impose one additional constraint, namely for every
*S*_{i}, there is a *j* ∈ {1, 2, … ,
*r*} such that *S*_{i} ⊂ *D*_{j}. This
implies that the marginals *n*_{S2}, … ,
*n*_{Sq} will be fixed once
*n*_{D1}, *n*_{D2}, … ,
*n*_{Dr} are fixed. With these notations, we announce
a more general version of *Theorem 9*.

Theorem 10. *Let* 𝒢 = (*K*, *E*) *be* *a* *reducible* *graph*.
*Then* *the* *following* *inequality* *is* *true*:
14
*where *𝒱(𝒢) *is* *the* *set* *of* *maximal* *prime* *subgraphs*
*of* 𝒢, 𝒮(𝒢) *is* *the* *set* *of* *separators* *associated* *with* 𝒱(𝒢), *and*
*m* *is* *the* *number* *of* *connected* *components* *of* *the* *graph* 𝒢. *In* *addition*,
{*n*_{V}^{U}|*V* ∈ 𝒱(𝒢)} *and*
{*n*_{V}^{L}|*V* ∈ 𝒱(𝒢)} *are* *the* *tightest* *upper*
*and* *lower* *bounds* *for* *the* *marginal* *tables* {*n*_{V}|*V*
∈ 𝒱(𝒢)}, *respectively*.

*Proof:* Because 𝒱(𝒢) is a derived system of the
graph 𝒢, we can think about the subgraphs {𝒢(*V*)|*V* ∈
𝒱(𝒢)} as being the derived system of cliques of a graph
𝒢′. In this case 𝒮(𝒢) will be the set of separators
associated with 𝒱(𝒢) in 𝒢′, hence Eq. 14
follows immediately from Eq. 8.

## 6. Conclusions

The results described in this paper are part of a
programmatic effort to understand and operationalize the computation of
upper and lower bounds for non-negative entries in
cross-classifications subject to a set of marginal constraints. From
research on mass transportation and other versions of this problem, we
know that the computational problem is typically characterized as being
*NP*-complete, and thus we cannot expect to find a simple
approach that will deal effectively with the bound calculation problem,
especially in high dimensions. Thus, instead of attempting to utilize a
general computational approach such as linear programming or the
simplex algorithm (22) or network methods (23, 24), we have opted to
exploit the structure of the underlying probability structures based on
statistical and mathematical theory.

In particular, we have worked with the graphical representation of probability distributions subject to conditional independence relationships and utilized existing results on decomposable graphs to derive explicit bounds for cell entries when the given marginals correspond to the maximal cliques of a decomposable graph. Our approach was motivated by the more specialized results for decomposable loglinear models for tables of counts where the minimal sufficient statistics are marginals and the expected cell values are explicit functions of them.

We also have extended the bound results from the decomposable to
the reducible case, and this allows us to exploit other results and
computational approaches for bounds applied to subtables corresponding
to the reducible components that are not cliques. The results of
Section 5 focus on the cases where tables still have a graphical
representation representing conditional independence relationships. But
there are many other probability structures where we would like to be
able to calculate bounds but are not graphical in this sense. For
example, a *k*-dimensional probability distribution given all
(*k* − 1)-dimensional marginals is not graphical, but we
are still able to exploit statistical theory to compute upper and lower
bounds in this case. Fienberg (11) outlines an approach for doing this
in the *k* = 3 case, and Dobra and Fienberg (25) provide
detailed algorithms for *k* > 3. Suppose that one wants
to compute bounds for a cross-classification that has a structure
similar to that in the reducible case, except that we replace a
*d*-dimensional nonclique by a *d*-dimensional
probability distribution given all (*d* − 1)-dimensional
marginals. Then we can combine the bounds computed for this
nongraphical distribution using the reducible representation of Section
5.

Cox (26) raised a very interesting question, namely, whether one can actually construct a feasible table with a prescribed set of possibly overlapping margins. The solution to the feasibility problem is straightforward if the marginal tables constitute the maximal cliques of a decomposable graph. In this case, the explicit formulas for calculating the MLEs of the associated loglinear model provide us with a feasible table. In addition, if the set of margins are the minimal sufficient statistics of a reducible loglinear model, Eq. 13 tells us how to construct a feasible table given a consistent set of marginals associated with the maximal prime subgraphs of the induced independence graph. Therefore, the results substantially reduce the computational effort needed to solve the feasibility problem by reducing it to a number of smaller and hopefully easier-to-solve problems.

These results represent only a small part of those needed to allow the computation of upper and lower bounds for high-dimensional cross-classifications of the sort that arise in disclosure limitation and other practical problems.

## Acknowledgments

Preparation of this paper was supported in part by the U.S. Bureau of the Census and the National Science Foundation under Grant EIA-9876619 to the National Institute of Statistical Sciences.

## Appendix

We give below a proof of *Theorem 4*.

*Proof:* Let *a*_{1} = {1, … ,
*p*} and *a*_{2} = {*q*, … , *m*}
where 1 ≤ *q* ≤ *p* ≤ *m*. Let
*n*_{i}_{1}^{0} … *i*_{m}^{0}
be an arbitrary cell in the table
*n*_{a1}_{∪}_{a2}.
To avoid confusion, the marginals *n*_{a1},
*n*_{a2},
*n*_{a1}_{∩}_{a2}
will be denoted by *A*^{1}, *A*^{2},
*A*^{12}, respectively. The following equalities should
hold:
Consider the sums:
With these notations, we have
15
We can write *D*_{12} = *D*_{1} +
*D*_{2} + *D*_{3}, where
It follows that
16
Clearly *D*_{1}, *D*_{2}, *D*_{3}
≥ 0. From Eqs. 15 and 16, we deduce
which concludes the proof.

## Footnotes

↵* To whom reprint requests should be addressed. E-mail: fienberg{at}stat.cmu.edu.

This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected on April 27, 1999.

## Abbreviations

- c.d.f.,
- cumulative distribution function;
- PEO,
- perfect elimination ordering;
- MCS,
- maximum cardinality search;
- MLE,
- maximum likelihood estimate;
- mp-,
- maximal prime

- Accepted August 2, 2000.

- Copyright © 2000, The National Academy of Sciences

## References

- ↵Various authors (1998)
*J. Off. Stat.***14,**Special Issue 4. - ↵
- Rachev S T,
- Rüschendorf L

- ↵
- Gutmann S,
- Kemperman J H B,
- Reeds J A,
- Shepp L A

- ↵
- King G

- ↵
- Balke A,
- Pearl J

- ↵
- Joe H

- ↵
- Fréchet M

- ↵
- Bonferroni C E

- ↵
- Hoeffding W

- ↵
- Rüschendorf L

- ↵
- Fienberg S E

- ↵
- Bishop Y M M,
- Fienberg S E,
- Holland P W

- Haberman S J

- ↵
- Lauritzen S L

- ↵
- Whittaker J

- ↵
- ↵
- Blair J R S,
- Barry P

- ↵
- Buzzigoli L,
- Giusti A

- ↵
- ↵
- ↵
- Ohtsuki T,
- Cheung L K,
- Fujisawa T

- ↵
- Roehrig S F,
- Padman R,
- Duncan G T,
- Krishnan R

- ↵
- Cox L H

- ↵
- Cox L H

- ↵
- Dobra A,
- Fienberg S E

- ↵
- Cox L H