Congruences between modular forms: Raising the level and dropping Euler factors
Abstract
We discuss the relationship among certain generalizations of results of Hida, Ribet, and Wiles on congruences between modular forms. Hida’s result accounts for congruences in terms of the value of an L-function, and Ribet’s result is related to the behavior of the period that appears there. Wiles’ theory leads to a class number formula relating the value of the L-function to the size of a Galois cohomology group. The behavior of the period is used to deduce that a formula at “nonminimal level” is obtained from one at “minimal level” by dropping Euler factors from the L-function.
An example of a congruence between modular forms is provided by
the newforms
of levels 11 and 77, respectively, whose first few Fourier
coefficients are found in Table 1. One can show
that, in fact, a
n ≡ b
n mod 3
for all n not divisible by 7. (See Theorem 5.1 below.)
Fourier coefficients
We shall discuss the relationship among the following three results concerning congruences to a newform ƒ of weight 2 and level N. We assume that K is a number field containing the coefficients of ƒ and restrict our attention to congruences mod powers of a prime λ dividing ℓ.
• A formula of Hida (1) measuring congruences to ƒ in terms of the value of an L-function.
• A result of Ribet (2) that establishes the existence of certain systematic congruences between ƒ and forms of level Np (such as the one above).
• A theorem of Wiles (3), completed by his work with Taylor (4), which shows that all suitable deformations of Galois representations associated to ƒ actually arise from forms congruent to ƒ.
Hida’s formula, though not part of the logical structure of ref. 3, provides some insight into the role played in Wiles’ proof by a certain generalization of Ribet’s result. This generalization can be interpreted as the invariance of a period appearing in Hida’s formula. Using this invariance, one shows that Wiles’ theorem at minimal level implies the theorem at nonminimal level.
Remark 1.1: We are concerned here mainly with Ribet’s “raising the level” result, rather than his “lowering the level” result of ref. 5. We remark that Hida also found systematic congruences between ƒ and forms of level Nℓr. We shall not discuss these, but focus on congruences between ƒ and forms of level Nd with d not divisible by ℓ.
Notation and Review
We fix a prime ℓ and embeddings Q̄ → Q̄ ℓ and Q̄ → C. Suppose that K is a number field contained in C and let λ denote the prime of 𝒪K determined by our choice of embeddings. Let 𝒪 denote the localization of 𝒪K at λ.
We suppose that ƒ is a newform of weight 2, level
N
ƒ and character χƒ with
coefficients in K. The Eichler–Shimura construction
associates to ƒ an ℓ-adic representation
such that if p does not divide
N
ƒℓ, then ρƒ is unramified
at p and ρƒ(Frobp) has
characteristic polynomial
We let ρ̄ƒ denote the semisimplification of
the reduction of ƒ. If ƒ and g are newforms of weight 2,
then we write ƒ ∼ g if ρ̄ƒ is
equivalent to ρ̄g. By the Cebotarev density theorem and
the Brauer–Nesbitt theorem, we have ƒ ∼ g if and only
if a
p(ƒ) ≡ a
p(g)
for all primes p not dividing
N
ƒ
N
gℓ, the congruence
being modulo the maximal ideal of the integral closure of
Z
ℓ in Q̄
ℓ.
We assume throughout that ℓ is odd, ℓ2 does not divide N ƒ, and ℓ does not divide the conductor of χƒ. We assume also that the restriction of ρ̄f to Gal (Q̄/F) is irreducible where F is the quadratic subfield of Q(ζℓ). It is convenient to distinguish two sets of primes which can create technical problems.
• We let S ƒ denote the set of primes p such that ρƒ|dp is not minimally ramified in the sense of ref. 6.
• We let P ƒ denote the set of primes ρ ≠ ℓ such that ρ̄ƒ Ip = 0, but ad0(ρ̄ƒ)Ip ≠ 0.
If p is not in P ƒ ∪ ℓ, then p is in S ƒ if and only if the powers of p differ in the conductors of ρƒ and ρ̄ƒ. In the introductory example, we have S ƒ = P ƒ = P g = ∅, and S g = {7}.
Counting Congruences
We assume that N is divisible by
N
ƒ but not by ℓ2 and let
Let T
N denote the 𝒪-subalgebra of
∏g∈FN
C generated by
the set of T
p for p not dividing
Nℓ, where T
p denotes
(a
p(g))g. Then
T
N is a local ring, free over 𝒪 of rank equal to
the cardinality of F
N.
Consider the homomorphism πƒ:T
N →
𝒪 defined by projection to the ƒ coordinate. Define ideals of
T
N by
Then the ideal πƒ(J
ƒ) has
finite index in 𝒪, and is called a congruence
ideal. This is a variant of the notion of a congruence
module used in refs. 1 and 2.
To see how it measures congruences, consider again the above example
with ƒ of level 11. We suppose that N = 77 and ℓ =
3. Then T
77 can be identified with
and we find that the congruence ideal is 3𝒪.
We consider also some useful variants. Suppose that Σ is a finite set
of primes containing S
ƒ. We let
F
Σ denote the set
We then define T
Σ as above, but using the
set F
Σ instead of F
N. We
denote the resulting congruence ideal C
ƒ,Σ.
If ƒ is replaced by the newform associated to a twist, then
T
Σ is replaced by a ring to which it is
canonically isomorphic, and we obtain the same congruence ideal. So we
suppose from now on that χƒ is of order not divisible
by ℓ.
If Σ contains P ƒ, then F Σ can be identified with F NΣ for a certain integer N Σ. Assuming this holds, we shall also associate to ƒ and Σ a cohomology congruence ideal.
Let ΓH(N
Σ) denote the maximal
subgroup of Γ0(N
Σ) in which
Γ1(N
Σ) has ℓ-power order. Let
T denote the 𝒪-subalgebra of
generated by the Hecke operators T
n for
n ≥ 1. We let ƒΣ denote the normalized
T-eigenform characterized by
• the newform associated to ƒΣ is ƒ;
• a p(ƒΣ) = 0 for primes p in Σ − {ℓ};
• a l(ƒΣ) is a unit in 𝒪 if ℓ divides N Σ;
where we have enlarged K if necessary. Consider the prime ideal θ in T defined as the kernel of the map T → 𝒪 arising from ƒΣ, and let m denote the maximal ideal generated by θ and λ. If ρ̄ƒ is irreducible, the completion of T Σ at its maximal ideal can be identified with the completion of T at m. (See section 4.2 of ref. 7.)
We now define a cohomology congruence ideal using the cohomology of the
modular curve X
Σ =
X
H(N
Σ) =
ΓH(N
Σ)∖ℋ*. We have a natural
action of T on
We choose a basis {x, y} for the rank
two submodule M =
H
1(X
Σ,𝒪)[θ], the
intersection of the kernels of the elements of θ. We define the
cohomology congruence ideal
where 〈,〉 is the perfect pairing on
H
1(X
Σ,𝒪) gotten from
x∪Wy, where W is the Atkin–Lehner
involution. One checks the following (see section 4.4 of ref. 7).
Lemma 3.1. The ideal Cƒ,Σ is contained in C ƒ,Σ coh . Furthermore if the completion H 1(X Σ,𝒪)m is free over T m≅T Σ , then equality holds.
Remark 3.2: The freeness of H 1(X Σ,𝒪)m is equivalent to H 1(X Σ,k)[m] being two-dimensional over k, which is known under our hypotheses through work of Mazur et al. (see section 2.1 of ref. 3).
Relation with L-Functions
Hida’s formula relates
C
ƒ,Σ
coh
to the value of an L-function. We consider the
L-function associated to the Galois representation
ad0ρƒ. This L-function is
defined by analytic continuation of the Euler product
where for primes p not dividing
N
ƒ, the Euler factor
L
p(ad0ρƒ,s)
is
αp and βp being the roots of Eq. 1.
We shall not give here the recipe for the Euler factors at primes
p dividing N
ƒ. We remark, however,
that L(ad0ƒ,s) remains the same if
ƒ is replaced by the newform associated to a twist, and that if
N
ƒ is minimal among such newforms, then
L
p(ad0ƒ,s) for p
dividing N
ƒ is one of the following:
If Σ is a finite set of primes, then we write
L
Σ(ad
0ƒ,s)
for the function obtained by omitting the Euler factors at the primes
in Σ.
Suppose now that Σ contains
P
ƒ∪S
ƒ as at the end
of preceding section. We let ω denote the class in
H
1(X
Σ,C)
associated to the holomorphic differential
2πiƒΣ(τ)dτ on
X
Σ. We let ω′ denote the class associated to
the antiholomorphic differential
where ωc is
defined using ƒΣ
c =
∑ā
n(ƒΣ)e
2π
in
τ
instead of ƒΣ.
Viewing M as contained in
H
1(X
Σ,C), we
find that the span of x and y coincides with that
of ω and ω′. We write A for the matrix in
GL
2(C) such that
Define the period Ω to be the determinant of A. (Note
that because we have chosen a basis for M, Ω is well
defined only up to a unit in 𝒪.) Set δ = 3 if ℓ is in Σ, 1 if
ℓ|N
ƒ but ℓ ∉ Σ, and 0 otherwise.
Hida’s formula can then be stated as follows:
Theorem 4.1.
C
ƒ,Σ
coh
is generated by
The proof uses results of Shimura to express the Petersson inner
product of ƒ with itself in terms of the value of the
L-function. In particular, the ratio is an element of 𝒪.
Recall that we have assumed here that Σ contains P ƒ∪S ƒ, but the formula actually holds assuming only that Σ contains S ƒ. However, we have not explained how to define Ω in that situation. We shall see that in fact
Theorem 4.2. Ω is independent of Σ.
So we could use any Σ containing S ƒ∪P ƒ to define Ω. From the theorem, we also see precisely how C ƒ,Σ coh varies with Σ: Adding primes other than ℓ to Σ simply corresponds to dropping the corresponding Euler factors from the L-function. Furthermore, we shall see that the congruences established by Ribet are related to the theorem, which is essentially a reformulation of Wiles’ generalization (3) of Ribet’s result.
Dropping Euler Factors
Ribet’s result (2) on “raising the level” is the following theorem:
Theorem 5.1. If p does not divide Nƒ then the following are equivalent: (a) There exists g such that ƒ ∼ g, χƒ = χg and Ng = dp for some divisor d of Nƒ.
(b) The congruence ap(ƒ)2 ≡ χƒ(p)(p + 1)2 mod λ holds.
The introductory example is a congruence as in the theorem. We take p = 7 and λ dividing 3. Because a p(ƒ) = −2, we see there must be a form g congruent to ƒ with N g = 77 (because N g = 7 is impossible).
The direction (a) ⇒ (b) of the theorem follows from consideration of
the representation ρ̄ƒ. We give the idea of the
proof in the case p ≠ ℓ: If there exists a
g as in the theorem, then the ratio of the eigenvalues of
ρ̄ƒ (Frobp) must be
p
±1mod λ. Then one applies the formula
The direction (b) ⇒ (a) is closely related to Theorem 4.2, which
shows that
if p is not in Σ and does not divide
N
ƒ. Ribet’s proof relies on a comparison of
cohomology congruence ideals, but his setup is slightly different from
the one here. He compares cohomology congruence ideals at level
N
ƒ and
N
ƒ
p, with the result that the
factor of p − 1 does not occur.
To prove Theorem 4.2, one defines a certain
T
Σ′-linear injection
It is defined so that φ(M)⊂M′ where ′
indicates we are using Σ′ instead of Σ. We may even normalize the
map so that this restriction, tensored with C, sends
ƒΣ to ƒΣ′, i.e., the map drops Euler
factors. The key ingredient in the proof of independence is the
following generalization by Wiles of a lemma of Ribet:
Lemma 5.2. φ has torsion-free cokernel.
This is proved using a result of Ihara whose role in the comparison of cohomology congruence ideals is identified in Ribet’s work.
It follows that φ induces an isomorphism M → M′, and we conclude that A = A′ using φ(x),φ(y) as a basis for M′. From Theorem 4.2 we deduce:
Corollary 5.3. Suppose that
Σ′⊃Σ⊃P
ƒ∪S
ƒ.
If ℓ is not in Σ′ − Σ, then let ɛ = 0.
Otherwise let ɛ = 2 or 3 according to whether or not
Nƒ is divisible by ℓ. Then
Relation with Selmer Groups
Using Mazur’s theory of deformations of Galois representations,
one associates a ring R
Σ and a universal
deformation
of ρ̄ƒ minimally ramified outside Σ (see
ref. 6). Here we work over the completion 𝒪 of
𝒪̂, which we view as contained in
Q̄
ℓ. Supposing that Σ contains
S
ƒ, we obtain a homomorphism
πƒ,Σ: R
Σ →
Q̄
ℓ from ρƒ and the
universal property. The 𝒪̂-module
can be described using Galois cohomology. In fact we have a
canonical isomorphism
where L is gotten from
ad0ρƒ. The group on the right is sometimes
called a Selmer group. The subscript Σ indicates that for
p∉Σ the cohomology classes are supposed to restrict to
elements of
H
ƒ
1(G
p,L
⊗Zℓ
Q
ℓ/Z
ℓ)
(as defined in ref. 8). There is also a possibly weaker condition
imposed at p = ℓ if it is in Σ (3, 9). The universal
property of the deformation also yields a surjective homomorphism
φΣ from R
Σ to the completion
of T
Σ. The key result of Wiles (3) and its
generalization in (9) is that φΣ is an isomorphism (6,
7).
This result turns out to be related to the comparison of the congruence
ideal C
ƒ,Σ with the Fitting ideal of
Φƒ,Σ, which we denote
D
ƒ,Σ. (Recall that if Φƒ,Σ
has finite length d, then its Fitting ideal is generated by
λd, and if the length is infinite than the Fitting ideal is
trivial.) On the one hand, an easy commutative algebra argument shows
that
On the other hand, a deeper commutative algebra argument shows
that equality holds in Eq. 5 if and only if the following
hold: (a) φΣ is an isomorphism, and (b)
T
Σ is a complete intersection.
One first proves the two assertions in the case Σ = ∅, so to get
started one needs the existence of ƒ such that
S
ƒ = ∅. This existence is a version of
Serre’s epsilon conjecture, and the most difficult step in the proof
is Ribet’s theorem on lowering the level (5). Assuming that we also
have P
ƒ = ∅, Taylor and Wiles (4) show that
T
∅ is a complete intersection, and using this
fact Wiles (3) shows that φ∅ is an isomorphism. Their
proofs use the generalization of Mazur’s result discussed in Remark
3.2, and from which we also deduce
if Σ = S
ƒ = P
ƒ
= ∅.
Combining the inclusion Eq. 3 with its counterpart
resulting from a Galois cohomology argument, we find that
Eq. 6 holds for arbitrary Σ provided
S
ƒ = P
ƒ = ∅. Hence
we have (a) and (b), and therefore D
ƒ,Σ =
Ĉ
ƒ,Σ, assuming only that Σ ⊃
S
ƒ and P
ƒ = ∅.
Applying the result of remark 3.2, we get Eq. 6 as well in
that case.
Remark 6.1: Improvements to these arguments, due to Faltings, Lenstra, Fujiwara, and the author (10) establish (a), (b), and Eq. 6 simultaneously (first for Σ = ∅, then in general) without appealing to Remark 3.2.
If P ƒ is not empty, then we can sometimes get empty P ƒ for a twist, but in general we appeal to ref. 9 to get (a) and (b) in the case of Σ = S ƒ = ∅, along with Eq. 3 if Σ′ = P ƒ. We conclude that
Theorem 6.2. Keep the above hypotheses and notation.
• For arbitrary ∑, (a) and (b) hold.
• If ∑ contains Sf, then
is
a generator for Df,∑.
• If ∑ contains Sf ∪ Pf, then Eq. 6 holds.
Remark 6.3: Coates and Flach have pointed out that one can deduce form the theorem a formula relating the order of H ∅ 1 (G Q ,L⊗ zℓ Q ℓ/Z ℓ) to L Σ(ad0 f,1). To relate the orders of H ∅ 1 and H Σ 1, one uses a variant of proposition 5.14 (ii) of ref. 8. In the case of f corresponding to an elliptic curve, see section 3 of ref. 11 for this variant and ref. 12 for a discussion of the relation with the Tamagawa number conjecture (8).
Acknowledgments
The author is grateful to M. Flach for comments on an earlier draft. This research was supported by the Engineering and Physical Sciences Research Council (Grant No. GR/J4761).
Footnotes
- Copyright © 1997, The National Academy of Sciences of the USA





