Loading [MathJax]/jax/element/mml/optable/MathOperators.js

Tuesday, May 17, 2016

Dirichlet's theorem

There have been some calls for an informal exposition of the recent formal proof of Dirichlet's theorem:
Theorem 9.4.1 (dirith, Dirichlet's theorem): If NN and (A,N)=1, then there are infinitely many primes p such that NpA.
If you have a copy of Shapiro, "Introduction to the Theory of Numbers", open it to p. 375, Theorem 9.4.1, because the proof follows the book closely. Otherwise, I will do my best to explain the steps, as well as a few things that Shapiro glossed over.

The major players in the proof are the von Mangoldt function Λ(n), which is defined by Λ(pα)=logp if pα is a prime power and Λ(n)=0 otherwise, the Möbius function μ(n), which is 1 if n is a squarefree number with an even number of factors, 1 if it is squarefree with an odd number of factors, and 0 otherwise, and the Dirichlet characters χ(n), which are completely multiplicative functions on Z/NZ that are nonzero on the units. Facts about these functions will be introduced as we go along.

The starting point is the following theorem (vmadivsum, Equation 9.2.13): nxΛ(n)n=logx+O(1).
Proof: By definition, ψ(x)=nxΛ(n) (this is the second Chebyshev function), and from chpo1ub ψ(x)=O(x). Also (logfac2), logx!=nxΛ(n)xn. We can put these together via nxΛ(n)n=1xnxΛ(n)xn=1xnxΛ(n)xn+1xnxΛ(n){xn}=1xlogx!+1xO(nxΛ(n))=1x(xlogx+O(x))+1xO(ψ(x))=logx+O(1). (We also used the asymptotic approximation of the factorial (logfaco1), logx!=xlogx+O(x) in this equation.)

Our goal will be to strengthen this to talk about a specific progression modN (we will hold N fixed throughout the proof) to achieve rpvmasum, Equation 9.4.3: ϕ(N)nxnAΛ(n)n=logx+O(1). Once we have established this, the main theorem is simple:

Theorem (rplogsum, Equation 9.4.4): ϕ(N)pxpAlogpp=logx+O(1).
Proof: (Note that p is always a variable over primes here.) nxnAΛ(n)npxpAlogpp=pαx,α>1pαAlogppαpαx,α>1logppαpPα=2logppα=pPlogpp(p1)n=2lognn(n1)n=2n1n(n1)n=2(2n12n)=2 Thus ϕ(N)nx,nAΛ(n)n=logx+O(1) implies ϕ(N)px,pAlogpp=logx+O(1). But this implies that px,pAlogpp diverges, and since a finite sum cannot diverge, this means that {pPpA(modN)} is infinite.

We are left with the task of establishing rpvmasum. At this point we turn to Dirichlet characters as a "better basis" from which to weight the sum. This works because of the relation (sum2dchr) χˉχ(A)χ(n)={ϕ(N)nA0o.w. (where the sum is taken over all Dirichlet characters χ). Using this equation we can rewrite the sum as nxnAΛ(n)n=χˉχ(A)nxχ(n)Λ(n)n.
There are two cases. The logx term comes from the principal character χ=χ0 (written χ=1 in Metamath):

Theorem (rpvmasumlem): nxχ0(n)Λ(n)n=logx+O(1) Proof: nxχ0(n)Λ(n)n=nx(n,N)=1Λ(n)n=nxΛ(n)nnx(n,N)>1Λ(n)n=logx+O(1)pαx(pα,N)>1logppα If (pα,N)>1, then p is a divisor of N, and there are only finitely many of those. Thus pαx(pα,N)>1logppαpNα=1logppα=pNlogpp1 so nx,(n,N)>1Λ(n)n=O(1) and nxχ0(n)Λ(n)n=logx+O(1).

The other case (the hard case) is when χχ0. Fix a nonprincipal character χ. Our goal is to show (dchrvmasum, Equation 9.4.8) nxχ(n)Λ(n)n=O(1). From this it follows that nxnAΛ(n)n=nxχ0(n)Λ(n)n+χχ0ˉχ(A)nxχ(n)Λ(n)n=(logx+O(1))+χχ0ˉχ(A)O(1)=logx+O(1), which will prove vmadivsum.

Theorem (vmasum, Equation 9.2.4): logn=dnΛ(d)
Proof: dnΛ(d)=pαnlogp=pανp(n)logp=pνp(n)logp=logppνp(n)=logn where νp(n) (written in Metamath as (p pCnt n)) is the exponent of p in the prime factorization of n.

Using Möbius inversion (muinv) on this equation gives Λ(n)=dnμ(d)lognd, hence (dchrvmasumlem1) nxχ(n)Λ(n)n=nxdnχ(n)μ(d)nlognd=dxmx/dχ(dm)μ(d)dmlogm=dxχ(d)μ(d)dmx/dχ(m)mlogm.
We have a similar equation for logx (dchrvmasum2lem): logx=nxχ(n)nlogxndnμ(d)=nxdnχ(n)μ(d)nlogxn=dxmx/dχ(dm)μ(d)dmlogxdm=dxχ(d)μ(d)dmx/dχ(m)mlogxdm.
The first step is an application of musum to collapse the double sum to just the n=1 term. Combining these two yields the equation nxχ(n)Λ(n)n+logx=dxχ(d)μ(d)dmx/dχ(m)mlogxd.
Our next task is to get an asymptotic expression for the sums mzχ(m)m and mzχ(m)mlogm appearing here. To that end, we introduce the following lemma:

Theorem (dchrisum, Theorem 9.4.1): Let f:R+R be an eventually monotonically decreasing function (that is, for some M, Mxy implies f(x)f(y)) which converges to 0. Then nχ(n)f(n) converges to some L, and for some c, xM|nxχ(n)f(n)L|cf(x).
Proof: It suffices to show that an<bχ(n)f(n)=O(f(x)) (for positive integers a,b with xab), because then the sequence is Cauchy and so converges to a limit L, and since the entire sequence above x is inside the closed set {zC|nxχ(n)f(n)z|f(x)}, so is the limit.

Let R(k)=k1n=0χ(n). Then R(N) sums over all the equivalence classes of Z/NZ exactly once, so R(N)=iZ/NZχ(i), and since χ is nonprincipal, by dchrsum R(N)=0. Let C|R(k)| for all 0k<N. (This is a finite set, so there is a finite supremum.) Then setting k=aN+b where b<N, we have R(k)=a1i=0N1n=0χ(iN+n)+b1n=0χ(aN+n)=a1i=0N1n=0χ(n)+b1n=0χ(n)=a1i=0R(N)+R(b)=R(b), and C|R(b)|; hence C|R(k)| for all k.

Using summation by parts, |an<bχ(n)f(n)|=|an<b(R(n+1)R(n))f(n)|=|R(b)f(b)R(a)f(a)an<bR(n+1)(f(n+1)f(n))|Cf(b)+Cf(a)+an<bC|f(n+1)f(n)|=Cf(b)+Cf(a)+an<bC(f(n)f(n+1))=Cf(b)+Cf(a)+C(f(a)f(b))2Cf(x).

With dchrisum in hand we can apply it to the decreasing functions f(x)=logxx and f(x)=1x; let L0=n=1χ(n)nL1=n=1χ(n)lognn be the limits of the series that result.

Theorem (dchrmusum2, Lemma 9.4.2): L0nxχ(n)μ(n)n=O(1).
Proof: We have: 1=nxdnχ(n)μ(d)n=dxmx/dχ(dm)μ(d)dm=dxχ(d)μ(d)dmx/dχ(m)m
dchrisum provides that |mzχ(m)mL0|Cz for some C and all z. Then: |L0nxχ(n)μ(n)n|1+|1L0nxχ(n)μ(n)n|=1+|dxχ(d)μ(d)d(mx/dχ(m)mL0)|1+dx1d|mx/dχ(m)mL0|1+dxCx1+C.

We will be doing a lot of case analysis on L0=0, so let the notation [ab] denote [ab]={aL0=0bL00.
Theorem (dchrvmasumif, Equation 9.4.24): nxχ(n)Λ(n)n=[logx0]+O(1).
Proof: Earlier, we showed that nxχ(n)Λ(n)n+[logx0]=dxχ(d)μ(d)dmx/dχ(m)mlog[x/dm]. We will show that this latter sum is bounded. Let g(x)=nxχ(n)nlog[xn], so that the sum in question is dxχ(d)μ(d)dg(x/d). Let us show that g(x)=[0L1]+O(logxx). In the case L0=0, we have g(x)=nxχ(n)nlogx=(L0+O(1x))logx=O(logxx), and if L00, we have g(x)=nxχ(n)lognn=L1+O(logxx). Since [0L1/L0]L0=[0L1] and L0dxχ(d)μ(d)d=O(1), we also have [0L1]dxχ(d)μ(d)d=O(1). nxχ(n)Λ(n)n+[logx0]=dxχ(d)μ(d)dg(x/d)=O(1)+dxχ(d)μ(d)d(g(x/d)[0L1])
Now suppose the big-O constants take xA|g(x)[0L1]|Clogxx, with A1, and set D=n<AlogAn. Then x<A implies |g(x)|nx1n|log[xn]|nx1nlogAn<A1nlogA=D Then: |dxχ(d)μ(d)d(g(x/d)[0L1])|dx1d|g(x/d)[0L1]|dx/ACdlog(x/d)x/d+x/A<dxDdCxdx(logxlogd)+D(logxlog(x/A)+1)Cx(xlogxlogx!)+D(logA+1) which is bounded using logx!=xlogx+O(x).

We're almost there: if we can prove L00 unconditionally, dchrvmasumif will give us what we want. Replay the derivation of rpvmasum, with A=1 so that the ˉχ(A) factors disappear: nxn1Λ(n)n=nxχ0(n)Λ(n)n+χχ0nxχ(n)Λ(n)n=(logx+O(1))+χχ0([logx0]+O(1))=(1|W|)logx+O(1), where W is the set of all χχ0 such that L0(χ)0. Since nx,n1Λ(n)n is nonnegative, we have a contradiction if |W|>1 (because the (1|W|)logx+O(1) will eventually be negative). Thus there is at most one χ such that L0(χ)=0. Now fix χ again and assume L0=0. Because ˉχ is also a Dirichlet character, and L0(ˉχ)=¯L0(χ)=0, we must have ˉχ=χ, so χ is a real Dirichlet character (dchrisum0re).

Now let F(n)=dnχ(d). Because a divisor sum of a multiplicative function is multiplicative (fsumdvdsmul), F is multiplicative (dchrisum0fmul). (Unlike χ it is not completely multiplicative.)

Theorem (dchrisum0flb, Equation 9.4.29): F(n){1nN0o.w..
Proof: We begin by establishing the theorem for prime powers pα, α0. In this case pαN iff α is even. Now F(pα)=dpαχ(d)=kαχ(pk)=kαχ(p)k, and since χ is a real Dirichlet character it can only take the values 1,0,1. If χ(p)=1, then F(pα)=α+11; if χ(p)=0, then F(pα)=11; and if χ(p)=1 then F(pα)=kα(1)k=1+(1)α2 which is 1 when α is even and 0 if it is odd. Thus in any case the theorem is proven for prime powers.

Now to cover the general case, by complete induction. If n=1 then it is a prime power so the theorem is proven; otherwise let p be a prime divisor of n, and let n=pαk where p, so F(n)=F(p^\alpha)F(k). We have proven that F(p^\alpha) satisfies the theorem, and F(k) satisfies the theorem by the induction hypothesis, because k<n. The product of two nonnegatives is nonnegative, so F(n)\ge0. If n is a square then \nu_p(n)=\alpha is even, so p^\alpha is a square, and so k is a quotient of squares, which is thus also a square. Thus F(p^\alpha),F(k)\ge1, so F(n)\ge1. \tag*{$\blacksquare$}

Therefore, \sum_{n\le x}\frac{F(n)}{\sqrt n}\ge\sum_{m^2\le x}\frac{F(m^2)}{m}\ge\sum_{m\le\sqrt x}\frac1m\ge\frac12\log x, so \sum_{n\le x}\frac{F(n)}{\sqrt n} is unbounded.

Theorem (dchrisumn0, Lemma 9.4.4): L_0\ne0.
Proof: We will show that L_0=0 implies \sum_{n\le x}\frac{F(n)}{\sqrt n} is bounded. We have: \begin{align} \sum_{n\le x}\frac{F(n)}{\sqrt n} &=\sum_{n\le x}\sum_{d\mid n}\frac{\chi(d)}{\sqrt n}\\ &=\sum_{d\le x}\sum_{m\le x/d}\frac{\chi(d)}{\sqrt{dm}}\\ &=\sum_{d\le\sqrt x}\sum_{m\le x/d}\frac{\chi(d)}{\sqrt{dm}} +\sum_{m\le\sqrt x}\sum_{\sqrt x<d\le x/m}\frac{\chi(d)}{\sqrt{dm}}\\ &=\sum_{d\le y}\sum_{m\le y^2/d}\frac{\chi(d)}{\sqrt{dm}} +\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}} \end{align} (changing variables y\mapsto\sqrt x in the last step). Using dchrisum once more, we obtain L_2 such that |\sum_{n\le x}\frac{\chi(d)}{\sqrt d}-L_2|\le\frac C{\sqrt x}. Then \begin{align} |\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}}| &\le\frac{2C}{\sqrt y}\sum_{m\le y}\frac1{\sqrt m}. \end{align} We need here the asymptotic estimate divsqrsum for \sum_{m\le x}\frac1{\sqrt m}=2\sqrt x+c+O(x^{-1/2}), so that we get |\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}}|\le4C+O(\frac1{\sqrt y})=O(1).

For the other sum, we can write \begin{align} \sum_{d\le y}\frac{\chi(d)}{\sqrt d}\sum_{m\le y^2/d}\frac1{\sqrt m} &=\sum_{d\le y}\frac{\chi(d)}{\sqrt d}\left(2\frac{y}{\sqrt d}+c+O(\frac{\sqrt d}{y})\right)\\ &=2y\sum_{d\le y}\frac{\chi(d)}{d}+c\sum_{d\le y}\frac{\chi(d)}{\sqrt d}+O(1)\\ &=2y(L_0+O(\frac1y))+c(L_2+O(\frac1{\sqrt y}))+O(1)=2yL_0+O(1)=O(1). \end{align}
Thus \sum_{n\le x}\frac{F(n)}{\sqrt n}=O(1), a contradiction. Thus our assumption L_0=0 was false, so L_0\ne0. \tag*{$\blacksquare$} Then dchrvmasumif simplifies to dchrvmasum, and we are done. \tag*{$\blacksquare$}

1 comment: