Theorem 9.4.1 (dirith, Dirichlet's theorem): If N∈N and (A,N)=1, then there are infinitely many primes p such that N∣p−A.If you have a copy of Shapiro, "Introduction to the Theory of Numbers", open it to p. 375, Theorem 9.4.1, because the proof follows the book closely. Otherwise, I will do my best to explain the steps, as well as a few things that Shapiro glossed over.
The major players in the proof are the von Mangoldt function Λ(n), which is defined by Λ(pα)=logp if pα is a prime power and Λ(n)=0 otherwise, the Möbius function μ(n), which is 1 if n is a squarefree number with an even number of factors, −1 if it is squarefree with an odd number of factors, and 0 otherwise, and the Dirichlet characters χ(n), which are completely multiplicative functions on Z/NZ that are nonzero on the units. Facts about these functions will be introduced as we go along.
The starting point is the following theorem (vmadivsum, Equation 9.2.13): ∑n≤xΛ(n)n=logx+O(1).
Proof: By definition, ψ(x)=∑n≤xΛ(n) (this is the second Chebyshev function), and from chpo1ub ψ(x)=O(x). Also (logfac2), log⌊x⌋!=∑n≤xΛ(n)⌊xn⌋. We can put these together via ∑n≤xΛ(n)n=1x∑n≤xΛ(n)xn=1x∑n≤xΛ(n)⌊xn⌋+1x∑n≤xΛ(n){xn}=1xlog⌊x⌋!+1xO(∑n≤xΛ(n))=1x(xlogx+O(x))+1xO(ψ(x))=logx+O(1). (We also used the asymptotic approximation of the factorial (logfaco1), log⌊x⌋!=xlogx+O(x) in this equation.)
Our goal will be to strengthen this to talk about a specific progression modN (we will hold N fixed throughout the proof) to achieve rpvmasum, Equation 9.4.3: ϕ(N)∑n≤xn≡AΛ(n)n=logx+O(1). Once we have established this, the main theorem is simple:
Theorem (rplogsum, Equation 9.4.4): ϕ(N)∑p≤xp≡Alogpp=logx+O(1).
Proof: (Note that p is always a variable over primes here.) ∑n≤xn≡AΛ(n)n−∑p≤xp≡Alogpp=∑pα≤x,α>1pα≡Alogppα≤∑pα≤x,α>1logppα≤∑p∈P∞∑α=2logppα=∑p∈Plogpp(p−1)≤∞∑n=2lognn(n−1)≤∞∑n=2√n−1n(n−1)≤∞∑n=2(2√n−1−2√n)=2 Thus ϕ(N)∑n≤x,n≡AΛ(n)n=logx+O(1) implies ϕ(N)∑p≤x,p≡Alogpp=logx+O(1). But this implies that ∑p≤x,p≡Alogpp diverges, and since a finite sum cannot diverge, this means that {p∈P∣p≡A(modN)} is infinite.
We are left with the task of establishing rpvmasum. At this point we turn to Dirichlet characters as a "better basis" from which to weight the sum. This works because of the relation (sum2dchr) ∑χˉχ(A)χ(n)={ϕ(N)n≡A0o.w. (where the sum is taken over all Dirichlet characters χ). Using this equation we can rewrite the sum as ∑n≤xn≡AΛ(n)n=∑χˉχ(A)∑n≤xχ(n)Λ(n)n.
There are two cases. The logx term comes from the principal character χ=χ0 (written χ=1 in Metamath):
Theorem (rpvmasumlem): ∑n≤xχ0(n)Λ(n)n=logx+O(1) Proof: ∑n≤xχ0(n)Λ(n)n=∑n≤x(n,N)=1Λ(n)n=∑n≤xΛ(n)n−∑n≤x(n,N)>1Λ(n)n=logx+O(1)−∑pα≤x(pα,N)>1logppα If (pα,N)>1, then p is a divisor of N, and there are only finitely many of those. Thus ∑pα≤x(pα,N)>1logppα≤∑p∣N∞∑α=1logppα=∑p∣Nlogpp−1 so ∑n≤x,(n,N)>1Λ(n)n=O(1) and ∑n≤xχ0(n)Λ(n)n=logx+O(1).
The other case (the hard case) is when χ≠χ0. Fix a nonprincipal character χ. Our goal is to show (dchrvmasum, Equation 9.4.8) ∑n≤xχ(n)Λ(n)n=O(1). From this it follows that ∑n≤xn≡AΛ(n)n=∑n≤xχ0(n)Λ(n)n+∑χ≠χ0ˉχ(A)∑n≤xχ(n)Λ(n)n=(logx+O(1))+∑χ≠χ0ˉχ(A)O(1)=logx+O(1), which will prove vmadivsum.
Theorem (vmasum, Equation 9.2.4): logn=∑d∣nΛ(d)
Proof: ∑d∣nΛ(d)=∑pα∣nlogp=∑p∑α≤νp(n)logp=∑pνp(n)logp=log∏ppνp(n)=logn where νp(n) (written in Metamath as (p pCnt n)) is the exponent of p in the prime factorization of n.
Using Möbius inversion (muinv) on this equation gives Λ(n)=∑d∣nμ(d)lognd, hence (dchrvmasumlem1) ∑n≤xχ(n)Λ(n)n=∑n≤x∑d∣nχ(n)μ(d)nlognd=∑d≤x∑m≤x/dχ(dm)μ(d)dmlogm=∑d≤xχ(d)μ(d)d∑m≤x/dχ(m)mlogm.
We have a similar equation for logx (dchrvmasum2lem): logx=∑n≤xχ(n)nlogxn∑d∣nμ(d)=∑n≤x∑d∣nχ(n)μ(d)nlogxn=∑d≤x∑m≤x/dχ(dm)μ(d)dmlogxdm=∑d≤xχ(d)μ(d)d∑m≤x/dχ(m)mlogxdm.
The first step is an application of musum to collapse the double sum to just the n=1 term. Combining these two yields the equation ∑n≤xχ(n)Λ(n)n+logx=∑d≤xχ(d)μ(d)d∑m≤x/dχ(m)mlogxd.
Our next task is to get an asymptotic expression for the sums ∑m≤zχ(m)m and ∑m≤zχ(m)mlogm appearing here. To that end, we introduce the following lemma:
Theorem (dchrisum, Theorem 9.4.1): Let f:R+→R be an eventually monotonically decreasing function (that is, for some M, M≤x≤y implies f(x)≥f(y)) which converges to 0. Then ∑nχ(n)f(n) converges to some L, and for some c, x≥M⟹|∑n≤xχ(n)f(n)−L|≤cf(x).
Proof: It suffices to show that ∑a≤n<bχ(n)f(n)=O(f(x)) (for positive integers a,b with x≤a≤b), because then the sequence is Cauchy and so converges to a limit L, and since the entire sequence above x is inside the closed set {z∈C∣|∑n≤xχ(n)f(n)−z|≤f(x)}, so is the limit.
Let R(k)=∑k−1n=0χ(n). Then R(N) sums over all the equivalence classes of Z/NZ exactly once, so R(N)=∑i∈Z/NZχ(i), and since χ is nonprincipal, by dchrsum R(N)=0. Let C≥|R(k)| for all 0≤k<N. (This is a finite set, so there is a finite supremum.) Then setting k=aN+b where b<N, we have R(k)=a−1∑i=0N−1∑n=0χ(iN+n)+b−1∑n=0χ(aN+n)=a−1∑i=0N−1∑n=0χ(n)+b−1∑n=0χ(n)=a−1∑i=0R(N)+R(b)=R(b), and C≥|R(b)|; hence C≥|R(k)| for all k.
Using summation by parts, |∑a≤n<bχ(n)f(n)|=|∑a≤n<b(R(n+1)−R(n))f(n)|=|R(b)f(b)−R(a)f(a)−∑a≤n<bR(n+1)(f(n+1)−f(n))|≤Cf(b)+Cf(a)+∑a≤n<bC|f(n+1)−f(n)|=Cf(b)+Cf(a)+∑a≤n<bC(f(n)−f(n+1))=Cf(b)+Cf(a)+C(f(a)−f(b))≤2Cf(x).
With dchrisum in hand we can apply it to the decreasing functions f(x)=logxx and f(x)=1x; let L0=∞∑n=1χ(n)nL1=∞∑n=1χ(n)lognn be the limits of the series that result.
Theorem (dchrmusum2, Lemma 9.4.2): L0∑n≤xχ(n)μ(n)n=O(1).
Proof: We have: 1=∑n≤x∑d∣nχ(n)μ(d)n=∑d≤x∑m≤x/dχ(dm)μ(d)dm=∑d≤xχ(d)μ(d)d∑m≤x/dχ(m)m
dchrisum provides that |∑m≤zχ(m)m−L0|≤Cz for some C and all z. Then: |L0∑n≤xχ(n)μ(n)n|≤1+|1−L0∑n≤xχ(n)μ(n)n|=1+|∑d≤xχ(d)μ(d)d(∑m≤x/dχ(m)m−L0)|≤1+∑d≤x1d|∑m≤x/dχ(m)m−L0|≤1+∑d≤xCx≤1+C.
We will be doing a lot of case analysis on L0=0, so let the notation [ab] denote [ab]={aL0=0bL0≠0.
Theorem (dchrvmasumif, Equation 9.4.24): ∑n≤xχ(n)Λ(n)n=−[logx0]+O(1).
Proof: Earlier, we showed that ∑n≤xχ(n)Λ(n)n+[logx0]=∑d≤xχ(d)μ(d)d∑m≤x/dχ(m)mlog[x/dm]. We will show that this latter sum is bounded. Let g(x)=∑n≤xχ(n)nlog[xn], so that the sum in question is ∑d≤xχ(d)μ(d)dg(x/d). Let us show that g(x)=[0L1]+O(logxx). In the case L0=0, we have g(x)=∑n≤xχ(n)nlogx=(L0+O(1x))logx=O(logxx), and if L0≠0, we have g(x)=∑n≤xχ(n)lognn=L1+O(logxx). Since [0L1/L0]L0=[0L1] and L0∑d≤xχ(d)μ(d)d=O(1), we also have [0L1]∑d≤xχ(d)μ(d)d=O(1). ∑n≤xχ(n)Λ(n)n+[logx0]=∑d≤xχ(d)μ(d)dg(x/d)=O(1)+∑d≤xχ(d)μ(d)d(g(x/d)−[0L1])
Now suppose the big-O constants take x≥A⟹|g(x)−[0L1]|≤Clogxx, with A≥1, and set D=∑n<AlogAn. Then x<A implies |g(x)|≤∑n≤x1n|log[xn]|≤∑n≤x1nlogA≤∑n<A1nlogA=D Then: |∑d≤xχ(d)μ(d)d(g(x/d)−[0L1])|≤∑d≤x1d|g(x/d)−[0L1]|≤∑d≤x/ACdlog(x/d)x/d+∑x/A<d≤xDd≤Cx∑d≤x(logx−logd)+D(logx−log(x/A)+1)≤Cx(xlogx−log⌊x⌋!)+D(logA+1) which is bounded using log⌊x⌋!=xlogx+O(x).
We're almost there: if we can prove L0≠0 unconditionally, dchrvmasumif will give us what we want. Replay the derivation of rpvmasum, with A=1 so that the ˉχ(A) factors disappear: ∑n≤xn≡1Λ(n)n=∑n≤xχ0(n)Λ(n)n+∑χ≠χ0∑n≤xχ(n)Λ(n)n=(logx+O(1))+∑χ≠χ0(−[logx0]+O(1))=(1−|W|)logx+O(1), where W is the set of all χ≠χ0 such that L0(χ)≠0. Since ∑n≤x,n≡1Λ(n)n is nonnegative, we have a contradiction if |W|>1 (because the (1−|W|)logx+O(1) will eventually be negative). Thus there is at most one χ such that L0(χ)=0. Now fix χ again and assume L0=0. Because ˉχ is also a Dirichlet character, and L0(ˉχ)=¯L0(χ)=0, we must have ˉχ=χ, so χ is a real Dirichlet character (dchrisum0re).
Now let F(n)=∑d∣nχ(d). Because a divisor sum of a multiplicative function is multiplicative (fsumdvdsmul), F is multiplicative (dchrisum0fmul). (Unlike χ it is not completely multiplicative.)
Theorem (dchrisum0flb, Equation 9.4.29): F(n)≥{1√n∈N0o.w..
Proof: We begin by establishing the theorem for prime powers pα, α≥0. In this case √pα∈N iff α is even. Now F(pα)=∑d∣pαχ(d)=∑k≤αχ(pk)=∑k≤αχ(p)k, and since χ is a real Dirichlet character it can only take the values −1,0,1. If χ(p)=1, then F(pα)=α+1≥1; if χ(p)=0, then F(pα)=1≥1; and if χ(p)=−1 then F(pα)=∑k≤α(−1)k=1+(−1)α2 which is 1 when α is even and 0 if it is odd. Thus in any case the theorem is proven for prime powers.
Now to cover the general case, by complete induction. If n=1 then it is a prime power so the theorem is proven; otherwise let p be a prime divisor of n, and let n=pαk where p∤, so F(n)=F(p^\alpha)F(k). We have proven that F(p^\alpha) satisfies the theorem, and F(k) satisfies the theorem by the induction hypothesis, because k<n. The product of two nonnegatives is nonnegative, so F(n)\ge0. If n is a square then \nu_p(n)=\alpha is even, so p^\alpha is a square, and so k is a quotient of squares, which is thus also a square. Thus F(p^\alpha),F(k)\ge1, so F(n)\ge1. \tag*{$\blacksquare$}
Therefore, \sum_{n\le x}\frac{F(n)}{\sqrt n}\ge\sum_{m^2\le x}\frac{F(m^2)}{m}\ge\sum_{m\le\sqrt x}\frac1m\ge\frac12\log x, so \sum_{n\le x}\frac{F(n)}{\sqrt n} is unbounded.
Theorem (dchrisumn0, Lemma 9.4.4): L_0\ne0.
Proof: We will show that L_0=0 implies \sum_{n\le x}\frac{F(n)}{\sqrt n} is bounded. We have: \begin{align} \sum_{n\le x}\frac{F(n)}{\sqrt n} &=\sum_{n\le x}\sum_{d\mid n}\frac{\chi(d)}{\sqrt n}\\ &=\sum_{d\le x}\sum_{m\le x/d}\frac{\chi(d)}{\sqrt{dm}}\\ &=\sum_{d\le\sqrt x}\sum_{m\le x/d}\frac{\chi(d)}{\sqrt{dm}} +\sum_{m\le\sqrt x}\sum_{\sqrt x<d\le x/m}\frac{\chi(d)}{\sqrt{dm}}\\ &=\sum_{d\le y}\sum_{m\le y^2/d}\frac{\chi(d)}{\sqrt{dm}} +\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}} \end{align} (changing variables y\mapsto\sqrt x in the last step). Using dchrisum once more, we obtain L_2 such that |\sum_{n\le x}\frac{\chi(d)}{\sqrt d}-L_2|\le\frac C{\sqrt x}. Then \begin{align} |\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}}| &\le\frac{2C}{\sqrt y}\sum_{m\le y}\frac1{\sqrt m}. \end{align} We need here the asymptotic estimate divsqrsum for \sum_{m\le x}\frac1{\sqrt m}=2\sqrt x+c+O(x^{-1/2}), so that we get |\sum_{m\le y}\sum_{y<d\le y^2/m}\frac{\chi(d)}{\sqrt{dm}}|\le4C+O(\frac1{\sqrt y})=O(1).
For the other sum, we can write \begin{align} \sum_{d\le y}\frac{\chi(d)}{\sqrt d}\sum_{m\le y^2/d}\frac1{\sqrt m} &=\sum_{d\le y}\frac{\chi(d)}{\sqrt d}\left(2\frac{y}{\sqrt d}+c+O(\frac{\sqrt d}{y})\right)\\ &=2y\sum_{d\le y}\frac{\chi(d)}{d}+c\sum_{d\le y}\frac{\chi(d)}{\sqrt d}+O(1)\\ &=2y(L_0+O(\frac1y))+c(L_2+O(\frac1{\sqrt y}))+O(1)=2yL_0+O(1)=O(1). \end{align}
Thus \sum_{n\le x}\frac{F(n)}{\sqrt n}=O(1), a contradiction. Thus our assumption L_0=0 was false, so L_0\ne0. \tag*{$\blacksquare$} Then dchrvmasumif simplifies to dchrvmasum, and we are done. \tag*{$\blacksquare$}
How to understand the big equation...
ReplyDelete