## Wednesday, September 24, 2014

### Mathbox update: Functions of Number Theory

I should probably apologize for having nothing but mathbox updates on this blog of late, but this is the only type of post I release regularly, in conjunction with the actual theorems being uploaded to the Metamath site. (I try to make up for it by making frequent, hopefully enlightening remarks in the list below so it's not just a list.) This update is primarily concerned with preparations for an eventual proof of the Prime Number Theorem, and actual results include the Chebyshev bound on the prime pi function $a\frac x{\log x}\le\pi(x)\le A\frac x{\log x}$, which is a weak form of the PNT (which shows that this is true for any $a<1<A$). We also define several number-theoretic functions, and formalize the notions $f\to a$ and $f\in O(1)$ for real functions (we already have a limit notion $(a_n)\to a$, but it only works on sequences).
• df-rlim: Define the limit predicate $f(x)\to a$ as $x\to\infty$. It is defined so that it will work on sequences defined on an upper integer set or on real functions; we say that $f\to a$ if $f$ is a partial function $\Bbb R\to\Bbb C$ and for all $\epsilon>0$ there exists an $M\in\Bbb R$ such that for all $x\ge M$ in the domain of $f$, $|f(x)-a|<\epsilon$. Note that if the domain of $f$ is upper bounded, then the rest is vacuously true, so $f\to a$ for every $a\in\Bbb C$ in this case. (But if the domain of $f$ is not upper bounded, then $f\to a$ for at most one $a$, as we expect.)
• df-o1: Define the set of eventually bounded functions, that is, $f\in O(1)$ iff there exists an $c,M$ such that $|f(x)|\le M$ for all $x\ge c$. The notation $O(1)$ is meant to be suggestive of Bachmann-Landau notation, and indeed it matches the notation for $f=O(1)$ (although of course we use $\in$ in place of the misleading $=$ in big-O notation). However, we do not define the complete hierarchy of notations, as $O(1)$ and the $\to 0$ predicate of df-rlim is sufficient to unify all of these, in the following manner:
$$f(x)=O(g(x))\iff g(x)=\Omega(f(x))\iff\frac{f(x)}{g(x)}\in O(1)\\ f(x)=o(g(x))\iff g(x)=\omega(f(x))\iff\frac{f(x)}{g(x)}\to0\\ f(x)=\Theta(g(x))\iff\frac{f(x)}{g(x)},\frac{g(x)}{f(x)}\in O(1)\\ f(x)\sim g(x)\iff\frac{f(x)}{g(x)}\to1$$
Thus chebbnd1, chebbnd2 together make the statement $\pi(x)\in\Theta(x/\log x)$.

Since usually the functions $g(x)$ we use in this sense are eventually non-zero, the equivalences above are exact. For functions that are not eventually non-zero, the usual definition of the statement $f(x)=O(g(x))$ also requires that $f(x)=0$ wherever $g(x)=0$ for sufficiently large $x$, while $f/g\in O(1)$ leaves $f$ unconstrained whenever $g(x)=0$ because in these cases $f(x)/g(x)$ is undefined. I don't see that this edge case is very useful, though, so the discrepancy is not enough to be relevant. (If anyone has a good example of a use for the notation $f(x)=O(g(x))$ or $f(x)=o(g(x))$ where $g$ is not eventually nonzero, I'm all ears.)
• o1const, rlimconst: The constant function $f(x)=a$ satisfies $f\in O(1)$ and $f\to a$.
• rlimclim: For functions $f:\Bbb Z^{\ge n}\to\Bbb C$, $f\to_{\Bbb Z}a\iff f\to_r a$, i.e. the old $\Bbb Z$-limit predicate df-clim and the new df-rlim are equivalent. Since df-clim applies to non-functions, but $f\to_{\Bbb Z} a\iff (x\in\Bbb N\mapsto f(x))\to_{\Bbb Z} a$, i.e. we can replace $f$ with a function that takes the same values and has the same behavior under $\to_{\Bbb Z}$, it follows that df-rlim is a generalization of df-clim.
• rlimres, o1res: If $f\to a$, then $f\upharpoonright X\to a$, that is, $f$ restricted to any subset $X$ still has limit $a$. Similarly, $f\in O(1)$ implies $f\upharpoonright X\in O(1)$. The unconstrained nature of this restriction may seem peculiar, since then it seems that even the empty function has limit $a$, but considering the note about upper bounded domain implying that $f$ takes every value as a limit, this should be no surprise.
• rlimresb: If $X=[b,\infty)$ for some $b\in\Bbb R$, then $f\to a\iff f\upharpoonright X\to a$. In other words, the limit of $f$ is not affected by its values on any upper bounded set.
• rlimuni: Since as pointed out above, $f$ often fails to have a unique limit, we may ask under what conditions we are ensured that if a limit exists it is unique. The correct answer is that the limit is unique iff the domain of $f$ has no upper bound, but here we prove the weaker assertion (which is sufficient for most applications) that the limit is unique if the domain of $f$ contains an upper integer set.
• rlimcn1, rlimcn2: One of the most important properties of continuous functions (almost a definitional property) is that it preserves limits, that is, if $f$ is continuous and $g\to a$ then $f\circ g\to f(a)$. Here we derive this for one and two-dimensional continuous functions. (This is a bit early in the development to be worrying about two-dimensional continuity, but we need it for the important continuous functions $+,-,\cdot$.)
• reccn2: Previously a lemma for the much later divcn, here we show that the reciprocal function is continuous (away from zero).
• o1of2: The property that characterizes $O(1)$-preserving functions is a bit different from continuity, and I don't know if it has a name: For all $\delta\in\Bbb R$, there exists an $\epsilon$ such that for all $x$, $|x|\le\delta\to|f(x)|\le\epsilon$. (Note that the roles of $\delta$ and $\epsilon$ is reversed from the usual definition of a limit.) The most nontrivial example of this I have seen so far is the function $f(x)=x^a$ where $a\in\Bbb C$ with $\Re a\ge0$; in this case we have the complicated bound $|x|\le\delta\to|x^a|\le\delta^{\Re a}e^{|\Im a|\pi}$. Note that $x^a$ is not even a continuous function due to the branch cut. I believe that any continuous function is also $O(1)$-preserving, but the proof seems to need the extreme value theorem and so I would not call it an "easy" theorem. $+,-,\cdot$ are all $O(1)$-preserving, of course (o1add, o1sub, o1mul), but division is not.
• rlimo1: If $f\to a$, then $f\in O(1)$. This encompasses the rules
• $f(x)=o(g(x))\implies f(x)=O(g(x))$,
• $f(x)\sim g(x)\implies f(x)=O(g(x))$
as well as a few others. Similarly, the rule $f,g\in O(1)\implies f\cdot g\in O(1)$ encompasses the rules
• $f(x)=O(g(x)),\, g(x)=O(h(x))\implies f(x)=O(h(x)),$
• $f_1(x)=O(g_1(x)),\, f_2(x)=O(g_2(x))\implies f_1(x)f_2(x)=O(g_1(x)g_2(x)),$
• $f(x)=O(g(x))\implies cf(x)=O(g(x)),$
and the sum rule implies $f_1(x),f_2(x)\in O(g(x))\implies f_1(x)+f_2(x)\in O(g(x))$. This phenomenon where many rules are expressed as special cases of one rule when written using a more restricted notation is why I support the approach taken in Metamath where multiple notations for one concept are eliminated as much as possible. For example, in Metamath there is a less-than sign $<$, but no greater-than sign, because any formula of the form $y>x$ can be rewritten as $x<y$ and it means that we don't need separate theorems dealing with transitivity of greater-than as well as those with less-than. (Even the use of $x\le y$ instead of $\lnot(y<x)$ is a controlled concession, although in this case it is used in enough contexts and helps in reading enough that it is considered worthwhile to define.)
• o1rlimmul: If $f\in O(1)$ and $g\to0$, then $f\cdot g\to0$. This expresses the rule $f(x)=O(g(x)),g(x)=o(h(x))\implies f(x)=o(h(x))$. Of course this is not true for limits other than $0$; if $f\in O(1)$ and $g\to a\ne0$, then $f\cdot g$ may not have a limit at all. It is still bounded, though; but this already follows from rlimo1 and o1mul.
• rlimadd, rlimsub, rlimmul, rlimdiv: I'm sure at this point it is no surprise that we have the theorems $f\to a,g\to b$ implies $f+g\to a+b$,$f-g\to a-b$,$f\cdot g\to a\cdot b$, and (if $g(x)\ne 0$ and $b\ne0$) $f/g\to a/b$. The last one is new, however - we had no $\to_{\Bbb Z}$-equivalent for division of limits before this point.
• divrcnv: The most basic nontrivial convergent function: if $f(x)=\frac ax$, then $f\to0$.

• logimcl: The imaginary part of $\log x$ is defined to satisfy $-\pi<\Im[\log x]\le\pi$. This, and the property $e^{\log x}=x$ (eflog), is sufficient to uniquely define the number $\log x$, and is generally needed when making arguments involving the complex logarithm function (the situation is a bit simpler for the real logarithm, which has no branch cut issues).
• lognegb: Any number with argument $\pi$ (recall that $\operatorname{Arg}z:=\Im[\log z]$) is a negative real number and vice-versa.
• relog: The dual equation to absef, which is nice for having no branch cut restrictions, unlike many other $\log$ formulas: $\Re[\log z]=\log|z|$.
• rplogcl: Common special case of logltb: if $x>1$, then $\log x>0$.
• angval: Define a function $\angle(x,y)$ which measures the angle of two vectors in the plane, using the complex numbers as a convenient proxy for $\Bbb R^2$. We define $\angle(x,y)=\operatorname{Arg}(y/x)=\Im[\log(y/x)]$. Note that $x,y$ are nonzero complex numbers, not real numbers. This measures the angle from the line $\overline{0x}$ to $\overline{0y}$, as a number in $(-\pi,\pi]$. A positive number indicates that $y$ is counterclockwise ahead of $x$, and the function is antisymmetric as long as $x,y$ are not $180^\circ$ from each other. For example, $\angle(1,1)=0$, $\angle(1,i)=\angle(i,-1)=90^\circ$, $\angle(i,1)=-90^\circ$. (I'm using degrees because I want to treat this as a tool for geometry, not complex analysis.)
• ang180: Let $a,b,c$ be three distinct points in the plane. Then they form the vertices of a (possibly degenerate) triangle, and the interior angles of the triangle are respectively $\angle(c-b,a-b)$, $\angle(a-c,b-c)$, $\angle(b-a,c-a)$. Under these circumstances, this theorem states that the sum of these angles is $\pm180^\circ$, where the sign choice depends on whether the three points are labeled in clockwise or counterclockwise order. This is actually one of the Metamath 100 theorems, so now this makes the total count 38 theorems. The proof is more intricate than I expected it to be, although it's not too hard to understand at a high level.
• abscxp, abscxp2, abscxpbnd: I was disappointed to discover there is no nice formula for $|a^b|$ in the general case. If $a$ is positive real, then $|a^b|=a^{\Re b}$; and if $b$ is real, then $|a^b|=|a|^b$. If both $a,b$ are complex, then $|a^b|=|a|^{\Re b}e^{-\operatorname{Arg}a\cdot\Im b}$; as a bound we can simplify this to $|a^b|\le |a|^{\Re b}e^{|\Im b|\pi}$ or $|a^b|\le |a|^{\Re b}e^{|b|\pi}$ (which is the form we quoted earlier).
• cxplim: In this section we establish the basic facts about the order of growth of powers, logarithms, and exponentials. If $a<0$ then $x^a\to0$ as $x\to\infty$.
• rlimcxp, o1cxp: If $f(x)\to0$ and $a>0$, then $f(x)^a\to0$, and similarly $f(x)\in O(1)$ and $\Re a\ge0$ imply $f(x)^a\in O(1)$.
• cxp2limlem: If $b>1$ then $x\in o(b^x)$, i.e. $x/b^x\to 0$.
• cxp2lim: If $a\in\Bbb R$ and $b>1$ then $x^a\in o(b^x)$, i.e. $x^a/b^x\to 0$. In words, any increasing exponential function eventually dominates every polynomial.
• cxploglim: If $a>0$, then $\log x\in o(x^a)$ - a logarithm grows slower than any polynomial.
• df-cht: Define the first Chebyshev function, $\theta(x)=\sum_{p\le x}\log p$, where the sum is only over primes.
• df-ppi: Define the prime $\pi$ function, $\pi(x)=\sum_{p\le x}1=|[0,x]\cap\Bbb P|$.
• df-mmu: Define the MÃ¶bius function, $\mu(x)=0$ if $x$ is not squarefree and $\mu(x)=(-1)^n$ where $n$ is the number of prime factors of $x$ otherwise.
• df-sgm: Define the divisor function, $\sigma(x)=$ the number of divisors of $x$.
• efchtcl: The Chebyshev function is closed in the log-integers, i.e. $e^{\theta(x)}\in\Bbb N$. (This is one of the most unusual sets I have applied fsumcllem to, showing that the set $\{x\in\Bbb R:e^x\in\Bbb N\}$ is closed under addition and zero as a consequence of the natural numbers being closed under multiplication and one.)
• efchtdvds: The exponential Chebyshev function is actually the same as the primorial, $e^{\theta(x)}=x\#=\prod_{p\le x}p$, so it's not surprising that it shares all the properties of the primorial, like this one, that says that $e^{\theta(x)}|e^{\theta(y)}$ whenever $x\le y$.
• ppifl: The Chebyshev and prime $\pi$ functions are actually functions on the reals, not the integers; we extend them to the reals by means of this theorem, $\pi(x)=\pi(\lfloor x\rfloor)$.
• ppip1le: A local version of ppiltx, showing that $\pi(x+1)\le\pi(x)+1$. The obvious extension of this formula, $\pi(x+n)\le\pi(x)+n$, is only true when $n$ is an integer, because if it was true for all $n\in\Bbb R$ then $\pi$ would have to be continuous, and of course it has jump discontinuities at the primes.
• ppiltx: A simple cardinality argument shows $\pi(x)<x$.
• prmorcht: As remarked earlier, the primorial is the same as the exponential of the Chebyshev function, $e^{\theta(x)}=x\#$. Later, in the proof of bpos, we will make a careful analysis of the primorial function in order to establish a bound; now this proof has been converted to show a bound on $\theta(x)$ instead.
• ppiub: This theorem used to be called prmpiub, and stated as $|\Bbb P\cap\{1\dots n\}|\le\frac n3+2$ and used as part of the proof of bpos. Now that we have defined a function for this, we can just write $\pi(x)\le\frac x3+2$, although the content hasn't changed much beyond being extended to the reals.
• chtleppi: A simple sum inequality shows $$\theta(x)=\sum_{p\le x}\log p\le\sum_{p\le x}\log x=\pi(x)\log x.$$
• chtub: Previously called prmorub and stated as $\prod_{p\le n}p<2^{2n-3}$, now we can write $\theta(x)<(2x-3)\log 2$ and extend the domain to all reals greater than $2$.
• chebbnd1: The largest original contribution in this update aside from ang180, we show that $\pi(x)\in\Omega(x/\log x)$ by borrowing a lot of lemmas that were used for bpos. We have $4^n/n<{2n\choose n}=\prod_pp^{e_p}$, where the exponents $e_p=\#_p{2n\choose n}$ of the product all satisfy $p^{e_p}\le 2n$, by bposlem1. Then no prime larger than $2n$ enters the product, and each term is bounded, so the largest that sum can be is $(2n)^{\pi(2n)}$, and after taking logs you get $\pi(2n)\frac{\log(2n)}{2n}>\frac{\log 4}2-\frac{\log n}{2n}\ge\frac{\log 4}2-\frac1{2e}$, and with a little relaxing of the constants you can extend this to all real numbers, so in the end you get that $\pi(x)\in\Omega(x/\log x)$. The other inequality goes via $\theta(x)$, as we will see.
• chtppilim: Here we show that $\theta(x)\sim\pi(x)\log x$. One inequality, chtleppi, is easy and mentioned above; the other direction fixes $0<a<1$ and uses the trick $$\theta(x)\ge\sum_{x^a<p\le x}\log p\ge\sum_{x^a<p\le x}a\log x\ge a\log x(\pi(x)-x^a),$$ and since $x^a\in o(x/\log x)$ and $x/\log x\in O(\pi(x))$, we can make $x^a\le (1-a)\pi(x)$ and then $\theta(x)\ge a^2\log x\,\pi(x)$; now since $a<1$ was arbitrary, we have the desired statement $\frac{\theta(x)}{\pi(x)\log x}\to1$.
• chto1ub: A simple consequence of chtub: Since $\theta(x)<(2x-3)\log 2<2x\log 2$, it immediately follows that $\theta(x)\in O(x)$. The PNT will strengthen this to $\theta(x)/x\to1$.
• chebbnd2: The other direction of chebbnd1, showing $\pi(x)\in O(x/\log x)$, by leveraging chto1ub and chtppilim: since $\theta(x)\in O(x)$ and $\theta(x)\sim\pi(x)\log x$, $\pi(x)\log x\in O(x)$, or $\pi(x)\in O(x/\log x)$.