Wednesday, September 24, 2014

Mathbox update: Functions of Number Theory

I should probably apologize for having nothing but mathbox updates on this blog of late, but this is the only type of post I release regularly, in conjunction with the actual theorems being uploaded to the Metamath site. (I try to make up for it by making frequent, hopefully enlightening remarks in the list below so it's not just a list.) This update is primarily concerned with preparations for an eventual proof of the Prime Number Theorem, and actual results include the Chebyshev bound on the prime pi function $a\frac x{\log x}\le\pi(x)\le A\frac x{\log x}$, which is a weak form of the PNT (which shows that this is true for any $a<1<A$). We also define several number-theoretic functions, and formalize the notions $f\to a$ and $f\in O(1)$ for real functions (we already have a limit notion $(a_n)\to a$, but it only works on sequences).
  • df-rlim: Define the limit predicate $f(x)\to a$ as $x\to\infty$. It is defined so that it will work on sequences defined on an upper integer set or on real functions; we say that $f\to a$ if $f$ is a partial function $\Bbb R\to\Bbb C$ and for all $\epsilon>0$ there exists an $M\in\Bbb R$ such that for all $x\ge M$ in the domain of $f$, $|f(x)-a|<\epsilon$. Note that if the domain of $f$ is upper bounded, then the rest is vacuously true, so $f\to a$ for every $a\in\Bbb C$ in this case. (But if the domain of $f$ is not upper bounded, then $f\to a$ for at most one $a$, as we expect.)
  • df-o1: Define the set of eventually bounded functions, that is, $f\in O(1)$ iff there exists an $c,M$ such that $|f(x)|\le M$ for all $x\ge c$. The notation $O(1)$ is meant to be suggestive of Bachmann-Landau notation, and indeed it matches the notation for $f=O(1)$ (although of course we use $\in$ in place of the misleading $=$ in big-O notation). However, we do not define the complete hierarchy of notations, as $O(1)$ and the $\to 0$ predicate of df-rlim is sufficient to unify all of these, in the following manner:
    $$f(x)=O(g(x))\iff g(x)=\Omega(f(x))\iff\frac{f(x)}{g(x)}\in O(1)\\ f(x)=o(g(x))\iff g(x)=\omega(f(x))\iff\frac{f(x)}{g(x)}\to0\\ f(x)=\Theta(g(x))\iff\frac{f(x)}{g(x)},\frac{g(x)}{f(x)}\in O(1)\\ f(x)\sim g(x)\iff\frac{f(x)}{g(x)}\to1$$
    Thus chebbnd1, chebbnd2 together make the statement $\pi(x)\in\Theta(x/\log x)$.

    Since usually the functions $g(x)$ we use in this sense are eventually non-zero, the equivalences above are exact. For functions that are not eventually non-zero, the usual definition of the statement $f(x)=O(g(x))$ also requires that $f(x)=0$ wherever $g(x)=0$ for sufficiently large $x$, while $f/g\in O(1)$ leaves $f$ unconstrained whenever $g(x)=0$ because in these cases $f(x)/g(x)$ is undefined. I don't see that this edge case is very useful, though, so the discrepancy is not enough to be relevant. (If anyone has a good example of a use for the notation $f(x)=O(g(x))$ or $f(x)=o(g(x))$ where $g$ is not eventually nonzero, I'm all ears.)
  • o1const, rlimconst: The constant function $f(x)=a$ satisfies $f\in O(1)$ and $f\to a$.
  • rlimclim: For functions $f:\Bbb Z^{\ge n}\to\Bbb C$, $f\to_{\Bbb Z}a\iff f\to_r a$, i.e. the old $\Bbb Z$-limit predicate df-clim and the new df-rlim are equivalent. Since df-clim applies to non-functions, but $f\to_{\Bbb Z} a\iff (x\in\Bbb N\mapsto f(x))\to_{\Bbb Z} a$, i.e. we can replace $f$ with a function that takes the same values and has the same behavior under $\to_{\Bbb Z}$, it follows that df-rlim is a generalization of df-clim.
  • rlimres, o1res: If $f\to a$, then $f\upharpoonright X\to a$, that is, $f$ restricted to any subset $X$ still has limit $a$. Similarly, $f\in O(1)$ implies $f\upharpoonright X\in O(1)$. The unconstrained nature of this restriction may seem peculiar, since then it seems that even the empty function has limit $a$, but considering the note about upper bounded domain implying that $f$ takes every value as a limit, this should be no surprise.
  • rlimresb: If $X=[b,\infty)$ for some $b\in\Bbb R$, then $f\to a\iff f\upharpoonright X\to a$. In other words, the limit of $f$ is not affected by its values on any upper bounded set.
  • rlimuni: Since as pointed out above, $f$ often fails to have a unique limit, we may ask under what conditions we are ensured that if a limit exists it is unique. The correct answer is that the limit is unique iff the domain of $f$ has no upper bound, but here we prove the weaker assertion (which is sufficient for most applications) that the limit is unique if the domain of $f$ contains an upper integer set.
  • rlimcn1, rlimcn2: One of the most important properties of continuous functions (almost a definitional property) is that it preserves limits, that is, if $f$ is continuous and $g\to a$ then $f\circ g\to f(a)$. Here we derive this for one and two-dimensional continuous functions. (This is a bit early in the development to be worrying about two-dimensional continuity, but we need it for the important continuous functions $+,-,\cdot$.)
  • reccn2: Previously a lemma for the much later divcn, here we show that the reciprocal function is continuous (away from zero).
  • o1of2: The property that characterizes $O(1)$-preserving functions is a bit different from continuity, and I don't know if it has a name: For all $\delta\in\Bbb R$, there exists an $\epsilon$ such that for all $x$, $|x|\le\delta\to|f(x)|\le\epsilon$. (Note that the roles of $\delta$ and $\epsilon$ is reversed from the usual definition of a limit.) The most nontrivial example of this I have seen so far is the function $f(x)=x^a$ where $a\in\Bbb C$ with $\Re a\ge0$; in this case we have the complicated bound $|x|\le\delta\to|x^a|\le\delta^{\Re a}e^{|\Im a|\pi}$. Note that $x^a$ is not even a continuous function due to the branch cut. I believe that any continuous function is also $O(1)$-preserving, but the proof seems to need the extreme value theorem and so I would not call it an "easy" theorem. $+,-,\cdot$ are all $O(1)$-preserving, of course (o1add, o1sub, o1mul), but division is not.
  • rlimo1: If $f\to a$, then $f\in O(1)$. This encompasses the rules
    • $f(x)=o(g(x))\implies f(x)=O(g(x))$,
    • $f(x)\sim g(x)\implies f(x)=O(g(x))$
    as well as a few others. Similarly, the rule $f,g\in O(1)\implies f\cdot g\in O(1)$ encompasses the rules
    • $f(x)=O(g(x)),\, g(x)=O(h(x))\implies f(x)=O(h(x)),$
    • $f_1(x)=O(g_1(x)),\, f_2(x)=O(g_2(x))\implies f_1(x)f_2(x)=O(g_1(x)g_2(x)),$
    • $f(x)=O(g(x))\implies cf(x)=O(g(x)),$
    and the sum rule implies $f_1(x),f_2(x)\in O(g(x))\implies f_1(x)+f_2(x)\in O(g(x))$. This phenomenon where many rules are expressed as special cases of one rule when written using a more restricted notation is why I support the approach taken in Metamath where multiple notations for one concept are eliminated as much as possible. For example, in Metamath there is a less-than sign $<$, but no greater-than sign, because any formula of the form $y>x$ can be rewritten as $x<y$ and it means that we don't need separate theorems dealing with transitivity of greater-than as well as those with less-than. (Even the use of $x\le y$ instead of $\lnot(y<x)$ is a controlled concession, although in this case it is used in enough contexts and helps in reading enough that it is considered worthwhile to define.)
  • o1rlimmul: If $f\in O(1)$ and $g\to0$, then $f\cdot g\to0$. This expresses the rule $f(x)=O(g(x)),g(x)=o(h(x))\implies f(x)=o(h(x))$. Of course this is not true for limits other than $0$; if $f\in O(1)$ and $g\to a\ne0$, then $f\cdot g$ may not have a limit at all. It is still bounded, though; but this already follows from rlimo1 and o1mul.
  • rlimadd, rlimsub, rlimmul, rlimdiv: I'm sure at this point it is no surprise that we have the theorems $f\to a,g\to b$ implies $f+g\to a+b$,$f-g\to a-b$,$f\cdot g\to a\cdot b$, and (if $g(x)\ne 0$ and $b\ne0$) $f/g\to a/b$. The last one is new, however - we had no $\to_{\Bbb Z}$-equivalent for division of limits before this point.
  • divrcnv: The most basic nontrivial convergent function: if $f(x)=\frac ax$, then $f\to0$.

  • logimcl: The imaginary part of $\log x$ is defined to satisfy $-\pi<\Im[\log x]\le\pi$. This, and the property $e^{\log x}=x$ (eflog), is sufficient to uniquely define the number $\log x$, and is generally needed when making arguments involving the complex logarithm function (the situation is a bit simpler for the real logarithm, which has no branch cut issues).
  • lognegb: Any number with argument $\pi$ (recall that $\operatorname{Arg}z:=\Im[\log z]$) is a negative real number and vice-versa.
  • relog: The dual equation to absef, which is nice for having no branch cut restrictions, unlike many other $\log$ formulas: $\Re[\log z]=\log|z|$.
  • rplogcl: Common special case of logltb: if $x>1$, then $\log x>0$.
  • angval: Define a function $\angle(x,y)$ which measures the angle of two vectors in the plane, using the complex numbers as a convenient proxy for $\Bbb R^2$. We define $\angle(x,y)=\operatorname{Arg}(y/x)=\Im[\log(y/x)]$. Note that $x,y$ are nonzero complex numbers, not real numbers. This measures the angle from the line $\overline{0x}$ to $\overline{0y}$, as a number in $(-\pi,\pi]$. A positive number indicates that $y$ is counterclockwise ahead of $x$, and the function is antisymmetric as long as $x,y$ are not $180^\circ$ from each other. For example, $\angle(1,1)=0$, $\angle(1,i)=\angle(i,-1)=90^\circ$, $\angle(i,1)=-90^\circ$. (I'm using degrees because I want to treat this as a tool for geometry, not complex analysis.)
  • ang180: Let $a,b,c$ be three distinct points in the plane. Then they form the vertices of a (possibly degenerate) triangle, and the interior angles of the triangle are respectively $\angle(c-b,a-b)$, $\angle(a-c,b-c)$, $\angle(b-a,c-a)$. Under these circumstances, this theorem states that the sum of these angles is $\pm180^\circ$, where the sign choice depends on whether the three points are labeled in clockwise or counterclockwise order. This is actually one of the Metamath 100 theorems, so now this makes the total count 38 theorems. The proof is more intricate than I expected it to be, although it's not too hard to understand at a high level.
  • abscxp, abscxp2, abscxpbnd: I was disappointed to discover there is no nice formula for $|a^b|$ in the general case. If $a$ is positive real, then $|a^b|=a^{\Re b}$; and if $b$ is real, then $|a^b|=|a|^b$. If both $a,b$ are complex, then $|a^b|=|a|^{\Re b}e^{-\operatorname{Arg}a\cdot\Im b}$; as a bound we can simplify this to $|a^b|\le |a|^{\Re b}e^{|\Im b|\pi}$ or $|a^b|\le |a|^{\Re b}e^{|b|\pi}$ (which is the form we quoted earlier).
  • cxplim: In this section we establish the basic facts about the order of growth of powers, logarithms, and exponentials. If $a<0$ then $x^a\to0$ as $x\to\infty$.
  • rlimcxp, o1cxp: If $f(x)\to0$ and $a>0$, then $f(x)^a\to0$, and similarly $f(x)\in O(1)$ and $\Re a\ge0$ imply $f(x)^a\in O(1)$.
  • cxp2limlem: If $b>1$ then $x\in o(b^x)$, i.e. $x/b^x\to 0$.
  • cxp2lim: If $a\in\Bbb R$ and $b>1$ then $x^a\in o(b^x)$, i.e. $x^a/b^x\to 0$. In words, any increasing exponential function eventually dominates every polynomial.
  • cxploglim: If $a>0$, then $\log x\in o(x^a)$ - a logarithm grows slower than any polynomial.
  • df-cht: Define the first Chebyshev function, $\theta(x)=\sum_{p\le x}\log p$, where the sum is only over primes.
  • df-ppi: Define the prime $\pi$ function, $\pi(x)=\sum_{p\le x}1=|[0,x]\cap\Bbb P|$.
  • df-mmu: Define the Möbius function, $\mu(x)=0$ if $x$ is not squarefree and $\mu(x)=(-1)^n$ where $n$ is the number of prime factors of $x$ otherwise.
  • df-sgm: Define the divisor function, $\sigma(x)=$ the number of divisors of $x$.
  • efchtcl: The Chebyshev function is closed in the log-integers, i.e. $e^{\theta(x)}\in\Bbb N$. (This is one of the most unusual sets I have applied fsumcllem to, showing that the set $\{x\in\Bbb R:e^x\in\Bbb N\}$ is closed under addition and zero as a consequence of the natural numbers being closed under multiplication and one.)
  • efchtdvds: The exponential Chebyshev function is actually the same as the primorial, $e^{\theta(x)}=x\#=\prod_{p\le x}p$, so it's not surprising that it shares all the properties of the primorial, like this one, that says that $e^{\theta(x)}|e^{\theta(y)}$ whenever $x\le y$.
  • ppifl: The Chebyshev and prime $\pi$ functions are actually functions on the reals, not the integers; we extend them to the reals by means of this theorem, $\pi(x)=\pi(\lfloor x\rfloor)$.
  • ppip1le: A local version of ppiltx, showing that $\pi(x+1)\le\pi(x)+1$. The obvious extension of this formula, $\pi(x+n)\le\pi(x)+n$, is only true when $n$ is an integer, because if it was true for all $n\in\Bbb R$ then $\pi$ would have to be continuous, and of course it has jump discontinuities at the primes.
  • ppiltx: A simple cardinality argument shows $\pi(x)<x$.
  • prmorcht: As remarked earlier, the primorial is the same as the exponential of the Chebyshev function, $e^{\theta(x)}=x\#$. Later, in the proof of bpos, we will make a careful analysis of the primorial function in order to establish a bound; now this proof has been converted to show a bound on $\theta(x)$ instead.
  • ppiub: This theorem used to be called prmpiub, and stated as $|\Bbb P\cap\{1\dots n\}|\le\frac n3+2$ and used as part of the proof of bpos. Now that we have defined a function for this, we can just write $\pi(x)\le\frac x3+2$, although the content hasn't changed much beyond being extended to the reals.
  • chtleppi: A simple sum inequality shows $$\theta(x)=\sum_{p\le x}\log p\le\sum_{p\le x}\log x=\pi(x)\log x.$$
  • chtub: Previously called prmorub and stated as $\prod_{p\le n}p<2^{2n-3}$, now we can write $\theta(x)<(2x-3)\log 2$ and extend the domain to all reals greater than $2$.
  • chebbnd1: The largest original contribution in this update aside from ang180, we show that $\pi(x)\in\Omega(x/\log x)$ by borrowing a lot of lemmas that were used for bpos. We have $4^n/n<{2n\choose n}=\prod_pp^{e_p}$, where the exponents $e_p=\#_p{2n\choose n}$ of the product all satisfy $p^{e_p}\le 2n$, by bposlem1. Then no prime larger than $2n$ enters the product, and each term is bounded, so the largest that sum can be is $(2n)^{\pi(2n)}$, and after taking logs you get $\pi(2n)\frac{\log(2n)}{2n}>\frac{\log 4}2-\frac{\log n}{2n}\ge\frac{\log 4}2-\frac1{2e}$, and with a little relaxing of the constants you can extend this to all real numbers, so in the end you get that $\pi(x)\in\Omega(x/\log x)$. The other inequality goes via $\theta(x)$, as we will see.
  • chtppilim: Here we show that $\theta(x)\sim\pi(x)\log x$. One inequality, chtleppi, is easy and mentioned above; the other direction fixes $0<a<1$ and uses the trick $$\theta(x)\ge\sum_{x^a<p\le x}\log p\ge\sum_{x^a<p\le x}a\log x\ge a\log x(\pi(x)-x^a),$$ and since $x^a\in o(x/\log x)$ and $x/\log x\in O(\pi(x))$, we can make $x^a\le (1-a)\pi(x)$ and then $\theta(x)\ge a^2\log x\,\pi(x)$; now since $a<1$ was arbitrary, we have the desired statement $\frac{\theta(x)}{\pi(x)\log x}\to1$.
  • chto1ub: A simple consequence of chtub: Since $\theta(x)<(2x-3)\log 2<2x\log 2$, it immediately follows that $\theta(x)\in O(x)$. The PNT will strengthen this to $\theta(x)/x\to1$.
  • chebbnd2: The other direction of chebbnd1, showing $\pi(x)\in O(x/\log x)$, by leveraging chto1ub and chtppilim: since $\theta(x)\in O(x)$ and $\theta(x)\sim\pi(x)\log x$, $\pi(x)\log x\in O(x)$, or $\pi(x)\in O(x/\log x)$.

Monday, September 15, 2014

Mathbox update: Fundamental Theorem of Algebra

The main theorems in this short update are Van der Waerden's theorem (already discussed in a previous post) and the Fundamental Theorem of Algebra. This last one is one of the metamath 100, and the proof was surprisingly straightforward. The hardest part was the first step, to show that a closed disk in $\Bbb C$ is compact via the Heine-Borel theorem; then it is just a matter of verifying some explicit bounds. By the way, this proof is based on the first "topological proof" on Wikipedia.
  • vdw: Van der Waerden's theorem: for any $k$ and finite set $R$ of colors, there is an $N$ such that any $R$-coloring of $\{1\dots N\}$ contains a length-$k$ monochromatic arithmetic progression. See also the previous post I wrote on this theorem.
  • vdwnn: Infinitary Van der Waerden's theorem: for any finite coloring of $\Bbb N$, there is a color that contains arbitrarily long arithmetic progressions.
  • txcmp, txcmpb: The topological product of two nonempty spaces is compact iff both factors are compact.
  • hmeores: The restriction of a homeomorphism of topological spaces is a homeomorphism on the induced subspace topologies.
  • cnheibor: Heine-Borel theorem for the complex numbers: A subset of $\Bbb C$ is compact iff it is closed and bounded. Note that we already have the more general heibor (a metric space is compact iff it is complete and totally bounded), but this direct proof avoids the axiom of countable choice.
  • dgrco: The degree of a composition of polynomials is the product of the degrees.
  • fta: The Fundamental Theorem of Algebra: Every non-constant polynomial has a root.

Thursday, September 11, 2014

Mathbox update: Ostrowski's theorem

Hello all,

This update has two main foci: the rounding off of the integration theorems with the integration by parts and integration by substitution formulas, and an unrelated work to formalize the concept of an absolute value over a ring (in the algebra sense) and prove Ostrowski's theorem. I dare say that the real analysis section is finally "open for business" now that most of the workhorse theorems are now available for use. On to the list:
  • suplub2, suprleub, infmxrgelb, ...: All of the supremum theorems of the form $\forall x \in A: x\le B\to\sup A\le B$ or $B<\sup A\to\exists x \in A: B<x$ have now been made bidirectional. I find this interesting, because it means that one can characterize "less than supremum" exactly in terms of a conjunction of individual inequalities. Another way to think about it is to consider the sets $(-\infty,x)$ for $x\in\Bbb R$. These are open sets, so any union of them is also open; moreover, they are all convex and unbounded below, and it's not hard to see that such sets are also closed under unions. Thus any nonempty union $\bigcup_{x\in A}(-\infty,x)$ is an open, convex, unbounded-below set, and the only such sets on the reals are those of the form $(\infty,x)$ or $(-\infty,\infty)$, since $\Bbb R$ is complete. If $A$ is bounded above, then the last case is not possible, and so we can define $\sup A$ as the upper endpoint of the union: $\bigcup_{x\in A}(-\infty,x)=(-\infty,\sup A)$.

    Note that a similar characterization with less-or-equal is not possible, because the infinite union of closed intervals is not necessarily closed.
  • limsupval, etc.: The definition of the limit supremum already existed, but there were no theorems for working with it beyond the bare minimum. I added the basics, with the most nontrivial theorems being limsupval2 (the limsup is not affected by any finite prefix) and limsupgre (if a sequence of real numbers has a bounded limit supremum, then the quantities $\sup_{n>k}f(n)$ appearing in the definition of the limsup are all bounded as well).
  • caurcvg: The original proof showed that if $f$ is a Cauchy sequence, then it converges to the supremum of the set $\{x:\exists j\forall k\ge j:x<f(k)\}$. Since this set is so similar to the limit supremum set, I modified the proof to show that it converges to the limit supremum instead (which then yields the corollary that any convergent sequence converges to the limit supremum of the sequence).
  • pcqmul, pczdvds, pcid: generalize some proofs about the prime count function from integers to rational numbers
  • pcadd: To show that $p^{-\#_p(r)}$ is an ultrametric absolute value on the rational numbers, you need two main facts about the prime count function $\#_p(r)$ (which is defined so that $p^{\#_p(r)}$ is the highest power of $p$ that divides the integer or rational number $r$): the multiplicative property pcqmul ($\#_p(ab)=\#_p(a)\#_p(b)$), and this theorem, which says that $\#_p(a+b)\ge \min(\#_p(a),\#_p(b))$. When you exponentiate both sides, this gives $p^{-\#_p(a+b)}\le\max(p^{-\#_p(a)},p^{-\#_p(b)})$, which is the ultrametric inequality. The proof is not as easy as you might think.
  • df-abv: Given a ring $R$, an absolute value on $R$ is a function $f:R\to\Bbb R$ such that for all $x,y\in R$:
    • $f(x)\ge 0$
    • $f(x)=0\iff x=0_R$
    • $f(x\cdot_R y)=f(x)f(y)$
    • $f(x+_R y)\le f(x)+f(y)$
    This definition is of course inspired by the standard absolute value function $|x|$ on $\Bbb C$; and all except the third axiom are very reminiscent of the analogous axioms for a metric. (Below I will denote an absolute value using the $|x|$ notation as is customary.) Other interesting simple properties of absolute values include $|0|=0$, $|1|=1$ (unless $1=0$), $\left|-x\right|=|x|$, and $|x/y|=|x|/|y|$.
  • abvdom: If $R$ is a ring with an absolute value, then $R$ is actually a domain, which is to say, $xy=0$ implies $x=0\vee y=0$.
  • abvtriv: As a converse to abvdom, if $R$ is a domain, then it has an absolute value, namely the trivial absolute value defined by $|x|=0$ if $x=0$ and $|x|=1$ otherwise. (Since we don't have a good notation for domain, we strengthen the hypotheses to say that $R$ is a division ring instead. However, this theorem now provides the alternative notation ${\rm AbsVal}(R)\ne\emptyset$ in order to specify that $R$ is a non-commutative domain.)
  • cncfmpt1f, cncfmpt2f: These theorems are the analogues of cnmpt11f and cnmpt12f, respectively, specialized to continuous functions on the complex numbers (rather than continuous functions in arbitrary topological spaces). We still need the topological continuity predicate for the two-arg version, though, because the version on $\Bbb C$ is designed only for functions $f:\Bbb C\to\Bbb C$ in one variable, which misses the common use-case $h(x)=R(f(x),g(x))$ where $R:\Bbb C\times\Bbb C\to\Bbb C$ is continuous in the product topology on $\Bbb C\times\Bbb C$ (particularly when $R$ is one of $+,\cdot,-$).
  • mbfsub: The difference of two measurable functions is measurable.
  • mbfsup, mbfinf, mbflimsup, mbflim: The supremum, infimum, limit supremum, and limit of a sequence of measurable functions is measurable. The main reason for developing limsup in the first place is because it allows a limit to be written in terms of a supremum and infimum, which are the basic measurability-preserving operations we have to work with.
  • mbfi1flim: A real measurable function is the pointwise limit of a sequence of simple functions. (The restriction to real functions is only because in our definition simple functions have to be real, so that they can be compared.)
  • mbfmul: The pointwise product of two measurable functions is measurable. The proof is very general; you just take two sequences of simple functions converging to $f$ and $g$ and multiply them pointwise to get a sequence of simple functions that converges to $fg$; then $fg$ is a limit of simple functions and so is measurable. I think it could easily be adapted to show that $h(f(x),g(x))$ is a measurable function whenever $h$ is jointly continuous and $f,g$ are measurable.
  • bddmulibl, bddibl: A bounded function is integrable, and a bounded function times an integrable function is also integrable. (This also yields a shorter proof for cniccibl.)
  • dvcn: A function differentiable on its entire domain is continuous.
  • ftc2ditg: Restate ftc2 for directed integrals, so that we can drop the ordering of the endpoints.
  • itgparts: Integration by parts. If $A:[x,y]\to\Bbb C$ and $C:[x,y]\to\Bbb C$ are continuous functions, $A':(x,y)\to\Bbb C$ and $C':(x,y)\to\Bbb C$ are defined and continuous (so $A,C$ are differentiable on $(x,y)$), and $A\cdot C'$ and $A'\cdot C$ are integrable on $[x,y]$, then $$\int_x^yA(t)C'(t)\,dt=(A(y)C(y)-A(x)C(x))-\int_x^yA'(t)C(t)\,dt.$$
  • itgsubst: Integration by $u$-substitution. If $I$ is an open interval, $A:[x,y]\to I$ and $C:I\to\Bbb C$ are continuous functions, and $A':(x,y)\to\Bbb R$ is continuous and integrable, then $$\int_{A(x)}^{A(y)}C(u)\,du=\int_x^yC(A(t))A'(t)\,dt.$$ When I was learning these theorems in school, I don't remember there being so many assumptions, but it just goes to show that it's easy to drop details in a regular math class if you're not being careful. These theorems are not as general as they could be, but usually making the hypotheses more general also means making them more complicated, and I don't want to go too far in that direction. (I certainly don't want to define absolute continuity, which has little use outside these theorems and the FTC.)
  • cxpaddle: A tricky proof that $(a+b)^c\le a^c+b^c$ when $0<c\le1$.
  • abvcxp: We're back to the absolute value after that integration interlude because this list is in database order and we had to wait for the logarithm and complex power functions to be defined. This theorem shows that if $|x|$ is an absolute value on $R$, then $|x|^c$ is as well, when $0<c\le 1$. For $c>1$, it usually is not because the triangle inequality is broken. (I have seen an alternative definition of absolute value with an auxiliary parameter, the "order" $c$ of the absolute value, such that for all $a,b$, $|a+b|\le c\max(|a|,|b|)$; then $|x|$ is a "triangular" absolute value iff $c\le 2$ and $|x|$ is an ultrametric absolute value iff $c=1$. Under this setup, if $|x|$ is an absolute value of order $n$, then $|x|^c$ is an absolute value of order $n^c$.)
  • abvmet: I mentioned earlier that the axioms of an absolute value look a lot like the axioms of a metric; well, here we show the correspondence, where an absolute value $|x|$ induces a metric $d(x,y)=|x-y|$. In fact, in the proof we never use the third property abvmul; and if we drop that (and add $\left|-x\right|=|x|$ as an axiom, since this is no longer derivable) then there is no need to even have the multiplication operation, so we end up with a statement about pure groups. I don't even think they need to be abelian groups, since we never use that either.
  • qrng and related theorems: Define the ring $\langle \Bbb Q,+,\cdot\rangle$, and show that it is actually a division ring, and the operations $+,\cdot,0,1,-,/$ map to their ring operation equivalents.
  • qabvle: From here on, we are interested in absolute values on the ring of rational numbers. For brevity and unambiguity, I will denote the "traditional" absolute value function by $|x|_\infty$, and reserve $|x|$ for any absolute value under our new definition. Induct the triangle inequality to show that $|n|\le n$ for all $n\in\Bbb N_0$.
  • ostthlem2: Any two absolute values that agree on the primes are equal.
  • qabsabv: The regular absolute value $|x|_\infty$ is an absolute value on $\Bbb Q$.
  • padicabv, padicabvf, padicabvcxp: The $p$-adic absolute value $|x|=n^{\#_p(x)}$ is an absolute value whenever $0<n<1$. We define "the" $p$-adic absolute value $|x|_p$ by setting $n=p^{-1}$; other choices of $n$ yield equivalent (off by an exponent) absolute values.
  • ostth: Ostrowski's theorem. If $|x|$ is an absolute value, then (1) $|x|=|x|_0$ is trivial, that is, $|x|=1$ for all $x\ne0$; (2) $|x|=|x|_\infty^c$ for some $c\in(0,1]$, i.e. $|x|$ is "regular"; or (3) there is a prime $p$ such that $|x|=|x|_p^c$ for some $c\in(0,\infty)$, i.e. $|x|$ is equivalent to a $p$-adic absolute value. We separate the proof into three parts according to these cases. If there is some $n\in\Bbb N$ with $|n|>1$ then we are in case (2), otherwise if there is a prime $|p|$ such that $|p|<1$ then we are in case (3). If both of these are false then $|p|=1$ for each prime, so case (1) shows that $|x|=|x|_0$ is the trivial absolute value.
I will probably write a post later going into a bit more detail for this proof, which is quite involved. It's a shame it's not on the metamath 100 list - I think it's a pretty nice and notable theorem.

Wednesday, September 3, 2014

Mathbox update: The Fundamental Theorem of Calculus

Today I just put the finishing touches on a proof of the Fundamental Theorem of Calculus, which is the first theorem from the Formalizing 100 Theorems list to be proven since the creation of this blog, and has been the main reason for my recent fixation on real analysis and basic properties of integrals. Hopefully this means that integrals can now actually be useful tools for more interesting proofs rather than objects of study of themselves. There is also a proof of the Mean Value Theorem and Rolle's theorem, among other things. I may describe proof overviews for some of the theorems in this list in later posts; the theorems below that are proven on this blog are marked with an asterisk and linked at the end.

Note that almost all of these theorems were necessary in order to complete the proof; it is an interesting exercise to see how some of these seemingly unrelated theorems in totally different fields are used in the final proof of the FTC. (To track the usage of a theorem, use the "This theorem is referenced by:" list at the end of the proof page.)
  • fofinf1o: Complementary to f1finf1o, this shows that a surjection from a finite set to another of the same size is a bijection (f1finf1o shows the same assuming an injection).
  • wlogle: Inspired by a similar theorem in HOL Light (another formalization language), we show that if a predicate satisfies $\varphi(x,y)\leftrightarrow \varphi(y,x)$, i.e. it is symmetric under interchange of $x,y$, then "without loss of generality" we can assume that $x\le y$ in a proof that $\forall x,y:\varphi(x,y)$.
  • ifle: A nice property of the conditional operator ${\rm if}(\varphi,x,y)$ (read "if $\varphi$ then $x$, else $y$", or in piecewise notation as $\begin{cases}x&\varphi\\y&\lnot\varphi\end{cases}$) is that it can transform an implication into an inequality. To be precise, if $y\le x$, then $\varphi\to\psi$ implies ${\rm if}(\varphi,x,y)\le{\rm if}(\psi,x,y)$ (and the converse is also true if $x<y$).
  • cnrecnv, cnrehmeo: The canonical map $\Bbb R^2\to\Bbb C$ is the function $\langle x,y\rangle\mapsto x+iy$. We already know from cnref1o that this map is invertible; here we show that the inverse is $z\mapsto\langle\Re\,z,\Im\,z\rangle$. Since all the functions involved here are known to be continuous, this allows us to deduce that the map is a homeomorphism, and so $\Bbb R^2$ under the product topology is homeomorphic to $\Bbb C$ under the metric topology.
  • max0add: The identity $f^++f^-=|f|$, where $f^+=\max(f,0)$ and $f^-=\max(-f,0)$ are the positive and negative part functions. See also max0sub which shows $f^+-f^-=f$.
  • txbas2: If $B$ and $C$ are topology bases, then $\{u\times v\,|\,u\in B,v\in C\}$ is also a basis for a topology, where $u\times v$ is the cartesian product of sets. This generalizes txbas which assumes that $B$ and $C$ are topologies.
  • txbasval: The topological product $\scr B\times\scr C$ is defined as the topology generated by the basis discussed in the last bullet, but it is only defined on topologies, not topological bases. Here we show that if $B$ is a basis for $\scr B$ and $C$ is a basis for $\scr C$, then $\{u\times v\,|\,u\in B,v\in C\}$ is a basis for $\scr B\times\scr C$. Note the difference from the definition, which defines $\scr B\times\scr C$ as the topology generated by $\{u\times v\,|\,u\in{\scr B},v\in{\scr C}\}$ - we see that it is sufficient to consider rectangles of basis elements and not topology elements.
  • iccntr: The interior of the closed interval $[x,y]$ is the open interval $(x,y)$, in the standard topology on the reals.
  • volsup2*, volcn*, volivth*: The intermediate value theorem for Lebesgue measure: Given a measurable set $A$, there is a measurable subset of $A$ of any measure $0\le x\le\operatorname{vol}(A)$.
  • I changed the token for the set of all (Lebesgue) integrable functions from the made-up $f\in{\rm Itgbl}$ to $f\in L^1$, since the latter actually seems to be in use and so may be more understandable to the lay-mathematician.
  • mbfimaopn, mbfimaopn2: The preimage of an open set under a measurable function is measurable. Note in particular that these are open sets in the complex topology, not the real topology; the proof essentially proceeds by showing that open rectangles with rational coordinates form a countable basis for the topology on $\Bbb C$.
  • cncombf: The composition $g\circ f$ is measurable if $f$ is measurable and $g$ is continuous. This could be generalized to $g$ Borel-measurable, but I don't feel like defining the Borel
  • itg2gt0*, itggt0*: If $\operatorname{vol}(A)>0$ and $f>0$ on $A$, then $\int_A f>0$. This is a nice proof that is discussed in a previous post.
  • itg2cn, itgcn: Absolute continuity of the Lebesgue integral. If $f$ is integrable, then for any $\varepsilon>0$ there is a $\delta>0$ such that for any measurable set $A$ with $\operatorname{vol}(A)<\delta$, $\int_A|f|<\varepsilon$. Note that no assumptions on $f$ other than integrability are needed.
  • itgeqa: If two functions are equal almost everywhere, then they have the same integrable and one is integrable iff the other is.
  • itgss3: If $A\subseteq B$ and $B\setminus A$ is a nullset, then $f$ is integrable on $A$ iff it is integrable on $B$ and the integrals are the same.
  • iblsub, itgsub: If $f,g$ are integrable, then $f-g$ is integrable and $\int(f-g)=\int f-\int g$.
  • itgfsum: If $f_1,f_2,\dots f_n$ are integrable, then $\sum_{k=1}^nf_k$ is integrable and $\int_A\sum_{k=1}^nf_k=\sum_{k=1}^n\int_Af_k$.
  • iblabs, iblabsr: If $f$ is a measurable function, then $f$ is integrable iff $|f|$ is integrable.
  • iblmulc2, itgmulc2: If $f$ is integrable and $c\in\Bbb C$, then $c\cdot f$ is integrable and $\int_Ac\cdot f=c\int_Af$.
  • itgabs: If $f$ is integrable, then $|\int_Af|\le\int_A|f|$. This is another nice proof that works on arbitrary complex functions $f$.
  • dvcj: The derivative $(f(x)^*)'$ is $f'(x)^*$, where ${}^*$ denotes the conjugate function. This does not follow from dvcof (the chain rule) because ${}^*$ is not differentiable as a complex function and is just the identity as a real function.
  • dvfre: The derivative of a real function is a real function.
  • dvmptid, dvmptc, dvmptadd, $\dots$, dvmptco: These theorems are nothing new in terms of actual content, in the sense that we already have formulas for the derivative of a sum, product, etc.; rather these are designed to be easy to use in proofs so that one can quickly differentiate a function given by a formula, in a sense providing an "algorithm" for differentiation. Supported here are the rules: $$c'=0\quad x'=1\quad (f+g)'=f'+g'\quad (fg)'=f'g+g'f\quad (cf)'=cf'\quad (-f)'=-f'$$ $$(f-g)'=f'-g'\quad (f^*)'=f'^*\quad (\Re\,f)'=\Re\,f'\quad (\Im\,f)'=\Im\,f'$$ Also the chain rule $(f(g(x)))'=f'(g(x))g'(x)$ as well as restriction theorems: if $f:X\to\Bbb C$ is differentiable on $X$ and $Z\subseteq X$, $Y=Z^\circ$ is the interior of $Z$, then $(f\upharpoonright Z)'=f'\upharpoonright Y$.
  • rolle: Rolle's theorem. If $f:[a,b]\to\Bbb R$ is continuous on $[a,b]$, differentiable on $(a,b)$, and $a<b$ and $f(a)=f(b)$, then there is an $x\in(a,b)$ such that $f'(x)=0$.
  • mvth: The Mean Value Theorem. If $f:[a,b]\to\Bbb R$ is continuous on $[a,b]$, differentiable on $(a,b)$, and $a<b$, then there is an $x\in(a,b)$ such that $f'(x)=\frac{f(b)-f(a)}{b-a}$.
  • dveq0: If $f:[a,b]\to\Bbb R$ is continuous on $[a,b]$ and differentiable with derivative $0$ on $(a,b)$, then $f$ is a constant function.
  Finally we get to the main theorem. Recall, though, that the "fundamental theorem" is really several separate but related theorems; we state them here.
  • ftc1a: Let $f:D\to\Bbb C$ be an integrable function, and $(a,b)\subseteq D\subseteq\Bbb R$ with $a\le b$. Define $G:[a,b]\to\Bbb C$ by $G(x)=\int_a^xf(t)\,dt$. Then $G$ is a continuous function on $[a,b]$.
  • ftc1: Now assume that $f$ is also continuous at some $c\in(a,b)$. Then $G$ is differentiable at $c$, and $G'(c)=f(c)$.
  • ftc1cn: Simplifying the assumptions, if $f:(a,b)\to\Bbb C$ is continuous and integrable, then $G'=f$.
  • ftc2: Let $f:[a,b]\to\Bbb C$ be a function differentiable on $(a,b)$ such that $f'$ is integrable and continuous. Then $\int_a^bf'=f(b)-f(a)$.
If you are familiar with the theorems, you'll see I kinda copped out on the last statement; the assumption that $f'$ is continuous is unnecessary but makes the proof a lot easier. I found a proof in Rudin that uses upper-semicontinuous functions via the Vitali–Carathéodory theorem, but I have not formalized the notion of semicontinuity or the Sorgenfrey line at all, so that seemed like a pain. Maybe I'll revisit it later if it becomes necessary to strengthen the theorem.

This will be theorem #15 of 100 at Formalizing 100 Theorems; as of now there are thus 35 theorems formalized in Metamath. You can see the others at the metamath 100 page.

* These theorems were discussed with natural-language proofs in integral of a positive function.