Wednesday, August 20, 2014

Van der Waerden's theorem

To take a break from all this real analysis, I thought I'd change gears and formalize a gem of number theory, which goes by the name of Van der Waerden's theorem. The proof is not yet complete, so instead I'll just lay out the high-level strategy of the proof. (Well, any regular mathematician would call this a proof, and a pretty detailed one at that, but for me it's only a sketch of the "real" formal proof.)

Theorem (vdw): For any finite set $R$ (the set of "colors") and any $k\in\Bbb N$, there is an $N$ such that for any "coloring" $f:\{1,\dots,N\}\to R$, there is a monochromatic arithmetic progression. This last phrase is expressed in symbols as: $$\exists c\in R;a,d\in\Bbb N:\forall m\in\{0,\dots,k-1\}:a+md\in f^{-1}(\{c\}).$$ By demanding that $a+md$ lie in the preimage of $f$ we can indirectly also demand that the entire AP lies within $\{1,\dots,N\}$.

Proof: We begin by defining some predicates, which look horrible in metamath because they take so many parameters. First, we define the "arithmetic progression function": $$A_k:\Bbb N\times\Bbb N\to{\cal P}\Bbb N\quad A_k(a,d)=\{a+md\,|\,m\in\{0,\dots,k-1\}\}$$ This function takes a length $k$, a start point and step $a,d$, and outputs the set of all points in this arithmetic progression. The parameter $k$ has been separated from the other two so that we can consider the set $\operatorname{ran}A_k$ of all length-$k$ arithmetic progressions.

Next, we define the notion that a given coloring $\langle R,N,f\rangle$ (where $f:\{1,\dots,N\}\to R$) contains a length-$k$ monochromatic AP. In metamath this predicate looks like $k\,(R\,O\,N)\,f$, because $O$ is a function which takes $R,N$ as input and returns a binary relation whose two parameters are $k,f$. In any case, we define this notion to be: $$k\,(R\,O\,N)\,f\iff\exists c\in R,x\in\operatorname{ran}A_k:x\subseteq f^{-1}(\{c\}).$$
Straightforward enough. Now we turn to our last and most complex definition, which states that a given coloring $\langle R,N,f\rangle$ contains a polychromatic $m$-tuple of arithmetic progressions of length $k$. (In metamath this looks like $k\,((R\,P\,N)m)\,f$.) A polychromatic $m$-tuple of arithmetic progressions is a set of arithmetic progressions $a+\{1,\dots,k\}d_1,a+\{1,\dots,k\}d_2,\dots,a+\{1,\dots,k\}d_m$ which all share a common basepoint $a$ (but do not actually include $a$), and we demand that each progression be monochromatic, but no two APs have the same color. We express this as: $$k\,((R\,P\,N)m)\,f\iff\exists a\in\Bbb N, d:\{1,\dots,m\}\to\Bbb N:\\ \quad\forall i\in 1,\dots,m:A_k(a+d_i,d_i)\subseteq f^{-1}(\{f(a+d_i)\})\,\wedge\\ \quad\big|\{f(a+d_i)\,|\,i\in1,\dots,m\}\big|=m$$

The first conjunct says that the entire AP $A_k(a+d_i,d_i)$ is all the same color as $f(a+d_i)$, the first element of the progression, i.e. the progression is monochromatic; the second conjunct says that the set $\{f(a+d_i)\,|\,i\in1,\dots,m\}$ of colors of the first element of each progression has size exactly $m$, i.e. they are all distinct. (We could just as well say that the $f$-image of the union of all the AP's has size $m$, but it's less trivial to show that the size of this larger set is already at most $m$.)

Now, onto the proof proper. The main body of the proof is a double induction; at the top level we induct on $k$, where our induction hypothesis is that for any $R$ there is an $N$ such that all $R$-colorings of $\{1,\dots,N\}$ contain a monochromatic length-$k$ AP. For the base case, we treat $k=1$ and $k=2$ separately (because some of our arguments in the induction step $k\to k+1$ don't work if $k=1$). If $k=1$, then any point is an AP, so take $N=1$ and so $A_1(1,1)=\{1\}\subseteq f^{-1}(\{f(1)\})$. If $k=2$, then we can use the pigeonhole principle. Take $N=|R|+1$, and assume that $f$ contains no length-$2$ monochromatic APs. If $f(x)=f(y)$ for $x<y$, then $A_2(x,y-x)=\{x,y\}\subseteq f^{-1}(\{f(x)\})$, so $f$ has a length-$2$ AP contrary to the hypothesis, so therefore $f$ is injective and $|\{1,\dots,N\}|=N=|R|+1\le|R|$, a contradiction.

In our induction step, we assume (for $k\ge2$) that the hypothesis is true for $k$, and derive it for $k+1$. Fix $R\ne\emptyset$; if $R=\emptyset$ then there are no functions $f:\{1\}\to R$ so the statement is vacuously true for $N=1$. Let us then start another induction on $m\in\Bbb N$, where the induction hypothesis is that there is an $N$ such that every coloring on $\{1,\dots,N\}$ contains either a monochromatic length-$k+1$ AP or a polychromatic $m$-tuple of length-$k$ APs. The base case is not entirely trivial, because a $1$-tuple of length-$k$ APs is not quite the same as a monochromatic length $k$ AP (which is guaranteed to us by the $k$-induction hypothesis) because it "reserves space" for the base point of a $k+1$-long AP - it just doesn't demand that this initial point have any particular color.

To fix this, suppose that $N$ is such that every coloring of $\{1,\dots,N\}$ contains a monochromatic length-$k$ AP, and let $N_2=2N$. Given a coloring $f:\{1,\dots,2N\}$, by defining $g:\{1,\dots,N\}\to R$ by $g(x)=f(x+N)$, we know $g$ contains a monochromatic length-$k$ AP, so there is a $c,a,d$ such that for all $t=0,\dots,k-1$ we have $a+td\in\{1,\dots,N\}\wedge f(a+td)=c$. Then we can define $a'=N+a-d$; this is a positive integer because $a,a+d\in\{1,\dots,N\}$ imply $a-d=2a-(a+d)\ge 2-N>-N$. (This is why we needed $k\ge2$ at the start.) Now we have the necessary setup for a polychromatic $1$-tuple of APs, using $a'$ and $d_1=d$.

Now, for the $m$-induction step, we finally get to the interesting part of the proof. Let $W\in\Bbb N$ be such as specified by the $m$-induction hypothesis. Then the set $R^{\{1,\dots,W\}}$ of all functions $\{1,\dots,W\}\to R$ is finite, so by the $k$-induction hypothesis there is a $V$ such that all colorings $f:\{1,\dots,V\}\to R^{\{1,\dots,W\}}$ have a length-$k$ monochromatic AP. Let $N=2WV$, and now we want to show that every function $h:\{1,\dots,N\}\to R$ contains either a monochromatic $k+1$ AP or a polychromatic $m,k$ AP tuple. Define $$f(x)=\big(y\in\{1,\dots,W\}\mapsto h(y+W(x-1+V))\big);$$ effectively we are cutting the set $\{1,\dots,N\}$ into blocks of length $W$. Again, we "reserve space" of size $WV$ in our set $1,\dots,N$ so that we can move backwards a little. The hypotheses give us that $f$ contains an AP with parameters $g,a,d$ such that $f(x)=g$ for $x=a,a+d,\dots,a+(k-1)d$. Now since $f$ takes values in the function space $R^{\{1,\dots,W\}}$, $g:\{1,\dots,W\}\to R$ is really a function, which satisfies the assumptions for the $m$-induction hypothesis, so either $g$ contains a $k+1$-long AP with parameters $c,a',d'$ (in which case $h$ has a $k+1$-long AP with parameters $c,W(a-1+V)+a',d'$), or there is a polychromatic $m$-tuple of APs $a',d_1,\dots,d_m$.

In this case, consider the prospective $m+1$-tuple defined by $a''=a'+W(a-d-1+V)$ and $d''_i=d_i+dW$, $d''_{m+1}=dW$. We can verify that $a''$ is positive, because $$V-1+a-d=V-1+2a-(a+d)\ge V-1+2-V=1,$$ and for each $i\in\{1,\dots,m\}$ and $t\in\{1,\dots,k\}$, \begin{align}
h(a''+td''_i)&=h(a'+td_i+W(a+(t-1)d-1+V))\\
&=f(a+(t-1)d)(a'+td_i)\\
&=g(a'+td_i)=g(a+d_i),
\end{align}
so each tuple is individually monochromatic. For the last AP, $h(a''+td''_{m+1})=f(a+(t-1)d)(a')=g(a')$ is monochromatic as well. All that remains is to verify that the colors are distinct. Since we already know that $g(a'+d_i)\ne g(a'+d_j)$ for any $i\ne j$, if additionally $g(a')\ne g(a'+d_i)$ for all $i=1,\dots,m$ then we are done because $h$ contains a polychromatic $m+1$-tuple. Otherwise, there is some $i$ such that $g(a')=g(a'+d_i)$. But we already know $g(a'+d_i)=g(a'+td_i)$ for $t\in\{1,\dots,k\}$, so then $g$ contains a $k+1$ monochromatic AP, and so does $h$ (with parameters $g(a'),a'+W(a-1+V),d_i$).

This completes the $m$-induction. To complete the $k$-induction, let $m=|R|$ (which is valid since $R\ne\emptyset$). Then there is an $N$ such that all colorings have a $k+1$ monochromatic AP or a polychromatic $|R|$-tuple of length-$k$ APs. In the first case we are done, and in the second case let the parameters of the tuple be $a,d_1,\dots,d_{|R|}$. Since $|\{f(a+d_i)\,|\,i\in\{1,\dots,|R|\}\}|=|R|$ and $R$ is finite, the function $i\mapsto f(a+d_i)$ must be bijective to $R$, so there is an $i$ such that $f(a)=f(a+d_i)$, and as before this yields a monochromatic AP of length $k+1$ with parameters $a,d_i$. Thus there is a $k+1$ monochromatic AP in any case, which completes the $k$-induction and the proof. $$\tag*{\blacksquare}$$

There is one more nice extension to this theorem, which is the so-called "infinitary" version of Van der Waerden's theorem.

Theorem (vdwnn): For any finite set $R$ and any coloring $f:\Bbb N\to R$, there is a color $c\in R$ such that $f^{-1}(\{c\})$ contains arbitrarily long arithmetic progressions. (In symbols, $\forall k\in\Bbb N:\exists a,d\in\Bbb N:\forall m\in\{0,\dots,k-1\}:a+md\in f^{-1}(\{c\})$.)

Proof: Fix $R$ and $f$ and suppose not. Then for each $c\in\Bbb R$, there is a $k$ such that there are no $c$-colored APs of length $k$, and since each length $n\ge k$ AP contains a length $k$ AP, there are also no $c$-colored APs of length $n$ for any $n\ge k$. Thus the set of $k\in\Bbb N$ such that there are $c$-colored APs of length $k$ is bounded, and since the finite union of bounded sets is bounded, there is a uniform bound $K$ such that there are no monochromatic length $K$ APs in any color. But this is a contradiction to vdw above. $$\tag*{\blacksquare}$$