Exploring Metamath: Extensible structures, and "Every vector space has a basis"

I'd like to take a little time to talk about the proof of lbsex, which says that every vector space has a basis, or every vector space is free (a free module is a module with a basis; general modules are not necessarily free). This one is interesting because it uses some tricks that as far as I am aware are not in any standard proofs, which I had to use because finite linear combinations are not yet easy to state over arbitrary modules. Note that these articles are best read by also consulting the frequent links to the associated formal proofs, so that you can appreciate the similarities and differences between a hand proof and a formal proof.

First, some background, and also a little window into how Metamath does structures. A left module, written $\rm{LMod}$ in metamath, is an example of an extensible structure, which is characterized by having various "accessor" functions defined on it to get multiple components. In this case, a left module is an extension of a group, which is another extensible structure with two defined operations: the base set $\operatorname{Base}(W)$ and the group operation $+_g(W)$. These functions are simply defined as $\operatorname{Base}(W)=W(1)$ and $+_g(W)=W(2)$, so that a group object is really a function from the natural numbers or a subset thereof to the operations. This is a little different from the standard literature definition "A group is an ordered pair $\langle V,+\rangle$ such that..." which does not allow for extensions of the set operations.

Given all the above, given a set $W$, define (throughout this article) $V=\operatorname{Base}(W)$ and $+=+_g(W)$ (note that these operations always return something even if $W$ is not a function, so it is always valid to make such a definition). Then $W$ is a group iff:

$\forall a,b,c\in V:(a+b\in V\wedge(a+b)+c=a+(b+c))$
$\exists e\in V\,\forall a\in V:(e+a=a\wedge\exists m\in V:m+a=e)$

It is not too hard to see that the first axiom is just closure and associativity and the second is existence of an identity element and inverses. Now, let us define a ring, which shows off the "extensible" part of our extensible structures. Define the function $\cdot_r(W)=W(3)$. Letting $\cdot=\cdot_r(W)$, $W$ is a ring iff $W$ is an abelian group (which adds the axiom $\forall a,b:a+b=b+a$ to the group axioms) and:

$\forall x,y,z\in V:$

$x\cdot y\in V$
$(x\cdot y)\cdot z=x\cdot(y\cdot z)$
$x\cdot(y+z)=(x\cdot y)+(x\cdot z)$
$(x+y)\cdot z=(x\cdot z)+(y\cdot z)$

$\exists u\in V\,\forall x\in V:u\cdot x=x=x\cdot u$

Here we have closure, associativity of multiplication, left- and right-distributivity, and existence of a multiplicative unit. (This is a unital ring.) Note how now we are demanding that $W$ contain at least three meaningful operations, stored at $W(1)$, $W(2)$, $W(3)$, so that a ring still satisfies the axioms of a group, even though it has "extra elements" that were not mentioned in the group definition.

Finally, let us define a left module. Define $\operatorname{Scalar}(W)=W(5)$ and $\operatorname{vsca}(W)=W(6)$. (We skipped $4$ because that is used for the conjugation of a star field, and we don't want to overlap slot numbers.) A left module is an abelian group $W$ such that $F=\operatorname{Scalar}(W)$ is a ring and (abbreviating the operations of $F$ by a subscript $F$, and abbreviating $\cdot_s=\operatorname{vsca}(W)$ and $K=\operatorname{Base}(F)$):

$\forall q,r\in K,x,w\in V:$

$r\cdot_s w\in V$
$r\cdot_s(w+x)=(r\cdot_sw)+(r\cdot_sx)$
$(q+_Fr)\cdot_sw=(q\cdot_sw)+(r\cdot_sw)$
$(q\cdot_Fr)\cdot_sw=q\cdot_s(r\cdot_sw)$
$1_F\cdot_sw=w$

These are most of the basic axioms of a vector space, except that there are no commutativity-type axioms for the multiplication operations $\cdot_F$ and $\cdot_s$, and there is no division. The only difference between a left module and a (left) vector space is the ability to divide elements: a left vector space is a left module such that $\operatorname{Scalar}(W)$ is a division ring; a division ring is a ring such that $\langle V\setminus\{0\},\cdot\rangle$ is a group (which is a fancy way of saying that for any $x\ne0$, there is a $y$ such that $x\cdot y=1$).

We need a few more definitions before we can get to the proof: a subspace of the left module $W$ (using the same abbreviations as above) is a subset $S$ of $V$ such that $a\cdot_sx+y\in S$ for any $a\in K$, $x,y\in S$; the span of a subset $S$ of $V$ is the smallest subspace containing $S$; and a basis is a set $B\subseteq V$ such that $\DeclareMathOperator{\span}{span}\span(B)=V$ and $x\notin\span(B\setminus\{x\})$ for all $x\in B$. A set satisfying only the first condition is called a spanning set, while a set satisfying only the second condition is called linearly independent.

Theorem (lbsex): Any vector space has a basis.

Proof: This is a special case of the following more general theorem, which we prove instead.

Theorem (lbsext): For any linearly independent subset $C\subseteq V$, there is a basis $B\supseteq C$ for the vector space $W$.

Proof: This theorem is a well-known equivalent to the Axiom of Choice, so it's no surprise that we will be using Zorn's lemma (zornn0) as part of the proof. The statement that we will be using says that if $S$ is a nonempty collection of sets such that for all nonempty $y\subseteq S$ with $\forall s,t\in y:(s\subseteq t\vee t\subseteq s)$, we also have $\bigcup y\in S$, then there is an $s\in S$ with no proper superset in $S$. This is usually read as "if any chain in $S$ has an upper bound in $S$, then $S$ has a maximal element", but this requires more definitions than necessary for understanding.

The set $S$ in question here is the set of all linearly independent subsets of $V$ containing $C$. This is nonempty because $C\in S$, and so next we have to establish that chains have an upper bound; so suppose $A\subseteq S$ is a nonempty collection of sets such that any two elements are in a subset relation. Actually, we will trade this for the weaker assumption that $A$ is closed under binary union, that is, for all $s,t\in A:s\cup t\in A$; this follows from the chain assumption because if $s\subseteq t$ then $s\cup t=t\in A$ and if $t\subseteq s$ then $s\cup t=s\in A$.

Our goal is to show that $\bigcup A\in S$, which is to say, $\bigcup A$ is a linearly independent subset of $V$ containing $C$. Now clearly $C\subseteq\bigcup A$ because $A$ is nonempty and every element of $A$ has $C$ as a subset; so now we need to show that for any $x\in\bigcup A$, $x\notin\span(\bigcup A\setminus\{x\})$. Rather than exploiting the "finite linear combinations" characterization of span, which we don't have, we leverage the "smallest subspace" characterization, by showing that a particular set is a subspace containing $\bigcup A\setminus\{x\}$.

Lemma (lbsextlem2): Let $T=\bigcup_{u\in A}\span(u\setminus\{x\})$. Then $T$ is a subspace, and $\bigcup A\setminus\{x\}\subseteq T$.

Proof: Let $r\in K,v,w\in T$. Then there is an $m,n\in A$ such that $v\in\span(m\setminus\{x\})$ and an $w\in\span(n\setminus\{x\})$. Since $A$ is closed under binary union, $m\cup n\in A$, and also $v,w\in\span(m\cup n\setminus\{x\})$ because span respects the subset relation. Since a span is a subspace, we then have $rv+w\in\span(m\cup n\setminus\{x\})$, so $rv+w\in T$. Thus $T$ is a subspace. To show $\bigcup A\setminus\{x\}\subseteq T$, let $y\in u\in A$ with $y\ne x$. Then $y\in u\setminus\{x\}\subseteq\span(u\setminus\{x\})$, so $y\in T$. $$\tag*{$\blacksquare$}$$

It follows that $\span(\bigcup A\setminus\{x\})\subseteq T$, so it suffices to show that for any $x\in\bigcup A$, $x\notin T$.

Lemma (lbsextlem3): If $A\subseteq S$ is nonempty and $\forall s,t\in A: s\cup t\in A$, then $\bigcup A\in S$.

Proof: From earlier remarks, all that remains is to show that for all $x\in\bigcup A$, $x\notin T$, so suppose that $x\in y\in A$ and $u\in A$, $x\in\span(u\setminus\{x\})$. Then since $y\cup u\in A$, $y\cup u$ is linearly independent, and so $x\in y\subseteq y\cup u\to x\notin\span(y\cup u\setminus\{x\})$, but since $x\in\span(u\setminus\{x\})\subseteq\span(y\cup u\setminus\{x\})$, this is a contradiction. $$\tag*{$\blacksquare$}$$

This finishes the prerequisites to zornn0, so now suppose that $B\in S$ such that for all $y\in S,y\not\subset B$. We wish to show that this $B$ is a basis containing $C$. It is immediate that $B$ is linearly independent and contains $C$, since $B\in S$; it remains to prove (lbsextlem4) that for any $w\in V$, $w\in\span(B)$. We will need one more lemma, where the division ring property comes into play; we state it now.

Theorem (lspsolv): For any vector space $W$, $A\subseteq V$, and $y\in V$, if $x\in\span(A\cup\{y\})$ but $x\notin\span(A)$, then $y\in\span(A\cup\{x\})$.

Proof: This is clear from the "finite linear combinations" characterization of span; if $x=\sum_{v\in A\cup\{y\}}a_vv$ where only finitely many $a_v$ are nonzero, then the coefficient $a_y$ must be nonzero or else this would show that $x\in\span(A)$, so we can "solve for $y$" in the equation to write $y=a_y^{-1}(x-\sum_{v\in A}a_vv)$ and thus show that $y$ is in the span of $A\cup\{x\}$.

In our characterization, we have to work with subspaces, so the subspace we consider for this proof is $$Q=\{z\in V:\exists r\in K\, z+ry\in\span(A)\}.$$ To show this is a subspace, suppose $a\in K,u,v\in Q$. Then there are $s,t\in K$ such that $u+sy,v+ty\in\span(A)$, so since $\span(A)$ is a subspace, $$(au+v)+(as+t)y=a(u+sy)+(v+ty)\in\span(A),$$ so since $as+t\in K$, $au+v\in Q$ and $Q$ is a subspace.

Furthermore, $A\cup\{y\}\subseteq Q$ because $z\in A\to z+0y\in\span(A)$, and $y+(-1)y=0\in\span(A)$. Therefore $x\in\span(A\cup\{y\})\subseteq Q$, and there is an $r\in K$ such that $x+ry\in\span(A)$. (This is all proved in lspsolvlem.)

Now we are almost there; if $r=0$ then $x\in\span(A)$ against the hypothesis, so by the division property $r^{-1}$ exists, and we can write $y=r^{-1}((x+ry)-x)$, and since $\span(A\cup\{x\})$ is a subspace containing $x$ and $x+ry$, $y$ is also in $\span(A\cup\{x\})$. $$\tag*{$\blacksquare$}$$

Returning to the proof of lbsext, our strategy is to show that if $w\in V\setminus\span(B)$, then $B\cup\{w\}\supset B$ is also linearly independent, contradicting the maximality of $B$. For this to be true, we need that $\forall x\in B\cup\{w\}:x\notin\span(B\cup\{w\}\setminus\{x\})$. For $x=w$, this is easy, because $B\cup\{w\}\setminus\{w\}=B$ and $w\notin\span(B)$ by assumption. If $x\in B$, suppose that $x\in\span(B\cup\{w\}\setminus\{x\})$ for a contradiction. Since $x\ne w$, $B\cup\{w\}\setminus\{x\}=B\setminus\{x\}\cup\{w\}$, so we can apply lspsolv above: $x\in\span(B\setminus\{x\}\cup\{w\})$ but $x\notin\span(B\setminus\{x\})$ (because $B$ is linearly independent), so $w\in\span(B\setminus\{x\}\cup\{x\})=\span(B)$, contrary to hypothesis. Thus $B$ is a basis containing $C$, which completes the proof. $$\tag*{$\blacksquare$}$$

Exploring Metamath

Sunday, August 17, 2014

Extensible structures, and "Every vector space has a basis"

No comments:

Post a Comment