# Formal definition of the free group

Van der Waerden's trick lets us define the free groups in a slick but mostly incomprehensible way.

[summary: The Free group may be constructed formally using van der Waerden's trick, which is not intuitive at all but leads to a definition that is very easy to work with. This page will detail van der Waerden's construction, and will prove that the trick yields a group which has all the right properties to be the free group.]

The Free group may be constructed formally using van der Waerden's trick, which is not intuitive at all but leads to a definition that is very easy to work with. This page will detail van der Waerden's construction, and will prove that the trick yields a group which has all the right properties to be the free group.

# The construction

Write $X^r$ for the set that contains all the freely reduced words over $X \cup X^{-1}$ (so, for instance, excluding the word $aa^{-1}$). %%note:We use the superscript $r$ to denote "reduced".%%

We define the free group $F(X)$, or $FX$, on the set $X$ to be a certain Subgroup of the Symmetric group $\mathrm{Sym}(X^r)$: namely that which is generated by the following elements, one for each $x \in X \cup X^{-1}$:

• $\rho_x : \mathrm{Sym}(X^r) \to \mathrm{Sym}(X^r)$, sending $a_1 a_2 \dots a_n \mapsto a_1 a_2 \dots a_n x$ if $a_n \not = x^{-1}$, and $a_1 a_2 \dots a_{n-1} x^{-1} \mapsto a_1 a_2 \dots a_{n-1}$.
• $\rho_{x^{-1}} : \mathrm{Sym}(X^r) \to \mathrm{Sym}(X^r)$, sending $a_1 a_2 \dots a_n \mapsto a_1 a_2 \dots a_n x^{-1}$ if $a_n \not = x$, and $a_1 a_2 \dots a_{n-1} x \mapsto a_1 a_2 \dots a_{n-1}$.

Recall that each $\rho_x$ lies in $\mathrm{Sym}(X^r)$, so each is a Bijective function from $X^r$ to $X^r$. We specify it by stating what it does to every element of $X^r$ (that is, to every freely reduced word over $X$).

We first specify what it does to those words which don't end in $x^{-1}$: $\rho_x$ simply appends an $x$ to such words. We then specify it to the remaining words, those which do end in $x^{-1}$: then $\rho_x$ just removes the $x^{-1}$.

It's easy to check that if $\rho_x$ is given a freely-reduced word as input, then it produces a freely-reduced word as output, because the only change to the word is at the end and we make sure to provide a separate definition if $x$ is to be cancelled. Therefore each $\rho_x$ is a function $X^r \to X^r$.

Then we do it all again for all the inverses $x^{-1}$, creating the functions $\rho_{x^{-1}}$; and finally, we add in the Identity element, denoted $\rho_{\varepsilon}$, which simply returns its input unchanged.

Notice that the $\rho_x$ and $\rho_{x^{-1}}$ are all indeed bijective (and therefore members of $\mathrm{Sym}(X^r)$), because in fact $\rho_x$ and $\rho_{x^{-1}}$ are inverse to each other (each cancelling off what the other did), and a function with an inverse is bijective.

So, we've defined the free group as a certain subgroup of the symmetric group. Remember that the subgroup has as its group operation "function composition"; so $\rho_x \cdot \rho_y = \rho_x \circ \rho_y$, for instance. We will write $\rho_x \rho_y$ for this, omitting the group operation.

Something key to notice is that if we apply $\rho_{a_n} \rho_{a_{n-1}} \dots \rho_{a_1}$ to the empty word $\varepsilon$, we get $$\rho_{a_n} \rho_{a_{n-1}} \dots \rho_{a_1}(\varepsilon) = \rho_{a_n} \rho_{a_{n-1}} \dots \rho_{a_3}(\rho_{a_2}(a_1)) = \rho_{a_n a_{n-1} \dots a_3}(a_1 a_2) = \dots = a_1 a_2 \dots a_n$$ if $a_1 a_2 \dots a_n$ is a freely reduced word. (Indeed, if the word is freely reduced then none of the successive $\rho_{a_i}, \rho_{a_{i+1}}$ can have cancelled each other's effect out, so every application of a $\rho_{a_i}$ must be appending a letter.) Hence we might hope to have captured the freely reduced words in our subgroup.

# The formal definition is the same as the intuitive definition

We'll show that there is a bijection between the free group and the set of reduced words, by "converting" each reduced word into a corresponding member of the free group.

Take a reduced word, $w = a_1 a_2 \dots a_n$, and produce the member of the free group (that is, the function) $\rho_{a_1} \rho_{a_2} \dots \rho_{a_n}$. %%note:Recall that the group operation here is composition of functions, so this is actually the function $\rho_{a_1} \circ \rho_{a_2} \circ \dots \circ \rho_{a_n}$.%% This really does produce a member of the free group (i.e. of the subgroup of the symmetry group), because each $a_i$ is an element of $X \cup X^{-1}$ and we have already specified how to make $\rho_{a_i}$ from such an element.

Now, we claim that in fact this map is injective: that is, we can't take two words $a_1 a_2 \dots a_n$ and $b_1 b_2 \dots b_m$ and produce the same member of the free group. (That is, we show that $\rho_{a_1} \rho_{a_2} \dots \rho_{a_n} = \rho_{b_1} \rho_{b_2} \dots \rho_{b_m}$ implies $a_1 \dots a_n = b_1 \dots b_m$.) Indeed, if the two functions ("elements of the free group") are equal, then they must in particular do the same thing when they are applied to the empty word $\varepsilon$. But by the "key notice" above, when we evaluate $\rho_{a_1} \rho_{a_2} \dots \rho_{a_n}$ at the empty word, we get $a_n a_{n-1} \dots a_2 a_1$; and when we evaluate $\rho_{b_1} \rho_{b_2} \dots \rho_{b_m}$ at the empty word, we get $b_m b_{m-1} \dots b_2 b_1$; so the two words must be equal after all. [todo: we've got this backwards]

Finally, the map is surjective: we can make any member of the free group by "converting the appropriate reduced word into a function". Indeed, the free group is generated by the $\rho_x$ and $\rho_{x^{-1}}$ for $x \in X$, so every element is some $\rho_{x_1} \dots \rho_{x_n}$ for some selection of $x_1, \dots, x_n \in X \cup X^{-1}$. Note that $x_1 \dots x_n$ need not necessarily be freely reduced as a word at the moment; but if it is indeed not freely reduced, so some $x_i, x_{i+1}$ cancel each other out, then removing that pair completely doesn't change the function $\rho_{x_1} \dots \rho_{x_n}$. For example, $\rho_{x_1} \rho_{x_1^{-1}} \rho_{x_2} = \rho_{x_2}$. Hence the process of "performing one step of a free reduction" (i.e. removing a cancelling pair) doesn't change the member of the free group as a function; and since each such removal makes the word shorter, it must eventually terminate. It remains to show that it doesn't matter in what order we remove the cancelling pairs; but that is immediate because we've already shown that our "conversion" process is injective: we started with a member of the free group, so if it corresponds to a freely reduced word then it corresponds to a unique freely reduced word. Since we've just shown that it does indeed correspond to a freely reduced word (by repeatedly removing cancelling pairs), we are done.

The above shows that the free group can be considered just to be the set of reduced words.