The Normal Basis Theorem


Just for this post, I’ll specialize my definition of “character” to the case where my representation is 1-dimensional. This means that G\to GL_1(V)=k^\times, so a the trace of each element of this map (i.e. the character afforded by this representation) is just a homomorphism to k^\times.

A famous theorem of Dedekind says that every list of distinct characters of a group, G, into some field, k, is linearly independent. I’m not going to prove this, since I proved something very similar when proving Hilbert’s Theorem 90 (in fact, I may have done it already). Anyway, it is just your standard induction and contradiction combination. There is a slick trick in there if you attempt to do it yourself, though, so beware.

I’d like to quickly prove a nice important and useful result that seems to escape the standard curriculum. It is called the Normal Basis Theorem. It basically says that any Galois field extension, if interpreted as a vector space, has as a basis the roots of a single irreducible polynomial over the base field. I’ll explain a little better when we actually get there.

First let’s set up a lemma: If L/k is Galois with Gal(L/k)=\{\sigma_1, \ldots , \sigma_n\} and e_1, \ldots , e_n are a basis for L over k, then W=\{(\sigma_1(e_i), \ldots , \sigma_n(e_i)) \}_{i=1}^n is an L-basis for L^n.

Proof: Let S=span_L(W). So the lemma claims that S=L^n. Suppose not. Then there exists some 0\neq \delta\in (L^n)^* (the dual space) such that \delta vanishes on some element of S.

Since this is finite dimensional there is some l=(l_1, \ldots , l_n)\in L^n, so that we have the representation \delta(\cdot)=\langle \cdot, l\rangle (the standard dot product). So if \delta vanishes on S, then \langle (\sigma_1(e_i), \ldots , \sigma_n(e_i)), l\rangle=0 for all i. i.e. \sum l_k\sigma_k(e_i)=0 for all i, contradicting Dedekind, since some l_k was assumed non-zero.

Alright. Time to prove the big theorem now. Every Galois extension has a normal basis.

Proof: Assume notation from the lemma. We want to construct a normal basis for L. If x\in L, then \{\sigma_1(x), \ldots, \sigma_n(x)\} are linearly dependent over k. So there are some not-all-zero constants such that \sum k_j\sigma_j(x)=0. Now by applying \sigma_i^{-1} we get that \sum k_j(\sigma_i^{-1}\sigma_j)(x)=0

Note that A=(\sigma_i^{-1}\sigma_j)(x) is an nxn matrix. And that A (k_1, \ldots , k_n)^T\in k^n. The sum above tells us that A(k_1, \ldots , k_n)^T=(0, \ldots , 0)^T. We want to choose x\in L such that we force all k_1, \ldots, k_n to be zero. This will be the case if A is invertible.

Let \{e_1, \ldots , e_n\} be a basis for L over k. So any x=\sum b_ke_k. i.e. we have that (\sigma_i^{-1}\sigma_j)(x)=\sum_l b_l(\sigma_i^{-1}\sigma_j)(e_l)

Now let p(x_1, \ldots, x_n)=det\left(\sum (\sigma_i^{-1}\sigma_j)(e_l)x_l\right)\in k[x_1, \ldots , x_n]. Let \sigma_1=id\in Gal(L/k). Then by the Lemma, we have some constants in L c_i such that \sum c_i(\sigma_1(e_i), \ldots , \sigma_n(e_i))=(1, 0, \ldots , 0).

But component-wise we see this is \sum_k (\sigma_i^{-1}\sigma_j)(e_k)c_k=\begin{cases} 1, \ if \ \sigma_i=\sigma_j \\ 0 \ else \end{cases}.

i.e. we have that p(c_1, \ldots , c_n)=1 which shows that it is not equivalently 0.

Suppose k is infinite. Then there are b_1, \ldots b_n\in k such that p(b_1, \ldots, b_n)\neq 0.

Thus 0\neq p(b_1, \ldots , b_n)
= det\left(\sum_k (\sigma_i^{-1}\sigma_j)(e_k)b_k\right)
= det\left(\sigma_i^{-1}\sigma_j\left(\sum_k b_ke_k\right)\right)
= det(\sigma_i^{-1}\sigma_j(x))=det(A)

Thus \sigma_1(x), \dots, \sigma_n(x) are linearly independent and hence span L, so they are the normal basis.

Unfortunately, we made that assumption that k was infinite for a quick step. And in fact, there is no easy remedy to get the finite case. It is still true there, but we need to take an entirely new approach. I’m not sure if I’m going to do this next time.

Advertisements

10 thoughts on “The Normal Basis Theorem

  1. I’d say a classic text on representation theory is Fulton and Harris’ Representation Theory: A First Course. I’ve also been reading through a neat little book called An Introduction to Group Rings by Milies and Sehgal. I’m not sure if either have what you are looking for. To my knowledge there isn’t a well-established notion of the determinant of a group…but it is possible I just haven’t seen it.

  2. I found helpful the notes by Pavel Etingof(http://math.mit.edu/~etingof/replect.pdf), the Curtis-Reiner books, and Serre’s _Linear Representations of Finite Groups._

    I actually think I saw the book you mentioned on group rings (I don’t remember the authors, but I don’t suppose there are that many with the title…) although I did not actually pick it up for lack of time; I didn’t realize it went into representation theory.

  3. I checked Pavel Etingof notes. At least in the introduction it mentioned about the history of representation theory which started from Dedekind letter to Frobenius about this determinant of finite group. Hopefully I’ll find what I looking for.

    Thanks guys……… 😀

  4. The Dedekind’s group determinant is defined as follows:

    Let G be a finite group and consider |G| indeterminates, x_g, one for each g in G. The determinant of G is defined to be the determinant of the matrix (x_{g h^(-1)}) where g and h run over the elements of G (in some fixed arbitrary order).

  5. I hope my comments won’t “pollute” this nice paper. I took the simple example of a quadratic extension to find a normal basis and follow your proof. Say L=\mathbb{Q}(\sqrt{b}), a=\sqrt{b} in short, and the k (power) basis (e_1,e_2)=(1,a) of L, and G=\{e:(a\mapsto a),\sigma:(a\mapsto -a)\}.
    The matrix A is $\smallmatrix{e & \sigma \\ \sigma & e}$. Then computing the determinant polynomial p(x_1,x_2), I found p=(x_1+x_2a)^2-(x_1-x_2a)^2=4x_1x_2a\in L[x_1,x_2], and not in k[x_1,x_2]. How is p G-invariant?
    Then to find c_1,c_2 such that c_1(1,a)+c_2(1,-a)=(1,0), I obtain c_1=c_2=\frac12.
    Then, component wise, for k=1, we have c_1 id(1)+c_2 id(a)=\frac12+\frac12 a, and for k=2, we have c_1 \sigma (e_1)+c_2\sigma (a)=\frac12 -\frac12 a. I don’t get 1 or 0.
    Could you give more explanations on what is wrong in my understanding?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s