Classical Local Systems

I lied to you a little. I may not get into the arithmetic stuff quite yet. I’m going to talk about some “classical” things in modern language. In the things I’ve been reading lately, these ideas seem to be implicit in everything said. I can’t find this explained thoroughly anywhere. Eventually I want to understand how monodromy relates to bad reduction in the {p}-adic setting. So we’ll start today with the different viewpoints of a local system in the classical sense that are constantly switched between without ever being explained.

You may need to briefly recall the old posts on connections. The goal for the day is to relate the three equivalent notions of a local system, a vector bundle plus flat connection on it, and a representation of the fundamental group. There may be some inaccuracies in this post, because I can’t really find this written anywhere and I don’t fully understand it (that’s why I’m making this post!).

Since I said we’d work in the “classical” setting, let’s just suppose we have a nice smooth variety over the complex numbers, {X}. In this sense, we can actually think about it as a smooth manifold, or complex analytic space. If you want, you can have the picture of a Riemann surface in your head, since the next post will reduce us to that situation.

Suppose we have a vector bundle on {X}, say {E}, together with a connection {\nabla : E\rightarrow E\otimes \Omega^1}. We’ll fix a basepoint {p\in X} that will always secretly be lurking in the background. Let’s try to relate this this connection to a representation of the fundamental group. Well, if we look at some old posts we’ll recall that a choice of connection is exactly the same data as telling you “parallel transport”. So what this means is that if I have some path on {X} it tells me how a vector in the fiber of the vector bundle moves from the starting point to the ending point.

Remember, that we fixed some basepoint {p} already. So if I take some loop based at {p} say {\sigma}, then a vector {V\in E_p} can be transported around that loop to give me another vector {\sigma(V)\in E_p}. If my vector bundle is rank {n}, then {E_p} is just an {n}-dimensional vector space and I’ve now told you an action of the loop space based at {p} on this vector space.

Visualization of a vector being transported around a loop on a torus (yes, I’m horrible at graphics, and I couldn’t even figure out how to label the other vector at p as \sigma (V)):

This doesn’t quite give me a representation of the fundamental group (based at {p}), since we can’t pass to the quotient, i.e. the transport of the vector around a loop that is homotopic to {0} might be non-trivial. We are saved if we started with a flat connection. It can be checked that the flatness assumption gives a trivial action around nullhomotopic loops. Thus the parallel transport only depends on homotopy classes of loops, and we get a group homomorphism {\pi_1(X, p)\rightarrow GL_n(E_p)}.

Modulo a few details, the above process can essentially be reversed, and hence given a representation you can produce a unique pair {(E,\nabla)}, a vector bundle plus flat connection associated to it. This relates the latter two ideas I started with. The one that gave me the most trouble was how local systems fit into the picture. A local system is just a locally constant sheaf of {n}-dimensional vector spaces. At first it didn’t seem likely that the data of a local system should be equivalent to these other two things, since the sheaf is locally constant. This seems like no data at all to work with rather than an entire vector bundle plus flat connection.

Here is why algebraically there is good motivation to believe this. Recall that one can think of a connection as essentially a generalization of a derivative. It is just something that satisfies the Leibniz rule on sections. Recall that we call a section, {s}, horizontal for the connection if {\nabla (s)=0}. But if this is the derivative, this just means that the section should be constant. In this analogy, we see that if we pick a vector bundle plus flat connection, we can form a local system, namely the horizontal sections (which are the locally constant functions). If you want an exercise to see that the analogy is actually a special case, take the vector bundle to be the globally trivial line bundle {\mathcal{O}_X} and the connection to be the honest exterior derivative {d:\mathcal{O}_X\rightarrow \Omega^1}.

The process can be reversed again, and given any locally constant sheaf of vector spaces, you can cook up a vector bundle and flat connection whose horizontal sections are precisely the sections of the sheaf. Thus our three seemingly different notions are actually all equivalent. I should point out that part of my oversight on the local system side was thinking that a locally constant sheaf somehow doesn’t contain much information. Recall that it is still a sheaf, so we can be associating lots of information on large open sets and we still have restriction homomorphisms giving data as well. Next time we’ll talk about some classical theorems in differential equation theory that are most easily proved and stated in this framework.


Irreducible Character Basis

I’d just like to expand a little on the topic of the irreducible characters being a basis for the class functions of a group cf(G) from two times ago.

Let’s put an inner product on cf(G). Suppose \alpha, \beta \in cf(G). Then define \displaystyle \langle \alpha, \beta \rangle =\frac{1}{|G|}\sum_{g\in G} \alpha(g)\overline{\beta(g)}.

The proof of the day is that the irreducible characters actually form an orthonormal basis of cf(G) with respect to this inner product.

Let e_i=\sum_{g\in G} a_{ig}g. Then we have that a_{ig}=\frac{n_i\chi_i(g^{-1})}{|G|} (although just a straightforward calculation, it is not all that short, so we’ll skip it for now). Thus e_j=\frac{1}{|G|}\sum n_j\chi_j(g^{-1})g.

So now examine \frac{\chi_i(e_j)}{n_j}=\frac{1}{|G|}\sum \chi_j(g^{-1})\chi_i(g)
=\frac{1}{|G|}\sum \chi_i(g)\overline{\chi_j(g)}
= \langle \chi_i, \chi_j \rangle.

Where we note that since \chi_j is a character \chi_j(g^{-1})=\overline{\chi_j(g)}. Thus we have that \langle \chi_i, \chi_j \rangle = \delta_{ij}.

This fact can be used to get some neat results about the character table of a group, and as consequences of those we get new ways to prove lots of familiar things, like |G|=\sum n_i^2 where the n_i are the degrees of the characters. You also get a new proof of Burnside’s Lemma. I’m not very interested in any of these things, though.

I may move on to induced representations and induced characters. I may think of something entirely new to start in on. I haven’t decided yet.

Class sums

Let’s define a new concept that seems to be really important in algebraic number theory, that will help us peek inside some of the things we’ve been seeing.

Let C_j be a conjugacy class in a finite group. Then we call z_j=\sum_{g\in C_j}g a class sum (for pretty obvious reasons, it is the sum of all the elements in a conjugacy class).

Lemma: The number of conjugacy classes in a finite group G is the dimension of the center of the group ring. Or if we let r denote the number of conjugacy classes, then r=dim_k(Z(kG)).

We prove this by showing that the class sums form a basis. First, given a class sum, we show that z_j\in Z(kG). Well, let h\in G, then hz_j h^{-1}=z_j\Rightarrow hz_j=z_j h, since conjugation just permutes elements of the conjugacy class, thus they live in the right place. They are also linearly independent since the elements of the sums z_j and z_k are disjoint (they are orbits which partition the group) if j\neq k.

Now all we need is that they span. Let u=\sum a_gg\in Z(kG). Then for any h\in G, we have that huh^{-1}=u, so by comparing coefficients, a_{hgh^{-1}}=a_g for all g\in G. This gives that all the coefficients on elements in the same conjugacy class are the same, i.e. we can factor out that coefficient and have the class sum left over. Thus u is a linear combination of the class sums, and hence they span.

As a corollary we get that the number of simple components of kG is the same as the number of conjugacy classes of G. This is because Z(M_{n_i}(k)) is the subspace of scalar matrices. So if there are m simple components, we get 1 dimension for each of these by our decomposition in Artin-Wedderburn and so r=Z(kG)=m.

Another consequence is that the number of irreducible k-representations of a finite group is equal to the number of its conjugacy classes.

The proof is just to note that the number of simple kG-modules is precisely the number of simple components of kG which correspond bijectively with the irreducible k-representations, and now I refer to the paragraph above.

Now we can compute \mathbb{C}S_3 in a different way and confirm our answer from before. We know that it is 6 dimensional, since the dimension is the order of the group. We also know that there are three conjugacy classes, so there are three simple components, so the dimensions of these must be 1, 1, and 4. Thus \mathbb{C}S_3\cong \mathbb{C}\times\mathbb{C}\times M_2(\mathbb{C}).

If we want another quick one. Let Q_8 be the quaternion group of order 8. Then try to figure out why \mathbb{C}Q_8\cong \mathbb{C}^4\times M_2(\mathbb{C}).

So I think I’m sort of done with Artin-Wedderburn and its consequences for now. Maybe I’ll move on to some character theory as Akhil brought up in the last post…

A-W Consequences

I said I’d do the uniqueness part of Artin-Wedderburn, but I’ve decided not to prove it. Here is the statement: Every left semisimple ring R is a direct product R\cong M_{n_1}(\Delta_1)\times\cdots \times M_{n_m}(\Delta_m) where \Delta_i are division rings (so far the same as before), and the numbers m, n_i, and the division rings \Delta_i are uniquely determined by R.

The statement here is important since if we can figure one of those pieces of information out by some means, then we’ve completely figured it out, but I think the proof is rather unenlightening since it is just fiddling with simple components.

Let’s use this to write down the structure of kG where G is finite and k algebraically closed with characteristic not dividing |G|. This is due to Molien: then kG\cong M_{n_1}(k)\times\cdots \times M_{n_m}(k).

By Maschke we know that kG is semisimple, and by Artin-Wedderburn, we get then that kG\cong\prod M_{n_i}(\Delta_i). In fact, the proof of Artin-Wedderburn even tells us that \Delta_i=End_{kG}(L_i) where L_i is a minimal left ideal of kG. Thus, given some minimal left ideal L, it suffices to show that \Delta=End_{kG}(L)\cong k.

Note that \Delta is a subspace of kG as a vector space over k. Thus it is finite dimensional. Now we have both L and \Delta as finite dimensional vector spaces (over k). Let a\in k, then this element acts on L by u\mapsto au. But au=ua, so k\subset Z(\Delta). Choose d\in \Delta, then adjoin it to k: k(d). Since this is commutative, and a subdivision ring, it is a field. i.e. k(d)/k as a field extension is finite, and hence algebraic, so d is algebraic over k. But we assumed k algebraically closed, so d\in k. Thus \Delta=k, and we are done.

As a Corollary to this, we get that under the same hypotheses, |G|=n_1^2+\cdots + n_m^2. This is just counting dimensions under the isomorphism above, since dim_k(kG)=|G| and dim_k(M_{n_i})=n_i^2. Note also that we can always take one of the n_i to be 1, since we always have the trivial representation.

Let’s end today with an example to see how nice this is. Without needing to peek inside or know anything about representations of S_3, we know that \mathbb{C}S_3\cong \mathbb{C}\times\mathbb{C}\times M_2(\mathbb{C}), since the only way to write 6 as the sum of squares is 1+1+1+1+1+1, or 1+1+4, and the first one gives \mathbb{C}^6 which is abelian which can’t happen since S_3 is non-abelian. Thus it must be the second one.

Maschke and Schur

As usual, ordering of presenting this material and level of generality are proving to be difficult decisions. For my purposes, I don’t want to do things as generally as they can be done. But on the other hand, most of the proofs are no harder in the general case, so it seems pointless to avoid generality.

First we prove Maschke’s Theorem. Note that there are lots of related statements and versions of what I’m going to write. This says that if G is a finite group and k is a field whose characteristic does not divide the order of the group, then kG is a left semisimple ring.

Proof: We’ll do this using the “averaging operator.” Let’s use the version of semisimple that every left ideal is a direct summand. Let I be a left ideal of kG. Then since kG can be regarded as a vector space over k, I is a subspace. So there is a subspace, V, such that kG=I\oplus V. We are done if it turns out that V is a left ideal.

Let \pi : kG\to I be the projection. (Since any u=i + v uniquely, define \pi(u)=i.) Now it is equivalent to show that \pi is a kG-map, since then it would be a retract and hence I would be a direct summand. Unfortunately, it is not a kG-map. So we’ll force it to be one by averaging.

Let D:kG\to kG by D(u)=\frac{1}{|G|}\sum_{x\in G}x\pi(x^{-1}u). By our characteristic condition, |G|\neq 0.

Claim: Im(D)\subset I. Let u\in kG and x\in G, then \pi(x^{-1}u)\in I by definition, and I a left ideal, so x\pi(x^{-1}u)\in I. So since I an ideal, the sum is in I which shows the claim.

Claim: D(b)=b for all b\in I. This is just computation, x\pi(x^{-1}b)=xx^{-1}b=b, so D(b)=\frac{1}{|G|}(|G|b)=b.

Claim: D is a kG-map. i.e. we want to prove that D(gu)=gD(u) for g\in G and u\in kG. Here the averaging pays off:

\displaystyle gD(u)=\frac{1}{|G|}\sum_{x\in G}gx\pi(x^{-1}u)
\displaystyle = \frac{1}{|G|}\sum_{x\in G} gx\pi(x^{-1}g^{-1}gu)
\displaystyle = \frac{1}{|G|}\sum_{y=gx\in G} y\pi(y^{-1}gu)
= D(gu).

Thus we have proved Maschke’s Theorem.

The other tool we’ll need next time is that of Schur’s Lemma: Let M and N be simple left R-modules. Then every non-zero R-map f:M\to N is an iso. And End_R(M) is a division ring.

Proof: \ker f\neq M since it is a non-zero map. And so ker f=\{0\} since it is a submodule, so we have an injection. Likewise, im f is a submodule, and hence must be all of N, so we have surjection and hence an iso. The other part of the lemma is just noting that since every map in End_R(M) that is non-zero is an iso, it has an inverse.

Next time I’ll talk about how some of these things relate to representations.

Representation Theory III

Let’s set up some notation first. Recall that if \phi: G\to GL(V) is a representation, then it makes V into a kG-module. Let’s denote this module by V^\phi. Now we want to prove that given two representations into GL(V), that V^\phi \cong V^\sigma if and only if there is an invertible linear transformation T: V \to V such that T(\phi(x))=\sigma(T(x)) for every x\in G.

The proof of this is basically unwinding definitions: Let T: V^\phi \to V^\sigma be a kG-module isomorphism. Then for free we get T(xv)=xT(v) for x\in G and v\in V is vector space iso. Now note that the multiplication in V^\phi is xv=\phi(x)(v) and in V^\sigma it is xv=\sigma(x)(v). So T(xv)=xT(v)\Rightarrow T(\phi(x)(v))=\sigma(x)(T(v)). Which is what we needed to show. The converse is even easier. Just check that the T is a kG-module iso by checking it preserves scalar multiplication.

This should look really familiar (especially if you are picking a basis and thinking in terms of matrices). We’ll say that T intertwines \phi and \sigma. Essentially this is the same notion as similar matrices.

Now we will define some more concepts. Let \phi: G\to GL(V) be a representation. Then if W\subset V is a subspace, then it is “\phi-invariant” if \phi(x)(W)\subset W for all x\in G. If the only \phi-invariant subspaces are 0 and V, then we say \phi is irreducible.

Let’s look at what happens if \phi is reducible. Let W be a proper non-trivial \phi-invariant subspace.Then we can take a basis for W and extend it to a basis for V such that the matrix \phi(x)=\left(\begin{matrix} A(x) & C(x) \\ 0 & B(x) \end{matrix}\right)
and A(x) and B(x) are matrix representations of G (the degrees being dim W and dim(V/W) respectively).

In fact, given a representation on V, \phi and a representation on W, \psi, we have a representation on V\oplus W, \phi \oplus \psi given in the obvious way: (\phi \oplus \psi)(x) : (v, w)\mapsto (\phi(x)v, \psi(x)w). The matrix representation in the basis \{(v_i, 0)\}\cup \{(0, w_j)\} is just \left(\begin{matrix}\phi(x) & 0 \\ 0 & \psi(x)\end{matrix}\right) (hence is reducible since it has both V\oplus 0 and 0\oplus W as invariant subspaces).

I’m going to continue with representation theory, but I’ll start titling more appropriately now that the basics have sort of been laid out.

Representation Theory I

I know everyone and their brother does a series of posts on basic representation theory, and I said I would try to avoid very overtly repeat posts of other math blogs, but I can’t help it. I don’t know representation theory very well at all, and I feel my time has come to wrestle with the beast. This is one of the main points of this blog, so may as well try.

Goal: Carefully build as slowly as I can all the way to the Artin-Wedderburn Theorem.

Beginning: What is a representation. Well, let G be a finite group and V any finite dimensional vector space over \mathbb{C}. Then a representation of G is a just a homomorphism \phi: G\to GL(V). So we can already see some nice uses of this. If we choose a basis, then we get a matrix representation (every element is sent to some invertible matrix, and the group operation is preserved). It would be even nicer if this were an embedding, so that we could actually think of the elements of our group as matrices. We say a 1-1 representation is faithful.

We have an arsenal of examples already, but probably don’t even realize it. The trivial representation is just sending every group element to the identity transformation. What I like to call the “almost trivial representation” (term my own, so don’t use this somewhere and expect people to know what you are talking about), is to embed G in S_n for some large n, which we know is possible. Then under this embedding, a group element is either even or odd. If it is even, send it to the identity transformation. If it is odd, send it to the negative identity transformation. Probably a better way to say this is: a representation of S_n is, \phi(x)=sgn(x)1_V.

Let’s define the group ring (aka the group algebra). Let k be a field and G a finite group. Then kG is the set \{\sum a_g g : a_g\in k \ and \ g\in G\}. In a sense, we have formed a vector space over k with basis G. Our addition is \sum a_g g+\sum b_g g=\sum (a_g+b_g) g. Our multiplication requires slightly more effort: (\sum a_g g)(\sum b_g g)=\sum a_g b_h gh = \sum_x (\sum_{gh=x}a_g b_h) x.

This structure will be of great importance soon, but I don’t want to throw too much out there at once. Remember, we’re going to go slowly. But if you want to think ahead, the first thing of tomorrow’s post will be that a representation equips the vector space V with the structure of a \mathbb{C}G-module. And there was nothing special about \mathbb{C} there. A k-representation equips V with the structure of a kG-module.