Handle Decomposition

Today I’ll just prove that a Morse function will give a handle decomposition of a closed manifold. Let’s use all the notation already set up (meaning critical points, values, attaching maps, dimension, Morse function, gradient-like vector field, etc).

We just induct on the subscripts of critical points. We’ve already done the base case (it is a min and hence a 0-handle from here). So we just need to show that if M_{t} is a handlebody for t\in (c_{i-1}, c_i), then M_{c_i+\varepsilon} is a handlebody with the appropriate handle attached.

So we’ve assumed that we have some decomposition M_{c_{i-1}+\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1}). We also know that we attach a handle of index \lambda_i when crossing c_i, so we do have a diffeo to a manifold M_{c_i-\varepsilon} with a \lambda_i-handle attached with attaching map \phi: \partial D^{\lambda_i}\times D^{m-\lambda_i}\to \partial M_{c_i-\varepsilon}.

Note that [c_{i-1}+\varepsilon, c_i-\varepsilon] contains no critical values, so by flowing along X we get a diffeo M_{c_{i-1}+\varepsilon}\cong M_{c_i-\varepsilon}. Let \psi:M_{c_{i-1}+\varepsilon}\to M_{c_i-\varepsilon} be this diffeo.

So by inductive hypothesis, M_{c_i-\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1}), so we can assume \psi actually maps from the handlebody to M_{c_i-\varepsilon}. Now by composing we get our actual attaching map (note that before now the handle was attached to M_{c_i-\varepsilon} and not the handlebody itself).

i.e. \psi^{-1}\circ \phi : \partial D^{\lambda_i}\times D^{m-\lambda_i}\to \partial (\mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1})). So let \phi_i=\psi^{-1}\circ \phi, and we get that M_{c_i+\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1}, \phi_i), so we are done.

So I sort of dragged on longer than probably necessary there, since there was essentially nothing new. It was just being pedantic about the diffeo of the manifold and the handlebody.

There are some subtleties that should be pointed out, though. The index of the critical point did determine the index of the handle, and we went in “ascending” order. The other much more important and also more subtle point is that the choice of gradient-like vector field was how we constructed the attaching map. So even the same Morse function with a different choice of gradient-like vector field could actually give a “different” handle decomposition when considering attaching maps as part of the data.


Handlebodies II

Let’s think back to our example to model our \lambda-handle (where \lambda is not a max or min). Well, it was a “saddle point”. So it consisted of a both a downward arc and upward arc. If you got close enough, it would probably look like D^1\times D^1.

Well, generally this will fit with our scheme. An n-handle looked like D^n … or better yet D^n\times 0, and a 0-handle looked like 0\times D^n, so maybe it is the case that a \lambda-handle looks like D^\lambda\times D^{n-\lambda}. Let’s call D^\lambda\times 0 the core of the handle, and D^{n-\lambda} the co-core.

By doing the same trick of writing out what our function looks like at a critical point of index \lambda in some small enough neighborhood using the Morse lemma, we could actually prove this, but we’re actually more interested now in how to figure out what happens with M_t as t crosses this point.

By that I mean, it is time to figure out what exactly it is to “attach a \lambda-handle” to the manifold.

Suppose as in the last post that c_i is a critical value of index \lambda. Then I propose that M_{c_i+\varepsilon} is diffeomorphic to M_{c_i-\varepsilon}\cup D^\lambda\times D^{m-\lambda} (sorry again, recall my manifold is actually m-dimensional with n critical values).

I wish I had a good way of making pictures to get some of the intuition behind this across. I’ll try in words. A 1-handle for a 3-manifold, will be D^1\times D^2, i.e. a solid cylinder. So we can think of this as literally a handle that we will bend the cylinder into, and attach those two ends to the existing manifold. This illustration is quite useful in bringing up a concern we should have. Attaching in this manner is going to create “corners” and we want a smooth manifold, so we need to make sure to smooth it out. But we won’t worry about that now, and we’ll just call the smoothed out M_{c_i-\varepsilon}\cup D^\lambda\times D^{m-\lambda}, say M'.

Let’s use our gradient-like vector field again. Let’s choose \varepsilon small enough so that we are in a coordinate chart centered at p_i such that f=-x_1^2-\cdots - x_\lambda^2 + x_{\lambda +1}^2+\cdots + x_m^2 is in standard Morse lemma form.

Let’s see what happens on the core D^\lambda\times 0. At the center, it takes the critical value c_i and it decreases everywhere from there (as we move from 0, only the first \lambda coordinates change). This decreasing goes all the way to the boundary where it is c_i-\varepsilon. Thus it is the upside down bowl (of dimension \lambda). Likewise, the co-core goes from the critical value and increases (as in the right side up bowl) to the boundary of a m-\lambda disk at a value c_i+\delta (where 0<\delta<\varepsilon).

Let's carefully figure out the attaching procedure now. If we think of our 3-manifold for intuition, we want to attach D^\lambda\times D^{m-\lambda} to M_{c_i-\varepsilon} by pasting \partial D^\lambda\times D^{m-\lambda} along \partial M_{c_i-\varepsilon}.

So I haven't talked about attaching procedures in this blog, but basically we want a map \phi: \partial D^\lambda\times D^{m-\lambda}\to \partial M_{c_i-\varepsilon} and then forming the quotient space of the disjoint union under the relation of identifying p\in \partial D^\lambda\times D^{m-\lambda} with \phi (p). Sometimes this is called an adjunction space.

So really \phi is a smooth embedding of a thickened sphere S^{\lambda - 1}, since \partial D^\lambda=S^{\lambda-1}. And the dimensions in which it was thickened is m-\lambda. Think about the "handle" in the 3-dimensional 1-handle case. We gave the two endpoints of line segment (two points = S^0) a 2-dimensional thickening by a disk.

Now it is the same old trick to get the diffeo. The gradient-like vector field, X, flows from \partial M' to \partial M_{c_i+\varepsilon}, so just multiply X by a smooth function that will make M' match M_{c_i+\varepsilon} after some time. This is our diffoemorphism and we are done.

Handlebodies I

We now come to the main point of all these Morse theory posts. We want to somehow figure out what a closed manifold looks like based a Morse function that it admits (who knows how long I’ll develop this theory, maybe we’ll even get to how Smale proved the Poincare Conjecture in dimensions greater than or equal to 5).

Suppose M is closed and f:M\to\mathbb{R} a Morse function. We’ll use the convenient notation M_t=\{p\in M : f(p)\leq t\}. So again, with the height analogy, as t increases, we will be looking at the entire manifold up to that height. Since M is compact, there is some finite interval [a,b] such that M_a=\emptyset and M_b=M.

Note that with essentially no modification, we have already proved the Theorem that if [c,d] contains no critical values, then M_c\cong M_d. So really, the point is to now figure out what happens as we pass through the critical values.

First off, there are only finitely many critical points, and we can assume that each of these has distinct critical values by raising and lowering critical values. So if p_0, \ldots, p_n are the critical points and c_k=f(p_k), we can order the indices so that c_0 < c_1 < \cdots < c_n.

To be explicit, c_0 is the min, so M_t=\emptyset for t < c_0 and M_t=M for t greater than c_n, since c_n is the max (also, wordpress hates inequalities, or me, I haven't decided yet, but it always cuts out lots of stuff and I just have to write the inequality in words).

These two critical points would be a nice place to start our examination. By the Morse lemma and the fact that a min has index 0, we know that there exists a neighborhood of p_0 on which f=x_1^2+\cdots + x_m^2+c_0 (Alright, I’m sorry about that, but I just realized I have n critical points, so the dimension of my manifold is now m).

More explicitly there is some \varepsilon>0 such that M_{c_0+\varepsilon}=\{(x_1, \ldots , x_m) : x_1^2+\cdots + x_m^2\leq \varepsilon\}\cong B^m. So if we are thinking of height (of a 2-dim manifold), we’ll want to visualize this as a “bowl” where you have the bottom of the bowl the min and then it slopes upward along a sphere, and then you have the boundary circle at height c_0+\varepsilon.

So note that the only thing we used about this critical point is that it had index 0. This shape is called a (m-dimensional) 0-handle.

The reverse happens at our max. We have M_{c_n-\varepsilon}=\{(x_1, \ldots , x_m) : x_1^2+\cdots +x_m^2\geq \varepsilon\}, since the critical point has index m. This is an m-handle and thinking in 2-d height, it is a downward facing bowl.

Again, there is nothing special about being the absolute max, any index m critical point will locally be an m-handle.

Index k critical points where k\neq 0,m are more complicated so I’ll leave those for next time.

Now we have a nice overview of how this will work. We just need to figure out what a k-handle looks like, then as t increases through a critical value with index k, M_t will “attach a k-handle”. When we are not near a critical value, the M_t will not change diffeomorphism-type. We just need to make this a little more precise next time (or maybe even the time after).

Altering the Critical Points

I officially have a new favorite search for which someone found this blog: How to write a Japanese satire.

Let’s introduce a new term. Two Morse functions are considered equivalent if they have the same critical points and same index at each critical point.

The hope here is that two equivalent Morse functions will give the same topological data about our manifold, and so we want to develop techniques of altering our Morse function to something extremely nice to work with, but having it be equivalent to the origin one.

Our first excursion into this technique is the following: If M is a compact manifold and f is a Morse function on M, then we can find an equivalent Morse function g such that all the critical values are distinct.

If we’re going back to the height intuition, this is the technique that corresponds to “raising” or “lowering” critical points. So if you have two strange things happening at the same height (two mountain peaks that have the same height), the idea is sort of that you can slightly move the manifold around so that one is now higher than the other. Of course, we won’t actually move the manifold in any real sense, we’re going to construct the function.

This is going to be really nice, because it says that we can always get a Morse function in which only a single “change” can happen at any given height.

We’ll do this by first proving a Lemma which does all the work for us. Let f be our Morse function, and p a critical point. Then there is some \varepsilon>0 such that for all c\in (-\varepsilon, \varepsilon) there is an equivalent Morse function h that has the same critical values as f, except for h(p)=f(p)+c.

The arguments here are essentially the same as in previous posts, so I’ll be a little looser and only outline the proof.

Since the critical points are isolated we can take a small coordinate chart centered at p that contains no other critical points. Now let \psi be a bump function that is 1 on some small neighborhood of p and dies to zero before getting to the edge of the chart.

Then we define h_c=f+c\psi. We definitely have that all the critical points of f are still critical points of h_c and since on a neighborhood of any of those points the functions either agree or differ by adding a constant, they have the same index. Also, h_c(p)=f(p)+c, so we have constructed our desired function as long as we don’t have any extra critical points.

But in the same was as before, \Big|Dh_c\Big|=\Big|Df+cD\psi\Big|\geq \delta-ca>0 for all |c|<\varepsilon where \varepsilon=\delta/a, since we're only concerned with the compact set on which \psi is decaying, Df has a positive min \delta, and D\psi has a finite max a. Thus we do not gain any critical points in that set and we are done.

To get to the whole theorem all we need to do is note that there are only finitely many critical points (since compact). So if any of the values are shared, we can use the lemma to give an equivalent Morse function with shifted critical value, where we shift by a small enough value that it can't make it to any other critical value. We only have to apply this a finite number of times.

Gradient-Like Vector Fields Exist

Now we want to start building some technique that will allow us to figure out what our closed manifold looks like based on the Morse functions it admits.

We’ll call a vector field X, a gradient-like vector field for f, if X\cdot f>0 away from critical points, and if p\in M is a critical point of index \lambda, then there is a coordinate neighborhood about p such that f has the standard form as in the Morse lemma, and X=-2x_1\frac{\partial}{\partial x_1}-\cdots - 2x_{\lambda}\frac{\partial}{\partial x_\lambda}+2x_{\lambda+1}\frac{\partial}{\partial x_{\lambda+1}}+\cdots + 2x_m\frac{\partial}{\partial x_m} (i.e. it is the gradient in this neighborhood).

Intuitively, if we think back to our example, we visualize Morse functions as “height functions”. So we are attempting to construct in some sense an everywhere “upward” pointing vector field. If we’re thinking of the entire manifold flowing along this, then the only places where it is allowed to get “stuck” is at the critical points of f.

The theorem is that there always exists a gradient-like vector field for a Morse function on a compact manifold.

Proof: As before, let \{U_i\}_1^k be a finite subcover of coordinate charts, and \{K_i\}_1^k be a compact refinement. Since the critical points are isolated (immediate corollary to the Morse lemma), there can only be finitely many since our manifold is compact. So we can assume that each critical point has a neighborhood small enough so that it is entirely contained in exactly one of the U_i, and that the U_i were chosen so that f has standard form in those coordinates.

Let \psi_i: U_i\to \mathbb{R} be a bump function for K_i supported in U_i. Then we get a smooth function on the entire manifold by letting \psi_i\equiv 0 outside of U_i.

Let X_i be the gradient of f on U_i. Let \displaystyle X=\sum_{j=1}^k \psi_jX_j. The claim is that this is our gradient-like vector field for f.

Let’s check X\cdot f at non-critical points. If x\in M is not a critical point, and x\in U_i, then (\psi_i X_i\cdot f)(x)>0 since X_i is the gradient and \psi_i(x)>0. All other terms of the sum are 0 since \psi_i(x)=0 for any i such that x\notin U_i. Thus (X\cdot f)(x)>0.

The other condition we have set up to work since each critical point has a neighborhood that is contained in precisely one of the U_i, thus on that neighborhood f is in standard form, and X=\psi_iX_i which is of the correct form. Thus X is gradient-like for f.

As a preview of things to come, I’ll prove our first result about what our manifold looks like using Morse functions. This is often called the Regular Interval Theorem.

Suppose that f has no critical value in [a,b], then M_{[a,b]}=\{p\in M : a\leq f(p)\leq b\} is diffeomorphic to f^{-1}(a)\times [0,1].

Let X be gradient-like for f. Define \displaystyle Y=\frac{1}{X\cdot f}X which is smooth off of the critical points of f, but since M_{[a,b]} contains no critical points it is a smooth vector field there (in fact, on an open set containing M_{[a,b]}).

Let \theta^p(t) be an integral curve for Y starting at p\in f^{-1}(a). But now \displaystyle \frac{d}{dt}\Big|_{t=t_0}f(\theta^p(t))=\frac{d\theta^p}{dt}(t_0)(f)
\displaystyle = Y_{\theta^p(t_0)}(f)
\displaystyle = \frac{1}{X\cdot f}X\cdot f=1.

Thus, the integral curve continues along at constant speed 1 for the entire time it is in M_{[a,b]}. But it starts at f=a at time 0, so it reaches f=b at time t=b-a.

Thus h: f^{-1}(a)\times [0,b-a]\to M_{[a,b]} by (p,t)\mapsto \theta^p(t) is a diffeomorphism. But rescaling gives the diffeo to f^{-1}(a)\times [0,1].

This basically says that between critical points of a Morse function, we must have the manifold looking like cylinder built off of a single slice of the function (if we’re thinking in terms of height, we can pick any height, and at anywhere between the two nearest critical heights, all the level sets will look the same).

Morse Functions Exist

The astute reader at this point may be getting a little anxious that despite the fact that I found Morse function in two easy low dimensional cases, my eventual goal of saying very general things about manifolds by using Morse functions is going to rely on the fact that they exist.

If these thing are really as powerful as I have been making them out to be, then it would seem that there probably isn’t an abundance of them. But surprisingly, it turns out that basically every smooth function is Morse.

Let M^n be a closed manifold, and g:M\to \mathbb{R} be a smooth function. Then there is a Morse function f:M\to\mathbb{R} arbitrarily close to g.

Recall Sard’s Theorem (I’m assuming some familiarity with it, which is probably not a good idea): The set of critical values of a smooth map f: U\to \mathbb{R}^n has measure zero in \mathbb{R}^n.

Now we’ll first need a lemma. Let U\subset \mathbb{R}^n be an open set and f:U\to\mathbb{R} a smooth function. Then there are real numbers \{a_k\} such that f(x_1, \ldots, x_n)-(a_1x_1+a_2x_2+\cdots + a_nx_n) is a Morse function on U. We can also choose \{a_k\} to be arbitrarily small in absolute value.

Let p\in U be a critical point of f. Define h=Jac(f)^T (a smooth map h:U\to\mathbb{R}^n). Then Jac(h)\Big|_p is the Hessian H_f(p). Thus, p is a critical point of h iff det(H_f(p))=0.

By Sard’s Theorem, we can choose a=(a_1, \ldots , a_n)\in\mathbb{R}^n where each a_k have arbitrarily small absolute value such that a is not a critical value of h.

The claim is that \overline{f}(x_1, \ldots , x_n)=f(x_1, \ldots, x_n)-(a_1x_1+\cdots + a_nx_n) is a Morse function on U.

Well, if p is a critical point of \overline{f}, then since \frac{\partial \overline{f}}{\partial x_i}\Big|_p=\frac{\partial f}{\partial x_i}\Big|_p - a_i=0, by the definition of h, we get h(p)=a.

But we chose a to not be a critical value of h. Thus, p is not a critical point of h. So as noted, det(H_f(p))\neq 0. But H_f(p)=H_{\overline{f}}(p), so p is a non-degenerate critical point. Since p was an arbitrary critical point, all critical points are non-degenerate and hence \overline{f} is Morse, completing the proof of the Lemma.

We also need another Lemma. Let K\subset M be a compact subset. Then if g:M\to\mathbb{R} has no degenerate critical points in K, then we can choose \varepsilon >0 small enough so that any C^2 approximation of g also has no degenerate critical points in K.

Since our manifold is closed, it is compact. So we can choose a finite subcover of coordinate charts, and compactly refine it (I’ll do this construction if someone asks in the comments), so that \{U_i\}_1^m cover M and there are compact sets K_i\subset U_i such that \cup K_i=M.

But with this, we can look at any of the U_k, and in these coordinates, g has no degenerate critical points in K\cap K_k (alright, that was probably a poor choice of notation) iff \displaystyle\Big|\frac{\partial g}{\partial x_1}\Big|+\cdots + \Big|\frac{\partial g}{\partial x_n}\Big|+\Big| det(H_g)\Big|>0 for every point in K\cap K_k.

But for a small enough \varepsilon we can definitely still make that inequality hold for any C^2 approximation. Thus we have proved the lemma.

Now let’s do the actual existence proof. Take the U_i, K_i as before. We will inductively build our C^2 approximations on C_l=K_1\cup \cdots \cup K_l. Our base step is to build f_0 on C_0=\emptyset, so we’re done.

For our inductive hypothesis, suppose we have f_{l-1}:M\to\mathbb{R} having no degenerate critical points in C_{l-1}.

Let’s work with the coordinate neighborhood U_l with coordinates (x_i). By the first lemma, there are arbitrarily small numbers \{a_i\} so that f_{l-1}(x_1, \ldots , x_n)-(a_1x_1+\cdots + a_nx_n) is Morse on U_l. But note, we only have a definition on U_l and we need one everywhere.

Let \psi be a bump function that is 1 on K_l and supported in V, where K_l\subset V\subset U_l.

Define f_l=\begin{cases} f_{l-1}-\psi\cdot (a_1x_1+\cdots a_nx_m) & in \ U_l \\  f_{l-1} & outside \ V\end{cases}.

(So I have this same cases problem again, just ignore the “line break” symbol, it is actually readable this time).

This gives us a nice well-defined function on all of M (just need to check the overlaps). Also f_l is our first lemma function on K_l, so it is Morse on K_l and hence has no degenerate critical points there.

Since 0\leq \psi \leq 1 (and we’re on a compact set), we can make \{a_i\} small enough so that f_l is an arbitrarily close C^2 approximation of f_{l-1} (I won’t do this since it is fairly long and tedious, but quite straightforward for the reasons I gave).

But now by the second lemma, since f_{l-1} has no degenerate critical points in C_{l-1}, we have that f_l has no degenerate critical points in C_{l-1} either. We already checked on K_l, and thus there are no deg. critical points on C_{l-1}\cup K_l=C_l.

Thus inductively we can get a Morse function on all of M that is C^2-close to our original smooth function.

The Morse Lemma

Today we prove what is known as The Morse Lemma. It tells us exactly what our Morse function looks like near its critical points.

Let p\in M be a non-degenerate critical point of f:M\to \mathbb{R}. Then we can choose coordinates about p, (x_i), such that in these coordinates f=-x_1^2-x_2^2-\cdots -x_\lambda^2+x_{\lambda+1}^2+\cdots +x_n^2+f(p). Moreover, \lambda is the index of the critical point. (Note that 0\mapsto f(p)).

Proof: Choose local coordinates, (x_i), centered at p. Without loss of generality f(p)=0 by replacing f with f-f(p). Thus in coordinates, since p corresponds to 0, f(0)=0 (it is a little sloppy, but I’ll probably call the actual function and the function in coordinates the same thing and go back and forth).

By a general theorem of multi-variable calculus (I don’t know if it has a name, it might be Taylor’s theorem? I always get confused at how much is actually included in that), we have smooth functions g_1, \ldots, g_n such that f(x_1, \ldots, x_n)=\sum_{k=1}^n x_ig_i(x_1, \ldots, x_n) and \displaystyle \frac{\partial f}{\partial x_i}\Big|_0=g_i(0).

But 0 is a critical point of f, so g_i(0)=0 and we can apply the theorem again to each g_i. We’ll suggestively call the smooth functions g_k(x_1, \ldots, x_n)=\sum_{i=1}^n x_i h_{ki}(x_1, \ldots, x_n).

Thus, we now have \displaystyle f=\sum_{k,i}x_kx_i h_{ki}. Let \displaystyle H_{ki}=\frac{(h_{ki}+h_{ik})}{2}.

Then \displaystyle f=\sum_{k, i}x_kx_i H_{ki}, and H_{ki}=H_{ik}.

But in that form we see that the second partial derivatives are \displaystyle \frac{\partial^2 f}{\partial x_k \partial x_i}\Big|_0=2H_{ki}(0).

By assumption 0 is a non-degenerate critical point, so det(H_{ki}(0))\neq 0 and hence we can apply a linear transformation to our current coordinates and get that \frac{\partial^2 f}{\partial x_1^2}\Big|_0\neq 0. Thus H_{11}(0)\neq 0.

Now H_{11} is continuous, so that means it is non-zero in a neighborhood of 0.

Let (y_1, x_2, \ldots, x_n) be a new coordinate neighborhood where y_1=\sqrt{|H_{11}|}\left(x_1+\sum_{i=2}^n x_i\frac{H_{1i}}{H_{11}}\right). (Note this is actually a coordinate system, since the determinant of the Jacobian of the transformation from this one to the old one is non-zero).

Now \displaystyle y_1^2=|H_{11}|\left(x_1+\sum_{i=2}^nx_i \frac{H_{1i}}{H_{11}}\right)^2
= H_{11}x_1^2 + 2\sum_{i=2} x_1x_i H_{1i} +\left(\sum_{i=2} x_i H_{1i}\right)^2/H_{11} if H_{11}>0, and the same thing with minus signs everywhere if H_{11} is negative.

Thus the function is y_1^2+\sum_{i,j=2}x_ix_jH_{ij}-\left(\sum_{i=2} x_i H_{1i}\right)^2/H_{11} if H_{11}>0 or
-y_1^2 +\sum_{i,j=2} x_ix_j H_{ij} -\left(\sum_{i=2}x_i H_{1i}\right)^2/H_{11} otherwise.

(I awkwardly wrote this with words, because I couldn’t get cases to look right, and was having weird errors I couldn’t figure).

Now just isolate the stuff after the \pm y_1^2. It satisfies the same conditions as f, but has fewer variables, so we can induct on the number of variables until we have f(y_1, \ldots , y_n)=-y_1^2-\cdots - y_\lambda^2 +y_{\lambda +1}^2+\cdots +y_n^2.

And since the plus and minus signs came from changing basis to put the Hessian into diagonal form with plus and minus 1’s, the number of minus signs is indeed the index.

The proof of this tended to be sort of tedious to check everything, so don’t worry if you didn’t go through it. I don’t think there is really insight you get from going through it. This is one of those rare instances that I think the result is more important than the proof.

Now we have real good reason to believe the index will be n or 0 if we are at a local max or min. What does a max or min look like near the point? Well, it slopes all in the same direction, i.e. it will locally look like a sphere. But this is exactly what the Morse lemma tells us about index n and 0 critical points. We’ll make this more precise later.

I wasn’t sure how I was going to proceed. My two options seemed to be to build the Morse theory I need for Lefschetz, and then do Lefschetz, then come back to Morse theory. But I think I’m just going to continue as far as I want to go ignoring what is needed for the Hyperplane Theorem, then reference what I need.

A Better Example

The example I gave last time was awful I’ve realized. I need something a little more complicated to better motivate why we’d believe some of these things, and to illustrate what happens in certain situations.

So let’s take a surface embedded in \mathbb{R}^3 given by the equation z=x^2(x+1)-y^2. It is a “mountain landscape”:

(Created by WolframAlpha)
(Created by WolframAlpha)

It might be hard to tell, but there is the one peak, and it forever decreases to the general left, and forever increases to the general right.

We have a global chart to work with. Our Morse function will again be the “height function”. So f(x,y,z)=z. We have two critical points. One will occur when we reach the “saddle point” at z=0 and one when we reach the peak of the mountain at z=4/27. Since this is a conceptual example, I won’t go through all the technical stuff to show that this is actually a Morse function.

Now as we stated before, at a non-critical value, i.e. a regular value, the level set is an embedded submanifold. Thus if c<0, then f(x,y,z)=c is something that vaguely looks like:


This is because it is below the saddle point. As the height increases to 0, our level set starts to close in, and when we reach f(x,y,z)=0, we get:


This is not an embedded submanifold, because the point of intersection is not locally Euclidean. Continuing up, we get that 0<c<4/27 will look something like:


Then we hit the critical value c=4/27:


It doesn’t show up on the graph, but there is a point at (-2/3, 0) which is why this one is not a manifold. There would be no well-defined dimension since it is a 0-dimensional object union a 1-dimensional object. Then everything above 4/27 looks like the last picture but without the dot.

Let’s analyze a little bit. Between critical values all of our embedded submanifolds seemed to be diffeomorphic to each other (you may be able to guess the proof of this even if you haven’t seen it). But when we cross a critical value we don’t even maintain homotopy type.

If anyone actually worked out the math behind this example, then they would see that we also now have an example of an index 1 critical point at the saddle. The top of the mountain is index 2, which still fits with the local min/max conjecture from last time.

I may post again later today on actual Morse theory, but I decided that I really needed a better example to reference once we got going.

What is a Morse function?

A Morse function is a smooth function from a smooth manifold, M, to \mathbb{R} that is in some sense “non-degenerate.”

Suppose p\in M, then define the Hessian of f at p, H_f(p): T_pM\times T_pM\to \mathbb{R} to be the bilinear form that sends \displaystyle (\frac{\partial}{\partial x^i}, \frac{\partial}{\partial x^j})\mapsto \frac{\partial^2f}{\partial x_i\partial x_j}\Big|_p.

So picking a basis, the Hessian is just the matrix of second partial derivatives.

Now we call f:M\to\mathbb{R} a Morse function if for any a\in\mathbb{R} we have f^{-1}((-\infty, a]) is compact, and for any critical point of f (the derivative is 0), then H_f(p) is non-singular. So in matrix form, it would have non-zero determinant. In bilinear form terms, it is non-degenerate or zero is not an eigenvalue.

The index at p, is the index of the Hessian at p as a bilinear form. Recall that the index of a bilinear form is the maximal dimension of a linear subspace such that the form is negative definite. (This is sort of backwards of the intuition of counting how big the positive dimension can be. So note that a form is positive semidefinite iff it has index 0).

We really have to check that the property of being a “Morse function” actually is a well-defined concept for smooth manifolds. i.e. is it a diffeomorphism invariant?

We’ll work locally in coordinates. Suppose we have \phi : V\to U a diffeomorphism where \phi(q)=p. Define g= f\circ \phi (i.e. the change of coordinates of our so-called Morse function). The well-defined claim is that q is a critical point of g and that the Hessian of g at p is non-singular.

Well, the critical point claim is just the chain rule. Now we’ll actually compute the Hessian. I propose that it is H_g(q)=(D\phi(q))^T H_f(p) (D\phi(q)) to make it easier to follow.

We’ll do the right hand side first. The j-th column \displaystyle (H_f(p)D\phi(q))_j= \left(\sum_{l=1}^n \frac{\partial^2 f}{\partial x_1\partial x_l}(\phi(q))\frac{\partial \phi^l}{\partial x_j}(q)\cdots \sum_{l=1}^n \frac{\partial^2 f}{\partial x_n\partial x_l}(\phi(q))\frac{\partial \phi^l}{\partial x_j}(q)\right).

Thus the i-j entry of the right side is multiplying on the left by the ith row of (D\phi(q))^T, which gives \displaystyle \sum_{k=1}^n \sum_{l=1}^n \frac{\partial^2 f}{\partial x_k \partial x_l}(p)\frac{\partial \phi^k}{\partial x_i}(q)\frac{\partial \phi^l}{\partial x_j}(q).

Now we’ll calculate the i-j entry of the left side and see if it is the same. So we’ll need the chain rule for partial derivatives.

\displaystyle (H_g(q))_{ij}=\frac{\partial^2}{\partial x_i \partial x_j}(f\circ \phi)(q)
\displaystyle = \frac{\partial}{\partial x_i}\sum_{k=1}^n\frac{\partial f}{\partial x_k}(\phi (q))\frac{\partial \phi^k}{\partial x_j}(q)
\displaystyle = \sum_{k=1}^n\frac{\partial}{\partial x_i}(\frac{\partial f}{\partial x_k}(p))\frac{\partial \phi^k}{\partial x_j}(q) + \sum_{k=1}^n\frac{\partial f}{\partial x_k}(p)\frac{\partial}{\partial x_i}(\frac{\partial \phi^k}{\partial x_j})(q)
= \displaystyle \sum_{k=1}^n \sum_{l=1}^n \frac{\partial^2 f}{\partial x_k \partial x_l}(p)\frac{\partial \phi^k}{\partial x_i}(q)\frac{\partial \phi^l}{\partial x_j}(q) +\sum_{k=1}^n\frac{\partial^2\phi^k}{\partial x_i \partial x_j}(q)\frac{\partial f}{\partial x_k}(p).

But that last line is the same as the right side with that extra plus stuff. But since f is critical at p, that term is zero, and the two sides are equal.

So there is no problem calling a smooth function Morse, but I also introduced the idea of the index of f at a point. Hopefully this doesn’t change under diffeomorphism. Let’s check.

Suppose index_f(p)=k. Then since \phi is a diffeo, D\phi is non-singular. But the index is a well-defined notion of a bilinear form at a point, so it is independent of choice of basis. Our previous calculation showed that H_g(q)=(D\phi(q))^TH_f(p)(D\phi(q)) which is just a change of basis, so index_g(q)=k as well.

I don’t want to leave you without some sort of concrete idea of what is going on. So define f:S^2\to \mathbb{R} to be the “height function” (x_1, x_2, x_3)\mapsto x_3. If I’m not at the north or south pole, then I can write this function in one of the “side” coordinate patches, i.e. f(\sqrt{1-x_2^2-x_3^2}, x_2, x_3)=x_3. Hence the Jacobian is non-singular. So every point that is not the north or south pole is a regular value.

The north and south poles are critical values. Now write f(u,v)=\sqrt{1-u^2-v^2} in the “north patch.” Then at the north pole we are at u=v=0. Thus H_f(N)=\left(\begin{matrix} -1 & 0 \\ 0 & -1 \end{matrix}\right). Not only does this tell us that the critical point is non-degenerate, but it tells us the index is 2. In fact, the index of the south pole is 0.

Some final points that our example might have just revealed. The index seems to actually give us some information. Note that we could have done the same thing for S^n\to \mathbb{R}. In this case, the two critical points are the same, but the indexes are 0 and n. Is it in fact the case that local mins of Morse functions have index 0 and local maxes on an n-manifold have index n? What does it even mean for a critical point to be something other than a local max or min (i.e. if the previous conjecture holds what is the meaning of an index strictly between 0 and n)? Non-critical points are regular values, and since f is smooth to a 1-manifold (\mathbb{R}), the level sets are codimension 1 properly embedded submanifolds (“hypersurfaces?”). What happens are the relations of these families of submanifolds as we cross critical values?

Alright. I think that is enough of a preview of what is coming up.