# Handle Decomposition

Today I’ll just prove that a Morse function will give a handle decomposition of a closed manifold. Let’s use all the notation already set up (meaning critical points, values, attaching maps, dimension, Morse function, gradient-like vector field, etc).

We just induct on the subscripts of critical points. We’ve already done the base case (it is a min and hence a 0-handle from here). So we just need to show that if $M_{t}$ is a handlebody for $t\in (c_{i-1}, c_i)$, then $M_{c_i+\varepsilon}$ is a handlebody with the appropriate handle attached.

So we’ve assumed that we have some decomposition $M_{c_{i-1}+\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1})$. We also know that we attach a handle of index $\lambda_i$ when crossing $c_i$, so we do have a diffeo to a manifold $M_{c_i-\varepsilon}$ with a $\lambda_i$-handle attached with attaching map $\phi: \partial D^{\lambda_i}\times D^{m-\lambda_i}\to \partial M_{c_i-\varepsilon}$.

Note that $[c_{i-1}+\varepsilon, c_i-\varepsilon]$ contains no critical values, so by flowing along $X$ we get a diffeo $M_{c_{i-1}+\varepsilon}\cong M_{c_i-\varepsilon}$. Let $\psi:M_{c_{i-1}+\varepsilon}\to M_{c_i-\varepsilon}$ be this diffeo.

So by inductive hypothesis, $M_{c_i-\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1})$, so we can assume $\psi$ actually maps from the handlebody to $M_{c_i-\varepsilon}$. Now by composing we get our actual attaching map (note that before now the handle was attached to $M_{c_i-\varepsilon}$ and not the handlebody itself).

i.e. $\psi^{-1}\circ \phi : \partial D^{\lambda_i}\times D^{m-\lambda_i}\to \partial (\mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1}))$. So let $\phi_i=\psi^{-1}\circ \phi$, and we get that $M_{c_i+\varepsilon}\cong \mathcal{H}(D^m;\phi_1, \ldots , \phi_{i-1}, \phi_i)$, so we are done.

So I sort of dragged on longer than probably necessary there, since there was essentially nothing new. It was just being pedantic about the diffeo of the manifold and the handlebody.

There are some subtleties that should be pointed out, though. The index of the critical point did determine the index of the handle, and we went in “ascending” order. The other much more important and also more subtle point is that the choice of gradient-like vector field was how we constructed the attaching map. So even the same Morse function with a different choice of gradient-like vector field could actually give a “different” handle decomposition when considering attaching maps as part of the data.

# Handlebodies II

Let’s think back to our example to model our $\lambda$-handle (where $\lambda$ is not a max or min). Well, it was a “saddle point”. So it consisted of a both a downward arc and upward arc. If you got close enough, it would probably look like $D^1\times D^1$.

Well, generally this will fit with our scheme. An n-handle looked like $D^n$ … or better yet $D^n\times 0$, and a 0-handle looked like $0\times D^n$, so maybe it is the case that a $\lambda$-handle looks like $D^\lambda\times D^{n-\lambda}$. Let’s call $D^\lambda\times 0$ the core of the handle, and $D^{n-\lambda}$ the co-core.

By doing the same trick of writing out what our function looks like at a critical point of index $\lambda$ in some small enough neighborhood using the Morse lemma, we could actually prove this, but we’re actually more interested now in how to figure out what happens with $M_t$ as $t$ crosses this point.

By that I mean, it is time to figure out what exactly it is to “attach a $\lambda$-handle” to the manifold.

Suppose as in the last post that $c_i$ is a critical value of index $\lambda$. Then I propose that $M_{c_i+\varepsilon}$ is diffeomorphic to $M_{c_i-\varepsilon}\cup D^\lambda\times D^{m-\lambda}$ (sorry again, recall my manifold is actually m-dimensional with n critical values).

I wish I had a good way of making pictures to get some of the intuition behind this across. I’ll try in words. A 1-handle for a 3-manifold, will be $D^1\times D^2$, i.e. a solid cylinder. So we can think of this as literally a handle that we will bend the cylinder into, and attach those two ends to the existing manifold. This illustration is quite useful in bringing up a concern we should have. Attaching in this manner is going to create “corners” and we want a smooth manifold, so we need to make sure to smooth it out. But we won’t worry about that now, and we’ll just call the smoothed out $M_{c_i-\varepsilon}\cup D^\lambda\times D^{m-\lambda}$, say $M'$.

Let’s use our gradient-like vector field again. Let’s choose $\varepsilon$ small enough so that we are in a coordinate chart centered at $p_i$ such that $f=-x_1^2-\cdots - x_\lambda^2 + x_{\lambda +1}^2+\cdots + x_m^2$ is in standard Morse lemma form.

Let’s see what happens on the core $D^\lambda\times 0$. At the center, it takes the critical value $c_i$ and it decreases everywhere from there (as we move from 0, only the first $\lambda$ coordinates change). This decreasing goes all the way to the boundary where it is $c_i-\varepsilon$. Thus it is the upside down bowl (of dimension $\lambda$). Likewise, the co-core goes from the critical value and increases (as in the right side up bowl) to the boundary of a $m-\lambda$ disk at a value $c_i+\delta$ (where $0<\delta<\varepsilon$).

Let's carefully figure out the attaching procedure now. If we think of our 3-manifold for intuition, we want to attach $D^\lambda\times D^{m-\lambda}$ to $M_{c_i-\varepsilon}$ by pasting $\partial D^\lambda\times D^{m-\lambda}$ along $\partial M_{c_i-\varepsilon}$.

So I haven't talked about attaching procedures in this blog, but basically we want a map $\phi: \partial D^\lambda\times D^{m-\lambda}\to \partial M_{c_i-\varepsilon}$ and then forming the quotient space of the disjoint union under the relation of identifying $p\in \partial D^\lambda\times D^{m-\lambda}$ with $\phi (p)$. Sometimes this is called an adjunction space.

So really $\phi$ is a smooth embedding of a thickened sphere $S^{\lambda - 1}$, since $\partial D^\lambda=S^{\lambda-1}$. And the dimensions in which it was thickened is $m-\lambda$. Think about the "handle" in the 3-dimensional 1-handle case. We gave the two endpoints of line segment (two points = $S^0$) a 2-dimensional thickening by a disk.

Now it is the same old trick to get the diffeo. The gradient-like vector field, $X$, flows from $\partial M'$ to $\partial M_{c_i+\varepsilon}$, so just multiply $X$ by a smooth function that will make $M'$ match $M_{c_i+\varepsilon}$ after some time. This is our diffoemorphism and we are done.

# Handlebodies I

We now come to the main point of all these Morse theory posts. We want to somehow figure out what a closed manifold looks like based a Morse function that it admits (who knows how long I’ll develop this theory, maybe we’ll even get to how Smale proved the Poincare Conjecture in dimensions greater than or equal to 5).

Suppose $M$ is closed and $f:M\to\mathbb{R}$ a Morse function. We’ll use the convenient notation $M_t=\{p\in M : f(p)\leq t\}$. So again, with the height analogy, as t increases, we will be looking at the entire manifold up to that height. Since M is compact, there is some finite interval $[a,b]$ such that $M_a=\emptyset$ and $M_b=M$.

Note that with essentially no modification, we have already proved the Theorem that if $[c,d]$ contains no critical values, then $M_c\cong M_d$. So really, the point is to now figure out what happens as we pass through the critical values.

First off, there are only finitely many critical points, and we can assume that each of these has distinct critical values by raising and lowering critical values. So if $p_0, \ldots, p_n$ are the critical points and $c_k=f(p_k)$, we can order the indices so that $c_0 < c_1 < \cdots < c_n$.

To be explicit, $c_0$ is the min, so $M_t=\emptyset$ for $t < c_0$ and $M_t=M$ for t greater than $c_n$, since $c_n$ is the max (also, wordpress hates inequalities, or me, I haven't decided yet, but it always cuts out lots of stuff and I just have to write the inequality in words).

These two critical points would be a nice place to start our examination. By the Morse lemma and the fact that a min has index 0, we know that there exists a neighborhood of $p_0$ on which $f=x_1^2+\cdots + x_m^2+c_0$ (Alright, I’m sorry about that, but I just realized I have n critical points, so the dimension of my manifold is now m).

More explicitly there is some $\varepsilon>0$ such that $M_{c_0+\varepsilon}=\{(x_1, \ldots , x_m) : x_1^2+\cdots + x_m^2\leq \varepsilon\}\cong B^m$. So if we are thinking of height (of a 2-dim manifold), we’ll want to visualize this as a “bowl” where you have the bottom of the bowl the min and then it slopes upward along a sphere, and then you have the boundary circle at height $c_0+\varepsilon$.

So note that the only thing we used about this critical point is that it had index 0. This shape is called a (m-dimensional) 0-handle.

The reverse happens at our max. We have $M_{c_n-\varepsilon}=\{(x_1, \ldots , x_m) : x_1^2+\cdots +x_m^2\geq \varepsilon\}$, since the critical point has index m. This is an $m$-handle and thinking in 2-d height, it is a downward facing bowl.

Again, there is nothing special about being the absolute max, any index m critical point will locally be an $m$-handle.

Index k critical points where $k\neq 0,m$ are more complicated so I’ll leave those for next time.

Now we have a nice overview of how this will work. We just need to figure out what a $k$-handle looks like, then as t increases through a critical value with index k, $M_t$ will “attach a k-handle”. When we are not near a critical value, the $M_t$ will not change diffeomorphism-type. We just need to make this a little more precise next time (or maybe even the time after).

# Altering the Critical Points

I officially have a new favorite search for which someone found this blog: How to write a Japanese satire.

Let’s introduce a new term. Two Morse functions are considered equivalent if they have the same critical points and same index at each critical point.

The hope here is that two equivalent Morse functions will give the same topological data about our manifold, and so we want to develop techniques of altering our Morse function to something extremely nice to work with, but having it be equivalent to the origin one.

Our first excursion into this technique is the following: If M is a compact manifold and $f$ is a Morse function on M, then we can find an equivalent Morse function $g$ such that all the critical values are distinct.

If we’re going back to the height intuition, this is the technique that corresponds to “raising” or “lowering” critical points. So if you have two strange things happening at the same height (two mountain peaks that have the same height), the idea is sort of that you can slightly move the manifold around so that one is now higher than the other. Of course, we won’t actually move the manifold in any real sense, we’re going to construct the function.

This is going to be really nice, because it says that we can always get a Morse function in which only a single “change” can happen at any given height.

We’ll do this by first proving a Lemma which does all the work for us. Let $f$ be our Morse function, and $p$ a critical point. Then there is some $\varepsilon>0$ such that for all $c\in (-\varepsilon, \varepsilon)$ there is an equivalent Morse function $h$ that has the same critical values as $f$, except for $h(p)=f(p)+c$.

The arguments here are essentially the same as in previous posts, so I’ll be a little looser and only outline the proof.

Since the critical points are isolated we can take a small coordinate chart centered at $p$ that contains no other critical points. Now let $\psi$ be a bump function that is 1 on some small neighborhood of $p$ and dies to zero before getting to the edge of the chart.

Then we define $h_c=f+c\psi$. We definitely have that all the critical points of $f$ are still critical points of $h_c$ and since on a neighborhood of any of those points the functions either agree or differ by adding a constant, they have the same index. Also, $h_c(p)=f(p)+c$, so we have constructed our desired function as long as we don’t have any extra critical points.

But in the same was as before, $\Big|Dh_c\Big|=\Big|Df+cD\psi\Big|\geq \delta-ca>0$ for all $|c|<\varepsilon$ where $\varepsilon=\delta/a$, since we're only concerned with the compact set on which $\psi$ is decaying, $Df$ has a positive min $\delta$, and $D\psi$ has a finite max $a$. Thus we do not gain any critical points in that set and we are done.

To get to the whole theorem all we need to do is note that there are only finitely many critical points (since compact). So if any of the values are shared, we can use the lemma to give an equivalent Morse function with shifted critical value, where we shift by a small enough value that it can't make it to any other critical value. We only have to apply this a finite number of times.

Now we want to start building some technique that will allow us to figure out what our closed manifold looks like based on the Morse functions it admits.

We’ll call a vector field $X$, a gradient-like vector field for f, if $X\cdot f>0$ away from critical points, and if $p\in M$ is a critical point of index $\lambda$, then there is a coordinate neighborhood about $p$ such that f has the standard form as in the Morse lemma, and $X=-2x_1\frac{\partial}{\partial x_1}-\cdots - 2x_{\lambda}\frac{\partial}{\partial x_\lambda}+2x_{\lambda+1}\frac{\partial}{\partial x_{\lambda+1}}+\cdots + 2x_m\frac{\partial}{\partial x_m}$ (i.e. it is the gradient in this neighborhood).

Intuitively, if we think back to our example, we visualize Morse functions as “height functions”. So we are attempting to construct in some sense an everywhere “upward” pointing vector field. If we’re thinking of the entire manifold flowing along this, then the only places where it is allowed to get “stuck” is at the critical points of $f$.

The theorem is that there always exists a gradient-like vector field for a Morse function on a compact manifold.

Proof: As before, let $\{U_i\}_1^k$ be a finite subcover of coordinate charts, and $\{K_i\}_1^k$ be a compact refinement. Since the critical points are isolated (immediate corollary to the Morse lemma), there can only be finitely many since our manifold is compact. So we can assume that each critical point has a neighborhood small enough so that it is entirely contained in exactly one of the $U_i$, and that the $U_i$ were chosen so that $f$ has standard form in those coordinates.

Let $\psi_i: U_i\to \mathbb{R}$ be a bump function for $K_i$ supported in $U_i$. Then we get a smooth function on the entire manifold by letting $\psi_i\equiv 0$ outside of $U_i$.

Let $X_i$ be the gradient of $f$ on $U_i$. Let $\displaystyle X=\sum_{j=1}^k \psi_jX_j$. The claim is that this is our gradient-like vector field for $f$.

Let’s check $X\cdot f$ at non-critical points. If $x\in M$ is not a critical point, and $x\in U_i$, then $(\psi_i X_i\cdot f)(x)>0$ since $X_i$ is the gradient and $\psi_i(x)>0$. All other terms of the sum are 0 since $\psi_i(x)=0$ for any $i$ such that $x\notin U_i$. Thus $(X\cdot f)(x)>0$.

The other condition we have set up to work since each critical point has a neighborhood that is contained in precisely one of the $U_i$, thus on that neighborhood $f$ is in standard form, and $X=\psi_iX_i$ which is of the correct form. Thus $X$ is gradient-like for $f$.

As a preview of things to come, I’ll prove our first result about what our manifold looks like using Morse functions. This is often called the Regular Interval Theorem.

Suppose that $f$ has no critical value in $[a,b]$, then $M_{[a,b]}=\{p\in M : a\leq f(p)\leq b\}$ is diffeomorphic to $f^{-1}(a)\times [0,1]$.

Let $X$ be gradient-like for $f$. Define $\displaystyle Y=\frac{1}{X\cdot f}X$ which is smooth off of the critical points of $f$, but since $M_{[a,b]}$ contains no critical points it is a smooth vector field there (in fact, on an open set containing $M_{[a,b]}$).

Let $\theta^p(t)$ be an integral curve for $Y$ starting at $p\in f^{-1}(a)$. But now $\displaystyle \frac{d}{dt}\Big|_{t=t_0}f(\theta^p(t))=\frac{d\theta^p}{dt}(t_0)(f)$
$\displaystyle = Y_{\theta^p(t_0)}(f)$
$\displaystyle = \frac{1}{X\cdot f}X\cdot f=1$.

Thus, the integral curve continues along at constant speed 1 for the entire time it is in $M_{[a,b]}$. But it starts at $f=a$ at time 0, so it reaches $f=b$ at time $t=b-a$.

Thus $h: f^{-1}(a)\times [0,b-a]\to M_{[a,b]}$ by $(p,t)\mapsto \theta^p(t)$ is a diffeomorphism. But rescaling gives the diffeo to $f^{-1}(a)\times [0,1]$.

This basically says that between critical points of a Morse function, we must have the manifold looking like cylinder built off of a single slice of the function (if we’re thinking in terms of height, we can pick any height, and at anywhere between the two nearest critical heights, all the level sets will look the same).

# Morse Functions Exist

The astute reader at this point may be getting a little anxious that despite the fact that I found Morse function in two easy low dimensional cases, my eventual goal of saying very general things about manifolds by using Morse functions is going to rely on the fact that they exist.

If these thing are really as powerful as I have been making them out to be, then it would seem that there probably isn’t an abundance of them. But surprisingly, it turns out that basically every smooth function is Morse.

Let $M^n$ be a closed manifold, and $g:M\to \mathbb{R}$ be a smooth function. Then there is a Morse function $f:M\to\mathbb{R}$ arbitrarily close to $g$.

Recall Sard’s Theorem (I’m assuming some familiarity with it, which is probably not a good idea): The set of critical values of a smooth map $f: U\to \mathbb{R}^n$ has measure zero in $\mathbb{R}^n$.

Now we’ll first need a lemma. Let $U\subset \mathbb{R}^n$ be an open set and $f:U\to\mathbb{R}$ a smooth function. Then there are real numbers $\{a_k\}$ such that $f(x_1, \ldots, x_n)-(a_1x_1+a_2x_2+\cdots + a_nx_n)$ is a Morse function on $U$. We can also choose $\{a_k\}$ to be arbitrarily small in absolute value.

Let $p\in U$ be a critical point of $f$. Define $h=Jac(f)^T$ (a smooth map $h:U\to\mathbb{R}^n$). Then $Jac(h)\Big|_p$ is the Hessian $H_f(p)$. Thus, p is a critical point of $h$ iff $det(H_f(p))=0$.

By Sard’s Theorem, we can choose $a=(a_1, \ldots , a_n)\in\mathbb{R}^n$ where each $a_k$ have arbitrarily small absolute value such that $a$ is not a critical value of $h$.

The claim is that $\overline{f}(x_1, \ldots , x_n)=f(x_1, \ldots, x_n)-(a_1x_1+\cdots + a_nx_n)$ is a Morse function on U.

Well, if $p$ is a critical point of $\overline{f}$, then since $\frac{\partial \overline{f}}{\partial x_i}\Big|_p=\frac{\partial f}{\partial x_i}\Big|_p - a_i=0$, by the definition of h, we get $h(p)=a$.

But we chose $a$ to not be a critical value of h. Thus, p is not a critical point of h. So as noted, $det(H_f(p))\neq 0$. But $H_f(p)=H_{\overline{f}}(p)$, so $p$ is a non-degenerate critical point. Since p was an arbitrary critical point, all critical points are non-degenerate and hence $\overline{f}$ is Morse, completing the proof of the Lemma.

We also need another Lemma. Let $K\subset M$ be a compact subset. Then if $g:M\to\mathbb{R}$ has no degenerate critical points in $K$, then we can choose $\varepsilon >0$ small enough so that any $C^2$ approximation of $g$ also has no degenerate critical points in $K$.

Since our manifold is closed, it is compact. So we can choose a finite subcover of coordinate charts, and compactly refine it (I’ll do this construction if someone asks in the comments), so that $\{U_i\}_1^m$ cover $M$ and there are compact sets $K_i\subset U_i$ such that $\cup K_i=M$.

But with this, we can look at any of the $U_k$, and in these coordinates, $g$ has no degenerate critical points in $K\cap K_k$ (alright, that was probably a poor choice of notation) iff $\displaystyle\Big|\frac{\partial g}{\partial x_1}\Big|+\cdots + \Big|\frac{\partial g}{\partial x_n}\Big|+\Big| det(H_g)\Big|>0$ for every point in $K\cap K_k$.

But for a small enough $\varepsilon$ we can definitely still make that inequality hold for any $C^2$ approximation. Thus we have proved the lemma.

Now let’s do the actual existence proof. Take the $U_i, K_i$ as before. We will inductively build our $C^2$ approximations on $C_l=K_1\cup \cdots \cup K_l$. Our base step is to build $f_0$ on $C_0=\emptyset$, so we’re done.

For our inductive hypothesis, suppose we have $f_{l-1}:M\to\mathbb{R}$ having no degenerate critical points in $C_{l-1}$.

Let’s work with the coordinate neighborhood $U_l$ with coordinates $(x_i)$. By the first lemma, there are arbitrarily small numbers $\{a_i\}$ so that $f_{l-1}(x_1, \ldots , x_n)-(a_1x_1+\cdots + a_nx_n)$ is Morse on $U_l$. But note, we only have a definition on $U_l$ and we need one everywhere.

Let $\psi$ be a bump function that is 1 on $K_l$ and supported in $V$, where $K_l\subset V\subset U_l$.

Define $f_l=\begin{cases} f_{l-1}-\psi\cdot (a_1x_1+\cdots a_nx_m) & in \ U_l \\ f_{l-1} & outside \ V\end{cases}$.

(So I have this same cases problem again, just ignore the “line break” symbol, it is actually readable this time).

This gives us a nice well-defined function on all of $M$ (just need to check the overlaps). Also $f_l$ is our first lemma function on $K_l$, so it is Morse on $K_l$ and hence has no degenerate critical points there.

Since $0\leq \psi \leq 1$ (and we’re on a compact set), we can make $\{a_i\}$ small enough so that $f_l$ is an arbitrarily close $C^2$ approximation of $f_{l-1}$ (I won’t do this since it is fairly long and tedious, but quite straightforward for the reasons I gave).

But now by the second lemma, since $f_{l-1}$ has no degenerate critical points in $C_{l-1}$, we have that $f_l$ has no degenerate critical points in $C_{l-1}$ either. We already checked on $K_l$, and thus there are no deg. critical points on $C_{l-1}\cup K_l=C_l$.

Thus inductively we can get a Morse function on all of $M$ that is $C^2$-close to our original smooth function.

# The Morse Lemma

Today we prove what is known as The Morse Lemma. It tells us exactly what our Morse function looks like near its critical points.

Let $p\in M$ be a non-degenerate critical point of $f:M\to \mathbb{R}$. Then we can choose coordinates about p, $(x_i)$, such that in these coordinates $f=-x_1^2-x_2^2-\cdots -x_\lambda^2+x_{\lambda+1}^2+\cdots +x_n^2+f(p)$. Moreover, $\lambda$ is the index of the critical point. (Note that $0\mapsto f(p)$).

Proof: Choose local coordinates, $(x_i)$, centered at $p$. Without loss of generality $f(p)=0$ by replacing $f$ with $f-f(p)$. Thus in coordinates, since p corresponds to 0, $f(0)=0$ (it is a little sloppy, but I’ll probably call the actual function and the function in coordinates the same thing and go back and forth).

By a general theorem of multi-variable calculus (I don’t know if it has a name, it might be Taylor’s theorem? I always get confused at how much is actually included in that), we have smooth functions $g_1, \ldots, g_n$ such that $f(x_1, \ldots, x_n)=\sum_{k=1}^n x_ig_i(x_1, \ldots, x_n)$ and $\displaystyle \frac{\partial f}{\partial x_i}\Big|_0=g_i(0)$.

But 0 is a critical point of $f$, so $g_i(0)=0$ and we can apply the theorem again to each $g_i$. We’ll suggestively call the smooth functions $g_k(x_1, \ldots, x_n)=\sum_{i=1}^n x_i h_{ki}(x_1, \ldots, x_n)$.

Thus, we now have $\displaystyle f=\sum_{k,i}x_kx_i h_{ki}$. Let $\displaystyle H_{ki}=\frac{(h_{ki}+h_{ik})}{2}$.

Then $\displaystyle f=\sum_{k, i}x_kx_i H_{ki}$, and $H_{ki}=H_{ik}$.

But in that form we see that the second partial derivatives are $\displaystyle \frac{\partial^2 f}{\partial x_k \partial x_i}\Big|_0=2H_{ki}(0)$.

By assumption $0$ is a non-degenerate critical point, so $det(H_{ki}(0))\neq 0$ and hence we can apply a linear transformation to our current coordinates and get that $\frac{\partial^2 f}{\partial x_1^2}\Big|_0\neq 0$. Thus $H_{11}(0)\neq 0$.

Now $H_{11}$ is continuous, so that means it is non-zero in a neighborhood of 0.

Let $(y_1, x_2, \ldots, x_n)$ be a new coordinate neighborhood where $y_1=\sqrt{|H_{11}|}\left(x_1+\sum_{i=2}^n x_i\frac{H_{1i}}{H_{11}}\right)$. (Note this is actually a coordinate system, since the determinant of the Jacobian of the transformation from this one to the old one is non-zero).

Now $\displaystyle y_1^2=|H_{11}|\left(x_1+\sum_{i=2}^nx_i \frac{H_{1i}}{H_{11}}\right)^2$
$= H_{11}x_1^2 + 2\sum_{i=2} x_1x_i H_{1i} +\left(\sum_{i=2} x_i H_{1i}\right)^2/H_{11}$ if $H_{11}>0$, and the same thing with minus signs everywhere if $H_{11}$ is negative.

Thus the function is $y_1^2+\sum_{i,j=2}x_ix_jH_{ij}-\left(\sum_{i=2} x_i H_{1i}\right)^2/H_{11}$ if $H_{11}>0$ or
$-y_1^2 +\sum_{i,j=2} x_ix_j H_{ij} -\left(\sum_{i=2}x_i H_{1i}\right)^2/H_{11}$ otherwise.

(I awkwardly wrote this with words, because I couldn’t get cases to look right, and was having weird errors I couldn’t figure).

Now just isolate the stuff after the $\pm y_1^2$. It satisfies the same conditions as $f$, but has fewer variables, so we can induct on the number of variables until we have $f(y_1, \ldots , y_n)=-y_1^2-\cdots - y_\lambda^2 +y_{\lambda +1}^2+\cdots +y_n^2$.

And since the plus and minus signs came from changing basis to put the Hessian into diagonal form with plus and minus 1’s, the number of minus signs is indeed the index.

The proof of this tended to be sort of tedious to check everything, so don’t worry if you didn’t go through it. I don’t think there is really insight you get from going through it. This is one of those rare instances that I think the result is more important than the proof.

Now we have real good reason to believe the index will be $n$ or 0 if we are at a local max or min. What does a max or min look like near the point? Well, it slopes all in the same direction, i.e. it will locally look like a sphere. But this is exactly what the Morse lemma tells us about index n and 0 critical points. We’ll make this more precise later.

I wasn’t sure how I was going to proceed. My two options seemed to be to build the Morse theory I need for Lefschetz, and then do Lefschetz, then come back to Morse theory. But I think I’m just going to continue as far as I want to go ignoring what is needed for the Hyperplane Theorem, then reference what I need.

# A Better Example

The example I gave last time was awful I’ve realized. I need something a little more complicated to better motivate why we’d believe some of these things, and to illustrate what happens in certain situations.

So let’s take a surface embedded in $\mathbb{R}^3$ given by the equation $z=x^2(x+1)-y^2$. It is a “mountain landscape”:

It might be hard to tell, but there is the one peak, and it forever decreases to the general left, and forever increases to the general right.

We have a global chart to work with. Our Morse function will again be the “height function”. So $f(x,y,z)=z$. We have two critical points. One will occur when we reach the “saddle point” at $z=0$ and one when we reach the peak of the mountain at $z=4/27$. Since this is a conceptual example, I won’t go through all the technical stuff to show that this is actually a Morse function.

Now as we stated before, at a non-critical value, i.e. a regular value, the level set is an embedded submanifold. Thus if $c<0$, then $f(x,y,z)=c$ is something that vaguely looks like:

This is because it is below the saddle point. As the height increases to 0, our level set starts to close in, and when we reach $f(x,y,z)=0$, we get:

This is not an embedded submanifold, because the point of intersection is not locally Euclidean. Continuing up, we get that $0 will look something like:

Then we hit the critical value $c=4/27$:

It doesn’t show up on the graph, but there is a point at $(-2/3, 0)$ which is why this one is not a manifold. There would be no well-defined dimension since it is a 0-dimensional object union a 1-dimensional object. Then everything above 4/27 looks like the last picture but without the dot.

Let’s analyze a little bit. Between critical values all of our embedded submanifolds seemed to be diffeomorphic to each other (you may be able to guess the proof of this even if you haven’t seen it). But when we cross a critical value we don’t even maintain homotopy type.

If anyone actually worked out the math behind this example, then they would see that we also now have an example of an index 1 critical point at the saddle. The top of the mountain is index 2, which still fits with the local min/max conjecture from last time.

I may post again later today on actual Morse theory, but I decided that I really needed a better example to reference once we got going.

# What is a Morse function?

A Morse function is a smooth function from a smooth manifold, $M$, to $\mathbb{R}$ that is in some sense “non-degenerate.”

Suppose $p\in M$, then define the Hessian of f at p, $H_f(p): T_pM\times T_pM\to \mathbb{R}$ to be the bilinear form that sends $\displaystyle (\frac{\partial}{\partial x^i}, \frac{\partial}{\partial x^j})\mapsto \frac{\partial^2f}{\partial x_i\partial x_j}\Big|_p$.

So picking a basis, the Hessian is just the matrix of second partial derivatives.

Now we call $f:M\to\mathbb{R}$ a Morse function if for any $a\in\mathbb{R}$ we have $f^{-1}((-\infty, a])$ is compact, and for any critical point of f (the derivative is 0), then $H_f(p)$ is non-singular. So in matrix form, it would have non-zero determinant. In bilinear form terms, it is non-degenerate or zero is not an eigenvalue.

The index at p, is the index of the Hessian at p as a bilinear form. Recall that the index of a bilinear form is the maximal dimension of a linear subspace such that the form is negative definite. (This is sort of backwards of the intuition of counting how big the positive dimension can be. So note that a form is positive semidefinite iff it has index 0).

We really have to check that the property of being a “Morse function” actually is a well-defined concept for smooth manifolds. i.e. is it a diffeomorphism invariant?

We’ll work locally in coordinates. Suppose we have $\phi : V\to U$ a diffeomorphism where $\phi(q)=p$. Define $g= f\circ \phi$ (i.e. the change of coordinates of our so-called Morse function). The well-defined claim is that $q$ is a critical point of $g$ and that the Hessian of g at p is non-singular.

Well, the critical point claim is just the chain rule. Now we’ll actually compute the Hessian. I propose that it is $H_g(q)=(D\phi(q))^T H_f(p) (D\phi(q))$ to make it easier to follow.

We’ll do the right hand side first. The j-th column $\displaystyle (H_f(p)D\phi(q))_j= \left(\sum_{l=1}^n \frac{\partial^2 f}{\partial x_1\partial x_l}(\phi(q))\frac{\partial \phi^l}{\partial x_j}(q)\cdots \sum_{l=1}^n \frac{\partial^2 f}{\partial x_n\partial x_l}(\phi(q))\frac{\partial \phi^l}{\partial x_j}(q)\right)$.

Thus the i-j entry of the right side is multiplying on the left by the ith row of $(D\phi(q))^T$, which gives $\displaystyle \sum_{k=1}^n \sum_{l=1}^n \frac{\partial^2 f}{\partial x_k \partial x_l}(p)\frac{\partial \phi^k}{\partial x_i}(q)\frac{\partial \phi^l}{\partial x_j}(q)$.

Now we’ll calculate the i-j entry of the left side and see if it is the same. So we’ll need the chain rule for partial derivatives.

$\displaystyle (H_g(q))_{ij}=\frac{\partial^2}{\partial x_i \partial x_j}(f\circ \phi)(q)$
$\displaystyle = \frac{\partial}{\partial x_i}\sum_{k=1}^n\frac{\partial f}{\partial x_k}(\phi (q))\frac{\partial \phi^k}{\partial x_j}(q)$
$\displaystyle = \sum_{k=1}^n\frac{\partial}{\partial x_i}(\frac{\partial f}{\partial x_k}(p))\frac{\partial \phi^k}{\partial x_j}(q) + \sum_{k=1}^n\frac{\partial f}{\partial x_k}(p)\frac{\partial}{\partial x_i}(\frac{\partial \phi^k}{\partial x_j})(q)$
$= \displaystyle \sum_{k=1}^n \sum_{l=1}^n \frac{\partial^2 f}{\partial x_k \partial x_l}(p)\frac{\partial \phi^k}{\partial x_i}(q)\frac{\partial \phi^l}{\partial x_j}(q) +\sum_{k=1}^n\frac{\partial^2\phi^k}{\partial x_i \partial x_j}(q)\frac{\partial f}{\partial x_k}(p)$.

But that last line is the same as the right side with that extra plus stuff. But since f is critical at p, that term is zero, and the two sides are equal.

So there is no problem calling a smooth function Morse, but I also introduced the idea of the index of f at a point. Hopefully this doesn’t change under diffeomorphism. Let’s check.

Suppose $index_f(p)=k$. Then since $\phi$ is a diffeo, $D\phi$ is non-singular. But the index is a well-defined notion of a bilinear form at a point, so it is independent of choice of basis. Our previous calculation showed that $H_g(q)=(D\phi(q))^TH_f(p)(D\phi(q))$ which is just a change of basis, so $index_g(q)=k$ as well.

I don’t want to leave you without some sort of concrete idea of what is going on. So define $f:S^2\to \mathbb{R}$ to be the “height function” $(x_1, x_2, x_3)\mapsto x_3$. If I’m not at the north or south pole, then I can write this function in one of the “side” coordinate patches, i.e. $f(\sqrt{1-x_2^2-x_3^2}, x_2, x_3)=x_3$. Hence the Jacobian is non-singular. So every point that is not the north or south pole is a regular value.

The north and south poles are critical values. Now write $f(u,v)=\sqrt{1-u^2-v^2}$ in the “north patch.” Then at the north pole we are at $u=v=0$. Thus $H_f(N)=\left(\begin{matrix} -1 & 0 \\ 0 & -1 \end{matrix}\right)$. Not only does this tell us that the critical point is non-degenerate, but it tells us the index is 2. In fact, the index of the south pole is 0.

Some final points that our example might have just revealed. The index seems to actually give us some information. Note that we could have done the same thing for $S^n\to \mathbb{R}$. In this case, the two critical points are the same, but the indexes are 0 and n. Is it in fact the case that local mins of Morse functions have index 0 and local maxes on an n-manifold have index n? What does it even mean for a critical point to be something other than a local max or min (i.e. if the previous conjecture holds what is the meaning of an index strictly between 0 and n)? Non-critical points are regular values, and since f is smooth to a 1-manifold ($\mathbb{R}$), the level sets are codimension 1 properly embedded submanifolds (“hypersurfaces?”). What happens are the relations of these families of submanifolds as we cross critical values?

Alright. I think that is enough of a preview of what is coming up.