Surviving Upper Division Math

It’s that time of the year. Classes are starting up. You’re nervous and excited to be taking some of your first “real” math classes called things like “Abstract Algebra” or “Real Anaylsis” or “Topology.”

It goes well for the first few weeks as the professor reviews some stuff and gets everyone on the same page. You do the homework and seem to be understanding.

Then, all of a sudden, you find yourself sitting there, watching an hour-long proof of a theorem you can’t even remember the statement of, using techniques you’ve never heard of.

You panic. Is this going to be on the test?

We’ve all been there.

I’ve been that teacher, I’m sad to say, where it’s perfectly clear in my head that the students are not supposed to regurgitate any of this. The proof is merely there for rigor and exposure to some ideas. It’s clear in my head which ideas are the key ones, though I maybe forgot to point it out carefully.

It’s a daunting situation for the best students in the class and a downright nightmare for the weaker ones.

Then it gets worse. Once your eyes glaze over that first time, it seems the class gets more and more abstract as the weeks go by, filled with more and more of these insanely long proofs and no examples to illuminate the ideas.

Here’s some advice for surviving these upper division math classes. I’m sure people told me this dozens of times, but I tended to ignore it. I only learned how effective it was when I got to grad school.

Disclaimer: Everyone is different. Do what works for you. This worked for me and may only end up frustrating someone with a different learning style.

Tip Summary: Examples, examples, examples!

I used to think examples were something given in a textbook to help me work the problems. They gave me a model of how to do things.

What I didn’t realize was that examples are how you’re going to remember everything: proofs, theorems, concepts, problems, and so on.

Every time you come to a major theorem, write out the converse, inverse, switch some quantifiers, remove hypotheses, weaken hyphotheses, strengthen conclusions, and whatever you can think of to mess it up.

When you do this you’ll produce a bunch of propositions that are false! Now come up with examples to show they’re false (and get away from that textbook when you do this!). Maybe some rearrangement of the theorem turns out to be true, and so you can’t figure out a counterexample.

This is good, too! I cannot overstate how much you will drill into your memory by merely trying unsuccessfully to find a counterexample to a true statement. You’ll start to understand and see why it’s probably true, which will help you follow along to the proof.

As someone who has taught these classes, I assure you that a huge amount of problems students have on a test would be solved by doing this. Students try to memorize too much, and then when they get to a test, they start to question: was that a “for every” or “there exists?” Does the theorem go this way or that?

You must make up your own examples, so when you have a question like that, the answer comes immediately. It’s so easy to forget the tiniest little hypothesis under pressure.

It’s astounding the number of times I’ve seen someone get to a point in a proof where it looks like everything is in place, but it’s not. Say you’re at a step where f: X\to Y is a continuous map of topological spaces, and X is connected. You realize you can finish the proof if Y is connected.

You “remember” this is a theorem from the book! You’re done!

Woops. It turns out that f has to be surjective to make that true.

But now imagine, before the test, you read that theorem and you thought: what’s a counterexample if I remove the surjective hypothesis?

The example you came up with was so easy and took no time at all. It’s f: [0,1] \to \{0\} \cup \{1\} given by f(x) = 1. This example being in your head saves you from bombing that question.

If you just try to memorize the examples in the book or that the professor gives you, that’s just more memorization, and you could run into trouble. By going through the effort of making your own examples, you’ll have the confidence and understanding to do it again in a difficult situation.

A lesser talked about benefit is that having a bunch of examples that you understand gives you something concrete to think about when watching these proofs. So when the epsilons and deltas and neighborhoods of functions and uniform convergence and on and on start to make your eyes glaze over, you can picture the examples you’ve already constructed.

Instead of thinking in abstract generality, you can think: why does that step of the proof work or not work if f_n(x) = x^n?

Lastly, half the problems on undergraduate exams are going to be examples. So, if you already know them, you can spend all your time on the “harder” problems.

Other Tip: Partial credit is riskier than in lower division classes.

There’s this thing that a professor will never tell you, but it’s true: saying wrong things on a test is worse than saying nothing at all.

Let me disclaimer again. Being wrong and confused is soooo important to the process of learning math. You have to be unafraid to try things out on homework and quizzes and tests and office hours and on your own.

Then you have to learn why you were wrong. When you’re wrong, make more examples!

Knowing a bunch of examples will make it almost impossible for you to say something wrong.

Here’s the thing. There comes a point every semester where the professor has to make a judgment call on how much you understand. If they know what they’re doing, they’ll wait until the final exam.

The student that spews out a bunch of stuff in the hopes of partial credit is likely to say something wrong. When we’re grading and see something wrong (like misremembering that theorem above), a red flag goes off: this student doesn’t understand that concept.

A student that writes nothing on a problem or only a very small amount that is totally correct will be seen as superior. This is because it’s okay to not be able to do a problem if you understand you didn’t know how to do it. That’s a way to demonstrate you’re understanding. In other words: know what you don’t know.

Now, you shouldn’t be afraid to try, and this is why the first tip is so much more important than this other tip (and will often depend on the instructor/class).

And the best way to avoid using a “theorem” that’s “obviously wrong” is to test any theorem you quote against your arsenal of examples. As you practice this, it will become second-nature and make all of these classes far, far easier.

Advertisements

New Site and Future Plans

We’re closing in on September, and since my brain still thinks of Sept-Sept as “the proper year” from all that time in academia, I got to thinking about my plans for this upcoming year.

I decided it no longer made sense to have my internet presence so spread out. I originally created “matthewwardbooks.com” as the professional site containing the information about my writing and career. That way I could spew an unprofessional mess of random thoughts at this blog without worrying about how that would look.

I’ve now migrated that site over to this blog (it redirects). I’ll probably keep experimenting with the look, themes, sidebar, menus, and those things for a few weeks. I’ve gone to a cleaner theme and removed the header image, since that seemed to do nothing but clutter things.

Nothing should change for regular blog readers, but I’m considering changing the name of the site in general. It’s a bit hard to do this without it changing the name of the blog in RSS aggregates and social media, etc, which could be confusing to longtime readers, but I’ll keep thinking about it.

So, why did I do this?

The main reason is that I’m basically killing off my name as a writer. I currently write under three names: my real name, a romance genre pseudonym, and a LitRPG/GameLit pseudonym.

The two pseudonyms were chosen to keep genres separate for advertising and “clean also boughts” (if you don’t know that phrase, don’t worry about it; it would take too long to explain here).

But they were also carefully chosen to be searchable and identifiable. One thing people don’t really think about when they start writing is if their name is “usable.” Well, it turns out my name is not usable at all. It’s about as bad as possible.

Matthew Ward writes the Fantastic Family Whipple series. Another Matthew Ward is a translator of French literature. Another is a dead child whose mother channels him and writes his stories from beyond the grave in his name. Another is a self-published fantasy writer (who isn’t me!!). Another writes cookbooks and diet books. And so on.

Yeah. Not good from a branding standpoint or Google or even just trying to figure out which other books are mine if you like them. Using my real name was an epic mistake.

My real name books are somewhat “arty” and not all that marketable. The books under my pseudonyms are in line with genre conventions and are doing reasonably well. So, it only makes sense from a professional standpoint to stop writing books under this name. Hence, paying for a separate professional webpage for a writer who is going to cease to exist doesn’t make sense, either.

I’ll bring up one more point as a word of caution for people considering self-publishing under their real name.

People know me from this blog, and this has led to some pretty questionable behavior from someone who wants to sabotage me for some reason. I assume they asked for help with something in the comments section, and I didn’t do their homework for them. So they retaliated out of anger by leaving me 1 star reviews.

If that was you, I would greatly appreciate it if you would delete that now that you’ve had time to cool down. It’s easy to forget that this is my livelihood now. I’m a real person. You think that leaving a fake 1 star review is just going to “troll” or “anger” me, but it actually hurts my business. It’s a very serious thing to do.

Anyway, because I have ten years of content on this blog plus people who know me in real life with various opinions on my life choices, things like this are bound to happen again in the future.

So I have to take that into account from a business standpoint. It’s just not worth the risk of spending 1.5 years working on a single work of art to have it get trashed by someone who hasn’t even read it merely because they don’t like that I left math (or whatever their reason was).

Undoubtedly, I’m going to have the itch to produce something strange and important that defies genre conventions within the next year or two. So I’ll have to figure out what to do about that, because I’ll definitely write it. That might mean using a new name that I advertise here, or I might keep it secret. Or maybe after a few years, I decide it’s not that bad to use my real name again.

I’m going to keep blogging here as usual. Nothing about that will change.

I’d love to hear any thoughts on the new setup. Likes? Dislikes?

Become a Patron!

I’ve come to a crossroads recently.

I write a blog post every week. It takes time. The last one was close to 2,000 words and required reading a book. For the past three years I’ve been writing full time, and so blogging can be a burden that cuts into this with no monetary rewards.

This blog is now over nine years old, and I’ve done nothing to monetize it. I think this is mostly a good thing. I do not and will not run any sort of advertisements. Even upon the release of my first book, I only did a brief mention and then no promotion afterward (and as far as I can tell, this converted to literally 0 sales).

I want this to be about the blog content. I do not want it to turn into some secret ad campaign to sell my work. I can think of many authors who have done this, and I ended up unsubscribing from them.

This brings me to the point. Putting this much work into something is not really sustainable anymore without some sort of support, so I’ve started a Patreon page. As you’ll see, my initial goal is quite modest and will barely cover the expenses to run my blog and website. But without anything, I will slowly phase out writing here regularly.

If this concept is new to you, Patreon is a site dedicated to supporting creative work. Patrons can pledge money to support people creating content they like. It can be as little as $1 a month (or as many podcasters say: “less than a coffee a month”), and in return, you not only help the site to keep running, you’ll receive bonus content as well.

Because of the scattered nature of my posts, I know a lot of you are probably scared to support, because you might not get content of interest for the month. Some of you like the math and tune out for the writing advice. Some of you like the critical analysis of philosophy and wish the articles on game mechanics didn’t exist.

For consistency, I’ll only put out something that would be tagged “literature” for the vast majority of posts from now on. This means once a month or less and probably never two months in a row (i.e. six per year spread out equally). This “literature” tag includes, but is not limited to, most posts on philosophy that touch on narrative or language somehow, editing rules, writing advice, book reviews, story structure analysis, examining pro’s prose, movie reviews, and so on.

Again, the core original vision for the blog included game and music and math posts, but these will be intentionally fewer now. If you check the past few years, I basically already did this anyway, but this way you know what you’re signing up for.

I think people are drawn to my literature analysis because I’m in a unique position. This month I’m about to submit my fifth romance novel under a pseudonym. This is the “commercial” work I do for money, and it’s going reasonably well. I’ve come to understand the ins and outs of genre fiction through this experience, and it has been a valuable part of learning the craft of writing for me.

My main work under my real name is much more literary. I’ve put out one novel of literary fiction. Next month I’ll put out my second “real” novel, which is firmly in the fantasy genre but hopefully doesn’t give up high-quality prose.

These two opposite experiences have given me an eye for what makes story work and what makes prose work. All over this blog I’ve shown that I love experimental writing, but I’ve also been one of the few people to unapologetically call out BS where I see it.

As you can imagine, writing several genre novels and a “real” novel every year makes it tough to justify this weekly blog for the fun of it.

If I haven’t convinced you that the quality here is worth supporting, I’ll give you one last tidbit. I get to see incoming links thanks to WordPress, so I know that more than one graduate seminar and MFA program has linked to various posts I’ve made on critical theory and difficult literature. Since I’m not in those classes, I can’t be sure of the purpose, but graduate programs tend to only suggest reading things that are worth reading. There just isn’t enough time for anything else.

I know, I know. Print is dead. You’d rather support people making podcasts or videos, but writing is the easiest way to get my ideas across. I listen to plenty of podcasts on writing, but none of them get to dig into things like prose style. The format isn’t conducive to it. One needs to see the text under analysis to really get the commentary on it.

Don’t panic. I won’t decrease blog production through the end of 2017, but I’m setting an initial goal of $100 per month. We’ll go from there, because even that might not be a sustainable level long-term. If it isn’t met, I’ll have to adjust accordingly. It’s just one of those unfortunate business decisions. Sometimes firing someone is the right move, even if they’re your friend.

I’ve set up a bunch supporter rewards, and I think anyone interested in the blog will find them well worth it. I’m being far more generous than most Patreon pages making similar content. Check out the page for details. The rewards involve seeing me put into practice what I talk about with video of me editing a current project with live commentary; extra fiction I write for free; free copies of my novels; extra “Examining Pro’s Prose” articles; and more!

I hope you find the content here worth supporting (I’m bracing myself for the humiliation of getting $2 a month and knowing it’s from my parents). If you don’t feel you can support the blog, feel free to continue reading and commenting for free. The community here has always been excellent.

What is an Expert?

I’ll tread carefully here, because we live in a strange time of questioning the motives and knowledge of experts to bolster every bizarre conspiracy theory under the sun. No one trusts any information anymore. It’s not even clear if trusting/doubting expert opinion is anti/hyper-intellectual. But that isn’t the subject of today’s topic.

I listen to quite a few podcasts, and several of them have made me think about expertise recently.

For example, Gary Taubes was on the Sam Harris podcast and both of them often get tarred with the “you don’t have a Ph.D. in whatever, so you’re an unknowledgeable/dangerous quack” brush. Also, Dan Carlin’s Hardcore History podcast is insanely detailed, but every ten minutes he reminds the audience “I’m not a historian …”

Many people who value the importance of expertise think that the degree (the Ph.D. in particular but maybe an MFA for arts stuff) is the be-all-end-all of the discussion. You have the Ph.D., then you’re an expert. If you don’t, then you’re not.

The argument I want to present is that if you believe this, you really should be willing to extend your definition of expertise to a wider group of people who have essentially done the equivalent work of one of these degrees.

Think of it this way. Person A goes to Subpar University, scrapes by with the minimal work, kind of hates it, and then teaches remedial classes at a Community College for a few years. Person B has a burning passion for the subject, studies all of the relevant literature, and continues to write about and develop novel ideas in the subject for decades. I’d be way more willing to trust Person B as an expert than Person A despite the degree differences.

Maybe I’ve already convinced you, and I need not go any further. Many of you are probably thinking, yeah, but there are parts to doing a degree that can’t be mimicked without the schooling. And others might be thinking, yeah, but Person B is merely theoretical. No one in the real world exists like Person B. We’ll address each of these points separately.

I think of a Ph.D. as having three parts. Phase 1 is demonstration of competence of the basics. This is often called the Qualifying or Preliminary Exam. Many students don’t fully understand the purpose of this phase while going through it. They think they must memorize and compute. They think of it as a test of basic knowledge.

At least in math and the hard sciences, this is not the case. It is almost a test of attitude. Do you know when you’re guessing? Do you know what you don’t know? Are you able to admit this or will you BS your way through something? Is the basic terminology internalized? You can pass Phase 1 with gaps in knowledge. You cannot pass Phase 1 if you don’t know where those gaps are.

Phase 2 is the accumulation of knowledge of the research done in your sub-sub-(sub-sub-sub)-field. This basically amounts to reading thousands of pages, sometimes from textbooks to get a historical view, but mostly from research papers. It also involves talking to lots of people engaged in similar, related, or practically the same problems as your thesis. You hear their opinions and intuitions about what is true and start to develop your own intuitions.

Phase 3 is the original contribution to the literature. In other words, you write the thesis. To get a feel for the difficulty and time commitment of each step, if you do a five year Ph.D., ideally Phase 1 would be done in around a year, Phase 2 is 2-4 years, and Phase 3 is around a year (there is overlap between phases).

I know a lot of people aren’t going to like what I’m about to say, but the expertise gained from a Ph.D. is almost entirely the familiarization with the current literature. It’s taking the time to read and understand everything being done in the field.

Phase 1 is basically about not wasting people’s time and money. If you’re going to not understand what you’re reading in Phase 2 and make careless mistakes in Phase 3, it’s best to weed those people out with Phase 1. But you aren’t gaining any expertise in Phase 1, because it’s all just the basics still.

One of the main reasons people don’t gain Ph.D.-level expertise without actually doing the degree is because being in such a program forces you to compress all that reading into a small time-frame (yes, reading for three years is short). It’s going to take someone doing it as a hobby two or three times longer, and even then, they’ll be tempted to just give up without the external motivation of the degree looming over them.

Also, without motivating thesis problem, you won’t have the narrow focus to make the reading and learning manageable. I know everyone tackles this in different ways, but here’s how it worked for me. I’d take a paper on a related topic, and I’d try to adapt the techniques and ideas to my problem. This forced me to really understand what made these techniques work, which often involved learning a bunch of stuff I wouldn’t have if I just read through it to see the results.

Before moving on, I’d like to add that upon completion of a Ph.D. you know pretty much nothing outside of your sub-sub-(sub-sub-sub)-field. It will take many years of continued teaching and researching and reading and publishing and talking to people to get any sense of your actual sub-field.

Are there people who complete the equivalent of the three listed phases without an actual degree?

I’ll start with the more controversial example of Gary Taubes. He got a physics undergrad degree and a masters in aerospace engineering. He then went into science journalism. He stumbled upon how complicated and shoddy the science of nutrition was, and started to research a book.

Five years later, he had read and analyzed pretty much every single nutrition study done. He interviewed six hundred doctors and researchers in the field. If this isn’t Phase 2 of a Ph.D., I don’t know what is. Most students won’t have gone this in-depth to learn the state of the field in an actual Ph.D. program.

Based on all of this, he then wrote a meticulously cited book Good Calories, Bad Calories. The bibliography is over 60 pages long. If this isn’t Phase 3 of a Ph.D., I don’t know what is. He’s continued to stay abreast of studies and has done at least one of his own in the past ten years. He certainly has more knowledge of the field than any fresh Ph.D.

Now you can disagree with his conclusions all you want. They are quite controversial (but lots of Ph.D. theses have controversial conclusions; this is partially how knowledge advances). Go find any place on the internet with a comments section that has run something about him and you’ll find people who write him off because “he got a physics degree so he’s not an expert on nutrition.” Are we really supposed to ignore 20 years of work done by a person just because it wasn’t done at a University and the previous 4 years of their life they got an unrelated degree? It’s a very bizarre sentiment.

A less controversial example is Dan Carlin. Listen to any one of his Hardcore History podcasts. He loves history, so he obsessively reads about it. Those podcasts are each an example of completing Phase 3 of the Ph.D. And he also clearly knows the literature as he constantly references hundreds of pieces of research an episode off the top of his head. What is a historian? Supposedly it’s someone who has a Ph.D. in history. But Dan has completed all the same Phases, it just wasn’t at a university.

(I say this is less controversial, because I think pretty much everyone considers Dan an expert on the topics he discusses except for himself. It’s a stunning display of humility. Those podcasts are the definition of having expertise on a subject.)

As a concluding remark/warning. There are a lot of cranks out there who try to pass themselves off as experts who really aren’t. It’s not easy to tell for most people, and so it’s definitely best to err on the side of the degree that went through the gatekeeper of a university when you’re not sure.

But also remember that Ph.D.’s are human too. There’s plenty of people like Person A in the example above. You can’t just believe a book someone wrote because that degree is listed after their name. They might have made honest mistakes. They might be conning you. Or, more likely, they might not have a good grasp on the current state of knowledge of the field they’re writing about.

What is an expert? To me, it is someone who has dedicated themselves with enough seriousness and professionalism to get through the phases listed above. This mostly happens with degree programs, but it also happens a lot in the real world, often because someone moves into a new career.

On Google’s AlphaGo

I thought I’d get away from critiques and reviews and serious stuff like that for a week and talk about a cool (or scary) development in AI research. I won’t talk about the details, so don’t get scared off yet. This will be more of a high level history of what happened. Many of my readers are probably unaware this even exists.

Let’s start with the basics. Go is arguably the oldest game in existence. And despite appearances, it’s one of the simplest. Each player takes a turn placing a stone on the intersections of a 19×19 board. If you surround a stone or group of stones of your opponent, you capture them (remove them from the board). If you completely surround other intersections, that counts as your “territory.”

The game ends when both sides pass (no more moves can be made to capture or surround territory). The side that has more territory + captures wins. There’s no memorization of how pieces move. There’s no rules to learn (except ko, which basically says you can’t do an infinite loop causing the game to never end). It’s really that simple.

And despite the simplicity, humans have continued to get better and produce more and more advanced theory about the game for over 2,500 years.

Let’s compare Go to Chess for a moment, because most people in the West think of Chess as the gold standard of strategy games. One could study chess for a whole lifetime and still pale in comparison to the top Grand Masters. When Deep Blue beat Kasparov in 1997, it felt like a blow to humanity.

If you’re at all in touch with the Chess world, you will have succumb to the computer overlords by now. We can measure the time since Deep Blue’s victory in decades. The AI have improved so much since then that it is commonly accepted across the whole community that a human will never be able to win against a machine at Chess ever again.

A few years ago, we could at least have said, “But wait, there’s still Go.” To someone who doesn’t have much experience with Go, it might be surprising to learn that computers weren’t even close to winning against a human a few years ago.

Here’s the rough idea why. Chess can be won by pure computation of future moves. There is no doubt that humans use pattern recognition and positional judgment and basic principles when playing, but none of that stands a chance against a machine that just reads out every single combination of the next 20 moves and then picks the best one.

Go, on the other hand, has pattern recognition as a core element of the strategy. One might try to argue that this is only because the calculations are so large, no human could ever do them. Once we have powerful enough computers, a computer could win by pure forward calculation.

As far as I understand it, this is not true. And it was the major problem in making an AI strong enough to win. Even at a theoretical level, having the computer look ahead a dozen moves would generate more combinations than the number of atoms in the known universe. A dozen moves in Chess is half the game. A dozen moves in Go tells you nothing; it wouldn’t even cover a short opening sequence.

Go definitely has local sections of the game where pure “reading ahead” wins you the situation, but there is still the global concept of surrounding the most territory to consider. It’s somewhat hard to describe in words to someone unfamiliar with the game what exactly I mean here.

san-ren-sei-opening

Notice how on the right the black stones sort of surround that area. That could quickly turn into territory by fully surrounding it. So how do you get an AI to understand this loose, vague surrounding of an area? One could even imagine much, much looser and vaguer surrounding as well. Humans can instantly see it, but machines cannot and no amount of a calculating further sequences of moves will help.

For years, every winter break from college, I’d go home and watch famous and not-so-famous people easily win matches against the top AI. Even as late as 2014, it wasn’t clear to me that I’d ever see a computer beat a human. The problem was that intractable.

Along came Google. They used a machine learning technique called “Deep Learning” to teach an AI to develop these intuitions. The result was the AlphaGo AI. In March 2016, AlphaGo beat Lee Sedol, arguably the top Go player in the world. It was a five game sequence, and AlphaGo won 4-1. This gave humanity some hope that the top players could still manage a match here and there (unlike in Chess).

But then the AI was put on an online Go server secretly under the name “Master.” It has since played pretty much every single top pro in the world. It has won every single game with a record around 60-0. It is now believed that humans will never win against it, just like in Chess.

More theory has been developed about Go than any other game. We’ve had 2,500 years of study. We thought we had figured out sound basic principles and opening theory. AlphaGo has shaken this up. It will often play moves that look bad to a trained eye, but we’re coming to see that many of the basics we once thought of as optimal are not.

It’s sort of disturbing to realize how quickly the machine learned the history of human development and then went on to innovate it’s own superior strategies. It will be interesting to see if humans can adapt to these new strategies the AI has invented.

The Carter Catastrophe

I’ve been reading Manifold: Time by Stephen Baxter. The book is quite good so far, and it presents a fascinating probabilistic argument that humans will go extinct in the near future. It is sometimes called the Carter Catastrophe, because Brandon Carter first proposed it in 1983.

I’ll use Bayesian arguments, so you might want to review some of my previous posts on the topic if you’re feeling shaky. One thing we didn’t talk all that much about is the idea of model selection. This is the most common thing scientists have to do. If you run an experiment, you get a bunch of data. Then you have to figure out the most likely reason for what you see.

Let’s take a basic example. We have a giant tub of golf balls, and we can’t see inside the tub. There could be 1 ball or a million. We’re told the owner accidentally dropped a red ball in at some point. All the other balls are the standard white golf balls. We decide to run an experiment where we draw a ball out, one at a time, until we reach the red one.

First ball: white. Second ball: white. Third ball: red. We stop. We’ve now generated a data set from our experiment, and we want to use Bayesian methods to give the probability of there being three total balls or seven or a million. In probability terms, we need to calculate the probability that there are x balls in the tub given that we drew the red ball on the third draw. Any time we see this language, our first thought should be Bayes’ theorem.

Define A_i to be the model of there being exactly i balls in the tub. I’ll use “3” inside of P( ) to be the event of drawing the red ball on the third try. We have to make a finiteness assumption, and although this is one of the main critiques of the argument, we can examine what happens as we let the size of the bound grow. Suppose for now the tub can only hold 100 balls.

A priori, we have no idea how many balls are in there, so we’ll assume all “models” are equally likely. This means P(A_i)=1/100 for all i. By Bayes’ theorem we can calculate:

P(A_3|3) = \frac{P(3|A_3)P(A_3)}{(\sum_{i=1}^{100}P(3|A_i)P(A_i))}

\frac{(1/3)(1/100)}{(1/100)\sum_{i=3}^{100}1/i} \approx 0.09

So there’s around a 9% chance that there are only 3 balls in the tub. That bottom summation remains exactly the same when computing P(A_n | 3) for any n and equals about 3.69, and the (1/100) cancels out every time. So we can compute explicitly that for n > 3:

P(A_n|3)\approx \frac{1}{n}(0.27)

This is a decreasing function of n, and this shouldn’t be surprising at all. It says that as we guess there are more and more balls in the tub, the probability of that guess goes down. This makes sense, because it’s unreasonable to think we’d see the red one that early if there are actually 100 balls in the tub.

There’s lots of ways to play with this. What happens if our tub could hold millions but we still assume a uniform prior? It just takes all the probabilities down, but the general trend is the same: It becomes less and less reasonable to assume large amounts of total balls given that we found the red one so early.

You could also only care about this “earliness” idea and redo the computations where you ask how likely is A_n given that we found the red ball by the third try. This is actually the more typical way the problem is formulated in the Doomsday arguments. It’s more complicated, but the same idea pops out, and this should make intuitive sense.

Part of the reason these computations were somewhat involved is because we tried to get a distribution on the natural numbers. But we could have tried to compare heuristically to get a super clear answer (homework for you). What if we only had two choices “small number of total balls (say 10)” or “large number of total balls (say 10,000)”? You’d find there is around a 99% chance that the “small” hypothesis is correct.

Here’s the leap. Now assume the fact that you exist right now is random. In other words, you popped out at a random point in the existence of humans. So the totality of humans to ever exist are the white balls and you are the red ball. The same type of argument above applies, and it says that the most likely thing is that you aren’t born at some super early point in human history. In fact, it’s unreasonable from a probabilistic standpoint to think that humans will continue much longer at all given your existence.

The “small” total population of humans is far, far more likely than the “large” total population, and the interesting thing is that this remains true even if you mess with the uniform prior. You could assume it is much more likely a priori for humans to continue to make improvements and colonize space and develop vaccines giving a higher prior for the species existing far into the future. But unfortunately the Bayesian argument will still pull so strongly in favor of humans ceasing to exist in the near future that one must conclude it is inevitable and will happen soon!

Anyway. I’m travelling this week, so I’m sorry if there are errors in those calculations. I was in a hurry and never double checked them. The crux of the argument should still make sense even if you don’t get my exact numbers. There’s also a lot of interesting and convincing rebuttals, but I don’t have time to get into them now (including the fact that unlikely hypotheses turn out to be true all the time).

The Infinite Cycle of Gladwell’s David and Goliath

I recently finished reading Malcolm Gladwell’s David and Goliath: Underdogs, Misfits, and the Art of Battling Giants. The book is like most Gladwell books. It has a central thesis, and then interweaves studies and anecdotes to make the case. In this one, the thesis is fairly obvious: sometimes things we think of as disadvantages have hidden advantages and sometimes things we think of as advantages have hidden disadvantages.

The opening story makes the case from the Biblical story of David and Goliath. Read it for more details, but roughly he says that Goliath’s giant strength was a hidden disadvantage because it made him slow. David’s shepherding was a hidden advantage because it made him good with a sling. It looks like the underdog won that fight, but it was really Goliath who was at a disadvantage the whole time.

The main case I want to focus on is the chapter on education, since that is something I’ve talked a lot about here. The case he makes is both interesting and poses what I see as a big problem for the thesis. There is an infinite cycle of hidden advantages/disadvantages that makes it hard to tell if the apparent (dis)advantages are anything but a wash.

Gladwell tells the story of a girl who loves science. She does so well in school and is so motivated that she gets accepted to Brown University. Everyone thinks of an Ivy League education as being full of advantages. It’s hard to think of any way in which there would be a hidden disadvantage that wouldn’t be present in someplace like Small State College (sorry, I don’t remember what her actual “safety school” was).

It turns out that she ended up feeling like a complete inadequate failure despite being reasonably good. The people around her were so amazing that she got impostor syndrome and quit science. If she had gone to Small State College, she would have felt amazing, gotten a 4.0, and become a scientist like she wanted.

It turns out we have quite a bit of data on this subject, and this is a general trend. Gladwell then goes on to make just about the most compelling case against affirmative action I’ve ever heard. He points out that letting a minority into a college that they otherwise wouldn’t have gotten into is not an advantage. It’s a disadvantage. Instead of excelling at a smaller school and getting the degree they want, they’ll end up demoralized and quit.

At this point, I want to reiterate that this has nothing to do with actual ability. It is entirely a perception thing. Gladwell is not claiming the student can’t handle the work or some nonsense. The student might even end up an A student. But even the A students at these top schools quit STEM majors because they perceive themselves to be not good enough.

Gladwell implies that this hidden disadvantage is bad enough that the girl at Brown should have gone to Small State College. But if we take Gladwell’s thesis to heart, there’s an obvious hidden advantage within the hidden disadvantage. Girl at Brown was learning valuable lessons by coping with (perceived) failure that she wouldn’t have learned at Small State College.

It seems kind of insane to shelter yourself like this. Becoming good at something always means failing along the way. If girl at Brown had been a sheltered snowflake at Small State College and graduated with her 4.0 never being challenged, that seems like a hidden disadvantage within the hidden advantage of going to the “bad” school. The better plan is to go to the good school, feel like you suck at everything, and then have counselors to help students get over their perceived inadequacies.

As a thought experiment, would you rather have a surgeon who was a B student at the top med school in the country, constantly understanding their limitations, constantly challenged to get better, or the A student at nowhere college who was never challenged and now has an inflated sense of how good they are? The answer is really easy.

This gets us to the main issue I have with the thesis of the book. If every advantage has a hidden disadvantage and vice-versa, this creates an infinite cycle. We may as well throw up our hands and say the interactions of advantages and disadvantages is too complicated to ever tell if anyone is at a true (dis)advantage. I don’t think this is a fatal flaw for Gladwell’s thesis, but I do wish it had been addressed.