Become a Patron!

I’ve come to a crossroads recently.

I write a blog post every week. It takes time. The last one was close to 2,000 words and required reading a book. For the past three years I’ve been writing full time, and so blogging can be a burden that cuts into this with no monetary rewards.

This blog is now over nine years old, and I’ve done nothing to monetize it. I think this is mostly a good thing. I do not and will not run any sort of advertisements. Even upon the release of my first book, I only did a brief mention and then no promotion afterward (and as far as I can tell, this converted to literally 0 sales).

I want this to be about the blog content. I do not want it to turn into some secret ad campaign to sell my work. I can think of many authors who have done this, and I ended up unsubscribing from them.

This brings me to the point. Putting this much work into something is not really sustainable anymore without some sort of support, so I’ve started a Patreon page. As you’ll see, my initial goal is quite modest and will barely cover the expenses to run my blog and website. But without anything, I will slowly phase out writing here regularly.

If this concept is new to you, Patreon is a site dedicated to supporting creative work. Patrons can pledge money to support people creating content they like. It can be as little as $1 a month (or as many podcasters say: “less than a coffee a month”), and in return, you not only help the site to keep running, you’ll receive bonus content as well.

Because of the scattered nature of my posts, I know a lot of you are probably scared to support, because you might not get content of interest for the month. Some of you like the math and tune out for the writing advice. Some of you like the critical analysis of philosophy and wish the articles on game mechanics didn’t exist.

For consistency, I’ll only put out something that would be tagged “literature” for the vast majority of posts from now on. This means once a month or less and probably never two months in a row (i.e. six per year spread out equally). This “literature” tag includes, but is not limited to, most posts on philosophy that touch on narrative or language somehow, editing rules, writing advice, book reviews, story structure analysis, examining pro’s prose, movie reviews, and so on.

Again, the core original vision for the blog included game and music and math posts, but these will be intentionally fewer now. If you check the past few years, I basically already did this anyway, but this way you know what you’re signing up for.

I think people are drawn to my literature analysis because I’m in a unique position. This month I’m about to submit my fifth romance novel under a pseudonym. This is the “commercial” work I do for money, and it’s going reasonably well. I’ve come to understand the ins and outs of genre fiction through this experience, and it has been a valuable part of learning the craft of writing for me.

My main work under my real name is much more literary. I’ve put out one novel of literary fiction. Next month I’ll put out my second “real” novel, which is firmly in the fantasy genre but hopefully doesn’t give up high-quality prose.

These two opposite experiences have given me an eye for what makes story work and what makes prose work. All over this blog I’ve shown that I love experimental writing, but I’ve also been one of the few people to unapologetically call out BS where I see it.

As you can imagine, writing several genre novels and a “real” novel every year makes it tough to justify this weekly blog for the fun of it.

If I haven’t convinced you that the quality here is worth supporting, I’ll give you one last tidbit. I get to see incoming links thanks to WordPress, so I know that more than one graduate seminar and MFA program has linked to various posts I’ve made on critical theory and difficult literature. Since I’m not in those classes, I can’t be sure of the purpose, but graduate programs tend to only suggest reading things that are worth reading. There just isn’t enough time for anything else.

I know, I know. Print is dead. You’d rather support people making podcasts or videos, but writing is the easiest way to get my ideas across. I listen to plenty of podcasts on writing, but none of them get to dig into things like prose style. The format isn’t conducive to it. One needs to see the text under analysis to really get the commentary on it.

Don’t panic. I won’t decrease blog production through the end of 2017, but I’m setting an initial goal of $100 per month. We’ll go from there, because even that might not be a sustainable level long-term. If it isn’t met, I’ll have to adjust accordingly. It’s just one of those unfortunate business decisions. Sometimes firing someone is the right move, even if they’re your friend.

I’ve set up a bunch supporter rewards, and I think anyone interested in the blog will find them well worth it. I’m being far more generous than most Patreon pages making similar content. Check out the page for details. The rewards involve seeing me put into practice what I talk about with video of me editing a current project with live commentary; extra fiction I write for free; free copies of my novels; extra “Examining Pro’s Prose” articles; and more!

I hope you find the content here worth supporting (I’m bracing myself for the humiliation of getting $2 a month and knowing it’s from my parents). If you don’t feel you can support the blog, feel free to continue reading and commenting for free. The community here has always been excellent.

What is an Expert?

I’ll tread carefully here, because we live in a strange time of questioning the motives and knowledge of experts to bolster every bizarre conspiracy theory under the sun. No one trusts any information anymore. It’s not even clear if trusting/doubting expert opinion is anti/hyper-intellectual. But that isn’t the subject of today’s topic.

I listen to quite a few podcasts, and several of them have made me think about expertise recently.

For example, Gary Taubes was on the Sam Harris podcast and both of them often get tarred with the “you don’t have a Ph.D. in whatever, so you’re an unknowledgeable/dangerous quack” brush. Also, Dan Carlin’s Hardcore History podcast is insanely detailed, but every ten minutes he reminds the audience “I’m not a historian …”

Many people who value the importance of expertise think that the degree (the Ph.D. in particular but maybe an MFA for arts stuff) is the be-all-end-all of the discussion. You have the Ph.D., then you’re an expert. If you don’t, then you’re not.

The argument I want to present is that if you believe this, you really should be willing to extend your definition of expertise to a wider group of people who have essentially done the equivalent work of one of these degrees.

Think of it this way. Person A goes to Subpar University, scrapes by with the minimal work, kind of hates it, and then teaches remedial classes at a Community College for a few years. Person B has a burning passion for the subject, studies all of the relevant literature, and continues to write about and develop novel ideas in the subject for decades. I’d be way more willing to trust Person B as an expert than Person A despite the degree differences.

Maybe I’ve already convinced you, and I need not go any further. Many of you are probably thinking, yeah, but there are parts to doing a degree that can’t be mimicked without the schooling. And others might be thinking, yeah, but Person B is merely theoretical. No one in the real world exists like Person B. We’ll address each of these points separately.

I think of a Ph.D. as having three parts. Phase 1 is demonstration of competence of the basics. This is often called the Qualifying or Preliminary Exam. Many students don’t fully understand the purpose of this phase while going through it. They think they must memorize and compute. They think of it as a test of basic knowledge.

At least in math and the hard sciences, this is not the case. It is almost a test of attitude. Do you know when you’re guessing? Do you know what you don’t know? Are you able to admit this or will you BS your way through something? Is the basic terminology internalized? You can pass Phase 1 with gaps in knowledge. You cannot pass Phase 1 if you don’t know where those gaps are.

Phase 2 is the accumulation of knowledge of the research done in your sub-sub-(sub-sub-sub)-field. This basically amounts to reading thousands of pages, sometimes from textbooks to get a historical view, but mostly from research papers. It also involves talking to lots of people engaged in similar, related, or practically the same problems as your thesis. You hear their opinions and intuitions about what is true and start to develop your own intuitions.

Phase 3 is the original contribution to the literature. In other words, you write the thesis. To get a feel for the difficulty and time commitment of each step, if you do a five year Ph.D., ideally Phase 1 would be done in around a year, Phase 2 is 2-4 years, and Phase 3 is around a year (there is overlap between phases).

I know a lot of people aren’t going to like what I’m about to say, but the expertise gained from a Ph.D. is almost entirely the familiarization with the current literature. It’s taking the time to read and understand everything being done in the field.

Phase 1 is basically about not wasting people’s time and money. If you’re going to not understand what you’re reading in Phase 2 and make careless mistakes in Phase 3, it’s best to weed those people out with Phase 1. But you aren’t gaining any expertise in Phase 1, because it’s all just the basics still.

One of the main reasons people don’t gain Ph.D.-level expertise without actually doing the degree is because being in such a program forces you to compress all that reading into a small time-frame (yes, reading for three years is short). It’s going to take someone doing it as a hobby two or three times longer, and even then, they’ll be tempted to just give up without the external motivation of the degree looming over them.

Also, without motivating thesis problem, you won’t have the narrow focus to make the reading and learning manageable. I know everyone tackles this in different ways, but here’s how it worked for me. I’d take a paper on a related topic, and I’d try to adapt the techniques and ideas to my problem. This forced me to really understand what made these techniques work, which often involved learning a bunch of stuff I wouldn’t have if I just read through it to see the results.

Before moving on, I’d like to add that upon completion of a Ph.D. you know pretty much nothing outside of your sub-sub-(sub-sub-sub)-field. It will take many years of continued teaching and researching and reading and publishing and talking to people to get any sense of your actual sub-field.

Are there people who complete the equivalent of the three listed phases without an actual degree?

I’ll start with the more controversial example of Gary Taubes. He got a physics undergrad degree and a masters in aerospace engineering. He then went into science journalism. He stumbled upon how complicated and shoddy the science of nutrition was, and started to research a book.

Five years later, he had read and analyzed pretty much every single nutrition study done. He interviewed six hundred doctors and researchers in the field. If this isn’t Phase 2 of a Ph.D., I don’t know what is. Most students won’t have gone this in-depth to learn the state of the field in an actual Ph.D. program.

Based on all of this, he then wrote a meticulously cited book Good Calories, Bad Calories. The bibliography is over 60 pages long. If this isn’t Phase 3 of a Ph.D., I don’t know what is. He’s continued to stay abreast of studies and has done at least one of his own in the past ten years. He certainly has more knowledge of the field than any fresh Ph.D.

Now you can disagree with his conclusions all you want. They are quite controversial (but lots of Ph.D. theses have controversial conclusions; this is partially how knowledge advances). Go find any place on the internet with a comments section that has run something about him and you’ll find people who write him off because “he got a physics degree so he’s not an expert on nutrition.” Are we really supposed to ignore 20 years of work done by a person just because it wasn’t done at a University and the previous 4 years of their life they got an unrelated degree? It’s a very bizarre sentiment.

A less controversial example is Dan Carlin. Listen to any one of his Hardcore History podcasts. He loves history, so he obsessively reads about it. Those podcasts are each an example of completing Phase 3 of the Ph.D. And he also clearly knows the literature as he constantly references hundreds of pieces of research an episode off the top of his head. What is a historian? Supposedly it’s someone who has a Ph.D. in history. But Dan has completed all the same Phases, it just wasn’t at a university.

(I say this is less controversial, because I think pretty much everyone considers Dan an expert on the topics he discusses except for himself. It’s a stunning display of humility. Those podcasts are the definition of having expertise on a subject.)

As a concluding remark/warning. There are a lot of cranks out there who try to pass themselves off as experts who really aren’t. It’s not easy to tell for most people, and so it’s definitely best to err on the side of the degree that went through the gatekeeper of a university when you’re not sure.

But also remember that Ph.D.’s are human too. There’s plenty of people like Person A in the example above. You can’t just believe a book someone wrote because that degree is listed after their name. They might have made honest mistakes. They might be conning you. Or, more likely, they might not have a good grasp on the current state of knowledge of the field they’re writing about.

What is an expert? To me, it is someone who has dedicated themselves with enough seriousness and professionalism to get through the phases listed above. This mostly happens with degree programs, but it also happens a lot in the real world, often because someone moves into a new career.

On Google’s AlphaGo

I thought I’d get away from critiques and reviews and serious stuff like that for a week and talk about a cool (or scary) development in AI research. I won’t talk about the details, so don’t get scared off yet. This will be more of a high level history of what happened. Many of my readers are probably unaware this even exists.

Let’s start with the basics. Go is arguably the oldest game in existence. And despite appearances, it’s one of the simplest. Each player takes a turn placing a stone on the intersections of a 19×19 board. If you surround a stone or group of stones of your opponent, you capture them (remove them from the board). If you completely surround other intersections, that counts as your “territory.”

The game ends when both sides pass (no more moves can be made to capture or surround territory). The side that has more territory + captures wins. There’s no memorization of how pieces move. There’s no rules to learn (except ko, which basically says you can’t do an infinite loop causing the game to never end). It’s really that simple.

And despite the simplicity, humans have continued to get better and produce more and more advanced theory about the game for over 2,500 years.

Let’s compare Go to Chess for a moment, because most people in the West think of Chess as the gold standard of strategy games. One could study chess for a whole lifetime and still pale in comparison to the top Grand Masters. When Deep Blue beat Kasparov in 1997, it felt like a blow to humanity.

If you’re at all in touch with the Chess world, you will have succumb to the computer overlords by now. We can measure the time since Deep Blue’s victory in decades. The AI have improved so much since then that it is commonly accepted across the whole community that a human will never be able to win against a machine at Chess ever again.

A few years ago, we could at least have said, “But wait, there’s still Go.” To someone who doesn’t have much experience with Go, it might be surprising to learn that computers weren’t even close to winning against a human a few years ago.

Here’s the rough idea why. Chess can be won by pure computation of future moves. There is no doubt that humans use pattern recognition and positional judgment and basic principles when playing, but none of that stands a chance against a machine that just reads out every single combination of the next 20 moves and then picks the best one.

Go, on the other hand, has pattern recognition as a core element of the strategy. One might try to argue that this is only because the calculations are so large, no human could ever do them. Once we have powerful enough computers, a computer could win by pure forward calculation.

As far as I understand it, this is not true. And it was the major problem in making an AI strong enough to win. Even at a theoretical level, having the computer look ahead a dozen moves would generate more combinations than the number of atoms in the known universe. A dozen moves in Chess is half the game. A dozen moves in Go tells you nothing; it wouldn’t even cover a short opening sequence.

Go definitely has local sections of the game where pure “reading ahead” wins you the situation, but there is still the global concept of surrounding the most territory to consider. It’s somewhat hard to describe in words to someone unfamiliar with the game what exactly I mean here.

san-ren-sei-opening

Notice how on the right the black stones sort of surround that area. That could quickly turn into territory by fully surrounding it. So how do you get an AI to understand this loose, vague surrounding of an area? One could even imagine much, much looser and vaguer surrounding as well. Humans can instantly see it, but machines cannot and no amount of a calculating further sequences of moves will help.

For years, every winter break from college, I’d go home and watch famous and not-so-famous people easily win matches against the top AI. Even as late as 2014, it wasn’t clear to me that I’d ever see a computer beat a human. The problem was that intractable.

Along came Google. They used a machine learning technique called “Deep Learning” to teach an AI to develop these intuitions. The result was the AlphaGo AI. In March 2016, AlphaGo beat Lee Sedol, arguably the top Go player in the world. It was a five game sequence, and AlphaGo won 4-1. This gave humanity some hope that the top players could still manage a match here and there (unlike in Chess).

But then the AI was put on an online Go server secretly under the name “Master.” It has since played pretty much every single top pro in the world. It has won every single game with a record around 60-0. It is now believed that humans will never win against it, just like in Chess.

More theory has been developed about Go than any other game. We’ve had 2,500 years of study. We thought we had figured out sound basic principles and opening theory. AlphaGo has shaken this up. It will often play moves that look bad to a trained eye, but we’re coming to see that many of the basics we once thought of as optimal are not.

It’s sort of disturbing to realize how quickly the machine learned the history of human development and then went on to innovate it’s own superior strategies. It will be interesting to see if humans can adapt to these new strategies the AI has invented.

The Carter Catastrophe

I’ve been reading Manifold: Time by Stephen Baxter. The book is quite good so far, and it presents a fascinating probabilistic argument that humans will go extinct in the near future. It is sometimes called the Carter Catastrophe, because Brandon Carter first proposed it in 1983.

I’ll use Bayesian arguments, so you might want to review some of my previous posts on the topic if you’re feeling shaky. One thing we didn’t talk all that much about is the idea of model selection. This is the most common thing scientists have to do. If you run an experiment, you get a bunch of data. Then you have to figure out the most likely reason for what you see.

Let’s take a basic example. We have a giant tub of golf balls, and we can’t see inside the tub. There could be 1 ball or a million. We’re told the owner accidentally dropped a red ball in at some point. All the other balls are the standard white golf balls. We decide to run an experiment where we draw a ball out, one at a time, until we reach the red one.

First ball: white. Second ball: white. Third ball: red. We stop. We’ve now generated a data set from our experiment, and we want to use Bayesian methods to give the probability of there being three total balls or seven or a million. In probability terms, we need to calculate the probability that there are x balls in the tub given that we drew the red ball on the third draw. Any time we see this language, our first thought should be Bayes’ theorem.

Define A_i to be the model of there being exactly i balls in the tub. I’ll use “3” inside of P( ) to be the event of drawing the red ball on the third try. We have to make a finiteness assumption, and although this is one of the main critiques of the argument, we can examine what happens as we let the size of the bound grow. Suppose for now the tub can only hold 100 balls.

A priori, we have no idea how many balls are in there, so we’ll assume all “models” are equally likely. This means P(A_i)=1/100 for all i. By Bayes’ theorem we can calculate:

P(A_3|3) = \frac{P(3|A_3)P(A_3)}{(\sum_{i=1}^{100}P(3|A_i)P(A_i))}

\frac{(1/3)(1/100)}{(1/100)\sum_{i=3}^{100}1/i} \approx 0.09

So there’s around a 9% chance that there are only 3 balls in the tub. That bottom summation remains exactly the same when computing P(A_n | 3) for any n and equals about 3.69, and the (1/100) cancels out every time. So we can compute explicitly that for n > 3:

P(A_n|3)\approx \frac{1}{n}(0.27)

This is a decreasing function of n, and this shouldn’t be surprising at all. It says that as we guess there are more and more balls in the tub, the probability of that guess goes down. This makes sense, because it’s unreasonable to think we’d see the red one that early if there are actually 100 balls in the tub.

There’s lots of ways to play with this. What happens if our tub could hold millions but we still assume a uniform prior? It just takes all the probabilities down, but the general trend is the same: It becomes less and less reasonable to assume large amounts of total balls given that we found the red one so early.

You could also only care about this “earliness” idea and redo the computations where you ask how likely is A_n given that we found the red ball by the third try. This is actually the more typical way the problem is formulated in the Doomsday arguments. It’s more complicated, but the same idea pops out, and this should make intuitive sense.

Part of the reason these computations were somewhat involved is because we tried to get a distribution on the natural numbers. But we could have tried to compare heuristically to get a super clear answer (homework for you). What if we only had two choices “small number of total balls (say 10)” or “large number of total balls (say 10,000)”? You’d find there is around a 99% chance that the “small” hypothesis is correct.

Here’s the leap. Now assume the fact that you exist right now is random. In other words, you popped out at a random point in the existence of humans. So the totality of humans to ever exist are the white balls and you are the red ball. The same type of argument above applies, and it says that the most likely thing is that you aren’t born at some super early point in human history. In fact, it’s unreasonable from a probabilistic standpoint to think that humans will continue much longer at all given your existence.

The “small” total population of humans is far, far more likely than the “large” total population, and the interesting thing is that this remains true even if you mess with the uniform prior. You could assume it is much more likely a priori for humans to continue to make improvements and colonize space and develop vaccines giving a higher prior for the species existing far into the future. But unfortunately the Bayesian argument will still pull so strongly in favor of humans ceasing to exist in the near future that one must conclude it is inevitable and will happen soon!

Anyway. I’m travelling this week, so I’m sorry if there are errors in those calculations. I was in a hurry and never double checked them. The crux of the argument should still make sense even if you don’t get my exact numbers. There’s also a lot of interesting and convincing rebuttals, but I don’t have time to get into them now (including the fact that unlikely hypotheses turn out to be true all the time).

The Infinite Cycle of Gladwell’s David and Goliath

I recently finished reading Malcolm Gladwell’s David and Goliath: Underdogs, Misfits, and the Art of Battling Giants. The book is like most Gladwell books. It has a central thesis, and then interweaves studies and anecdotes to make the case. In this one, the thesis is fairly obvious: sometimes things we think of as disadvantages have hidden advantages and sometimes things we think of as advantages have hidden disadvantages.

The opening story makes the case from the Biblical story of David and Goliath. Read it for more details, but roughly he says that Goliath’s giant strength was a hidden disadvantage because it made him slow. David’s shepherding was a hidden advantage because it made him good with a sling. It looks like the underdog won that fight, but it was really Goliath who was at a disadvantage the whole time.

The main case I want to focus on is the chapter on education, since that is something I’ve talked a lot about here. The case he makes is both interesting and poses what I see as a big problem for the thesis. There is an infinite cycle of hidden advantages/disadvantages that makes it hard to tell if the apparent (dis)advantages are anything but a wash.

Gladwell tells the story of a girl who loves science. She does so well in school and is so motivated that she gets accepted to Brown University. Everyone thinks of an Ivy League education as being full of advantages. It’s hard to think of any way in which there would be a hidden disadvantage that wouldn’t be present in someplace like Small State College (sorry, I don’t remember what her actual “safety school” was).

It turns out that she ended up feeling like a complete inadequate failure despite being reasonably good. The people around her were so amazing that she got impostor syndrome and quit science. If she had gone to Small State College, she would have felt amazing, gotten a 4.0, and become a scientist like she wanted.

It turns out we have quite a bit of data on this subject, and this is a general trend. Gladwell then goes on to make just about the most compelling case against affirmative action I’ve ever heard. He points out that letting a minority into a college that they otherwise wouldn’t have gotten into is not an advantage. It’s a disadvantage. Instead of excelling at a smaller school and getting the degree they want, they’ll end up demoralized and quit.

At this point, I want to reiterate that this has nothing to do with actual ability. It is entirely a perception thing. Gladwell is not claiming the student can’t handle the work or some nonsense. The student might even end up an A student. But even the A students at these top schools quit STEM majors because they perceive themselves to be not good enough.

Gladwell implies that this hidden disadvantage is bad enough that the girl at Brown should have gone to Small State College. But if we take Gladwell’s thesis to heart, there’s an obvious hidden advantage within the hidden disadvantage. Girl at Brown was learning valuable lessons by coping with (perceived) failure that she wouldn’t have learned at Small State College.

It seems kind of insane to shelter yourself like this. Becoming good at something always means failing along the way. If girl at Brown had been a sheltered snowflake at Small State College and graduated with her 4.0 never being challenged, that seems like a hidden disadvantage within the hidden advantage of going to the “bad” school. The better plan is to go to the good school, feel like you suck at everything, and then have counselors to help students get over their perceived inadequacies.

As a thought experiment, would you rather have a surgeon who was a B student at the top med school in the country, constantly understanding their limitations, constantly challenged to get better, or the A student at nowhere college who was never challenged and now has an inflated sense of how good they are? The answer is really easy.

This gets us to the main issue I have with the thesis of the book. If every advantage has a hidden disadvantage and vice-versa, this creates an infinite cycle. We may as well throw up our hands and say the interactions of advantages and disadvantages is too complicated to ever tell if anyone is at a true (dis)advantage. I don’t think this is a fatal flaw for Gladwell’s thesis, but I do wish it had been addressed.

On Switching to Colemak

There’s this thing many people will probably go their whole lives and never know about. A ton of alternative keyboard layouts exist other than the default “QWERTY” (named for the letters along the top row of the keyboard). There is a subculture obsessed with this.

The two most common ones are Dvorak and Colemak. Last Saturday I started learning where the letters on Colemak are located. By the end of Sunday, I had them memorized. This meant I could type very slowly (3-5 wpm) with near perfect accuracy.

It didn’t take long to learn at all. Now, a few days later, I no longer have to think about where the letters are, but it will probably be another week or so before I get back to full speed.

Let’s talk about the burning question in everyone’s mind: why would anyone subject themselves to such an experience? I type a lot. For the past year or so I’ve experienced some mild pain in my wrists. I’ve never had it diagnosed to know if it is repetitive strain injury, but my guess is it’s a bad sign if you experience any pain, no matter how small.

I tried to alleviate some stress by tilting my keyboard and giving my wrists something to rest on:

imag0482

[Yes, that’s Aristotle’s Poetics under the front of the keyboard.]

This helped a little, but the more I looked it up, the more I realized there was a fundamental issue with the keyboard layout that could be part of the problem. Most people probably think the layout has a purpose because of how strange it is. But we’ve outgrown that purpose.

The history of this is long and somewhat interesting, but it basically boils down to making sure hands alternate and common digraphs (two-letter combinations) have large distances separating them, so that when typing quickly on a mechanical typewriter it will be least likely to jam.

If one were to design a keyboard to minimize injury, one would put the most common letters on the home row, minimize long stretches, and make sure common digraphs use different but nearby fingers. This is almost exactly the philosophy of the Colemak layout.

The Colemak layout allows you to type around 34 times the number of words on the home row than QWERTY. It’s sort of insane that “j” is on the home row and “e” and “i” are not for QWERTY. Colemak also distributes workload more evenly. It favors the right hand slightly more at 6%, unlike the massive favoring of the right hand for QWERTY at 15%. You can go look up the stats if you want to know more. I won’t bore you by listing them here.

You will definitely lose a lot of work time while making the change due to slow typing, but the layout is provably more efficient. So in the long run you’ll end up more than compensated for these short-term losses.

I’d like to end by reflecting on what a surreal experience this has been. I think I first started learning to type around the age of eight. I’m now thirty. That’s twenty-two years of constant ingraining of certain actions that had to be undone. Typing has to be subconscious to be effective. We don’t even think about letters or spelling when doing it. Most words are just patterns that roll off the fingers.

This is made explicitly obvious when I get going at a reasonable speed. I can type in Colemak without confusion letter-by-letter, but I still slip up when my speed hits that critical point where I think whole words at a time. At that point, a few words of nonsense happen before I slide back into correct words. It’s very strange, because I don’t even notice it until I look back and see that it happened.

I’ve never become fluent in another language, but I imagine a similar thing must happen when one is right on the edge of being able to think in the new language. You can speak fluently, but occasionally the subconscious brain takes over for a word, even if you know the word.

If you’re at all interested, I’d recommend checking into it. I already feel a huge difference in comfort level.

Confounding Variables and Apparent Bias

I was going to call this post something inflammatory like #CylonLivesMatter but decided against it. Today will be a thought experiment to clarify some confusion over whether apparent bias is real bias based on aggregate data. I’ll unpack all that with a very simple example.

Let’s suppose we have a region, say a county, and we are trying to tell if car accidents disproportionately affect cylons due to bias. If you’re unfamiliar with this term, it comes from Battlestar Galactica. They were the “bad guys,” but they had absolutely no distinguishing features. From looking at them, there was no way to tell if your best friend was one or not. I want to use this for the thought experiment so that we can be absolutely certain there is no bias based on appearance.

The county we get our data from has roughly two main areas: Location 1 and Location 2. Location 1 has 5 cylons and 95 humans. Location 2 has 20 cylons and 80 humans. This means the county is 12.5% cylon and 87.5% human.

Let’s assume that there is no behavioral reason among the people of Location 1 to have safer driving habits. Let’s assume it is merely an environmental thing, say the roads are naturally larger and speed limits lower or something. They only average 1 car accident per month. Location 2, on the other hand, has poorly designed roads and bad visibility in areas, so they have 10 car accidents per month.

At the end of the year, if there is absolutely no bias at all, we would expect to see 12 car accidents uniformly distributed among the population of Location 1 and 120 car accidents uniformly distributed among the population of Location 2. This means Location 1 had 1 cylon in an accident and 11 humans, and Location 2 had 24 cylons and 96 humans in accidents.

We work for the county, and we take the full statistics: 25 cylon accidents and 107 human accidents. That means 19% of car accidents involve cylons, even though their population in the county is only 12.5%. As an investigator into this matter, we now try to conclude that since there is a disproportionate number of cylons in car accidents with respect to their baseline population, there must be some bias or speciesism present causing this.

Now I think everyone knows where this is going. It is clear from the example that combining together all the numbers from across the county, and then saying that the disproportionately high number of cylon car accidents had to be indicative of some underlying, institutional problem, was the incorrect thing to do. But this is the standard rhetoric of #blacklivesmatter. We hear that blacks make up roughly 13% of the population but are 25% of those killed by cops. Therefore, that basic disparity is indicative of racist motives by the cops, or at least is an institutional bias that needs to be fixed.

Recently, a more nuanced study has been making the news rounds that claims there isn’t a bias in who cops kill. How can this be? Well, what happened in our example case to cause the misleading information? A disproportionate number of cylons lived in environmental conditions that caused the car accidents. It wasn’t anyone’s fault. There wasn’t bias or speciesism at work. The lack of nuance in analyzing the statistics caused apparent bias that wasn’t there.

The study by Fryer does this. It builds a model that takes into account one uncontroversial environmental factor: we expect more accidental, unnecessary shootings by cops in more dangerous locations. In other words, we expect that, regardless of race, cops will shoot out of fear for their lives in locations where higher chances of violent crimes occur.

As with any study, there is always pushback. Mathbabe had a guest post pointing to some potential problems with sampling. I’m not trying to make any sort of statement with this post. I’ve talked about statistics a lot on the blog, and I merely wanted to show how such a study is possible with a less charged example. I know a lot of the initial reaction to the study was: But 13% vs 25%!!! Of course it’s racism!!! This idiot just has an agenda, and he’s manipulating data for political purposes!!!

Actually, when we only look at aggregate statistics across the entire country, we can accidentally pick up apparent bias where none exists, as in the example. The study just tries to tease these confounding factors out. Whether it did a good job is the subject of another post.

The Ethics of True Knowledge

This post will probably be a mess. I listen to lots of podcasts while running and exercising. There was a strange confluence of topics that seemed to hit all at once from several unrelated places. Sam Harris interviewed Neil deGrasse Tyson, and they talked a little about recognizing alien intelligence and the rabbit hole of postmodernist interpretations of knowledge (more on this later). Daniel Kaufman talked with Massimo Pigliucci about philosophy of math.

We’ll start with a fundamental fact that must be acknowledged: we’ve actually figured some things out. In other words, knowledge is possible. Maybe there are some really, really, really minor details that aren’t quite right, but the fact that you are reading this blog post on a fancy computer is proof that we aren’t just wandering aimlessly in the dark when it comes to the circuitry of a computer. Science has succeeded in many places, and it remains the only reliable way to generate knowledge at this point in human history.

Skepticism is the backbone of science, but there is a postmodernist rabbit hole one can get sucked into by taking it too far. I won’t make the standard rebuttals to radical skepticism, but instead I’ll make an appeal to ethics. I’ve written about this many times, two of which are here and here. It is basically a variation on Clifford’s paper The Ethics of Belief.

The short form is that good people will do good things if they have good information, but good people will often do bad things unintentionally if they have bad information. Thus it is an ethical imperative to always strive for truth and knowledge.

I’ll illuminate what I mean with an example. The anti-vaccine people have their hearts in the right place. They don’t intend to cause harm. They actually think that vaccines are harmful, so it is the bad information causing them act unethically. I picked this example, because it exemplifies the main problem I wanted to get to.

It is actually very difficult to criticize their arguments in general terms. They are skeptical of the science for reasons that are usually good. They claim big corporations stand to lose a lot of money, so they are covering up the truth. Typically, this is one of the times it is good to question the science, because there are actual examples where money has led to bad science in the past. Since I already mentioned Neil deGrasse Tyson, I’ll quote him for how to think about this.

“A skeptic will question claims, then embrace the evidence. A denier will question claims, then deny the evidence.”

This type of thing can be scary when we, as non-experts, still have to figure out what is true or risk unintentional harm in less clear-cut examples. No one has time to examine all of the evidence for every issue to figure out what to embrace. So we have to rely on experts to tell us what the evidence says. But then the skeptic chimes in and says, but an appeal to authority is a logical fallacy and those experts are paid by people that cause a conflict of interest.

Ah! What is one to do? My answer is to go back to our starting point. Science actually works for discovering knowledge. Deferring to scientific consensus on issues is the ethically responsible thing to do. If they are wrong, it is almost certainly going to be an expert within the field that finds the errors and corrects them. It is highly unlikely that some Hollywood actor has discovered a giant conspiracy and also has the time and training to debunk the scientific papers that go against them.

Science has been wrong; anything is possible, but one must go with what is probable.

I said this post would be a mess and brought up philosophy of math at the start, so how does that have anything to do with what I just wrote? Maybe nothing, but it’s connected in my mind in a vague way.

Some people think mathematical objects are inherent in nature. They “actually exist” in some sense. This is called Platonism. Other people think math is just an arbitrary game where we manipulate symbols according to rules we’ve made up. I tend to take the embodied mind philosophy of math as developed by Lakoff and Nunez.

They claim that mathematics itself is purely a construct of our embodied minds, but it isn’t an “arbitrary” set of rules like chess. We’ve struck upon axioms (Peano or otherwise) and logic that correspond to how we perceive the world. This is why it is useful in the real world.

To put it more bluntly: Aliens, whose embodied experience of the world might be entirely different, might strike upon an entirely different mathematics that we might not even recognize as such but be equally effective at describing the world as they perceive it. Therefore, math is not mind independent or even universal among all intelligent minds, but is still useful.

To tie this back to the original point, I was wondering if we would even recognize aliens as intelligent if their way of expressing it was so different from our own that their math couldn’t even be recognized as such to us. Would they be able to express true knowledge that was inaccessible to us? What does this mean in relation to the ethics of belief?

Anyway, I’m thinking about making this a series on the blog. Maybe I’ll call it RRR: Random Running Ramblings, where I post random questions that occur to me while listening to something while running.

Draw Luck in Card Games, Part 2

A few weeks ago I talked about draw luck in card games. I thought I’d go a little further today with the actual math behind some core concepts when you play a card game where you build your own deck to use. The same idea works for computing probabilities in poker, so you don’t need to get too hung up on the particulars here.

I’m going to use Magic: The Gathering (MTG) as an example. Here are the relevant idea axioms we will use:

1. Your deck will consist of 60 cards.
2. You start by drawing 7 cards.
3. Each turn you draw 1 card.
4. Each card has a “cost” to play it (called mana).
5. Optimal strategy is to play a cost 1 card on turn 1, a cost 2 card on turn 2, and so on. This is called “playing on curve.”

You don’t have to know anything about MTG now that you have these axioms (in fact, writing them this way allows you to convert everything to Hearthstone, or your card game of choice). Of course, every single one of those axioms can be affected by play, so this is a vast oversimplification. But it gives a good reference point if you’ve never seen anything like this type of analysis before. Let’s build up the theory little by little.

First, what is the probability of being able to play a 1-cost card on turn 1 if you put, say, 10 of these in your deck? We’ll simplify axiom 2 to get started. Suppose you only draw one card to start. Basically, by definition of probability, you have a 10/60, or 16.67% chance of drawing it. Now if you draw 2 cards, it already gets a little bit trickier. Exercise: Try to work it out to see why (hint: the first card could be 1-cost OR the second OR both).

Let’s reframe the question. What’s the probability of NOT being able to play a card turn 1 if you draw 2 cards? You would have to draw a non-1-cost AND another non-1-cost. The first card you pick up has a 50/60 chance of this happening. Now the deck only has 59 cards left, and 49 of those are non-1-cost. So the probability of not being able to play turn 1 is {\frac{50}{60}\cdot\frac{49}{59}}, or about a 69% chance.

To convert this back, we get that the probability of being able to play the 1-cost card on turn 1 (if start with 2 cards) is {\displaystyle 1- \frac{50\cdot 49}{60\cdot 59}}, or about a 31% chance.

Axiom 2 says that in the actual game we start by drawing 7 cards. The pattern above continues in this way, so if we put {k} 1-cost cards in our deck, the probability of being able to play one of these on turn 1 is:

{\displaystyle 1 - \frac{(60-k)\cdot (60-k-1)\cdots (60-k-7)}{60\cdot 59\cdots (60-7)} = 1 - \frac{{60-k \choose 7}}{{60 \choose 7}}}.

To calculate the probability of hitting a 2-cost card on turn 2, we just change the 7 to an 8, since we’ll be getting 8 cards by axiom 3. The {k} becomes however many 2-cost cards we have.

Here’s a nice little question: Is it possible to make a deck where we have a greater than 50% chance of playing on curve every turn for the first 6 turns? We just compute the {k} above that makes each probability greater than {0.5}. This requires putting the following amount of cards in your deck:

6 1-cost
5 2-cost
5 3-cost
4 4-cost
4 5-cost
3 6-cost

Even assuming you put 24 lands in your deck, this still gives you tons of extra cards. Let’s push this a little further. Can you make a deck that has a better than 70% chance of playing on curve every turn? Yes!

9 1-cost
8 2-cost
7 3-cost
7 4-cost
6 5-cost
6 6-cost

Warning: This mana curve would never be used by any sort of competetive deck. This is a thought experiment with tons of simplifying assumptions. The curve for your deck is going to depend on a huge number of things. Most decks will probably value playing on curve in the 2,3,4 range way more than other turns. If you have an aggressive deck, you might value the early game. If you play a control deck, you might value the later game.

Also, the longer the game goes, the less cards you probably need in the high cost range to get those probabilities up, because there will be ways to hunt through your deck to increase the chance of finding them. Even more, all of these estimates are conservative, because MTG allows you to mulligan a bad starting hand. This means many worst-case scenarios get thrown out, giving you an even better chance at playing on curve.

This brings us back to the point being made in the previous post. Sometimes what feels like “bad luck” could be poor deck construction. This is an aspect you have full control over, and if you keep feeling like you aren’t able to play a card, you might want to work these probabilities out to make a conscious choice about how likely you are to draw certain cards at certain points of the game.

Once you know the probabilities, you can make more informed strategic decisions. This is exactly how professional poker is played.

Draw Luck in Card Games

Every year, around this time, I like to do a post on some aspect of game design in honor of the 7DRL Challenge. Let’s talk about something I hate: card games (though I sometimes become obsessed with, and love, well-made ones). For a game to be competitive, luck must be minimized or controlled in some way.

My family is obsessed with Canasta. I don’t get the appeal at all. This is a game that can take 1-2 hours to play and amounts to taking a random hand of cards and sorting them into like piles.

I’ve seen people say there is “strategy” on various forums. I’ll agree in a limited sense. There is almost always just one correct play, and if you’ve played a few times, that play will be immediately obvious to you. This means that everyone playing the game will play the right moves. This isn’t usually what is meant by “strategy.” By definition, the game is completely decided by the cards you draw.

This is pure tedium. Why would anyone want to sit down, flip a coin but not look at it, then perform a sorting task over and over for an hour or more, stop, look at the result of the coin flip and then determine that whoever won the coin flip won the “game.” This analogy is almost exactly the game of Canasta. There are similar (but less obnoxious) bureaucratic jobs that people are paid to do, and those people hate their job.

Not to belabor this point, but imagine you are told to put a bunch of files into alphabetical order, and each time you finish, someone came into the room and threw the files into the air. You then had to pick them up and sort them again. Why would you take this task upon yourself as a leisure activity?

I’ve asked my family this before, and the answer is always something like: it gives us something to do together or it is bonding time or similar answers. But if that’s the case, why not sit around a table and talk rather than putting this tedious distraction into it? If the point is to have fun playing a game, why not play a game that is actually fun?

This is an extreme example, but I’d say that most card games actually fall into this pure coin flip area. We get so distracted by playing the right moves and the fact that it is called a “game” that we sometimes forget the winner of the activity is nothing more than a purely random luck of the draw.

Even games like Pitch or Euchre or other trick taking games, where the right plays take a bit more effort to come up with, are the same. It’s a difficult truth to swallow, but the depth of these games is so shallow that a few hours of playing and you’ll be making the correct moves, without much thought, every single hand. Once every player makes the right plays, it only amounts to luck.

It’s actually really difficult to design a game with a standard deck of cards that gets around this problem. I’ve heard Bridge has depth (I know nothing of the game, but I take people’s word on this considering there is a professional scene). Poker has depth.

How does Poker get around draw luck? I’d say there are two answers. The first is that we don’t consider any individual hand a “game” of Poker. Obviously, the worst Poker player in the world could be dealt a straight flush and win the hand against the best Poker player in the world. Skill in Poker comes into play over the long run. One unit of Poker should be something like a whole tournament, where enough games are played to overcome the draw luck.

Now that we aren’t referring to a single hand, the ability to fold with minimal consequences also mitigates draw luck. This means that if you get unlucky with your initial cards, you can just choose to not play that hand. There are types of Poker that straight up let you replace bad cards (we’ll get to replacing in a moment). All of these things mitigate the luck enough that it makes sense to talk about skill.

Another card game with a professional scene is Magic: The Gathering (MTG). Tournament types vary quite a bit, but one way to mitigate draw luck is again to consider a whole tournament as a unit rather than an individual game. Or you could always play best of five or something.

But one of the most interesting aspects is the deck itself. Unlike traditional playing cards, you get to make the deck you play with. This means that over the course of many games, you can only blame yourself for bad drawing. Did you only draw lands on your first turn for five matches in a row? Then maybe you have too many land cards. That’s your fault. Did you draw no land many times in a row? Also, your own fault again. Composing a deck that takes all these probabilities into account is part of the skill of the game (usually called the “curve” of the deck).

Here’s an interesting question: is there a way to mitigate draw luck without having to play a ton of games? Most people want to play something short and not have to travel for a few days to play in a tournament to test their skill.

In real life, replacing cards is obnoxious to implement, but I think it is a fascinating and underutilized rule. The replacement idea allows you to tone down draw luck even at the level of a single game. If your card game exists online only, it is easy to do, and some recent games actually utilize this like Duelyst.

Why does it work? Well, if you have a bad draw, you can just replace one or all of your cards (depending on how the rule is worded). Not only does this create strategic depth through planning ahead for which cards will be useful, it almost completely eliminates the luck of the draw.

I really want to see someone design a card game with a standard deck of cards that makes this idea work. The one downside is that the only way I can see a “replace” feature working is if you shuffle after each replacement. This is pretty annoying, but I don’t see a way around it. You can’t just stick the card you replace into the middle of the deck and pretend like that placement is random. Everyone will know that it isn’t going to be drawn in the next few turns and can play around that.

Anyway. That’s just something I’ve been thinking about since roguelikes have tons of randomness in them, and the randomness of card games have always bothered me.