Today I’ll sketch an idea that I fist learned about from David Corfield’s excellent book *Towards a Philosophy of Real Mathematics*. I read it about six years ago while doing my undergraduate honors thesis and my copy is filled with notes in the margins. It has been interesting to revisit this book. What I’m going to talk about is done in much greater detail and thoroughness with tons of examples in that book. So check it out if this is interesting to you.

There are lots of ways we could use Bayesian analysis in the philosophy of math. I’ll just use a single example to show how we can use it to describe how confident we are in certain conjectures. In other words, we’ll come up with a probability for how plausible a conjecture is given the known evidence. As usual we’ll denote this P(C|E). Before doing this, let’s address the question of why would we want to do this.

To me, there are two main answers to this question. The first is that mathematicians already do this colloquially. When someone proposes something in an informal setting, you hear phrases like, “I don’t believe that at all,” or “How could that be true considering …” or “I buy that, it seems plausible.” If you think that the subject of philosophy of mathematics has any legitimacy, then certainly one of its main goals would be to take such statements and try to figure out what is meant by them and whether or not they seem justified. This is exactly what our analysis will do.

The second answer is much more practical in nature. Suppose you conjecture something as part of your research program. As we’ve been doing in these posts, you could use Baye’s theorem to give two estimates on the plausibility of your conjecture being true. One is giving the most generous probabilities given the evidence, and the other is giving the least generous. You’ll get some sort of Bayesian confidence interval of the probability of the conjecture being true. If the entire interval is low (say below 60% or something), then before spending several months trying prove it your time might be better spent gathering more evidence for or against it.

Again, mathematicians already do this at some subconscious level, so being aware of one way to analyze what it is you are actually doing could be very useful. Humans have tons of cognitive biases, so maybe you have greatly overestimated how likely something is and doing a quick Bayes’ theorem calculation can set you straight before wasting a ton of time. Or you could write all this off as nonsense. Whatever. It’s up to you.

If you’ve followed the posts up to now, you’ll probably find this calculation quite repetitive. You can probably guess what we’ll do. We want to figure out P(C|E), the probability that a conjecture is true given the evidence you’ve accumulated. What goes into Bayes’ theorem? Well, P(E|C) the probability that we would see the evidence we have supposing the conjecture is true; P(C) the prior probability that the conjecture is true; P(E|-C) the probability we would see the evidence we have supposing the conjecture is not true; and P(-C) the prior probability that the conjecture is not true.

Clearly the problem of assigning some exact probability to any of these is insanely subjective. But also, as before, it should be possible to find the most optimistic person about a conjecture to overestimate the probability and the most skeptical person to underestimate the probability. This type of interval forming should be a lot less subjective and fairly consistent. One should even have strong arguments to support the estimates which will convince someone who questions them.

Let’s use the Riemann hypothesis as an example. In our modern age, we have massive numerical evidence that the Riemann hypothesis is true. Recall that it just says that all the zeroes of the Riemann zeta function in the critical strip lie on the line with real part 1/2. Something like the first 10,000,000,000,000 zeroes have been checked by computer plus lots (billions?) have been checked in random other places after this.

Interestingly enough, if this were our “evidence” our estimation of P(E|C) may as well be 1, but P(E|-C) would have to contribute a significant non-trivial factor in the denominator of Bayes’ theorem. This is because we estimate this probability based on what we’ve seen in the past in similar situations. It turns out that in analytic number theory we have several prior instances of the phenomenon of a conjecture looking true for exceedingly large numbers before getting a counterexample. In fact, Merten’s Conjecture is explicitly connected to the Riemann hypothesis and the first counterexample could be around (no explicit counterexample is known, just that one exists, but we know by checking that it is exceedingly large).

It probably isn’t unreasonable to say that most mathematicians believe the Riemann hypothesis. Even giving generous prior probabilities, the above analysis would give a not too high level of confidence. So where does the confidence come from? Remember, that in Bayesian analysis it is often easy to accidentally not use all available evidence (subconscious bias may play a role in this process).

I could do an entire series on the analogies and relations between the Riemann hypothesis for curves over finite fields and the standard Riemann hypothesis, so I won’t explain it here. The curves over finite fields case has been proven and provides quite good evidence in terms of making P(E|-C) small.

The Bayesian calculation becomes much, much more complicated in terms of modern mathematics because of all the analogies and more concretely the ways in which the RH is interrelated with theorems about number fields and Galois representations and cohomological techniques. We have conjectures equivalent to (or implying or implied by) the RH which allows us to transfer evidence for and against these other conjectures.

In some sense, essentially all this complication will only increase the Bayesian estimate, so we could simplify our lives and make some baseline estimate taking into account the clearest of these and then just say that our confidence is at least that much. That is one explanation of why many mathematicians beleive the RH even if they’ve never explicitly thought of it that way. Well, this has gone on too long, but I hope the idea has been elucidated.

I used your post as a starting point to contemplate how a Bayesian mathematician would perhaps approach the Halteproblem.