# Decision Theory 1

Today we’ll start looking at a branch of math called Decision Theory. It uses the types of things in probability and statistics that we’ve been looking at to make rational decisions. In fact, in the social sciences when bias/rationality experiments are done, seeing how closely people make decisions to these optimal decisions is the base line definition of rationality.

Today’s post will just take the easiest possible scenarios to explain the terms. I think most of this stuff is really intuitive, but all the textbooks and notes I’ve looked at make this way more complicated and confusing. This basically comes from doing too much too fast and not working basic examples.

Let’s go back to our original problem which is probably getting old by now. We have a fair coin. It gets flipped. I have to bet on either heads or tails. If I guess wrong, then I lose the money I bet. If I guess right, then I double my money. The coin will be flipped 100 times. How should I bet?

Let’s work a few things out. A decision function is a function from the space of random variables ${X}$ (technically we can let ${X}$ be any probability space) to the set of possible actions. Let’s call ${A=\{0,1\}}$ our set of actions where ${0}$ corresponds to choosing tails and ${1}$ corresponds to heads. Our decision function is a function that assigns to each flip a choice of picking heads or tails, ${\delta: X \rightarrow A}$. Note that in this example ${X}$ is also just a discrete space corresponding to the 100 flips of the coin.

We now define a loss function, ${L:X\times A \rightarrow \mathbb{R}}$. To make things easy, suppose we bet 1 cent every time. Then our loss is ${1}$ cent every time we guess wrong and ${-2}$ cents if we guess right. Because of the awkwardness of thinking in terms of loss (i.e. a negative loss is a gain) we will just invert it and use a utility function in this case which measures gains. Thus ${U=-1}$ when we guess wrong and ${U=2}$ when we guess right. Notationally, suppose ${F: X\rightarrow A}$ is the function that tells us the outcome of each flip. Explicitly,

$\displaystyle U(x_i, \delta(x_i)) = \begin{cases} -1 \ \text{if} \ F(x_i) \neq \delta(x_i) \\ 2 \ \text{if} \ F(x_i) = \delta(x_i) \end{cases}$

The last thing we need is the risk involved. The risk is just the expected value of the loss function (or the negative of the expected value of the utility). Suppose our decision function is to pick ${0}$ every time. Then our expected utility is just ${100(1/2(-1)+1/2(2))=50}$. This makes sense, because half the time we expect to lose and half we expect to win. But we double our money on a win, so we expect a net gain. Thus our risk is ${-50}$, i.e. there is no risk involved in playing this way!
This is a weird example, because in the real world we have to make our risk function up and it does not usually have negative expected value, i.e. there is almost always real risk in a decision. Also, our typical risk will still be a function. It is only because everything is discrete that some concepts have been combined which will need to be pulled apart later.

The other reason this is weird is that even though there are ${2^{10}}$ different decision functions, they all have the same risk because of the symmetry and independence of everything. In general, each decision function will give a different risk, and they are ordered by this risk. Any minimum risk decision function is called “admissible” and it corresponds to making a rational decision.

I want to point out that if you have the most rudimentary programming skills, then you don’t have to know anything about probability, statistics, or expected values to figure these things out in these simple toy examples. Let’s write a program to check our answer (note that you could write a much simpler program which is only about 5 lines, has no functions, etc to do this):

import random
import numpy as np
import pylab

def flip():
return random.randint(0,1)

def simulate(money, bet, choice, length):
for i in range(length):
tmp = flip()
if choice == tmp:
money += 2*bet
else:
money -= bet
return money

results = []
for i in range(1000):
results.append(simulate(10, 1, 0, 100))

pylab.plot(results)
pylab.title('Coin Experiment Results')
pylab.xlabel('Trial Number')
pylab.ylabel('Money at the End of the Trial')
pylab.show()

print np.mean(results)


This python program runs the given scenario 1000 times. You start with 10 cents. You play the betting game with 100 flips. We expect to end with 60 cents at the end (we start with 10 and have an expected gain of 50). The plot shows that sometimes we end with way more, and sometimes we end with way less (in these 1000 we never end with less than we started with, but note that is a real possibility, just highly unlikely):

It clearly hovers around 60. The program then spits out the average after 1000 simulations and we get 60.465. If we run the program a bunch of times we get the same type of thing over and over, so we can be reasonably certain that our above analysis was correct (supposing a frequentist view of probability it is by definition correct).

Eventually we will want to jump this up to continuous variables. This means doing an integral to get the expected value. We will also want to base our decision on data we observe, i.e. inform our decisions instead of just deciding on what to do ahead of time and then plugging our ears, closing our eyes, and yelling, “La, la, la, I can’t see what’s happening.” When we update our decision as the actions happen it will just update our probability distributions and turn it into a Bayesian decision theory problem.

So you have that to look forward to. Plus some fun programming/pictures should be in the future where we actually do the experiment to see if it agrees with our analysis.