[summary: Consider evaluating, in June of 2016, the question: "What is the probability of Hillary Clinton winning the 2016 US presidential election?"

- On the
**propensity**view, Hillary has some fundamental chance of winning the election. To ask about the probability is to ask about this objective chance. - On the
**subjective**view, saying that Hillary has an 80% chance of winning the election summarizes our*knowledge about*the election, or, equivalently, our*state of uncertainty*given what we currently know. - On the
**frequentist**view, we cannot formally or rigorously say anything about the 2016 presidential election, because it only happens once.]

## Betting on one-time events

Consider evaluating, in June of 2016, the question: "What is the probability of Hillary Clinton winning the 2016 US presidential election?"

On the **propensity** view, Hillary has some fundamental chance of winning the election. To ask about the probability is to ask about this objective chance. If we see a prediction market in which prices move after each new poll — so that it says 60% one day, and 80% a week later — then clearly the prediction market isn't giving us very strong information about this objective chance, since it doesn't seem very likely that Clinton's *real* chance of winning is swinging so rapidly.

On the **frequentist** view, we cannot formally or rigorously say anything about the 2016 presidential election, because it only happens once. We can't *observe* a frequency with which Clinton wins presidential elections. A frequentist might concede that they would cheerfully buy for \$1 a ticket that pays \$20 if Clinton wins, considering this a favorable bet in an *informal* sense, while insisting that this sort of reasoning isn't sufficiently rigorous, and therefore isn't suitable for being included in science journals.

On the **subjective** view, saying that Hillary has an 80% chance of winning the election summarizes our *knowledge about* the election or our *state of uncertainty* given what we currently know. It makes sense for the prediction market prices to change in response to new polls, because our current state of knowledge is changing.

## A coin with an unknown bias

Suppose we have a coin, weighted so that it lands heads somewhere between 0% and 100% of the time, but we don't know the coin's actual bias.

The coin is then flipped three times where we can see it. It comes up heads twice, and tails once: HHT.

The coin is then flipped again, where nobody can see it yet. An honest and trustworthy experimenter lets you spin a wheel-of-gambling-odds,%note:The reason for spinning the wheel-of-gambling-odds is to reduce the worry that the experimenter might know more about the coin than you, and be offering you a deliberately rigged bet.% and the wheel lands on (2 : 1). The experimenter asks if you'd enter into a gamble where you win \$2 if the unseen coin flip is tails, and pay \$1 if the unseen coin flip is heads.

On a **propensity** view, the coin has some objective probability between 0 and 1 of being heads, but we just don't know what this probability is. Seeing HHT tells us that the coin isn't all-heads or all-tails, but we're still just guessing — we don't really know the answer, and can't say whether the bet is a fair bet.

On a **frequentist** view, the coin would (if flipped repeatedly) produce some long-run frequency $~$f$~$ of heads that is between 0 and 1. If we kept flipping the coin long enough, the actual proportion $~$p$~$ of observed heads is guaranteed to approach $~$f$~$ arbitrarily closely, eventually. We can't say that the *next* coin flip is guaranteed to be H or T, but we can make an objectively true statement that $~$p$~$ will approach $~$f$~$ to within epsilon if we continue to flip the coin long enough.

To decide whether or not to take the bet, a frequentist might try to apply an [unbiased_estimator unbiased estimator] to the data we have so far. An "unbiased estimator" is a rule for taking an observation and producing an estimate $~$e$~$ of $~$f$~$, such that the expected value of $~$e$~$ is $~$f$~$. In other words, a frequentist wants a rule such that, if the hidden bias of the coin was in fact to yield 75% heads, and we repeat many times the operation of flipping the coin a few times and then asking a new frequentist to estimate the coin's bias using this rule, the *average* value of the estimated bias will be 0.75. This is a property of the *estimation rule* which is objective. We can't hope for a rule that will always, in any particular case, yield the true $~$f$~$ from just a few coin flips; but we can have a rule which will provably have an *average* estimate of $~$f$~$, if the experiment is repeated many times.

In this case, a simple unbiased estimator is to guess that the coin's bias $~$f$~$ is equal to the observed proportion of heads, or 2/3. In other words, if we repeat this experiment many many times, and whenever we see $~$p$~$ heads in 3 tosses we guess that the coin's bias is $~$\frac{p}{3}$~$, then this rule definitely is an unbiased estimator. This estimator says that a bet of \$2 vs. $\1 is fair, meaning that it doesn't yield an expected profit, so we have no reason to take the bet.

On a **subjectivist** view, we start out personally unsure of where the bias $~$f$~$ lies within the interval [0, 1]. Unless we have any knowledge or suspicion leading us to think otherwise, the coin is just as likely to have a bias between 33% and 34%, as to have a bias between 66% and 67%; there's no reason to think it's more likely to be in one range or the other.

Each coin flip we see is then evidence about the value of $~$f,$~$ since a flip H happens with different probabilities depending on the different values of $~$f,$~$ and we update our beliefs about $~$f$~$ using Bayes' rule. For example, H is twice as likely if $~$f=\frac{2}{3}$~$ than if $~$f=\frac{1}{3}$~$ so by Bayes's Rule we should now think $~$f$~$ is twice as likely to lie near $~$\frac{2}{3}$~$ as it is to lie near $~$\frac{1}{3}$~$.

When we start with a uniform prior, observe multiple flips of a coin with an unknown bias, see M heads and N tails, and then try to estimate the odds of the next flip coming up heads, the result is Laplace's Rule of Succession which estimates (M + 1) : (N + 1) for a probability of $~$\frac{M + 1}{M + N + 2}.$~$

In this case, after observing HHT, we estimate odds of 2 : 3 for tails vs. heads on the next flip. This makes a gamble that wins \$2 on tails and loses \$1 on heads a profitable gamble in expectation, so we take the bet.

Our choice of a uniform prior over $~$f$~$ was a little dubious — it's the obvious way to express total ignorance about the bias of the coin, but obviousness isn't everything. (For example, maybe we actually believe that a fair coin is more likely than a coin biased 50.0000023% towards heads.) However, all the reasoning after the choice of prior was rigorous according to the laws of probability theory, which is the [probability_coherence_theorems only method of manipulating quantified uncertainty] that obeys obvious-seeming rules about how subjective uncertainty should behave.

## Probability that the 98,765th decimal digit of $~$\pi$~$ is $~$0$~$.

What is the probability that the 98,765th digit in the decimal expansion of $~$\pi$~$ is $~$0$~$?

The **propensity** and **frequentist** views regard as nonsense the notion that we could talk about the *probability* of a mathematical fact. Either the 98,765th decimal digit of $~$\pi$~$ is $~$0$~$ or it's not. If we're running *repeated* experiments with a random number generator, and looking at different digits of $~$\pi,$~$ then it might make sense to say that the random number generator has a 10% probability of picking numbers whose corresponding decimal digit of $~$\pi$~$ is $~$0$~$. But if we're just picking a non-random number like 98,765, there's no sense in which we could say that the 98,765th digit of $~$\pi$~$ has a 10% propensity to be $~$0$~$, or that this digit is $~$0$~$ with 10% frequency in the long run.

The **subjectivist** considers probabilities to just refer to their own uncertainty. So if a subjectivist has picked the number 98,765 without yet knowing the corresponding digit of $~$\pi,$~$ and hasn't made any observation that is known to them to be entangled with the 98,765th digit of $~$\pi,$~$ and they're pretty sure their friend hasn't yet looked up the 98,765th digit of $~$\pi$~$ either, and their friend offers a whimsical gamble that costs \$1 if the digit is non-zero and pays \$20 if the digit is zero, the Bayesian takes the bet.

Note that this demonstrates a difference between the subjectivist interpretation of "probability" and Bayesian probability theory. A perfect Bayesian reasoner that knows the rules of logic and the definition of $~$\pi$~$ must, by the axioms of probability theory, assign probability either 0 or 1 to the claim "the 98,765th digit of $~$\pi$~$ is a $~$0$~$" (depending on whether or not it is). This is one of the reasons why [bayes_intractable perfect Bayesian reasoning is intractable]. A subjectivist that is not a perfect Bayesian nevertheless claims that they are personally uncertain about the value of the 98,765th digit of $~$\pi.$~$ Formalizing the rules of subjective probabilities about mathematical facts (in the way that Probability theory formalized the rules for manipulating subjective probabilities about empirical facts, such as which way a coin came up) is an open problem; this in known as the problem of Logical Uncertainty.