Newcomb's Problem

by Eliezer Yudkowsky Aug 1 2016 updated Oct 13 2016

There are two boxes in front of you, Box A and Box B. You can take both boxes, or only Box B. Box A contains $1000. Box B contains $1,000,000 if and only if Omega predicted you'd take only Box B.

[summary: A powerful alien named Omega has presented you with the following dilemma:

Newcomb's Problem was historically responsible for the invention of causal decision theory and its widespread adoption over evidential decision theory. For a discussion of Newcomblike decision problems in general, see Guide to Logical Decision Theory.]

Newcomb's Problem is the original Newcomblike decision problem that inspired the creation of causal decision theory as distinct from evidential decision theory, spawning a vast array of philosophical literature in the process. It is sometimes called Newcomb's Paradox (despite not being a paradox). The dilemma was originally formulated by William Newcomb, and presented to the philosophical community by Robert Nozick.

The original formulation of Newcomb's Problem was as follows:

An alien named Omega has come to Earth, and has offered some people the following dilemma.

Before you are two boxes, Box A and Box B.

You may choose to take both boxes ("two-box"), or take only Box B ("one-box").

Box A is transparent and contains \$1,000.

Box B is opaque and contains either \$1,000,000 or \$0.

The alien Omega has already set up the situation and departed, but previously put \$1,000,000 into Box B if and only if Omega predicted that you would one-box (take only the opaque Box B and leave Box A and its \$1,000 behind).

Omega is an excellent predictor of human behavior. For the sake of quantifying this assertion and how we know it, we can assume e.g. that Omega has run 67 previous experiments and not been wrong even once. Since people are often strongly opinionated about their choices in Newcomb's Problem, it isn't unrealistic to suppose this is the sort of thing you could predict by reasoning about, e.g., a scan of somebody's brain.

Newcomb originally specified that Omega would leave Box B empty in the case that you tried to decide by flipping a coin; since this violates algorithm-independence, we can alternatively suppose that Omega can predict coinflips.

We may also assume, e.g., that Box A combusts if it is left behind, so nobody else can pick up Box A later; that Omega adds \$1 of pollution-free electricity to the world economy for every \$1 used in Its dilemmas, so that the currency does not represent a zero-sum wealth transfer; etcetera. Omega never plays this game with a person more than once.

The two original opposing arguments given about Newcomb's problem were, roughly:

For the larger argument of which this became part, see one of the introductions to logical decision theory. As of 2016, the most academically common view of Newcomb's Problem is that it surfaces the split between Evidential decision theories and Causal decision theories, and that causal decision theory is correct. However, both that framing and that conclusion have been variously disputed, most recently by Logical decision theories.

%todo: add a diagram of a causal model for Newcomb's Problem.%

The more extensive Wikipedia page on Newcomb's Problem may be found under "Newcomb's Paradox".

Replies by different decision theories

(This section does not remotely do justice to the vast literature on Newcomb's Problem.)

Pretheoretic reactions

Evidential decision theory

Evidential decision theories can be seen as a form of decision theory that was originally written down by historical accident--writing the expected utility formula as if it [action_conditional conditioned] using Bayesian updating, because Bayesian updating is usually the way we condition probability functions. Historically, though, Evidential decision theories was explicitly named as such in an (arguably failed) attempt to rationalize the pretheoretic answer of "I expect to do better if I one-box" on Newcomb's Problem.

On Evidential decision theories, the [-principle_of_rational_choice principle of rational choice] is to choose so that your act is the best news you could have received about your action; in other words, imagine being told that you had in fact made each of your possible choices, imagine what you would believe about the world in that case, and output the choice which would be the best news. Thus, evidential agents one-box on Newcomb's Problem.

Although the EDT answer happens to conform with "the behavior of the agents that end up rich" on Newcomb's Problem, LDT proponents note that it does not do so in general; see e.g. the Transparent Newcomb's Problem.

Causal decision theory

On causal decision theories, the principle of rational choice is to choose according to the causal consequences of your physical act; formally, to calculate expected utility by conditioning using a [causal_counterfactual causal counterfactual]. To choose, imagine as the world as it is right up until the moment of your physical act; assume that your physical act changes without that changing anything else about the world up until that point; then imagine time running forward under what your model says are the rules or physical laws.

A causal agent thus believes that Box B is already empty, and takes both boxes. When they imagine the (counterfactual) result of taking only box B instead, they imagine the world being the same up until that point in time--including Box B remaining empty--and then imagine the result of taking only Box B under physical laws past that point, namely, going home with \$0.

Historically speaking, causal decision theory was first invented to justify two-boxing on Newcomb's Problem; we can see CDT as formalizing the pretheoretic intuition, "Omega's already gone, so I can't get more money by leaving behind Box A."

Logical decision theories

On logical decision theories, the principle of rational choice is "Decide as though you are choosing the logical output of your decision algorithm." E.g., on [-timeless_dt], our extended causal model of the world would include a logical proposition for whether the output of your decision algorithm is 'one-box' or 'two-box'; and this logical fact would affect both Omega's prediction of you, and your actual decision. Thus, an LDT agent prefers that its algorithm have the logical output of one-boxing.

%todo: add graph for TDT on NP%