Transparent Newcomb's Problem

by Eliezer Yudkowsky Aug 4 2016 updated Sep 9 2016

Omega has left behind a transparent Box A containing $1000, and a transparent Box B containing $1,000,000 or nothing. Box B is full iff Omega thinks you one-box on seeing a full Box B.

[summary: A version of Newcomb's Problem in which Box B is transparent. That is:

Omega has presented you with two boxes, Box A which transparently contains \$1,000, and Box B which transparently contains \$0 or \$1,000,000. You may take either both boxes, or only Box B. Omega has already filled Box B iff Omega predicted that you would, upon seeing a full Box B, take only Box B.

The Transparent Newcomb's Problem is noteworthy in that evidential decision theory and causal decision theory agree that a [principle_rational_choice rational] agent should take both boxes; only logical decision agents leave behind Box A and become rich. This is an apparent counterexample to a [edt_cdt_dichotomy widespread view] that EDT and CDT divide Newcomblike problems between them, with EDT agents accepting 'Why aincha rich?' arguments.]

Like Newcomb's Problem, but Box B is also transparent. That is:

Omega has presented you with the following dilemma:

This Newcomblike dilemma is structurally similar to Parfit's Hitchhiker (no decision theory disputes this structural similarity, so far as we know).

Note that it is not, in general, possible to have a transparent Newcomb's Problem in which, for every possible agent, Omega fills Box B iff Omega predicts unconditionally that the agent ends up one-boxing. Some agent could two-box on seeing a full Box B and one-box on seeing an empty Box B, making the general rule impossible for Omega to fulfill.

Similarly, the problem setup stipulates that it seems not entirely impossible that Omega will get the prediction wrong next time. Otherwise this would introduce a new and distracting problem of [action_conditional conditioning] on a visible impossibility when we see a full Box B and consider two-boxing.


Causal decision theory

Two-boxes, because one-boxing cannot cause Box B to be full or empty, since Omega has already predicted and departed.

Evidential decision theory

Two-boxes, because one-boxing cannot be further good news about Box B being full, because the agent has already seen that Box B is full. The agent, upon imagining being told that it one-boxes here, imagines concluding "Omega made its first mistake!" rather than "My eyes are deceiving me and Box B is actually empty." (Thus, EDT agents never see a full Box B to begin with.)

Logical decision theory

One-boxes, because:

• On [timeless_dt timeless decision theory] without the updateless feature: Even after observing Box B being full, we conclude from our extended causal model that in the counterfactual case where our algorithm output "Take both boxes", Box B would have counterfactually been empty. (Updateful TDT does not [counterfactual_mugging in general] output the behavior corresponding to the highest score on problems in this decision class, but updateful TDT happens to get the highest score in this particular scenario.)

• On updateless decision theories: The policy of mapping the sensory input "Box B is full" onto the action "Take only one box" leads to the highest expected utility (as evaluated relative to our non-updated prior).

The Transparent Newcomb's Problem is significant because it counterargues a [edt_cdt_dichotomy widespread view] that EDT and CDT split the Newcomblike problems between them, with EDT being the decision theory that accepts 'why aincha rich?' arguments.

Truly clever LDT and EDT agents

Truly clever agents will realize that the (transparently visible) state of Box B reflects oracular reasoning by Omega about any factor that could affect our decision whether to one-box after seeing a full Box B. The value of an advance prediction about any possible observable factor determining our decision could easily exceed a million dollars.

For example, suppose we have until the end of the day to actually decide how many boxes to take. On finding yourself in a transparent Newcomb's Problem, you could postcommit to an obvious strategy such as that you'll one-box iff the S&P 500 ends up on the day. If you see Box B is full, you can load up on margin and buy short-term call options (and then wait, and actually one-box at the end of the day iff the S&P 500 goes up).

You could also carry out the converse strategy (buy put options if you see Box B is empty), but only if you're confident that the S&P 500's daily movement is independent of any options you buy and that both of your possible selves converge on the same postcommitment, since what you're learning from seeing Box B in this case is what your action would have been at the end of the day if Box B had been full.

This general strategy was observed by Eliezer Yudkowsky and Jack LaSota.