Death in Damascus

https://arbital.com/p/death_in_damascus

by Eliezer Yudkowsky Aug 2 2016 updated Mar 21 2017

Death tells you that It is coming for you tomorrow. You can stay in Damascus or flee to Aleppo. Whichever decision you actually make is the wrong one. This gives some decision theories trouble.


[summary: In the city of Damascus, a man encounters the skeletal visage of Death. Death, upon seeing the man, looks surprised; but then says, "I ᴀᴍ ᴄᴏᴍɪɴɢ ғᴏʀ ʏᴏᴜ ᴛᴏᴍᴏʀʀᴏᴡ." The terrified man buys a camel and flees to Aleppo. After being killed in Aleppo by falling roof tiles, the man looks around and sees Death waiting.

"I thought you would be looking for me in Damascus," says the man.

"Nᴏᴛ ᴀᴛ ᴀʟʟ," says Death. "Tʜᴀᴛ ɪs ᴡʜʏ I ᴡᴀs sᴜʀᴘʀɪsᴇᴅ ᴛᴏ sᴇᴇ ʏᴏᴜ ʏᴇsᴛᴇʀᴅᴀʏ, ғᴏʀ I ᴋɴᴇᴡ I ʜᴀᴅ ᴀɴ ᴀᴘᴘᴏɪɴᴛᴍᴇɴᴛ ᴡɪᴛʜ ʏᴏᴜ ɪɴ Aʟᴇᴘᴘᴏ."

In the Death in Damascus dilemma, we can either stay in Damascus or flee to Aleppo. Death, an excellent predictor of human behavior, has informed us that whatever decision we end up making after being warned, will be the wrong one.

If we decide to stay in Damascus, we conclude that staying in Damascus will be fatal. If we observe ourselves fleeing to Aleppo, we conclude that we'll die if we go to Aleppo and that Damascus would be safe.

This standard dilemma can send some decision theories into infinite loops; while other decision theories break the loop in ways that (arguably) lead to other problems.]

In the city of Damascus, a man encounters the skeletal visage of Death. Death, upon seeing the man, looks surprised; but then says, "I ᴀᴍ ᴄᴏᴍɪɴɢ ғᴏʀ ʏᴏᴜ ᴛᴏᴍᴏʀʀᴏᴡ." The terrified man buys a camel and flees to Aleppo. After being killed in Aleppo by falling roof tiles, the man looks around and sees Death waiting.

"I thought you would be looking for me in Damascus," says the man.

"Nᴏᴛ ᴀᴛ ᴀʟʟ," says Death. "Tʜᴀᴛ ɪs ᴡʜʏ I ᴡᴀs sᴜʀᴘʀɪsᴇᴅ ᴛᴏ sᴇᴇ ʏᴏᴜ ʏᴇsᴛᴇʀᴅᴀʏ, ғᴏʀ I ᴋɴᴇᴡ I ʜᴀᴅ ᴀɴ ᴀᴘᴘᴏɪɴᴛᴍᴇɴᴛ ᴡɪᴛʜ ʏᴏᴜ ɪɴ Aʟᴇᴘᴘᴏ."

In the Death in Damascus dilemma for decision theories, Death has kindly informed us that whatever decision we end up making, will, in fact, have been the wrong one. It's not that Death follows us wherever we go, but that Death has helpfully predicted our future decision and found that our decision takes us to a city in which a fatal accident will occur to us.

If we observe ourselves deciding to stay in Damascus, we know that staying in Damascus will be fatal and that we would be safe if only we fled to Aleppo. If we observe ourselves fleeing to Aleppo, we will conclude that we are to die in Aleppo for no reason other than that we fled there.

This dilemma can send some decision theories into infinite loops; while other decision theories break the loop in ways that (arguably) lead to other problems.

For a related dilemma with some of the same flavor of [ratification looking for a stable policy], without involving Death or other perfect predictors, see the Absent-Minded Driver.

Analysis

Death in Damascus is a standard problem in decision theory and has a sizable literature concerning it. (We haven't found a good online collection, so try this Google search for some analyses within the mainstream view.)

Causal decision theory

The first-order version of CDT just considers counterfactuals--$~$\operatorname {do}()$~$ operations--on our possible actions, meaning that we don't update our background beliefs at all at the time of calculating our action. It's not clear in this case what we think of Aleppo and Damascus after Death gives us Its observation, which would seem to require that we have prior probabilities on our going to Aleppo or staying in Damascus. Let's say that we thought we only had a 0.01% chance under normal circumstances of suddenly traveling to Aleppo; then after updating on Death's statement, we'll think that Damascus has a 99.99% chance of being fatal and Aleppo has a 0.01% chance of being the fatal city, and we'll flee to Aleppo.

This does deliver a prompt answer, but it involves a false calculation about expected utility--at the time of calculating the expected utilities in the decision, we think we have a 99.99% chance of surviving (since we think Aleppo is only 0.01% likely to prove fatal). The actual number, by hypothesis, is 0%.

In turn, this could let a mischievous bookie pump money out of the CDT agent. Suppose that besides choosing between Aleppo and Damascus, the agent also needs to choose whether to buy a ticket that costs \$1, and pays out \$11 if the agent survives. This is a good bet if you have a 99.99% chance of survival; not so much if you have a 0% chance of survival.

We can suppose the agent must choose both $~$D$~$ vs $~$A$~$ for Damascus vs. Aleppo, and simultaneously choose $~$Y$~$ vs $~$N$~$ for whether to yes-buy or not-buy the \$1 ticket that pays \$11 if the agent survives. That is, the agent is facing four buttons $~$DY, AY, DN, AN$~$ and this outcome table:

$$~$ \begin{array}{r|c|c} & \text {Damascus fatal} & \text {Aleppo fatal} \\ \hline \ {DN} & \text {Die} & \text{Live} \\ \hline \ {AN} & \text {Live} & \text {Die} \\ \hline \ {DY} & \text {Die, \$-1} & \text{Live, \$+10} \\ \hline \ {AY} & \text {Live, \$+10} & \text {Die, \$-1} \end{array} $~$$

A causal decision theory that doesn't update its background beliefs at all while making the decision, will select $~$AY$~$ instead of $~$AN.$~$ (And then the CDT agent predictably updates afterwards to thinking that the ticket is worthless, so we can buy the ticket back for \$0.01 at a profit of \$0.99, justifying our regarding this as a "money pump".)

A first response would be to allow the CDT agent to [tickle_defense observe its own initial impulse], try updating the background variables accordingly, and then reconsider its decision until it finds a decision that is stable or [ratification "self-ratifying"].

This deals with the [newcombs_tax Newcomb's Tax] dilemma, but isn't sufficient for Death in Damascus since there is no deterministic self-ratifying decision on this problem--the decision theory goes into an infinite loop as it believes that Damascus is fatal and feels an impulse to go to Aleppo, updates to believe that Aleppo is fatal and observes an impulse to stay in Damascus, etcetera.

The standard reply is to allow the decision theory to break loops like this by deploying mixed strategies. At the point where the agent thinks it will deploy the mixed strategy of staying in Damascus with 50% probability and going to Aleppo with 50% probability, any possible probabilistic mix of "stay in Damascus" and "flee to Aleppo" will seem equally attractive, with a 50% probability of dying given either decision. We then modify the theory of CDT to add the rule that in cases like this, we output a self-consistent policy if one is found. (This does require an extra rule, because not only the policy of {0.5 stay, 0.5 flee} seems acceptable at the self-consistent point--all policies seem acceptable at that point--unless we add a special rule to stop there and output the self-consistent policy.)

This is a standard addendum to CDT and also appears in e.g. the most widely accepted resolution for the Absent-Minded Driver dilemma. But in this case, in addition to the concern that the extra rule in CDT could be taken as strange (why pick one policy at a point where all policies seem equally attractive?), we also need to deal with additional concerns:

• That the agent will immediately reverse course as soon as it notices itself fleeing to Aleppo and reconsiders this decision a second later.

• (Raised by Yudkowsky in personal conversation with James M. Joyce.) This version of the agent will still buy for \$1 a ticket that pays \$11 if it survives, if it's offered that choice as part of the stay/flee decision. That is, the agent stabilizes on the policy {0.5 DY, 0.5 AY} instead of the policy {0.5 DN, 0.5 AN} if it's offered all four choices.

%%comment:  This objection was raised by Eliezer Yudkowsky in personal conversation with James M. Joyce at "Self-prediction in Decision Theory and Artificial Intelligence" at Cambridge 2015.  Joyce was suggesting a particular formalism for a self-ratifying CDT.  The conversation went something like the following:

Yudkowsky:  I think this agent is irrational, because at the point where it makes the decision to stay or flee with 0.5:0.5 probability, it thinks it has a 50% chance of survival.

Joyce:  I think that's rational.  Maybe after the decision the agent realizes it won't survive, but it has no way of knowing that at the time it makes the decision.

Yudkowsky:  Hm.  (Goes off and thinks.)  (Returns.)  Your agent is irrational and I can pump money out of it by offering to sell it for \$1 a ticket that pays a net \$10 if it survives.

Joyce:  That's because from your epistemic vantage point outside the agent, you know something the agent doesn't.  Obviously you can win bets against the agent when you're allowed to bet with knowledge it doesn't have.

Yudkowsky:  (Thinks.)  Your agent knows in advance that it can be money-pumped and it will pay me \$0.50 not to offer to sell it a ticket later.  So I claim that it clearly *can* know the thing you say it can't know at the time of making the decision.

Joyce:  I disagree, but let me think about it.

(Commented out because it would be unfair to cite this conversation without running it past Joyce, plus he may have come up with a further reply since then.)

%%

Evidential decision theory

Evidential decision theory evaluates its expected utility as "doomed" whether it flees to Aleppo or stays in Damascus, and will choose whichever option corresponds to spending its last days more comfortably.

Logical decision theory

An agent using the standard updateless form of logical decision theory responds by asking: "How exactly does Death decide whether to speak to someone?"

It's not causally possible for Death to always tell people when a natural death is approaching, regardless of the person's policy. For example, there could be someone who will die if they stay in Damascus, but whose disposition causes them to flee to Aleppo (where no death waits) if they are warned.

Two possible rules for Death would be as follows:

Rule K:

Rule L:

Since the UDT-optimal policy differs depending on whether Death follows Rule K or Rule L, we need at least a prior probability distribution on which rule Death follows. As Bayesians, we can just guess this probability if we don't have authoritative information, but we need to guess something to proceed.

If Death follows Rule K, the UDT reply is to stay in Damascus, and this is decisively optimal--definitely superior to the option of fleeing to Aleppo! If you always flee to Aleppo on a warning, then you are killed by any fatal event that could occur in Aleppo (Death gives you warning, you flee, you die). You are also killed by any fatal event that could occur in Damascus (Death checks if It can consistently warn you, finds that It can't, stays silent, and collects you in Damascus the next day). You will be aware, on receiving the warning, that Death awaits you in Damascus; but you'll also be aware that if-counterfactually you were the sort of person who flees to Aleppo on warning, you would have received no warning today, and possibly have died in Aleppo some time ago.

If Death follows Rule L, you should, upon receiving Death's warning, hide yourself in the safest possible circumstances--perhaps near the emergency room of a well-managed hospital, under medical supervision. You'll still expect to die after taking this precaution--something fatal will happen to you despite all nearby doctors. However, by being the sort of person who acts like this on receiving a warning from Death, you minimize Death's probability of collecting you on any given day. You know that if-counterfactually you were the sort of person whose algorithm's logical output says to stay in Damascus after receiving warning, you would probably have been killed earlier in Damascus where potentially fatal events crop up more frequently.

An updateless LDT agent computes this reply in one sweep, and without needing to observe itself or search for a self-ratifying answer.


Comments

Gurkenglas Gurkenglas

Let's say that we thought we only had a 0.01% chance under normal circumstances of suddenly traveling to Aleppo; then after updating on Death's statement, we'll think that Damascus has a 99.99% chance of being fatal and Aleppo has a 0.01% chance of being the fatal city, and we'll flee to Aleppo.

This sounds wrong because it's not invariant under the introduction of new alternatives with a 0% chance.