Terminal versus instrumental goals / values / preferences


by Eliezer Yudkowsky Dec 17 2015 updated Feb 8 2017

Distinguish events wanted for their consequences, from events wanted locally.

[summary: In a human sense, we want some things for themselves ('terminally'), and other things because of their later consequences ('instrumentally').

When we get into a car on the way to the airport, we're not doing that because we enjoy opening car doors for their own sake, but because we want to be somewhere else later. This is 'instrumental value'.

When we enjoy eating chocolate, then while other goods or evils might come later of having eaten the chocolate, at least the current moment of happiness will count towards the total sum of goodness in the system. We don't derive our preference to eat the chocolate only from our beliefs about the future consequences of eating the chocolate. This is 'terminal value'.

An [-agent]'s 'instrumental goals' are preferred because of their expected future consequences. Conversely, 'terminal goals' can be evaluated as goals by considering only the local facts.

In terms of the expected utility formalism, instrumental utility is a non-local calculation that depends on subjective probabilities and distant parts of the event graph; terminal utility is evaluated locally on the objects of value inside a single possibility.]

'Instrumental goals' or 'instrumental values' are things that an agent wants for the sake of achieving other things. For example, we might want to get into a car, not because we enjoy the act of opening car doors for their own sake, but because we want to drive somewhere else.

'Terminal' goals, values, or preferences are those where the preference is derived locally rather than by looking at further or distant consequences. If you enjoy eating chocolate (and otherwise approve of this enjoyment, etcetera) then you aren't deriving your preference based on what you believe to be the further consequences of eating chocolate.

Imagine reality as an enormous web of events, linked by cause and effect. "Terminal value" is usually local and be evaluated at a single event inside the graph; even if it's a nonlocal good thing, we'd evaluate it by evaluating the history up to some point, and then we'd have a chunk of definite goodness that would stand on its own no matter what happened later.

"Instrumental value" is a nonlocal property of an event, depending on its real or expected future, and contingent on that future; if you add up all the instrumental values on the graph, you don't get a meaningful sum because you may be double-counting some value.

On a moral or ethical level, instrumental values are justified by appealing to their consequences, while terminal values are justified without appeal to their consequences.

Further reading: