Strong cognitive uncontainability

by Eliezer Yudkowsky Mar 26 2015 updated Mar 12 2016

An advanced agent can win in ways humans can't understand in advance.

[summary: A strongly uncontainable agent's best solution strategies often go through causal domains we can't model; we would not be able to see them as solutions in advance.]


Suppose somebody from the 10th century were asked how somebody from the 20th century might cool their house. While they would be able to understand the problem and offer some solutions, maybe even clever solutions ("Locate your house someplace with cooler weather", "divert water from the stream to flow through your living room") the 20th century's actual solution of 'air conditioning' is not available to them as a strategy. Not just because they don't think fast enough or aren't clever enough, but because an air conditioner takes advantage of physical laws they don't know about. Even if they somehow randomly imagined an air conditioner's exact blueprint, they wouldn't expect that design to operate as an air conditioner until they were told about the relation of pressure to temperature, how electricity can power a compressor motor, and so on.

By definition, a strongly uncontainable agent can conceive strategies that go through causal domains you can't currently model, and it has options accessing those strategies; therefore it may execute high-value solutions such that, even being told the exact strategy, you would not assign those solutions high expected efficacy without being told further background facts.

At least in this sense, the 20th century is 'strongly cognitively uncontainable' relative to the 10th century: We can solve the problem of how to cool homes using a strategy that would not be recognizable in advance to a 10th-century observer.

Arguably, most real-world problems, if we today addressed them using the full power of modern science and technology (i.e. we were willing to spend a lot of money on tech and maybe run a prediction market on the relevant facts) would have best solutions that couldn't be verified in the 10th-century.

We can imagine a cognitively powerful agent being strongly uncontainable in some domains but not others. Since every cognitive agent is containable on formal games of tic-tac-toe (at least so far as we can imagine, and so long as there isn't a real-world opponent to manipulate), strong uncontainability cannot be a universal property of an agent across all formal and informal domains.

General arguments

Arguments in favor of strong uncontainability tend to revolve around either:

Arguments against strong uncontainability tend to revolve around:

Key propositions