Relevant powerful agents will be highly optimized

by Eliezer Yudkowsky Mar 23 2015 updated Dec 16 2015

The probability that an agent that is cognitively powerful enough to be relevant to existential outcomes, will have been subject to strong, general optimization pressures. Two (disjunctive) supporting arguments are that, one, pragmatically accessible paths to producing cognitively powerful agents tend to invoke strong and general optimization pressures, and two, that cognitively powerful agents would be expected to apply strong and general optimization pressures to themselves.

An example of a scenario that negates [ RelevantPowerfulAgentsHighlyOptimized] is [ KnownAlgorithmNonrecursiveIntelligence], where a cognitively powerful intelligence is produced by pouring lots of computing power into known algorithms, and this intelligence is then somehow prohibited from self-modification and the creation of environmental subagents.

Whether a cognitively powerful agent will in fact have been sufficiently optimized depends on the disjunction of:

Ending up with a scenario along the lines of [ KnownAlgorithmNonrecursiveIntelligence] requires defeating both of the above conditions simultaneously. The second condition seems more difficult and to require more Corrigibility or [ CapabilityControl] features than the first.


Kenzi Amodei

Examples of 'strong, general optimization pressures'? Maybe the sorts of things in that table from Superintelligence. ?Optimization pressure = something like a selective filter, where "strong" means that it was strongly selected for? And maybe the reason to say 'optimization' is to imply that there was a trait that was selected for, strongly, in the same direction (or towards the same narrow target, more like?) for many "generations". Mm, or that all the many different elements of the agent were built towards that trait, with nothing else being a strong competitor. And then "general" presumably is doing something like the work that it does in "general intelligence", ie, not narrow? Ah, a different meaning would be that the agent has been subject to strong pressures towards being a 'general optimizer'. Seems less strongly implied by the grammar, but doesn't create any obvious meaningful interpretive differences.

Oh, or "general" could mean "along many/all axes". So, optimization pressure that is strong, and along many axes. Which fails to specify a set of axes that are relevant, but that doesn't seem super problematic at this moment.

It's not obvious to me that filtering for agents powerful enough to be relevant will leave mainly agents who've been subjected to strong general optimization pressures. For example the Limited Genie described on the Advanced Agent page maybe wasn't?

For self optimization, I assume this is broadly because of the convergent instrumental values claim?