Distinguish which advanced-agent properties lead to the foreseeable difficulty


by Eliezer Yudkowsky Jun 20 2016

Say what kind of AI, or threshold level of intelligence, or key type of advancement, first produces the difficulty or challenge you're talking about.

Any general project of producing a large edifice of good thinking should try to break down the ideas into modular pieces, distinguish premises from conclusions, and clearly label which reasoning steps are being used. Applied to AI alignment theory, one of the things this suggests is that if you propose any sort of potentially difficult or dangerous future behavior from an AI, you should distinguish what particular kinds of advancement or cognitive intelligence are supposed to produce this difficulty. In other words, supposed foreseeable difficulties should come with proposed advanced agent properties that match up to them.