"There's 6 successively stro..."


by Eliezer Yudkowsky Jan 18 2016 updated Jan 18 2016

There's 6 successively stronger arguments listed under "Arguments" in the current version of the page. Mind design space largeness and Humean freedom of preference are #1 and #2. By the time we get to the Gandhi stability argument #3, and the higher tiers of argument above (especially including the tiling agents that seem to directly show stability of arbitrary goals), we're outside the domain of arguments that could specialize equally well to supporting circular preferences. The reason for listing #1 and #2 as arguments anyway is not that they finish the argument, but that (a) before the later tiers of argument were developed #1 and #2 were strong intuition-pumps in the correct direction and (b) even if they might arguably prove too much if applied sloppily, they counteract other sloppy intuitions along the lines of "What does this strange new species 'AI' want?" or "But won't it be persuaded by…" Like, it's important to understand that even if it doesn't finish the argument, it is indeed the case that "All AIs have property P" has a lot of chances to be wrong and "At least one AI has property P" has a lot of chances to be right. It doesn't finish the story - if we took it as finishing the story, we'd be proving much too much, like circular preferences - but it pushes the story in a long way in a particular direction compared to coming in with a prior frame of mind about "What will AIs want? Hm, paperclips doesn't sound right, I bet they want mostly to be left alone."