Intended goal

by Eliezer Yudkowsky Jun 3 2015 updated Dec 16 2015

Definition. An "intended goal" refers to the intuitive intention in the mind of a human programmer when they executed some formal directive or goal within the AI. For example, if the programmer wants to create worthwhile happiness and the AI ends up tiling the universe with tiny molecular smiley-faces, we would say that worthwhile happiness (in some intuitive, possibly pre-verbal sense existing in the programmer's mind) was the "intended goal", as distinct from the result of the formal utility function actually encoded in the AI (which proved to have a maximum at tiny molecular smiley-faces).