Boxed AI

by Eliezer Yudkowsky Jun 12 2015 updated Dec 16 2015

Idea: what if we limit how AI can interact with the world. That'll make it safe, right??

AI-boxing is the theory that deals in machine intelligences that are allegedly safer due to allegedly having extremely restricted manipulable channels of causal interaction with the outside universe.

AI-boxing theory includes:

The central difficulty of AI boxing is to describe a channel which cannot be used to manipulate the human operators, but which provides information relevant enough to be pivotal or game-changing relative to larger events. For example, it seems not unthinkable that we could safely extract, from a boxed AI setup, reliable information that prespecified theorems had been proved within Zermelo-Fraenkel set theory, but there is no known way to save the world if only we could sometimes know that prespecified theorems had been reliably proven in Zermelo-Fraenkel set theory.