"My views about Eliezer's pr..."


by Paul Christiano Jun 18 2015

My views about Eliezer's preferences may depend on the reason that I am running X, rather than merely the content of X. E.g. if I am running X because I want to predict what a person will do, that's a tipoff. This sort of thing working relies on a matching between the capabilities being used to guide my thinking and the capabilities being used to assess that thinking to see whether it constitutes mind crime.

But so does the whole project. You've said this well: "you just build the conscience, and that is the AI." The AI doesn't propose a way of figuring out X and then reject or not reject it because it constitutes mind crime, any more than it proposes an action to satisfy its values and then rejects or fails to reject it because the user would consider it immoral. The AI thinks the thought that it ought to think, as best it can figure out, just like it does the thing that it ought to do, as best it can figure out.

Note that you are allowed to just ask about or avoid marginal cases, as long as the total cost of asking or inconvenience of avoiding is not large compared to the other costs of the project. And whatever insight you would have put into your philosophical definition of sapience, you can try to communicate it as well as possible as a guide to predicting "what Eliezer would say about X," which can circumvent the labor of actually asking.