"It's worth pointing out tha..."


by Paul Christiano Jan 1 2016

It's worth pointing out that in our discussions of AI safety, the author (I assume Eliezer, hereafter "you") often describe the problems as being hard precisely for agents that are not (yet) epistemically efficient, especially concerning predictions about human behavior. Indeed, in this comment it seems like you imply that a lack of epistemic efficiency is the primary justification for studying vingean reflection.

Given that you think coping with epistemic inefficiency is an important part of the safety problem, this line:

But epistemic efficiency isn't a necessary property for advanced safety to be relevant - we can conceive scenarios where an AI is not epistemically efficient, and yet we still need to deploy parts of value alignment theory. We can imagine, e.g., a Limited Genie that is extremely good with technological designs, smart enough to invent its own nanotechnology, but has been forbidden to model human minds in deep detail (e.g. to avert programmer manipulation)

Seems misleading.

In general, you seem to equivocate between a model where we can/should focus on extremely powerful agents, and a model where most of the key difficulties are at intermediate levels of power where our AI systems are better than humans at some tasks and worse at others. (You often seem to have quite specific views about which tasks are likely to be easy or hard; I don't really buy most of these particular views, but I do think that we should try to design controls systems that work robustly across a wide range of capability states.)