Ideal target

https://arbital.com/p/ideal_target

by Eliezer Yudkowsky Feb 8 2017

The 'ideal target' of a meta-utility function is the value the ground-level utility function would take on if the agent updated on all possible evidence; the 'true' utilities under moral uncertainty.


The 'ideal target' of a meta-utility function $~$\Delta U$~$ which behaves as if a ground-level utility function $~$U$~$ is taking on different values in different possible worlds, is the value of $~$U$~$ in the actual world; or the expected value of $~$U$~$ after updating on all possible accessible evidence. If chocolate has €8 utility in worlds where the sky is blue, and €5 utility in worlds where the sky is not blue, then in the AI's 'ideal target' utility function, the utility of chocolate is €8.