Principles in AI alignment

https://arbital.com/p/alignment_principle

by Eliezer Yudkowsky Feb 16 2017 updated Feb 16 2017

A 'principle' of AI alignment is a very general design goal like 'understand what the heck is going on inside the AI' that has informed a wide set of specific design proposals.


[summary: A 'principle' of AI alignment is something we want in a broad sense for the whole AI, which has informed narrower design proposals for particular parts or aspects of the AI.

Examples:

A 'principle' of AI alignment is something we want in a broad sense for the whole AI, which has informed narrower design proposals for particular parts or aspects of the AI.

For example:

Please be guarded about declaring things to be 'principles' unless they have already informed more than one specific design proposal and more than one person thinks they are a good idea. You could call them 'proposed principles' and post them under your own domain if you personally think they are a good idea. There are a lot of possible 'broad design wishes', or things that people think are 'broad design wishes', and the principles that have actually already informed specific design proposals would otherwise get lost in the crowd.