Fair problem class


by Eliezer Yudkowsky Jan 12 2017

A problem is 'fair' (according to logical decision theory) when only the results matter and not how we get there.

A 'fair' problem class, in logical decision theory, is a problem class in which agents' payoffs depend only on their choices and policies, not directly on their algorithm. For example, suppose my problem is "If an agent chooses X, I will give them \$10, and if they choose Y, I will give them \$0." This problem is 'fair' because we made no mention of why the agent chose X or Y; only the agent's actual choice matters. On the other hand, suppose I say, "I will give an agent \$10 if it chooses X over Y because it is an alphabetizing agent and X is higher in alphabetical order than Y; but if the agent chooses X for any other reason, like wanting money, I will give the agent \$0." This problem is not 'fair' because it rewards having an algorithm apart from any output or other abstract behavior of that algorithm.