Executable philosophy


by Eliezer Yudkowsky Jul 31 2015 updated Jun 6 2016

Philosophical discourse aimed at producing a trustworthy answer or meta-answer, in limited time, which can used in constructing an Artificial Intelligence.

[summary: 'Executable philosophy' is Eliezer Yudkowsky's term for discourse about subjects usually considered in the realm of philosophy, meant to be used for designing an Artificial Intelligence. Tenets include:

"Executable philosophy" is Eliezer Yudkowsky's term for discourse about subjects usually considered to belong to the realm of philosophy, meant to be applied to problems that arise in designing or aligning machine intelligence.

Two motivations of "executable philosophy" are as follows:

  1. We need a philosophical analysis to be "effective" in Turing's sense: that is, the terms of the analysis must be useful in writing programs. We need ideas that we can compile and run; they must be "executable" like code is executable.
  2. We need to produce adequate answers on a time scale of years or decades, not centuries. In the entrepreneurial sense of "good execution", we need a methodology we can execute on in a reasonable timeframe.

Some consequences:

Conversely, we can't just plug the products of standard analytic philosophy into AI problems, because:

• The academic incentives favor continuing to dispute small possibilities because "ongoing dispute" means "everyone keeps getting publications". As somebody once put it, for academic philosophy, an unsolvable problem is "like a biscuit bag that never runs out of biscuits". As a sheerly cultural matter, this means that academic philosophy hasn't accepted that e.g. everything is made out of quarks (particle fields) without any non-natural or irreducible properties attached.

In turn, this means that when academic philosophers have tried to do metaethics, the result has been a proliferation of different theories that are mostly about non-natural or irreducible properties, with only a few philosophers taking a stand on trying to do metaethics for a strictly natural and reducible universe. Those naturalistic philosophers are still having to argue for a natural universe rather than being able to accept this and move on to do further analysis inside the naturalistic possibilities. To build and align Artificial Intelligence, we need to answer some complex questions about how to compute goodness; the field of academic philosophy is stuck on an argument about whether goodness ought ever to be computed.

• Many academic philosophers haven't learned the programmers' discipline of distinguishing concepts that might compile. If we imagine rewinding the state of understanding of computer chess to what obtained in the days when Edgar Allen Poe proved that no mere automaton could play chess, then the modern style of philosophy would produce, among other papers, a lot of papers considering the 'goodness' of a chess move as a primitive property and arguing about the relation of goodness to reducible properties like controlling the center of a chessboard.

There's a particular mindset that programmers have for realizing which of their own thoughts are going to compile and run, and which of their thoughts are not getting any closer to compiling. A good programmer knows, e.g., that if they offer a 20-page paper analyzing the 'goodness' of a chess move in terms of which chess moves are 'better' than other chess moves, they haven't actually come any closer to writing a program that plays chess. (This principle is not to be confused with greedy reductionism, wherein you find one thing you understand how to compute a bit better, like 'center control', and then take this to be the entirety of 'goodness' in chess. Avoiding greedy reductionism is part of the skill that programmers acquire of thinking in effective concepts.)

Many academic philosophers don't have this mindset of 'effective concepts', nor have they taken as a goal that the terms in their theories need to compile, nor do they know how to check whether a theory compiles. This, again, is one of the foundational reasons why despite there being a very large edifice of academic philosophy, the products of that philosophy tend to be unuseful in AGI.

In more detail, Yudkowsky lists these as some tenets or practices of what he sees as 'executable' philosophy:

A final trope of executable philosophy is to not be intimidated by how long a problem has been left open. "Ignorance exists in the mind, not in reality; uncertainty is in the map, not in the territory; if I don't know whether a coin landed heads or tails, that's a fact about me, not a fact about the coin." There can't be any unresolvable confusions out there in reality. There can't be any inherently confusing substances in the mathematically lawful, unified, low-level physical process we call the universe. Any seemingly unresolvable or impossible question must represent a place where we are confused, not an actually impossible question out there in reality. This doesn't mean we can quickly or immediately solve the problem, but it does mean that there's some way to wake up from the confusing dream. Thus, as a matter of entrepreneurial execution, we're allowed to try to solve the problem rather than run away from it; trying to make an investment here may still be profitable.

Although all confusing questions must be places where our own cognitive algorithms are running skew to reality, this, again, doesn't mean that we can immediately see and correct the skew; nor that it is compilable philosophy to insist in a very loud voice that a problem is solvable; nor that when a solution is presented we should immediately seize on it because the problem must be solvable and behold here is a solution. An important step in the method is to check whether there is any lingering sense of something that didn't get resolved; whether we really feel less confused; whether it seems like we could write out the code for an AI that would be confused in the same way we were; whether there is any sense of dissatisfaction; whether we have merely chopped off all the interesting parts of the problem.

An earlier guide to some of the same ideas was the Reductionism Sequence.

[todo: tutorial: finishable philosophy applied to 'free will'. (don't forget to distinguish plausible wrong ways to do it on each step. is there a good example besides free will that can serve as a homework problem? maybe something actually unresolved like 'Why does anything exist -> why do some things exist more than others?' with Tegmark Level IV as a considered, but not accepted answer.)]


Benjy Forstadt

I have a few complaints/questions:

1) "What is goodness made out of" is not really a particularly active discussion in professional philosophy. I feel that this was put in there just to make analytic philosophers look silly. And anyways, if one believes in naturalistic moral properties (the stuff that we value,) then "what is goodness made out of" really is the question "what is good," which I think is probably a fine question. In this case, rephrasing in terms of AI just makes philosophical discussions more wordy and less accessible.

2) "Faced with any philosophically confusing issue, our task is to identify what cognitive algorithm humans are executing which feels from the inside like this sort of confusion, rather than, as in conventional philosophy, to try to clearly define terms and then weigh up all possible arguments for all 'positions'."

I don't get what the problem is with clearly defining terms and weighing up pros and cons for positions. Is conceptual analysis (http://philpapers.org/browse/conceptual-analysis) so problematic that it has no place in an improved version of philosophy? I think that there are at least a few parallels between that project in philosophy and the sentiment expressed in rescue_utility.html, for example.

3) "Most "philosophical issues" worth pursuing can and should be rephrased as subquestions of some primary question about how to design an Artificial Intelligence, even as a matter of philosophy qua philosophy."

What is "philosophy qua philosophy?"

"This imports the discipline of programming into philosophy. In particular, programmers learn that even if they have an inchoate sense of what a computer should do, when they actually try to write it out as code, they sometimes find that the code they have written fails (on visual inspection) to match up with their inchoate sense. Many ideas that sound sensible as English sentences are revealed as confused as soon as we try to write them out as code."

How would one translate questions like "Are there unverifiable truths?" or "under what conditions does the parthood relation hold?" into AI-speak?