Although the halting problem means we can't *prove* it doesn't happen, it would nonetheless be *extremely surprising* if some 100-state Turing machine turned out to print the exact text of Shakespeare's *Romeo and Juliet.* Unless something was specifically generated by a simple algorithm, the Vast supermajority of data structures that *look* like they have high algorithmic complexity actually *do* have high algorithmic complexity. Since there are at most $~$2^{101}$~$ programs that can be specified with at most 100 bits (in any particular language), we can't fit all the 1000-bit data structures into all the 100-bit programs. While *Romeo and Juliet* is certainly highly compressible, relative to most random bitstrings of the same length, it would be shocking for it to compress *all the way down* to a 100-state Turing machine. There just aren't enough 100-state Turing machines for one of their outputs to be *Romeo and Juliet*. Similarly, if you start with a 10 kilobyte text file, and 7zip compresses it down to 2 kilobytes, no amount of time spent trying to further compress the file using other compression programs will ever get that file down to 1 byte. For any given compressor there's at most 256 starting files that can ever be compressed down to 1 byte, and your 10-kilobyte text file almost certainly isn't one of them.

This takes on defensive importance with respect to refuting the probability-theoretic fallacy, "Oh, sure, Occam's Razor seems to say that this proposition is complicated. But how can you be sure that this apparently complex proposition wouldn't turn out to be generated by some very simple mechanism?" If we consider a partition of 10,000 possible propositions, collectively having a 0.01% probability on average, then all the arguments in the world for why various propositions might have unexpectedly high probability, must still add up to an average probability of 0.01%. It can't be the case that after considering that proposition 1 might have secretly high probability, and considering that proposition 2 might have secretly high probability, and so on, we end up assigning 5% probability to each proposition, because that would be a total probability of 500. If we assign prior probabilities using an algorithmic-complexity Occam prior as in Solomonoff induction, then the observation that "most apparently complex things can't be further compressed into an amazingly simple Turing machine", is the same observation as that "most apparently Occam-penalized propositions can't turn out to be simpler than they look" or "most apparently subjectively improbable things can't turn out to have unseen clever arguments that would validly make them more subjectively probable".

## Comments

Eric Rogstad

Not 2^100?

Kaya Fallenstein

"Similarly, if you start with a 10 kilobyte text file, and 7zip compresses it down to 2 kilobytes, no amount of time spent trying to

furthercompress the file using other compression programs willeverget that file down to 1 byte." (Emphasis added.)This claim seems too strong. Do you mean that an arbitrary compression algorithm is unlikely to compress the file to one byte?

Taking what's written at face value, if I assume that every compression I apply monotonically decreases the size of the file, then I need only find a large enough sequence of compressions and compose them to compress my file to one byte. This is of course silly, since the composition is just a huge highly specialized compression program optimized for this instance, but I can spend enough time to make it happen.