- I propose that this concept be called "unexpected surprise" rather than "strictly confused":

"Strictly confused" suggests logical incoherence.

"Unexpected surprise" can be motivated the following way: let $$~$ s(d) = \textrm{surprise}(d \mid H) = - \log \Pr (d \mid H) $~$$ be how surprising data $~$d$~$ is on hypothesis $~$H$~$. Then one is "strictly confused" if the observed $~$s$~$ is larger than than one would expect assuming a $~$H$~$ holds.

This terminology is nice because the average of $~$s$~$ under $~$H$~$ is the entropy or expected surprise in $~$(d \mid H)$~$. It also connects with Bayes, since $$~$\textrm{log-likelihood} = -\textrm{surprise}$~$$ is the evidential support $~$d$~$ gives $~$H$~$.

The section on "Distinction from frequentist p-values" is, I think, both technically incorrect and a bit uncharitable.

It's technically incorrect because the following isn't true:

The classical frequentist test for rejecting the null hypothesis involves considering the probability assigned to particular 'obvious'-seeming partitions of the data, and asking if we ended up inside a low-probability partition.

Actually, the classical frequentist test involves specifying an obvious-seeming measure of surprise $~$t(d)$~$, and seeing whether $~$t$~$ is higher than expected on $~$H$~$. This is even more arbitrary than the above.

On the other hand, it's uncharitable because it's widely acknowledged one should try to choose $~$t$~$ to be

*sufficient*, which is exactly the condition that the partition induced by $~$t$~$ is "compatible" with $~$\Pr(d \mid H)$~$ for different $~$H$~$, in the sense that $$~$\Pr(H \mid d) = \Pr(H \mid t(d))$~$$ for all the considered $~$H$~$.Clearly $~$s$~$ is sufficient in this sense. But there might be simpler functions of $~$d$~$ that do the job too ("minimal sufficient statistics").

Note that $~$t$~$ being sufficient doesn't make it non-arbitrary, as it may not be a monotone function of $~$s$~$.

Finally, I think that this concept is clearly "extra-Bayesian", in the sense that it's about non-probabilistic ("Knightian") uncertainty over $~$H$~$, and one is considering probabilities attached to unobserved $~$d$~$ (i.e., not conditioning on the observed $~$d$~$).

I don't think being "extra-Bayesian" in this sense is problematic. But I think it should be owned-up to.

Actually, "unexpected surprise" reveals a nice connection between Bayesian and sampling-based uncertainty intervals:

- To get a (HPD) credible interval, exclude those $~$H$~$ that are relatively surprised by the observed $~$d$~$ (or which are
*a priori*surprising). - To get a (nice) confidence interval, exclude those $~$H$~$ that are "unexpectedly surprised" by $~$d$~$.

- To get a (HPD) credible interval, exclude those $~$H$~$ that are relatively surprised by the observed $~$d$~$ (or which are