Can the posterior information of a parameter is more
extreme than both the prior information and the data evidence (likelihood
function)? In one of our consulting/collaborative projects with the J&J, we
encountered such a counter-intuitive (paradoxical?) phenomenon, as illustrated
in the figures below. This
counter-intuitive result on the parameter of interest is caused by
"marginalizations" of the joint distribution/likelihood functions;
but it is not the same "marginalization paradox" described by Berger
(2006) and Dawid et al. (1973). A brief description and
discussion/interpretation of the problem is on a separate page (click here).


Caption: The figure on the left hand side
contains contour plots of a bi-beta prior [in blue], the joint likelihood
function [in black] and the (simulated) posterior distribution [in red] of the
binomial parameters (p0, p1). These three two-dimensional distributions are
projected to the direction of the parameter of interest d = p1 - p0
(off-diagonal 45 degree line pointing towards the upper left corner). This
leads to the figure on the right hand side, in which the marginal posterior of d = p1- p0 [in red] is more extreme than
its prior [in blue] and data evidence [in black]!
Remark: When the prior is less skewed, this
counter-intuitive result may be somewhat mitigated, depending on the structure
of the likelihood function and the prior. But the phenomenon is mathematical
and is there to stay --- as long as skewed distributions are involved!