Today’s announcement at CERN of the latest research on the Higgs boson was truly extraordinary. Not only was the scientific achievement remarkable, but medias reporting of 5-sigma as a measure of “certainty” was also truly remarkable. For instance, the science editor at the Swedish news paper Dagens Nyheter reported that a sigma of 4.9 equals a certainty of 99.99994 %, which obviously isn’t true, simply because p( D | H0 ) is not the same as p( H0 | D ). In plain english this means that a p-value represents the conditional probability of getting the data given that the null hypothesis is true. Nothing more, and it surely doesn’t give the probability for the alternative hypothesis being true, i.e. the “certainty” that somethings been found that’s not a random fluctuation.The Higgs boson: 5-sigma and the concept of p-values | R Psychologist
The Higgs Boson and the p-value Police « Normal Deviate
The most common complaint is that physicists and journalists explain the meaning of a p-value incorrectly. For example, if the p-value is 0.000001 then we will see statements like “there is a 99.9999% confidence that the signal is real.” We then feel compelled to correct the statement: if there is no effect, then the chance of something as or more extreme is 0.000001.
Fair enough. But does it really matter? The big picture is: the evidence for the effect is overwhelming. Does it really matter if the wording is a bit misleading? I think we reinforce our image as pedants if we complain about this.
“This is the basic logic of hypothesis testing—conclude that your claim is correct if the chance of alternative claims being correct is small” (via Why You Shouldn’t Conclude “No Effect” from Statistically Insignificant Slopes « Elections « Carlisle Rainey).
I think this is more an issue of data scarcity to me, and one that would been attenuated, for example, by focusing on confidence intervals.
It’s also an issue of low resolution data. The author’s example is one where the model is trying to be of rank one on the Hibbs Efficacy Scale, where you beat the crap out of the covariance matrix by predicting a presidential election victory out of *one* independent variable. Try it again with state-level data to get more granularity, and the red-blue mikado plots will become much less ambiguous after a few residual-versus-predictor plots.
I do like the mikado plots, though. The eye seems to produce an intuitive credible interval out of them.