The goal of inference

The goal of inference

The scientific process.

P < 0.05

The problem with
inference

  • All swans are white (P < 0.05)

JJ Harrison (https://www.jjharrison.com.au/) - Own work CC BY-SA 4.0

What does the p-value mean?

(Svetkey 2008)

What does the p-value mean?

All groups regained weight after randomization by a mean of 5.5 kg in the self-directed, 5.2 kg in the interactive technology–based, and 4.0 kg in the personal-contact group… Those in the personal-contact group regained a mean of 1.2 kg less than those in the interactive technology–based group (95% CI, 2.1-0.3 kg; P=.008).

What does the p-value mean? Svetkey at al…

  1. …have absolutely disproved the null hypothesis (that there is no difference between the population means).
  2. … have found the probability of the null hypothesis being true.
  3. … have absolutely proved their experimental hypothesis (that there is a difference between the population means).
Questions adopted from (Dienes 2008).

What does the p-value mean? Svetkey at al…

  1. … can deduce the probability of the experimental hypothesis being true.
  2. … know the probability that you are making the wrong decision, if you decided to reject the null hypothesis.
  3. … have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 95 % of occasions.

Questions adopted from (Dienes 2008).

Statistical inference

Statistical inference

Drawing inference from a sample

  • Participants are recruited to an intervention and randomized to either receive a RED or BLUE pills.

  • Systolic blood pressure after the intervention

  • Do RED and BLUE pills affect blood pressure differently?

The results

A permutation experiment

10 000 Permutation

10 000 Permutation and our result

How unlikely is our result if the pills had no effect?

  • The average observed difference was 2.47.
  • 167 reshuffled averages where equal to or greater than our observed result.
  • This represent a small fraction of the reshuffled differences, in fact…
  • \(p = \frac{167}{10000} = 0.0167\)

To account for both extremes

  • When allocating the pills we did not really know what to expect, we did not account for the direction of the effect in our last test.
  • We should compare our result to extreme results in both directions.
  • 167 averages were equal to or greater than our observed result, and 169 averages were equal to or smaller than an effect corresponding to the observed in the other direction.
  • \(p = \frac{167 + 169}{10000} = 0.0336\)

What is enough evidence for inference?

  • We have produced a P-value!

  • So far, given our experiment, how do you define the P-value?

  • Is the P-value low enough to conclude anything about our pills?

References

Dienes, Zoltan. 2008. Understanding Psychology as a Science: An Introduction to Scientific and Statistical Inference. New York: Palgrave Macmillan.
Svetkey, Laura P. 2008. “Comparison of Strategies for Sustaining Weight LossThe Weight Loss Maintenance Randomized Controlled Trial.” JAMA 299 (10): 1139. https://doi.org/10.1001/jama.299.10.1139.