Appendix E — Inference
E.1 Interpretation of Negative Findings
If a confidence interval includes the null hypothesis, or equivalently if a hypothesis test fails to reject the null hypothesis, that doesn’t necessarily mean that the null hypothesis is true. Accordingly, we should not write interpretations of results as “the odds (or risks/hazards/means) are not significantly different”; instead, we should write something like “the data does not provide statistically significant EVIDENCE that the odds (or analogous estimands) differ”. Statistical significance is a characteristic of evidence, not of the estimands.
P-values do not distinguish between absence of evidence and evidence of absence.
Confidence intervals do: if the confidence interval is narrow and includes the null value, then that confidence interval represents evidence of absence. If a confidence interval includes the null value but also includes substantially non-null values, then that confidence interval represents absence of evidence.
Also, even if we do have statistically significant evidence of a non-null value, the estimated value may not be substantially different from 0, depending on what estimand is. For example, we might have statistically significant evidence that a certain exercise prolongs human lifespans by 20 seconds, but that effect would probably not be substantially different from 0 in practical terms.
Figure E.1 sketches various scenarios for confidence intervals, from office hours. To do: convert this sketch into a nicely formatted figure.
See also Vittinghoff et al. (2012) §3.7 (p64).
E.2 Confidence intervals
Definition E.1 (margin of error) The margin of error (a.k.a. the radius) is one-half the width of a confidence interval.
more: