Confidence Intervals: What the Error Bars Really Mean

A 95% confidence interval is the plausible range of true effect values. If the interval crosses zero (or 1.0 for odds ratios), the effect is not statistically significant. Wider intervals signal uncertainty; narrow ones suggest precision.

Evaluate9 min read

Understanding Uncertainty Through Intervals

When a study reports that a probiotic reduces bloating by 2 points on a 10-point scale (95% CI: 0.5-3.5), the confidence interval tells a crucial story about certainty. The interval represents the plausible range of true effects, accounting for sample variability.

Interpretation requires care. A 95% confidence interval does NOT mean "there's a 95% chance the true effect falls in this range." (That would be Bayesian credible interval language.) Instead, imagine conducting the same study 100 times and calculating a confidence interval each time. Approximately 95 of those 100 intervals would contain the true effect; 5 would miss it. This repeated-sampling framework defines frequentist confidence intervals.

CI crossing the null signal non-significance. For continuous measures, null = 0. If a study reports effect = 1.2 mg/dL (95% CI: -0.5 to 2.9), the interval crosses zero, meaning the true effect might be zero or even negative. This study fails to demonstrate statistical significance, regardless of what the abstract claims.

For odds ratios and relative risks, null = 1.0 (no difference between groups). A study showing OR = 1.2 (95% CI: 0.9-1.6) crosses 1.0, indicating non-significance. The true odds of the outcome might be equal between groups (OR = 1.0) or slightly elevated (OR = 1.2).

CI width reflects sample size and effect variability. Narrow intervals (tight bounds) suggest precision—large samples or low variability provide confident estimates. Wide intervals signal uncertainty. A weight-loss study with CI: 2-30 kg lost conveys far less certainty than CI: 8-12 kg lost, despite the same point estimate.

When comparing treatments, overlapping CIs don't necessarily mean equivalence. Two CIs can overlap while still differing in statistical significance, though practically, overlapping intervals suggest effects are not dramatically different.

Forest plots in meta-analyses visualize CIs beautifully. Each study appears as a point estimate with horizontal error bars representing its CI. The overall meta-analytic estimate sits at the bottom, its CI often narrower (reflecting combined data). When forest plot CIs cross the null line, the pooled analysis is non-significant.

Microbiome studies increasingly report CIs for alpha diversity measures, taxa relative abundances, and functional predictions. A study showing Faecalibacterium abundance increased from 5% to 8% (95% CI: 6-10%) in intervention group conveys both point estimate and uncertainty. This beats reporting only the 8% without acknowledging sampling error.

Difference between precision and accuracy matters. A narrow CI doesn't guarantee accuracy—systematic bias (flawed study design) creates false precision. Conversely, a wide CI with unbiased methods provides honest uncertainty. Good studies show wide CIs in small samples, progressively narrowing as samples grow.

Reporting standards increasingly mandate CIs. The CONSORT statement for randomized trials and STROBE statement for observational studies both require confidence interval reporting. Journals enforcing these guidelines improve the quality of reported evidence significantly.

Sources & references

Understanding Uncertainty Through Intervals

Sources & references

Continue reading